python - ファイルから特定の行を (行番号で) 読み取る方法は?

Question

ループを使用しforてファイルを読み取りますが、特定の行、たとえば行#26と#30. これを達成するための組み込み機能はありますか?

score 310 · Accepted Answer

読み取るファイルが大きく、メモリ内のファイル全体を一度に読み取りたくない場合：

fp = open("file")
for i, line in enumerate(fp):
    if i == 25:
        # 26th line
    elif i == 29:
        # 30th line
    elif i > 29:
        break
fp.close()

3行目に注意してi == n-1くださいn。

Python 2.6以降の場合：

with open("file") as fp:
    for i, line in enumerate(fp):
        if i == 25:
            # 26th line
        elif i == 29:
            # 30th line
        elif i > 29:
            break

score 189 · Accepted Answer

簡単な答え：

f=open('filename')
lines=f.readlines()
print lines[25]
print lines[29]

また：

lines=[25, 29]
i=0
f=open('filename')
for line in f:
    if i in lines:
        print i
    i+=1

多くの行を抽出するためのより洗練されたソリューションがあります: linecache ( 「python: how to jump to a specific line in a huge text file?」、以前の stackoverflow.com の質問による)。

上記のリンクにある Python のドキュメントを引用すると、次のようになります。

>>> import linecache
>>> linecache.getline('/etc/passwd', 4)
'sys:x:3:3:sys:/dev:/bin/sh\n'

を目的の回線番号に変更する4と、オンになります。カウントは 0 から始まるため、4 を指定すると 5 行目が表示されることに注意してください。

ファイルが非常に大きく、メモリに読み込むときに問題が発生する可能性がある場合は、@Alok のアドバイスに従って enumerate() を使用することをお勧めします。

結論として：

fileobject.readlines()またはfor line in fileobjectを小さなファイルの簡単な解決策として使用します。
linecacheより洗練されたソリューションに使用します。これは、多くのファイルの読み取りが非常に高速で、繰り返し可能です。
@Alokのアドバイスに従って、enumerate()非常に大きくなる可能性があり、メモリに収まらないファイルに使用します。この方法を使用すると、ファイルが順次読み取られるため、速度が低下する可能性があることに注意してください。

score 37 · Accepted Answer

別のソリューションを提供するために：

import linecache
linecache.getline('Sample.txt', Number_of_Line)

これが迅速かつ簡単であることを願っています:)

score 34 · Accepted Answer

高速でコンパクトなアプローチは次のとおりです。

def picklines(thefile, whatlines):
  return [x for i, x in enumerate(thefile) if i in whatlines]

これは、開いているファイルのようなオブジェクトthefile(ディスクファイルから開くか、ソケットやその他のファイルのようなストリームを介して開くかは呼び出し元に任せます) とゼロから始まる行インデックスのセットを受け入れwhatlines、メモリフットプリントが小さく、適度な速度を備えています。返される行数が膨大な場合は、ジェネレーターを使用することをお勧めします。

def yieldlines(thefile, whatlines):
  return (x for i, x in enumerate(thefile) if i in whatlines)

これは基本的にループにのみ適しています。唯一の違いは、ステートメントで四角括弧ではなく丸括弧を使用し、returnそれぞれリスト内包表記とジェネレーター式を作成していることに注意してください。

さらに、「行」と「ファイル」の言及にもかかわらず、これらの関数ははるかに一般的であることに注意してください。これらの関数は、開いているファイルであろうとその他のイテラブルであろうと、アイテムのリスト (またはジェネレーター) を返します。プログレッシブアイテム番号に基づいています。したがって、より適切に一般的な名前を使用することをお勧めします;-)。

score 15 · Accepted Answer

完全を期すために、もう 1 つのオプションを示します。

python docsからの定義から始めましょう:

スライス通常、シーケンスの一部を含むオブジェクト。variable_name[1:3:5] のように複数の数値が指定されている場合、数値の間にコロンを使用して [] を使用してスライスが作成されます。括弧 (添え字) 表記は、内部でスライスオブジェクトを使用します (または、古いバージョンでは __getslice__() および __setslice__())。

一般に、スライス表記は反復子に直接適用できませんが、itertoolsパッケージには代替関数が含まれています。

from itertools import islice

# print the 100th line
with open('the_file') as lines:
    for line in islice(lines, 99, 100):
        print line

# print each third line until 100
with open('the_file') as lines:
    for line in islice(lines, 0, 100, 3):
        print line

この関数のもう 1 つの利点は、反復子を最後まで読み取らないことです。したがって、より複雑なことを行うことができます。

with open('the_file') as lines:
    # print the first 100 lines
    for line in islice(lines, 100):
        print line

    # then skip the next 5
    for line in islice(lines, 5):
        pass

    # print the rest
    for line in lines:
        print line

そして、元の質問に答えるには：

# how to read lines #26 and #30
In [365]: list(islice(xrange(1,100), 25, 30, 4))
Out[365]: [26, 30]

score 14 · Accepted Answer

14

7行目が必要な場合

line = open("file.txt", "r").readlines()[7]

于 2010-10-21T17:07:39.620 に答える

score 12 · Accepted Answer

ファイルの読み取りは信じられないほど高速です。100MB のファイルを読み取るのにかかる時間は 0.1 秒未満です (私の記事Reading and Writing Files with Python を参照してください)。したがって、それを完全に読んでから、単一行で作業する必要があります。

ここでのほとんどの回答は間違っていませんが、悪いスタイルです。withファイルが再び閉じられることを確認するため、ファイルを開くときは常にを使用する必要があります。

したがって、次のようにする必要があります。

with open("path/to/file.txt") as f:
    lines = f.readlines()
print(lines[26])  # or whatever you want to do with this line
print(lines[30])  # or whatever you want to do with this line

巨大なファイル

巨大なファイルがあり、メモリ消費が懸念される場合は、1 行ずつ処理できます。

with open("path/to/file.txt") as f:
    for i, line in enumerate(f):
        pass  # process line i

score 5 · Accepted Answer

読み取りヘッドをファイル内の指定されたバイトに配置するseek（）呼び出しを実行できます。これは、読み取りたい行の前にファイルに書き込まれているバイト（文字）の数を正確に把握していない限り、役に立ちません。おそらく、ファイルは厳密にフォーマットされています（各行はXバイト数ですか？）、または本当に速度を上げたい場合は、自分で文字数を数えることができます（改行などの非表示の文字を含めることを忘れないでください）。

それ以外の場合は、ここですでに提案されている多くのソリューションの1つに従って、希望する行の前にすべての行を読み取る必要があります。

score 4 · Accepted Answer

def getitems(iterable, items):
  items = list(items) # get a list from any iterable and make our own copy
                      # since we modify it
  if items:
    items.sort()
    for n, v in enumerate(iterable):
      if n == items[0]:
        yield v
        items.pop(0)
        if not items:
          break

print list(getitems(open("/usr/share/dict/words"), [25, 29]))
# ['Abelson\n', 'Abernathy\n']
# note that index 25 is the 26th item

score 3 · Accepted Answer

インポートしてもかまわない場合は、fileinputが必要な処理を正確に実行します（これは、現在の行の行番号を読み取ることができます）

score 3 · Accepted Answer

誰かがすでに言及したこの構文でこれを非常に簡単に行うことができますが、これが最も簡単な方法です。

inputFile = open("lineNumbers.txt", "r")
lines = inputFile.readlines()
print (lines[0])
print (lines[2])

score 3 · Accepted Answer

これが私の小さな 2 セントです。

def indexLines(filename, lines=[2,4,6,8,10,12,3,5,7,1]):
    fp   = open(filename, "r")
    src  = fp.readlines()
    data = [(index, line) for index, line in enumerate(src) if index in lines]
    fp.close()
    return data


# Usage below
filename = "C:\\Your\\Path\\And\\Filename.txt"
for line in indexLines(filename): # using default list, specify your own list of lines otherwise
    print "Line: %s\nData: %s\n" % (line[0], line[1])

score 3 · Accepted Answer

これはどう：

>>> with open('a', 'r') as fin: lines = fin.readlines()
>>> for i, line in enumerate(lines):
      if i > 30: break
      if i == 26: dox()
      if i == 30: doy()

score 3 · Accepted Answer

私がこのアプローチを好むのは、より汎用的であるためです。つまり、ファイル、結果f.readlines()、StringIOオブジェクトなど、何でも使用できます。

def read_specific_lines(file, lines_to_read):
   """file is any iterable; lines_to_read is an iterable containing int values"""
   lines = set(lines_to_read)
   last = max(lines)
   for n, line in enumerate(file):
      if n + 1 in lines:
          yield line
      if n + 1 > last:
          return

>>> with open(r'c:\temp\words.txt') as f:
        [s for s in read_specific_lines(f, [1, 2, 3, 1000])]
['A\n', 'a\n', 'aa\n', 'accordant\n']

score 3 · Accepted Answer

Alok Singhalの回答に対するより良いマイナーな変更

fp = open("file")
for i, line in enumerate(fp,1):
    if i == 26:
        # 26th line
    elif i == 30:
        # 30th line
    elif i > 30:
        break
fp.close()

score 1 · Accepted Answer

ファイルオブジェクトには .readlines() メソッドがあり、ファイルの内容のリストをリストアイテムごとに 1 行で表示します。その後は、通常のリストスライス手法を使用できます。

http://docs.python.org/library/stdtypes.html#file.readlines

score 1 · Accepted Answer

@OP、列挙を使用できます

for n,line in enumerate(open("file")):
    if n+1 in [26,30]: # or n in [25,29] 
       print line.rstrip()

score 1 · Accepted Answer

file = '/path/to/file_to_be_read.txt'
with open(file) as f:
    print f.readlines()[26]
    print f.readlines()[30]

with ステートメントを使用して、これはファイルを開き、26 行目と 30 行目を出力してから、ファイルを閉じます。単純！

score 1 · Accepted Answer

行番号 3 を印刷するには、

line_number = 3

with open(filename,"r") as file:
current_line = 1
for line in file:
    if current_line == line_number:
        print(file.readline())
        break
    current_line += 1

原作者：フランク・ホフマン

score 0 · Accepted Answer

しきい値行の後に始まる行など、特定の行を読みたい場合は、次のコードを使用できます。 file = open("files.txt","r") lines = file.readlines() ## convert to list of lines datas = lines[11:] ## raed the specific lines

score -1 · Accepted Answer

私はこれがうまくいくと思う

 open_file1 = open("E:\\test.txt",'r')
 read_it1 = open_file1.read()
 myline1 = []
 for line1 in read_it1.splitlines():
 myline1.append(line1)
 print myline1[0]

score -1 · Accepted Answer

f = open(filename, 'r')
totalLines = len(f.readlines())
f.close()
f = open(filename, 'r')

lineno = 1
while lineno < totalLines:
    line = f.readline()

    if lineno == 26:
        doLine26Commmand(line)

    elif lineno == 30:
        doLine30Commmand(line)

    lineno += 1
f.close()

score -3 · Accepted Answer

特定の行からの読み取り:

n = 4   # for reading from 5th line
with open("write.txt",'r') as t:
     for i,line in enumerate(t):
         if i >= n:             # i == n-1 for nth line
            print(line)

python - ファイルから特定の行を (行番号で) 読み取る方法は?

28 に答える 28

巨大なファイル

Related

Reference