python - ファイルを1行ずつリストに読み込む方法は？

Question

Pythonでファイルのすべての行を読み取り、各行を要素としてリストに保存するにはどうすればよいですか？

ファイルを1行ずつ読み取り、各行をリストの最後に追加したいと思います。

score 2594 · Accepted Answer

このコードは、ファイル全体をメモリに読み込みます。

with open(filename) as file:
    lines = file.readlines()

各行の終わりからすべての空白文字（改行とスペース）を削除する場合は、代わりにこれを使用してください。

with open(filename) as file:
    lines = [line.rstrip() for line in file]

（これにより、から余分なリストを割り当てることが回避されfile.readlines()ます。）

大きなファイルで作業している場合は、代わりにファイルを1行ずつ読み取って処理する必要があります。

with open(filename) as file:
    for line in file:
        print(line.rstrip())

Python 3.8以降では、次のようにwalrus演算子でwhileループを使用できます。

with open(filename) as file:
    while line := file.readline():
        print(line.rstrip())

score 1131 · Accepted Answer

入力と出力を参照してください：

with open('filename') as f:
    lines = f.readlines()

または改行文字を削除します。

with open('filename') as f:
    lines = [line.rstrip('\n') for line in f]

score 680 · Accepted Answer

これは必要以上に明確ですが、必要なことを実行します。

with open("file.txt") as file_in:
    lines = []
    for line in file_in:
        lines.append(line)

score 306 · Accepted Answer

これにより、ファイルから行の「配列」が生成されます。

lines = tuple(open(filename, 'r'))

open繰り返すことができるファイルを返します。ファイルを反復処理すると、そのファイルから行が取得されます。tupleイテレータを取得し、指定したイテレータからタプルインスタンスをインスタンス化できます。linesファイルの行から作成されたタプルです。

score 229 · Accepted Answer

Python のMethods of File Objectsによると、テキストファイルをに変換する最も簡単な方法listは次のとおりです。

with open('file.txt') as f:
    my_list = list(f)
    # my_list = [x.rstrip() for x in f] # remove line breaks

デモ

テキストファイルの行を繰り返し処理する必要がある場合は、次を使用できます。

with open('file.txt') as f:
    for line in f:
       ...

古い答え:

withとの使用readlines():

with open('file.txt') as f:
    lines = f.readlines()

ファイルを閉じる必要がない場合は、次のワンライナーが機能します。

lines = open('file.txt').readlines()

従来の方法:

f = open('file.txt') # Open file on read mode
lines = f.read().splitlines() # List with stripped line-breaks
f.close() # Close file

score 163 · Accepted Answer

提案されているように、次のことを簡単に実行できます。

with open('/your/path/file') as f:
    my_lines = f.readlines()

このアプローチには 2 つの欠点があることに注意してください。

1) すべての行をメモリに保存します。一般的なケースでは、これは非常に悪い考えです。ファイルが非常に大きくなり、メモリが不足する可能性があります。大きくなくても、単にメモリの無駄です。

2) これでは、読み取り時に各行を処理することはできません。したがって、この後に行を処理すると、効率的ではありません (1 回ではなく 2 回のパスが必要になります)。

一般的なケースのより良いアプローチは次のとおりです。

with open('/your/path/file') as f:
    for line in f:
        process(line)

プロセス関数を任意の方法で定義する場所。例えば：

def process(line):
    if 'save the world' in line.lower():
         superman.save_the_world()

(Supermanクラスの実装は演習として残します)。

これは、どのファイルサイズでも適切に機能し、1 回のパスでファイルを処理できます。これは通常、汎用パーサーがどのように機能するかです。

score 45 · Accepted Answer

ファイルの行をリストに読み込むクリーンで Pythonic な方法

何よりもまず、効率的かつ Pythonic な方法でファイルを開き、その内容を読み取ることに集中する必要があります。私が個人的に好まない方法の例を次に示します。

infile = open('my_file.txt', 'r')  # Open the file for reading.

data = infile.read()  # Read the contents of the file.

infile.close()  # Close the file since we're done using it.

代わりに、読み取りと書き込みの両方でファイルを開く以下の方法を好みます。これは非常にクリーンであり、使用後にファイルを閉じる追加の手順を必要としないためです。以下のステートメントでは、読み取り用にファイルを開き、それを変数「infile」に割り当てています。このステートメント内のコードの実行が完了すると、ファイルは自動的に閉じられます。

# Open the file for reading.
with open('my_file.txt', 'r') as infile:

    data = infile.read()  # Read the contents of the file into memory.

このデータは反復可能で、効率的で、柔軟なので、 Python リストに取り込むことに集中する必要があります。あなたの場合、望ましい目標は、テキストファイルの各行を個別の要素にすることです。これを実現するには、次のようにsplitlines()メソッドを使用します。

# Return a list of the lines, breaking at line boundaries.
my_list = data.splitlines()

最終製品:

# Open the file for reading.
with open('my_file.txt', 'r') as infile:

    data = infile.read()  # Read the contents of the file into memory.

# Return a list of the lines, breaking at line boundaries.
my_list = data.splitlines()

コードのテスト:

テキストファイルの内容:

     A fost odatã ca-n povesti,
     A fost ca niciodatã,
     Din rude mãri împãrãtesti,
     O prea frumoasã fatã.

テスト目的でステートメントを出力します。

    print my_list  # Print the list.

    # Print each line in the list.
    for line in my_list:
        print line

    # Print the fourth element in this list.
    print my_list[3]

出力 (Unicode 文字のため見た目が異なります):

     ['A fost odat\xc3\xa3 ca-n povesti,', 'A fost ca niciodat\xc3\xa3,',
     'Din rude m\xc3\xa3ri \xc3\xaemp\xc3\xa3r\xc3\xa3testi,', 'O prea
     frumoas\xc3\xa3 fat\xc3\xa3.']

     A fost odatã ca-n povesti, A fost ca niciodatã, Din rude mãri
     împãrãtesti, O prea frumoasã fatã.

     O prea frumoasã fatã.

score 30 · Accepted Answer

ファイルでリスト内包表記を使用するもう 1 つのオプションがあります。

lines = [line.rstrip() for line in open('file.txt')]

ほとんどの作業は Python インタープリター内で行われるため、これはより効率的な方法です。

score 25 · Accepted Answer

コマンドラインまたは stdin からファイルを読みたい場合は、次のfileinputモジュールも使用できます。

# reader.py
import fileinput

content = []
for line in fileinput.input():
    content.append(line.strip())

fileinput.close()

次のようにファイルを渡します。

$ python reader.py textfile.txt

詳細はこちら: http://docs.python.org/2/library/fileinput.html

score 20 · Accepted Answer

それを行う最も簡単な方法

簡単な方法は次のとおりです。

ファイル全体を文字列として読み取る
文字列を 1 行ずつ分割する

1行で、次のようになります。

lines = open('C:/path/file.txt').read().splitlines()

ただし、これはコンテンツの 2 つのバージョンをメモリに保存するため、非常に非効率的な方法です (小さなファイルの場合はおそらく大きな問題ではありませんが、それでもなお)。[ありがとうマーク・アメリー]。

2 つの簡単な方法があります。

ファイルを反復子として使用する

lines = list(open('C:/path/file.txt'))
# ... or if you want to have a list without EOL characters
lines = [l.rstrip() for l in open('C:/path/file.txt')]

Python 3.4 以降を使用pathlibしている場合は、プログラム内の他の操作に使用できるファイルのパスを作成するために使用することをお勧めします。

from pathlib import Path
file_path = Path("C:/path/file.txt") 
lines = file_path.read_text().split_lines()
# ... or ... 
lines = [l.rstrip() for l in file_path.open()]

score 15 · Accepted Answer

splitlines() 関数を使用するだけです。ここに例があります。

inp = "file.txt"
data = open(inp)
dat = data.read()
lst = dat.splitlines()
print lst
# print(lst) # for python 3

出力には、行のリストが表示されます。

score 11 · Accepted Answer

非常に大きい/巨大なファイルに直面し、より高速に読み取りたい場合(Topcoder/Hackerrank コーディングコンテストに参加していると想像してください)、かなり大きな行のチャンクを一度にメモリバッファーに読み込むことができます。ファイルレベルで行ごとに繰り返すだけです。

buffersize = 2**16
with open(path) as f: 
    while True:
        lines_buffer = f.readlines(buffersize)
        if not lines_buffer:
            break
        for line in lines_buffer:
            process(line)

score 4 · Accepted Answer

これを使って：

import pandas as pd
data = pd.read_csv(filename) # You can also add parameters such as header, sep, etc.
array = data.values

dataはデータフレームタイプであり、値を使用して ndarray を取得します。を使用してリストを取得することもできますarray.tolist()。

score 2 · Accepted Answer

NumPy で loadtxt コマンドを使用することもできます。これは、genfromtxt よりも少ない条件をチェックするため、より高速になる可能性があります。

import numpy
data = numpy.loadtxt(filename, delimiter="\n")

python - ファイルを1行ずつリストに読み込む方法は？

28 に答える 28

Related

Reference