python - Python を使用してこのテキストファイルを処理する最も Pythonic な方法

Question

次のようなテストデータを含むテキストファイルがあります。

hdr 1

hdr2

hdr3

data1

data2

data3

data1

data2

....

すべての行の間に空白行があります。

含むリストのリストを作成する必要があります

[[hdr1,hdr2,hdr3],[data1,data2,data3],[data1,data2,...]

これを行うための簡潔でpythonicな方法は何でしょうか?

score 3 · Accepted Answer

これは、Charles Menguy のソリューションの単純化されたバージョンにすぎません。コメントとして読みにくいため、回答として追加するだけです。しかし、ここに鍵があります：

まず、レシピgrouperから使用して、ファイルを 6 行のグループにグループ化します。itertools

groups = grouper(6, f)

次に、スライスするだけで 1 行おきに捨てることができます。

nonblank = [group[::2] for group in groups]

または、代わりに、空白行を明示的に除外することによって:

nonblank = [filter(bool, group) for group in groups]

各行を削除する必要がある場合は、リスト内包表記またはmap. 一般にmap、新しい関数をラムダ/部分的に分割する必要がない場合を好みますが、ここでは必要ありません。それだけmap(str.strip, group)です。

それをまとめると、これがワンライナーとしての全体です（私はまだかなり読みやすいと思います）：

with open('input.txt') as f:
    arr = [map(str.strip, group[::2]) for group in grouper(6, f)]

score 0 · Accepted Answer

それが最善の解決策なのか、どれだけPythonicなのかはわかりませんが、単純に正規表現を使用してファイルの行を解析できます:

import re

regex = re.compile(r'^(\w+)\s*(\d+)')
last_groups = None
group = []
data = []

with open('data.txt', 'r') as f:
    for line in f:
        match = regex.search(line)
        if match:
            if last_groups is None:
                last_groups = match.groups()

            if last_groups[0] == match.groups()[0] and \
                    int(last_groups[1]) <= int(match.groups()[1]):
                last_groups = match.groups()
                group.append(''.join(last_groups))
            else:
                data.append(group)
                last_groups = match.groups()
                group = [''.join(last_groups)]

if group:
    data.append(group)

python - Python を使用してこのテキスト ファイルを処理する最も Pythonic な方法

3 に答える 3

Related

Reference

python - Python を使用してこのテキストファイルを処理する最も Pythonic な方法