python - Pythonでtxtファイルから文字間の文字列を抽出する

Question

Pythonに読み取らせたいtxtファイルがあり、そこからpythonに2文字間の文字列を具体的に抽出させます。次に例を示します。

Line a

Line b

Line c

&TESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTESTTEST !

Line d

Line e

私が欲しいのは、Pythonが行を読み取ることであり、「&」に遭遇すると、「!」に遭遇するまで行(「$」のある行を含む)の印刷を開始したい.

助言がありますか？

score 4 · Accepted Answer

これは機能します：

data=[]
flag=False
with open('/tmp/test.txt','r') as f:
    for line in f:
        if line.startswith('&'):
            flag=True
        if flag:
            data.append(line)
        if line.strip().endswith('!'):
            flag=False

print ''.join(data)

ファイルが十分に小さいため、すべてをメモリに読み込むことが問題ではなく、必要な文字列の開始と終了にあいまいさが&なけれ!ば、これは簡単です。

with open('/tmp/test.txt','r') as f:
    data=''.join(f.readlines())    

print data[data.index('&'):data.index('!')+1]

または、ファイル全体を読み取りたいが使用するだけ&で!、それらがそれぞれ行の先頭と末尾にある場合は、正規表現を使用できます。

import re

with open('/tmp/test.txt','r') as f:
    data=''.join(f.readlines())    

m=re.search(r'^(&.*!)\s*?\n',data,re.S | re.M)    
if m: print m.group(1)

score 0 · Accepted Answer

これは (非常に単純な!) 例です。

def Printer():
    f = open("yourfile.txt")
    Pr = False
    for line in f.readlines():
        if Pr: print line
        if "&" in line:
            Pr = True
            print line
        if "!" in line:
            Pr = False
    f.close()

score 0 · Accepted Answer

簡単な解決策の 1 つを以下に示します。コードには、コードの各行を理解できるように、多くのコメントが含まれています。コードの美しさは、 with 演算子を使用して例外を処理し、リソース (ファイルなど) を閉じることです。

#Specify the absolute path to the input file.
file_path = "input.txt" 

#Open the file in read mode. with operator is used to take care of try..except..finally block.
with open(file_path, "r") as f:
    '''Read the contents of file. Be careful here as this will read the entire file into memory. 
       If file is too large prefer iterating over file object
    ''' 
    content = f.read()
    size = len(content)
    start =0
    while start < size:
        # Read the starting index of & after the last ! index.
        start = content.find("&",start)
        # If found, continue else go to end of contents (this is just to avoid writing if statements.
        start = start if start != -1 else size
        # Read the starting index of ! after the last $ index.
        end = content.find("!", start)
        # Again, if found, continue else go to end of contents (this is just to avoid writing if statements.
        end = end if end != -1 else size
        '''print the contents between $ and ! (excluding both these operators. 
           If no ! character is found, print till the end of file.
        ''' 
        print content[start+1:end]
        # Move forward our cursor after the position of ! character. 
        start = end + 1

python - Pythonでtxtファイルから文字間の文字列を抽出する

3 に答える 3

Related

Reference