python - 2つのタグ間のPython検索文字列

Question

ファイルに保存されている 2 つのタグの間のコンテンツを読み込もうとしています。コンテンツが複数行にまたがっている可能性があります。タグは、ファイル内で 0 回または 1 回発生します。

例: ファイルの内容は次のとおりです。

title:Corruption Today: Corruption today in
content:Corruption Today: 
Corruption today in 
score:0.91750675

したがって、 "Content:" を読んでいる間、私のクエリは "Corruption Today: Corruption today in" という結果になるはずです。グーグルで調べた後、次のコードを書くことができます

myfile = open(files,'r');
filecontent = myfile.read();

startPtrs = [m.start()+8 for m in re.finditer('content:', filecontent)];
startPtr = startPtrs[0];
endPtrs = [m.start()-1 for m in re.finditer('score:', filecontent)];
endPtr = endPtrs[0];

content = filecontent[startPtr:endPtr];

コンテンツを取得するために filecontent を 2 回繰り返しているため、上記のコードがどれほど効率的かはわかりません。より効率的な何かを行うことができますか。

score 0 · Accepted Answer

2 つの部分文字列の間で文字列を検索する場合は、remoudleを使用できます。

import re

myfile = open(files,'r');
filecontent = myfile.read();

results = re.compile('content(.*?)score', re.DOTALL | re.IGNORECASE).findall(filecontent)
print results

いくつかの説明：

IGNORECASEドキュメントから：

大文字と小文字を区別しない一致を実行します。[AZ] のような表現も小文字に一致します。これは現在のロケールの影響を受けません。

DOTALL ドキュメントから:

(Dot.) In the default mode, this matches any character except a newline. If the DOTALL flag has been specified, this matches any character including a newline.

Compileここで見ることができます

また、ここで見ることができる他のソリューションもいくつかあります

python - 2つのタグ間のPython検索文字列

1 に答える 1

Related

Reference