python - python複数行正規表現

Question

与えられた一連の単語が最初に出現するまで、すべての文字 (改行文字を含む) を抽出するにはどうすればよいですか? たとえば、次の入力を使用します。

入力テキスト：

"shantaram is an amazing novel.
It is one of the best novels i have read.
the novel is written by gregory david roberts.
He is an australian"

そして、最初に出現the するテキストを抽出したいシーケンスは2行目にあります。shantaramthe

出力は-

shantaram is an amazing novel.
It is one of the

私は午前中ずっと試してきました。特定の文字に遭遇するまですべての文字を抽出する式を書くことができますが、ここでは次のような式を使用します。

re.search("shantaram[\s\S]*the", string)

改行を越えて一致しません。

score 1 · Accepted Answer

正規表現を使用しないソリューション:

from itertools import takewhile
def upto(a_string, stop):
    return " ".join(takewhile(lambda x: x != stop and x != "\n".format(stop), a_string))

3 に答える 3