python - Pythonで選択した文字列の間の文字列を抽出する方法

Question

次のような文字列がある場合:

str = 'Hello, <code>This is the string i want to extract</code>'

<code>次に、との間にある文字列をどのように抽出しますか</code>。上記の場合、抽出文字列は次のとおりです。

'This is the string i want to extract'

この文字列をdjangoフィルターで使用したいです。

score 4 · Accepted Answer

次のようなパーサーを使用しますBeautifulSoup。

>>> from bs4 import BeautifulSoup as BS
>>> text = 'Hello, <code>This is the string i want to extract</code>'
>>> soup = BS(text)
>>> print soup.code.text
This is the string i want to extract

または、1 行だけの場合は正規表現を使用できます。

>>> import re
>>> re.search(r'<code>(.*?)</code>', text).group(1)
'This is the string i want to extract'

ちなみに、strings には名前を付けないでくださいstr。組み込み型をオーバーライドします。

score 1 · Accepted Answer

「こんにちは」も必要な場合は、これを試してください

from bs4 import BeautifulSoup
import re
sentence = 'Hello, <code>This is the string i want to extract</code>'   
print re.sub('<[^>]*>', '',  sentence)

Hello, This is the string i want to extract

python - Pythonで選択した文字列の間の文字列を抽出する方法

2 に答える 2

Related

Reference