python - 特定の部分を参照するために正規表現を使用する方法は?

Question

正規表現を使用して取得したい情報を含む Python 文字列があります。

例：

"The weather is 75 degrees with a humidity of 13%"

「75」と「13」だけ抜き出したい。これまでにPythonで試したことは次のとおりです。

import re

str = "The weather is 75 degrees with a humidity of 13%"
m = re.search("The weather is \d+ degrees with a humidity of \d+%", str)
matched = m.group()

ただし、これは明らかに、必要な部分だけではなく、文字列全体に一致します。必要な数字だけを引き出すにはどうすればよいですか? 後方参照を調べましたが、正規表現パターン自体にのみ適用されるようです。

score 2 · Accepted Answer

たぶんあなたは名前付きグループを使いたいですか？

>>> m = re.search("The weather is (?P<temp>\d+) degrees with a humidity of (?P<humidity>\d+)%", s1)
>>> m.group('temp')
'75'
>>> m.group('humidity')
'13'

score 2 · Accepted Answer

m = re.search("The weather is (\d+) degrees with a humidity of (\d+)%", str)
matched = m.groups()

必要なものを括弧で囲む必要があります...

>>> s1 = "The weather is 75 degrees with a humidity of 13%"
>>> m = re.search("The weather is (\d+) degrees with a humidity of (\d+)%", s1)
>>> m.groups()
('75', '13')

または単にfindall任意の文字列から数値を取得するために使用します

>>> re.findall("\d+",s1)
['75', '13']

score 0 · Accepted Answer

数値などのテキストから型付けされたデータを抽出したい場合parse、非常に便利なライブラリです。多くの点で、これは文字列フォーマットの逆です。パターンを取り、型変換を行います。

簡単に言うと、正規表現グループなどについて心配する必要がなくなります。

>>> s = "The weather is 75 degrees with a humidity of 13%"
>>> parse("The weather is {} degrees with a humidity of {}%", s)
<Result ('75', '13') {}>

Resultオブジェクトの操作は非常に簡単です。

>>> r = _
>>> r[0]
'75'

フィールド名や型変換を指定することで、これよりもうまくいくことができます。結果を整数として取得するために必要なことは次のとおりです。

>>> parse("The weather is {:d} degrees with a humidity of {:d}%", s)
<Result (75, 13) {}>

非インデックスキーを使用する場合は、フィールド名を追加します。

>>> parse("The weather is {temp:d} degrees with a humidity of {humidity:d}%", s)
<Result () {'temp': 75, 'humidity': 13}>
>>> r = _
>>> r['temp']
75

python - 特定の部分を参照するために正規表現を使用する方法は?

3 に答える 3

Related

Reference