python - Python が正規表現と一致しない

Question

>>> pattern = re.compile(r'(.*)\\\\(.*)\\\\(.*)')
>>> m = re.match(pattern, 'string1\string2\string3')
>>> m
>>> 
>>> m.groups
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'groups'

上記の正規表現で次の形式の文字列を一致させようとしています: string1\string2\string3.

上記は Python の出力です。適切な正規表現オブジェクトを返さないのはなぜですか? 私のパターンに何か問題がありますか？

score 1 · Accepted Answer

問題は、生の文字列内でバックスラッシュをエスケープしようとしていることです。Python docsから、

'r' または 'R' プレフィックスが存在する場合、バックスラッシュに続く文字はそのまま文字列に含まれ、すべてのバックスラッシュは文字列に残されます。

これは、8 つのバックスラッシュすべてが正規表現に残り、各ペアがテスト文字列の 1 つのバックスラッシュに一致することを意味します。問題は視覚化するとすぐにわかります(スライダーをテスト文字列の上にドラッグします)。正規表現を次のように置き換えることで修正できます

r'(.*)\\(.*)\\(.*)'

score 1 · Accepted Answer

The issue is that in your pattern, you use \\\\, which represents two raw backslashes, while in the text to be matched, you use \s, which is actually no backslashes at all (it's a \s character).

First, you probably want to make your text a raw string, otherwise Python reads it as the \s character.

re.match(pattern, r'string1\string2\string3')

Second, you need only two consecutive slashes in your pattern, to represent that one backslash:

pattern = re.compile(r'(.*)\\(.*)\\(.*)')

Finally, rather than m.groups, you want to do m.groups() (call the method). Thus, all together your code would look like:

pattern = re.compile(r'(.*)\\(.*)\\(.*)')
m = re.match(pattern, r'string1\string2\string3')
m.groups()
# ('string1', 'string2', 'string3')

python - Python が正規表現と一致しない

2 に答える 2

Related

Reference