python - すべての完全な引用符を正規表現で一致させる

Question

そのため、シングルかダブルかがわからない場合に引用符を一致させるのはかなり簡単です。

>>> s ="""this is a "test" that I am "testing" today"""
>>> re.findall('[\'"].*?[\'"]',s)
['"test"', '"testing"']

一重引用符または二重引用符のいずれかの文字列を検索し、その間にあるものを取得します。しかし、ここに問題があります：

他のタイプの引用符が含まれている場合、文字列を閉じます! 私が言いたいことを説明するための2つの例を次に示します。

>>> s ="""this is a "test" and this "won't work right" at all"""
>>> re.findall('[\'"].*?[\'"]',s)
['"test"', '"won\'']
>>> s ="""something is "test" and this is "an 'inner' string" too"""
>>> re.findall('[\'"].*?[\'"]',s)
['"test"', '"an \'', '\' string"']

正規表現'[\'"].*?[\'"]'は一重引用符と二重引用符を一致させますが、これは明らかに悪いことです。

したがって、どの正規表現が両方のタイプの引用符に一致しますが、実際の文字列が同じ種類の引用符で終わる場合にのみ一致します。

あなたが混乱している場合に備えて

ここに私の望ましい出力があります：

s ="""this is a "test" and this "won't work right" at all"""
re.findall(expression,s)
#prints ['"test"','"won\'t work right"']

s ="""something is "test" and this is "an 'inner' string" too"""
re.findall(expression,s)
['"test"', '"an \'inner\' string"',"'inner'"]

score 4 · Accepted Answer

最初の文字クラスをキャプチャグループでラップし、反対側で次のように参照し\1ます。

>>> re.findall(r'([\'"])(.*?)\1',s)
[('"', 'test'), ('"', "won't work right")]

python - すべての完全な引用符を正規表現で一致させる

1 に答える 1

Related

Reference