python - パターンに一致する文字列のリストをフィルタリングするための正規表現

Question

私はRをもっとよく使用し、Rでそれを行う方が簡単です。

> test <- c('bbb', 'ccc', 'axx', 'xzz', 'xaa')
> test[grepl("^x",test)]
[1] "xzz" "xaa"

testしかし、リストの場合、Pythonでそれを行う方法は？

PS私はグーグルのPython演習を使用してPythonを学習しており、正規表現を使用することを好みます。

score 11 · Accepted Answer

一般的に、あなたは使用することができます

import re                                  # Add the re import declaration to use regex
test = ['bbb', 'ccc', 'axx', 'xzz', 'xaa'] # Define a test list
reg = re.compile(r'^x')                    # Compile the regex
test = list(filter(reg.search, test))      # Create iterator using filter, cast to list 
# => ['xzz', 'xaa']

または、結果を逆にして、正規表現に一致しないすべてのアイテムを取得するには、次のようにします。

list(filter(lambda x: not reg.search(x), test))
# >>> ['bbb', 'ccc', 'axx']

Pythonデモを参照してください。

使用上の注意：

re.search文字列内の任意の場所で最初の正規表現の一致を検索し、一致オブジェクトを返します。それ以外の場合は、一致オブジェクトを返します。None
re.match文字列の先頭でのみ一致を検索します。完全な文字列の一致は必要ありません。したがって、re.search(r'^x', text)=re.match(r'x', text)
re.fullmatch完全な文字列がパターンと一致する場合にのみ一致を返すため、re.fullmatch(r'x')== 。re.match(r'x\Z')re.search(r'^x\Z')

r''プレフィックスの意味がわからない場合は、 Pythonを参照してください-正規表現を使用してピリオド（ピリオドまたは。）を検索するときに文字列プレフィックスrを使用する必要がありますか？および Pythonregex--rprefix。

score 4 · Accepted Answer

以下を使用して、リスト内の文字列のいずれかがで始まるかどうかを確認できます'x'

>>> [e for e in test if e.startswith('x')]
['xzz', 'xaa']
>>> any(e.startswith('x') for e in test)
True

score 2 · Accepted Answer

を使用できますfilter。古いリストの特定の要素を含む新しいリストが必要だと思います。

new_test = filter(lambda x: x.startswith('x'), test)

または、フィルター関数で正規表現を使用する場合は、次のことを試してください。reモジュールをインポートする必要があります。

new_test = filter(lambda s: re.match("^x", s), test)

score 1 · Accepted Answer

リスト内の各文字列から複数のデータポイントを抽出する場合の例：

入力：

2021-02-08 20:43:16 [debug] : [RequestsDispatcher@_execute_request] Requesting: https://test.com&uuid=1623\n

コード：

pat = '(.* \d\d:\d\d:\d\d) .*_execute_request\] (.*?):.*uuid=(.*?)[\.\n]'
new_list = [re.findall(pat,s) for s in my_list]

出力：

[[('2021-02-08 20:43:15', 'Requesting', '1623')]]

score 0 · Accepted Answer

これがうまく機能する即興です。おそらく役立ちます。

import re
l= ['bbb', 'ccc', 'axx', 'xzz', 'xaa'] #list
s= str( " ".join(l))                   #flattening list to string
re.findall('\\bx\\S*', s)               #regex to find string starting with x

['xzz', 'xaa']

python - パターンに一致する文字列のリストをフィルタリングするための正規表現

5 に答える 5

Related

Reference