python - 「#」で始まるすべての用語を取得するにはどうすればよいですか?

Question

私は次のような文字列を持っています:"sometext #Syrup #nshit #thebluntislit"

「#」で始まるすべての用語のリストを取得したい

次のコードを使用しました。

import re
line = "blahblahblah #Syrup #nshit #thebluntislit"
ht = re.search(r'#\w*', line)
ht = ht.group(0)
print ht

そして私は次のようになります：

#Syrup

代わりに次のようなリストを取得できる方法があるかどうか疑問に思っていました:

[#Syrup,#nshit,#thebluntislit]

最初の用語だけでなく、「#」で始まるすべての用語。

score 21 · Accepted Answer

Python のような優れたプログラミング言語では、正規表現は必要ありません。

  hashed = [ word for word in line.split() if word.startswith("#") ]

score 4 · Accepted Answer

使用できます

compiled = re.compile(r'#\w*')
compiled.findall(line)

出力：

['#Syrup', '#nshit', '#thebluntislit']

しかし問題がある。のような文字列を検索する'blahblahblah #Syrup #nshit #thebluntislit beg#end'と、出力はになります['#Syrup', '#nshit', '#thebluntislit', '#end']。

この問題は、肯定的な後読みを使用して対処できます。

compiled = re.compile(r'(?<=\s)#\w*')

（境界が検索されている単語を構成する可能性のある記号の中にない\bため、ここでは（単語境界）を使用することはできません）。#\w[0-9a-zA-Z_]

score 1 · Accepted Answer

あなたが望むことをするように見えre.findall()ます。

matches = re.findall(r'#\w*', line)

3 に答える 3