python - 文字列の列で複数のキーワードを見つけるより効率的な方法 pandas

Question

多くの文字列行を含むデータフレームがあります: btb['Title']. 各文字列に肯定的なキーワード、否定的なキーワード、中立的なキーワードが含まれているかどうかを特定したいと思います。以下は動作しますが、かなり遅いです:

positive_kw =('rise','positive','high','surge')
negative_kw = ('sink','lower','fall','drop','slip','loss','losses')
neutral_kw = ('flat','neutral')
#create new columns, turn value to one if keyword exists in sentence
btb['Positive'] = np.nan
btb['Negative'] = np.nan
btb['Neutral'] = np.nan

#Turn value to one if keyword exists in sentence
for index, row in btb.iterrows():
    if any(s in row.Title for s in positive_kw) == True:
        btb['Positive'].loc[index] = 1
    if any(s in row.Title for s in negative_kw) == True:
        btb['Negative'].loc[index] = 1
    if any(s in row.Title for s in neutral_kw) == True:
        btb['Neutral'].loc[index] = 1

お時間をいただきありがとうございます。このコードのパフォーマンスを向上させるために何が必要かを知りたいと思っています。

python - 文字列の列で複数のキーワードを見つけるより効率的な方法 pandas

1 に答える 1

Related

Reference