python - キー (すべて指定された文字列で発生) を文字列内の位置にマッピングする

Question

文字列内のキーのすべてのインデックスを取得して dict に格納しようとしています。これにより、すべてのインデックスにキーのリストがマッピングされます。

例：

string = "loloo and foofoo at the foo bar"
keys = "foo", "loo", "bar", "lo"

私は次のようなものを期待しています

{ 
  0: [lo]
  2: [loo, lo]
 10: [foo]
 13: [foo]
 24: [foo]
 28: [bar]
}

私の現在の答えは次のとおりです。

def get_index_for_string(string, keys):
    """
    Get all indexes of the keys in the string and store them in a dict, so that
    every index has a list of keys mapping to it.
    """
    key_in_string = dict((key, [m.start() for m in re.finditer(key, string)])
                            for key in keys if key in string)
    index_of_keys = {}
    for key, values in key_in_string.items():
        for value in values:
            if not value in index_of_keys:
                index_of_keys[value] = []
            index_of_keys[value].append(key)
    return index_of_keys

これを改善する方法について何か提案はありますか?

score 1 · Accepted Answer

まずre.escape、キーにピリオドなどが含まれている場合に備えて、キーが必要になります。それに加えて、結果辞書を作成するより直接的なアプローチを取ることができます。

from collections import defaultdict
def get_index_for_string(string, keys):
    res = defaultdict(list)
    for key in keys:
        for match in re.finditer(re.escape(key), string):
            res[match.start()].append(key)
    return res

注: を使用する代わりにdefaultdict、通常の dict と do を使用することもできますがres.setdefault(match.start(), []).append(key)、見栄えがよくありません。

score 1 · Accepted Answer

Non-regexアプローチ：

を使用して、オプションの 2 番目の引数str.find()をstr.find()受け入れます。これは、単語を検索するためのインデックスです。

def indexes(word,strs):
    ind=0                #base index is 0
    res=[]
    while strs.find(word,ind)!=-1:   #loop until str.find() doesn't return -1
        ans=strs.find(word,ind)
        res.append(ans)
        ind=ans+1                 #change base index if the word is found
    return res     

strs = "loloo and foofoo at the foo bar"
keys = ["foo", "loo", "bar", "lo"]

print {x:indexes(x,strs) for x in keys}

出力：

{'lo': [0, 2], 'foo': [10, 13, 24], 'bar': [28], 'loo': [2]}

score 0 · Accepted Answer

あなたはどのような「より良いもの」を求めていますか？より優れた Big-O の複雑さが必要な場合は、Aho-Corasic Automaton を使用してください。Python で利用できる高速な実装があります。

python - キー (すべて指定された文字列で発生) を文字列内の位置にマッピングする

3 に答える 3

Related

Reference