python - 任意の基準に従って2つのリストを整列させる方法は?

Question

persons_a人のリストが 2 つあるとしましょうpersons_b。そして、リスト内の各人物を、たとえばなどの任意の属性に従って、リスト内persons_aの人物と一致させたいと考えています。persons_bperson.ageperson.town_from

Pythonで最も効率的な方法でそれを行うにはどうすればよいですか? forループを実行するだけですか？

criteria = lambda a, b: a.age == b.age

result = []
for a in persons_a:
    for b in persons_b:
        if critera(a, b):
           result.add(a)

score 5 · Accepted Answer

criteria = lambda a, b: a.age == b.age
cross = itertools.product( persons_a, persons_b )
result = ( a for a, b in cross if criteria( a, b ) )

これはよりPythonicで読みやすいです。これitertoolsは、ネストされた同じ for ループを実行する方法にすぎないため、コードを読みやすくするだけで効率的ではありません。

各組み合わせをループする必要があるため、の持っているものよりも良くなることはできませんO( n^2 )。したがって、ループを短絡するか、貪欲なアルゴリズムの両方のリストを介して単一のパスを考え出さない限り、上記とあなたの最適解です。半構造化データがある場合、ソートされている等長のリストがある場合、単一のパスでリストを取得できるため、コードを高速化できる場合がありますが、何も持っていない場合このような構造の場合、O( n^2 )アルゴリズムに固執する必要があります。

score 3 · Accepted Answer

辞書のアプローチを参照することで、あなたは次のようなものを意味しましたか？

class Store() :

    def __init__(self,types):
        self.a = {}
        for i in types : self.a[i]={}

    def addToStore(self,item) : # item is a dictionary
        for key in item.keys() :
            if key in self.a :
                self.a[key][item[key]] =   self.a[key].setdefault(item[key],[])+[item]

    def printtype(self,atype) :
        print atype
        for i in self.a[atype] : print self.a[atype][i]

if __name__=="__main__" :
    persons= Store(["age","place"])
    persons.addToStore({"name" : "Smith" , "place" : "Bury" , "age" : 32 })
    persons.addToStore({"name" : "Jones" , "place" : "Bolton" , "age" : 35 })
    persons.addToStore({"name" : "Swift" , "place" : "Radcliffe" , "age" : 32 })
    persons.addToStore({"name" : "Issac" , "place" : "Rochdale" , "age" : 32 })
    persons.addToStore({"name" : "Phillips" , "place" : "Bolton" , "age" : 26 })
    persons.addToStore({"name" : "Smith" , "place" : "Bury" , "age" : 41 })
    persons.addToStore({"name" : "Smith" , "place" : "Ramsbottom" , "age" : 25 })
    persons.addToStore({"name" : "Wilson" , "place" : "Bolton" , "age" : 26 })
    persons.addToStore({"name" : "Jones" , "place" : "Heywood" , "age" : 72 })
    persons.printtype("age")
    persons.printtype("place")

score 3 · Accepted Answer

質問や以前の回答のように、ネストされた for ループを使用すると、O(m*n) アルゴリズムが得られます。ここで、m、n はリストaおよびbのサイズです。(または、リストが同じサイズの場合は O(n^2)。) O(n) または O(n log n) アルゴリズムを取得するには、(1) O をサポートするセットまたは辞書データ構造を使用できます。 (1) または O(log n) メンバーシップルックアップ、または (2) O(1) または O(log n) 一致テストを可能にするために、a および/または b を基準要素の順に並べ替えます。O(n log n) 時間かかる 2 つの並べ替えを使用する次のコード例を参照してください。その後、一致するペアを識別する O(n) 比較パスが続き、全体の時間は O(n log n) になります。

import collections
iTuple = collections.namedtuple('item',['data','age'])
p_a, p_b, n, p = [], [], 15, 19

for i in range(n):
   p_a.append(iTuple(i, ( 7*i)%p))
   p_b.append(iTuple(i, (11*i)%p))

sa = sorted(p_a, key=lambda x: x.age)
sb = sorted(p_b, key=lambda x: x.age)
#print ('sa: ', sa, '\n\nsb: ',sb, '\n')

ia, ib, result = 0, 0, []
while ia < n > ib:
   #print (ia, ib)
   if sa[ia].age == sb[ib].age:
      result.append([sa[ia], sb[ib]])
      print ('Match:', sa[ia], '\t', sb[ib],'\tat',ia,ib)
      ia, ib = ia+1, ib+1
   elif sa[ia].age  < sb[ib].age: ia += 1
   elif sa[ia].age >  sb[ib].age: ib += 1

#print ('Result:', result)

上記のプログラムからの出力は次のとおりです。

Match: item(data=0, age=0)   item(data=0, age=0)    at 0 0
Match: item(data=11, age=1)  item(data=7, age=1)    at 1 1
Match: item(data=3, age=2)   item(data=14, age=2)   at 2 2
Match: item(data=14, age=3)  item(data=2, age=3)    at 3 3
Match: item(data=6, age=4)   item(data=9, age=4)    at 4 4
Match: item(data=9, age=6)   item(data=4, age=6)    at 5 5
Match: item(data=1, age=7)   item(data=11, age=7)   at 6 6
Match: item(data=4, age=9)   item(data=6, age=9)    at 8 7
Match: item(data=7, age=11)  item(data=1, age=11)   at 9 9
Match: item(data=2, age=14)  item(data=3, age=14)   at 11 11
Match: item(data=13, age=15) item(data=10, age=15)  at 12 12
Match: item(data=8, age=18)  item(data=12, age=18)  at 14 14

score 1 · Accepted Answer

result = [a for a in persons_a for b in persons_b if critera(a, b)]

リスト内包表記形式のループです。

結果をどのように処理するかによっては、ほぼ同じように見えるジェネレータ式を使用することもできます。

result = (a for a in persons_a for b in persons_b if critera(a, b))

違いは、メモリを占有しないことです。yield代わりに、ジェネレーター関数の場合と同様に、要求された時点で値を生成します。

score 1 · Accepted Answer

そのカスタム基準に従って人物を並べ替えることができると仮定します。

itertools.groupbyを使用して 1 つのリストの辞書を作成し (O(n log n並べ替え用にする必要があります)、他のかなり効率的な (正確な) 一致を見つけます (O(m)これは、他のリストの各人物に対して一定です)。

以下に実装例を示します。

import random
import collections
import itertools
iTuple = collections.namedtuple('Person', ['town', 'age'])

# make up data
random.seed(1)
def random_person():
    age = random.randrange(19,49)
    town = random.choice("Edinburgh Glasgow Aberdeen".split())
    return iTuple(town, age)
n_f, n_m = 15, 20
females = [random_person() for x in xrange(n_f)]
males = [random_person() for x in xrange(n_m)]

# group by criterion of interest: age, town
by_age, by_town = lambda x: x.age, lambda x: x.town
males_by_age = dict((age, list(group)) for age, group in itertools.groupby(
        sorted(males, key=by_age), key=by_age))
males_by_town = dict((age, list(group)) for age, group in itertools.groupby(
        sorted(males, key=by_town), key=by_town))

次に、この辞書にクエリを実行して、一致のリストを取得できます。

# assign random matches according to grouping variable (if available)
print "matches by age:"
for person in females:
    candidates = males_by_age.get(person.age)
    if candidates:
        print person, random.choice(candidates)
    else:
        print person, "no match found"

print "matches by town:"
for person in females:
    candidates = males_by_town.get(person.town)
    if candidates:
        print person, random.choice(candidates)
    else:
        print person, "no match found"

出力は次のようになります。

matches by age:
Person(town='Aberdeen', age=23) no match found
Person(town='Edinburgh', age=41) no match found
Person(town='Glasgow', age=33) no match found
Person(town='Aberdeen', age=38) Person(town='Edinburgh', age=38)
Person(town='Edinburgh', age=21) no match found
Person(town='Glasgow', age=44) Person(town='Glasgow', age=44)
Person(town='Edinburgh', age=41) no match found
Person(town='Aberdeen', age=32) no match found
Person(town='Aberdeen', age=25) Person(town='Edinburgh', age=25)
Person(town='Edinburgh', age=46) no match found
Person(town='Glasgow', age=19) no match found
Person(town='Glasgow', age=47) Person(town='Glasgow', age=47)
Person(town='Glasgow', age=25) Person(town='Glasgow', age=25)
Person(town='Edinburgh', age=19) no match found
Person(town='Glasgow', age=32) no match found
matches by town:
Person(town='Aberdeen', age=23) Person(town='Aberdeen', age=45)
Person(town='Edinburgh', age=41) Person(town='Edinburgh', age=27)
Person(town='Glasgow', age=33) Person(town='Glasgow', age=44)
Person(town='Aberdeen', age=38) Person(town='Aberdeen', age=45)
Person(town='Edinburgh', age=21) Person(town='Edinburgh', age=20)
Person(town='Glasgow', age=44) Person(town='Glasgow', age=24)
Person(town='Edinburgh', age=41) Person(town='Edinburgh', age=38)
Person(town='Aberdeen', age=32) Person(town='Aberdeen', age=34)
Person(town='Aberdeen', age=25) Person(town='Aberdeen', age=40)
Person(town='Edinburgh', age=46) Person(town='Edinburgh', age=38)
Person(town='Glasgow', age=19) Person(town='Glasgow', age=34)
Person(town='Glasgow', age=47) Person(town='Glasgow', age=42)
Person(town='Glasgow', age=25) Person(town='Glasgow', age=34)
Person(town='Edinburgh', age=19) Person(town='Edinburgh', age=27)
Person(town='Glasgow', age=32) Person(town='Glasgow', age=34)

python - 任意の基準に従って2つのリストを整列させる方法は?

5 に答える 5

Related

Reference