python - 2 つの要素のコレクションによる 2 つの配列のマージ

Question

偶数の整数を含む配列があります。配列は、識別子とカウントのペアを表します。タプルはすでに識別子でソートされています。これらの配列のいくつかを一緒にマージしたいと思います。私はそれを行ういくつかの方法を考えましたが、それらはかなり複雑で、Python でこれを行う簡単な方法があるかもしれないと感じています。

いいえ:

[<id>, <count>, <id>, <count>]

入力：

[14, 1, 16, 4, 153, 21]
[14, 2, 16, 3, 18, 9]

出力：

[14, 3, 16, 7, 18, 9, 153, 21]

score 8 · Accepted Answer

これらをリストとして保存するよりも、辞書として保存する方がよいでしょう (この目的だけでなく、単一の ID の値を抽出するなどの他のユースケースでも使用できます)。

x1 = [14, 1, 16, 4, 153, 21]
x2 = [14, 2, 16, 3, 18, 9]

# turn into dictionaries (could write a function to convert)
d1 = dict([(x1[i], x1[i + 1]) for i in range(0, len(x1), 2)])
d2 = dict([(x2[i], x2[i + 1]) for i in range(0, len(x2), 2)])

print d1
# {16: 4, 153: 21, 14: 1}

その後、この質問のソリューションのいずれかを使用して、それらを一緒に追加できます。例（最初の回答から取得）：

import collections

def d_sum(a, b):
    d = collections.defaultdict(int, a)
    for k, v in b.items():
        d[k] += v
    return dict(d)

print d_sum(d1, d2)
# {16: 7, 153: 21, 18: 9, 14: 3}

score 5 · Accepted Answer

collections.Counter()ここで必要なものは次のとおりです。

In [21]: lis1=[14, 1, 16, 4, 153, 21]

In [22]: lis2=[14, 2, 16, 3, 18, 9]

In [23]: from collections import Counter

In [24]: dic1=Counter(dict(zip(lis1[0::2],lis1[1::2])))

In [25]: dic2=Counter(dict(zip(lis2[0::2],lis2[1::2])))

In [26]: dic1+dic2
Out[26]: Counter({153: 21, 18: 9, 16: 7, 14: 3})

また：

In [51]: it1=iter(lis1)

In [52]: it2=iter(lis2)

In [53]: dic1=Counter(dict((next(it1),next(it1)) for _ in xrange(len(lis1)/2))) 
In [54]: dic2=Counter(dict((next(it2),next(it2)) for _ in xrange(len(lis2)/2))) 
In [55]: dic1+dic2
Out[55]: Counter({153: 21, 18: 9, 16: 7, 14: 3})

score 5 · Accepted Answer

使用collections.Counter:

import itertools
import collections

def grouper(n, iterable, fillvalue=None):
    args = [iter(iterable)] * n
    return itertools.izip_longest(fillvalue=fillvalue, *args)

count1 = collections.Counter(dict(grouper(2, lst1)))
count2 = collections.Counter(dict(grouper(2, lst2)))
result = count1 + count2

itertoolsここでライブラリのgrouperレシピを使用してデータを辞書に変換しましたが、他の回答が示しているように、その特定の猫の皮をむく方法は他にもあります。

resultは、Counter各 id が合計数を指すです。

Counter({153: 21, 18: 9, 16: 7, 14: 3})

Counterはマルチセットであり、各キーの数を簡単に追跡できます。データのデータ構造がはるかに優れているように感じます。たとえば、上記で使用されているように、それらは合計をサポートします。

score 0 · Accepted Answer

以前の回答はすべて良さそうに見えますが、最初から JSON blob を適切に形成する必要があると思います。そうしないと、(私の経験から) デバッグ中に深刻な問題が発生する可能性があります。この場合、id とカウントはフィールド、JSON は次のようになります

[{"id":1, "count":10}, {"id":2, "count":10}, {"id":1, "count":5}, ...]

そのように適切に形成された JSON は、処理がはるかに簡単であり、とにかく入ってきたものとおそらく似ています。

このクラスは少し一般的ですが、確かに拡張可能です


from itertools import groupby
class ListOfDicts():
    def init_(self, listofD=None):
        self.list = []
        if listofD is not None:
            self.list = listofD

    def key_total(self, group_by_key, aggregate_key):
        """ Aggregate a list of dicts by a specific key, and aggregation key"""
        out_dict = {}
        for k, g in groupby(self.list, key=lambda r: r[group_by_key]):
            print k
            total=0
            for record in g:
                print "   ", record
                total += record[aggregate_key]
            out_dict[k] = total
        return out_dict


if __name__ == "__main__":
    z = ListOfDicts([ {'id':1, 'count':2, 'junk':2}, 
                   {'id':1, 'count':4, 'junk':2},
                   {'id':1, 'count':6, 'junk':2},
                   {'id':2, 'count':2, 'junk':2}, 
                   {'id':2, 'count':3, 'junk':2},
                   {'id':2, 'count':3, 'junk':2},
                   {'id':3, 'count':10, 'junk':2},
                   ])

    totals = z.key_total("id", "count")
    print totals

どちらが与える


1
    {'count': 2, 'junk': 2, 'id': 1}
    {'count': 4, 'junk': 2, 'id': 1}
    {'count': 6, 'junk': 2, 'id': 1}
2
    {'count': 2, 'junk': 2, 'id': 2}
    {'count': 3, 'junk': 2, 'id': 2}
    {'count': 3, 'junk': 2, 'id': 2}
3
    {'count': 10, 'junk': 2, 'id': 3}

{1: 12, 2: 8, 3: 10}

python - 2 つの要素のコレクションによる 2 つの配列のマージ

4 に答える 4

Related

Reference