python - 辞書の辞書をマージする方法は？

Question

複数の辞書をマージする必要があります。たとえば、次のようになります。

dict1 = {1:{"a":{A}}, 2:{"b":{B}}}

dict2 = {2:{"c":{C}}, 3:{"d":{D}}

のように、木の葉とA B C一緒にD{"info1":"value", "info2":"value2"}

辞書のレベル（深さ）が不明です。{2:{"c":{"z":{"y":{C}}}}}

私の場合、ノードがドキュメントであり、ファイルのままであるディレクトリ/ファイル構造を表しています。

それらをマージして取得したい：

 dict3 = {1:{"a":{A}}, 2:{"b":{B},"c":{C}}, 3:{"d":{D}}}

Pythonでそれを簡単に行う方法がわかりません。

score 181 · Accepted Answer

これは実際には非常に注意が必要です。特に、重複しているが一貫性のあるエントリを正しく受け入れながら、矛盾しているときに有用なエラーメッセージが必要な場合 (ここでは他の回答はありません..)

膨大な数のエントリがないと仮定すると、再帰関数が最も簡単です。

def merge(a, b, path=None):
    "merges b into a"
    if path is None: path = []
    for key in b:
        if key in a:
            if isinstance(a[key], dict) and isinstance(b[key], dict):
                merge(a[key], b[key], path + [str(key)])
            elif a[key] == b[key]:
                pass # same leaf value
            else:
                raise Exception('Conflict at %s' % '.'.join(path + [str(key)]))
        else:
            a[key] = b[key]
    return a

# works
print(merge({1:{"a":"A"},2:{"b":"B"}}, {2:{"c":"C"},3:{"d":"D"}}))
# has conflict
merge({1:{"a":"A"},2:{"b":"B"}}, {1:{"a":"A"},2:{"b":"C"}})

これは変化することに注意してくださいa- の内容bが追加されますa(これも返されます)。保持したい場合はa、のように呼び出すことができますmerge(dict(a), b)。

agf は、2 つ以上の dict がある可能性があることを (以下で) 指摘しました。その場合、以下を使用できます。

reduce(merge, [dict1, dict2, dict3...])

すべてがに追加されdict1ます。

注：最初の回答を編集して、最初の引数を変更しました。これにより、「削減」が説明しやすくなります

PS：Python 3では、あなたも必要になりますfrom functools import reduce

score 41 · Accepted Answer

ジェネレーターを使用してそれを行う簡単な方法を次に示します。

def mergedicts(dict1, dict2):
    for k in set(dict1.keys()).union(dict2.keys()):
        if k in dict1 and k in dict2:
            if isinstance(dict1[k], dict) and isinstance(dict2[k], dict):
                yield (k, dict(mergedicts(dict1[k], dict2[k])))
            else:
                # If one of the values is not a dict, you can't continue merging it.
                # Value from second dict overrides one in first and we move on.
                yield (k, dict2[k])
                # Alternatively, replace this with exception raiser to alert you of value conflicts
        elif k in dict1:
            yield (k, dict1[k])
        else:
            yield (k, dict2[k])

dict1 = {1:{"a":"A"},2:{"b":"B"}}
dict2 = {2:{"c":"C"},3:{"d":"D"}}

print dict(mergedicts(dict1,dict2))

これは以下を出力します:

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}

score 29 · Accepted Answer

この質問の 1 つの問題は、dict の値が任意の複雑なデータになる可能性があることです。これらおよびその他の回答に基づいて、次のコードを思いつきました。

class YamlReaderError(Exception):
    pass

def data_merge(a, b):
    """merges b into a and return merged result

    NOTE: tuples and arbitrary objects are not handled as it is totally ambiguous what should happen"""
    key = None
    # ## debug output
    # sys.stderr.write("DEBUG: %s to %s\n" %(b,a))
    try:
        if a is None or isinstance(a, str) or isinstance(a, unicode) or isinstance(a, int) or isinstance(a, long) or isinstance(a, float):
            # border case for first run or if a is a primitive
            a = b
        elif isinstance(a, list):
            # lists can be only appended
            if isinstance(b, list):
                # merge lists
                a.extend(b)
            else:
                # append to list
                a.append(b)
        elif isinstance(a, dict):
            # dicts must be merged
            if isinstance(b, dict):
                for key in b:
                    if key in a:
                        a[key] = data_merge(a[key], b[key])
                    else:
                        a[key] = b[key]
            else:
                raise YamlReaderError('Cannot merge non-dict "%s" into dict "%s"' % (b, a))
        else:
            raise YamlReaderError('NOT IMPLEMENTED "%s" into "%s"' % (b, a))
    except TypeError, e:
        raise YamlReaderError('TypeError "%s" in key "%s" when merging "%s" into "%s"' % (e, key, b, a))
    return a

私の使用例は、考えられるデータ型のサブセットのみを処理する必要があるYAML ファイルをマージすることです。したがって、タプルやその他のオブジェクトを無視できます。私にとって賢明なマージロジックとは

スカラーを置き換える
リストを追加
欠落しているキーを追加し、既存のキーを更新することにより、辞書をマージします

他のすべてと予期しない結果はエラーになります。

score 17 · Accepted Answer

辞書の辞書がマージされます

これは標準的な質問であるため (特定の非一般性にもかかわらず)、この問題を解決するための標準的な Pythonic アプローチを提供しています。

最も単純なケース: 「葉は、空の辞書で終わるネストされた辞書です」:

d1 = {'a': {1: {'foo': {}}, 2: {}}}
d2 = {'a': {1: {}, 2: {'bar': {}}}}
d3 = {'b': {3: {'baz': {}}}}
d4 = {'a': {1: {'quux': {}}}}

これは再帰の最も単純なケースであり、2 つの素朴なアプローチをお勧めします。

def rec_merge1(d1, d2):
    '''return new merged dict of dicts'''
    for k, v in d1.items(): # in Python 2, use .iteritems()!
        if k in d2:
            d2[k] = rec_merge1(v, d2[k])
    d3 = d1.copy()
    d3.update(d2)
    return d3

def rec_merge2(d1, d2):
    '''update first dict with second recursively'''
    for k, v in d1.items(): # in Python 2, use .iteritems()!
        if k in d2:
            d2[k] = rec_merge2(v, d2[k])
    d1.update(d2)
    return d1

私は最初のものよりも2番目の方が好きだと思いますが、最初の状態は元の状態から再構築する必要があることに注意してください. 使用法は次のとおりです。

>>> from functools import reduce # only required for Python 3.
>>> reduce(rec_merge1, (d1, d2, d3, d4))
{'a': {1: {'quux': {}, 'foo': {}}, 2: {'bar': {}}}, 'b': {3: {'baz': {}}}}
>>> reduce(rec_merge2, (d1, d2, d3, d4))
{'a': {1: {'quux': {}, 'foo': {}}, 2: {'bar': {}}}, 'b': {3: {'baz': {}}}}

複雑なケース: 「葉は他のタイプです:」

したがって、それらが辞書で終わる場合、それは最後の空の辞書をマージする単純なケースです。そうでなければ、それほど簡単ではありません。文字列の場合、どのようにマージしますか? セットも同様に更新できるので、その処理を行うことはできますが、それらがマージされた順序は失われます。では、順番は重要ですか？

したがって、より多くの情報の代わりに、最も簡単な方法は、両方の値が dict でない場合に標準の更新処理を行うことです。つまり、2 番目の dict の値が None で最初の値が多くの情報を含む口述。

d1 = {'a': {1: 'foo', 2: None}}
d2 = {'a': {1: None, 2: 'bar'}}
d3 = {'b': {3: 'baz'}}
d4 = {'a': {1: 'quux'}}

from collections.abc import MutableMapping

def rec_merge(d1, d2):
    '''
    Update two dicts of dicts recursively, 
    if either mapping has leaves that are non-dicts, 
    the second's leaf overwrites the first's.
    '''
    for k, v in d1.items():
        if k in d2:
            # this next check is the only difference!
            if all(isinstance(e, MutableMapping) for e in (v, d2[k])):
                d2[k] = rec_merge(v, d2[k])
            # we could further check types and merge as appropriate here.
    d3 = d1.copy()
    d3.update(d2)
    return d3

そしていま

from functools import reduce
reduce(rec_merge, (d1, d2, d3, d4))

戻り値

{'a': {1: 'quux', 2: 'bar'}, 'b': {3: 'baz'}}

元の質問への適用:

文字を囲む中括弧を削除し、これを正当な Python にするために一重引用符で囲む必要がありました (それ以外の場合は、Python 2.7+ でリテラルが設定されます)。不足している中括弧を追加する必要があります。

dict1 = {1:{"a":'A'}, 2:{"b":'B'}}
dict2 = {2:{"c":'C'}, 3:{"d":'D'}}

そしてrec_merge(dict1, dict2)今戻ります：

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}

元の質問の望ましい結果と一致するもの (変更後、たとえば{A}to 'A'.)

score 8 · Accepted Answer

この単純な再帰手順は、競合するキーをオーバーライドしながら、ある辞書を別の辞書にマージします。

#!/usr/bin/env python2.7

def merge_dicts(dict1, dict2):
    """ Recursively merges dict2 into dict1 """
    if not isinstance(dict1, dict) or not isinstance(dict2, dict):
        return dict2
    for k in dict2:
        if k in dict1:
            dict1[k] = merge_dicts(dict1[k], dict2[k])
        else:
            dict1[k] = dict2[k]
    return dict1

print (merge_dicts({1:{"a":"A"}, 2:{"b":"B"}}, {2:{"c":"C"}, 3:{"d":"D"}}))
print (merge_dicts({1:{"a":"A"}, 2:{"b":"B"}}, {1:{"a":"A"}, 2:{"b":"C"}}))

出力：

{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}
{1: {'a': 'A'}, 2: {'b': 'C'}}

score 6 · Accepted Answer

不明なレベルの辞書がある場合は、再帰関数をお勧めします。

def combineDicts(dictionary1, dictionary2):
    output = {}
    for item, value in dictionary1.iteritems():
        if dictionary2.has_key(item):
            if isinstance(dictionary2[item], dict):
                output[item] = combineDicts(value, dictionary2.pop(item))
        else:
            output[item] = value
    for item, value in dictionary2.iteritems():
         output[item] = value
    return output

score 2 · Accepted Answer

Since dictviews support set operations, I was able to greatly simplify jterrace's answer.

def merge(dict1, dict2):
    for k in dict1.keys() - dict2.keys():
        yield (k, dict1[k])

    for k in dict2.keys() - dict1.keys():
        yield (k, dict2[k])

    for k in dict1.keys() & dict2.keys():
        yield (k, dict(merge(dict1[k], dict2[k])))

Any attempt to combine a dict with a non dict (technically, an object with a 'keys' method and an object without a 'keys' method) will raise an AttributeError. This includes both the initial call to the function and recursive calls. This is exactly what I wanted so I left it. You could easily catch an AttributeErrors thrown by the recursive call and then yield any value you please.

score 1 · Accepted Answer

もちろん、コードは、マージの競合を解決するためのルールに依存します。これは、任意の数の引数を取り、それらを再帰的に任意の深さまでマージするバージョンです。オブジェクトの変更は使用しません。次のルールを使用して、マージの競合を解決します。

辞書は辞書以外の値よりも優先されます ({"foo": {...}}はよりも優先されます{"foo": "bar"})
後の引数が前の引数よりも優先されます ( {"a": 1}、{"a", 2}、{"a": 3}を順にマージすると、結果はになります{"a": 3})

try:
    from collections import Mapping
except ImportError:
    Mapping = dict

def merge_dicts(*dicts):                                                            
    """                                                                             
    Return a new dictionary that is the result of merging the arguments together.   
    In case of conflicts, later arguments take precedence over earlier arguments.   
    """                                                                             
    updated = {}                                                                    
    # grab all keys                                                                 
    keys = set()                                                                    
    for d in dicts:                                                                 
        keys = keys.union(set(d))                                                   

    for key in keys:                                                                
        values = [d[key] for d in dicts if key in d]                                
        # which ones are mapping types? (aka dict)                                  
        maps = [value for value in values if isinstance(value, Mapping)]            
        if maps:                                                                    
            # if we have any mapping types, call recursively to merge them          
            updated[key] = merge_dicts(*maps)                                       
        else:                                                                       
            # otherwise, just grab the last value we have, since later arguments    
            # take precedence over earlier arguments                                
            updated[key] = values[-1]                                               
    return updated

score 1 · Accepted Answer

それぞれに任意の数のネストされた辞書を含めることができる2 つの辞書 (aおよび) がありました。を優先しbて、それらを再帰的にマージしたかったのです。ba

ネストされた辞書をツリーと考えると、私が欲しかったのは次のとおりです。

aすべての葉へのすべてのパスbがで表されるように更新するにはa
a対応するパスでリーフが見つかった場合のサブツリーを上書きするにはb
- bすべてのリーフノードがリーフのままであるという不変条件を維持します。

既存の回答は私の好みでは少し複雑で、棚にいくつかの詳細が残されていました. 以下をハックして、データセットの単体テストに合格しました。

  def merge_map(a, b):
    if not isinstance(a, dict) or not isinstance(b, dict):
      return b

    for key in b.keys():
      a[key] = merge_map(a[key], b[key]) if key in a else b[key]
    return a

例 (わかりやすくするために書式設定されています):

 a = {
    1 : {'a': 'red', 
         'b': {'blue': 'fish', 'yellow': 'bear' },
         'c': { 'orange': 'dog'},
    },
    2 : {'d': 'green'},
    3: 'e'
  }

  b = {
    1 : {'b': 'white'},
    2 : {'d': 'black'},
    3: 'e'
  }


  >>> merge_map(a, b)
  {1: {'a': 'red', 
       'b': 'white',
       'c': {'orange': 'dog'},},
   2: {'d': 'black'},
   3: 'e'}

b維持する必要のあるパスは次のとおりです。

1 -> 'b' -> 'white'
2 -> 'd' -> 'black'
3 -> 'e'.

a次の一意で競合しないパスがありました:

1 -> 'a' -> 'red'
1 -> 'c' -> 'orange' -> 'dog'

そのため、それらはマージされたマップに引き続き表示されます。

score 0 · Accepted Answer

dict2これは、すべてのアイテムをにマージするのに役立ちますdict1:

for item in dict2:
    if item in dict1:
        for leaf in dict2[item]:
            dict1[item][leaf] = dict2[item][leaf]
    else:
        dict1[item] = dict2[item]

それをテストして、これがあなたが望んでいたものかどうか教えてください.

編集：

上記の解決策は 1 つのレベルのみをマージしますが、OP によって与えられた例を正しく解決します。複数のレベルをマージするには、再帰を使用する必要があります。

score 0 · Accepted Answer

私はあなたのソリューションをテストしてきましたが、私のプロジェクトでこれを使用することにしました:

def mergedicts(dict1, dict2, conflict, no_conflict):
    for k in set(dict1.keys()).union(dict2.keys()):
        if k in dict1 and k in dict2:
            yield (k, conflict(dict1[k], dict2[k]))
        elif k in dict1:
            yield (k, no_conflict(dict1[k]))
        else:
            yield (k, no_conflict(dict2[k]))

dict1 = {1:{"a":"A"}, 2:{"b":"B"}}
dict2 = {2:{"c":"C"}, 3:{"d":"D"}}

#this helper function allows for recursion and the use of reduce
def f2(x, y):
    return dict(mergedicts(x, y, f2, lambda x: x))

print dict(mergedicts(dict1, dict2, f2, lambda x: x))
print dict(reduce(f2, [dict1, dict2]))

パラメータとして関数を渡すことは、jterrace ソリューションを拡張して他のすべての再帰的ソリューションと同じように動作させるための鍵です。

score 0 · Accepted Answer

ねえ、私も同じ問題を抱えていましたが、解決策を思いついたので、他の人にも役立つ場合に備えて、基本的にネストされた辞書をマージし、値を追加して、ここに投稿します。いくつかの確率を計算する必要があったので、これ1つはうまくいきました：

#used to copy a nested dict to a nested dict
def deepupdate(target, src):
    for k, v in src.items():
        if k in target:
            for k2, v2 in src[k].items():
                if k2 in target[k]:
                    target[k][k2]+=v2
                else:
                    target[k][k2] = v2
        else:
            target[k] = copy.deepcopy(v)

上記の方法を使用して、マージできます。

ターゲット = {'6,6': {'6,63': 1}、'63,4': {'4,4': 1}、'4,4': {'4,3': 1} , '6,63': {'63,4': 1}}

src = {'5,4': {'4,4': 1}, '5,5': {'5,4': 1}, '4,4': {'4,3': 1} }

これは次のようになります: {'5,5': {'5,4': 1}, '5,4': {'4,4': 1}, '6,6': {'6,63' : 1}、'63,4': {'4,4': 1}、'4,4': {'4,3': 2}、'6,63': {'63,4': 1 }}

ここでも変更に注意してください。

ターゲット = {'6,6': {'6,63': 1}、'6,63': {'63,4': 1}、'4,4': {'4,3': 1} , '63,4': {'4,4': 1}}

src = {'5,4': {'4,4': 1}, '4,3': {'3,4': 1}, '4,4': {'4,9': 1} , '3,4': {'4,4': 1}, '5,5': {'5,4': 1}}

マージ = {'5,4': {'4,4': 1}、'4,3': {'3,4': 1}、'6,63': {'63,4': 1} , '5,5': {'5,4': 1}, '6,6': {'6,63': 1}, '3,4': {'4,4': 1}, ' 63,4': {'4,4': 1}, '4,4': {'4,3': 1, '4,9': 1} }

コピー用のインポートも忘れずに追加してください。

import copy

score 0 · Accepted Answer

私はこれを広範囲にテストしていないので、フィードバックをお寄せください。

from collections import defaultdict

dict1 = defaultdict(list)

dict2= defaultdict(list)

dict3= defaultdict(list)


dict1= dict(zip(Keys[ ],values[ ]))

dict2 = dict(zip(Keys[ ],values[ ]))


def mergeDict(dict1, dict2):

    dict3 = {**dict1, **dict2}

    for key, value in dict3.items():

        if key in dict1 and key in dict2:

           dict3[key] = [value , dict1[key]]

    return dict3

dict3 = mergeDict(dict1, dict2)

#sort keys alphabetically.

dict3.keys()

2 つの辞書をマージし、共通キーの値を追加します

score 0 · Accepted Answer

次の関数は、b を a にマージします。

def mergedicts(a, b):
    for key in b:
        if isinstance(a.get(key), dict) or isinstance(b.get(key), dict):
            mergedicts(a[key], b[key])
        else:
            a[key] = b[key]
    return a

score 0 · Accepted Answer

そして、もう 1 つのわずかなバリエーション:

これは、純粋な python3 セットベースのディープアップデート関数です。一度に 1 つのレベルをループしてネストされた辞書を更新し、それ自体を呼び出して次の各レベルの辞書値を更新します。

def deep_update(dict_original, dict_update):
    if isinstance(dict_original, dict) and isinstance(dict_update, dict):
        output=dict(dict_original)
        keys_original=set(dict_original.keys())
        keys_update=set(dict_update.keys())
        similar_keys=keys_original.intersection(keys_update)
        similar_dict={key:deep_update(dict_original[key], dict_update[key]) for key in similar_keys}
        new_keys=keys_update.difference(keys_original)
        new_dict={key:dict_update[key] for key in new_keys}
        output.update(similar_dict)
        output.update(new_dict)
        return output
    else:
        return dict_update

簡単な例:

x={'a':{'b':{'c':1, 'd':1}}}
y={'a':{'b':{'d':2, 'e':2}}, 'f':2}

print(deep_update(x, y))
>>> {'a': {'b': {'c': 1, 'd': 2, 'e': 2}}, 'f': 2}

score 0 · Accepted Answer

class Utils(object):

    """

    >>> a = { 'first' : { 'all_rows' : { 'pass' : 'dog', 'number' : '1' } } }
    >>> b = { 'first' : { 'all_rows' : { 'fail' : 'cat', 'number' : '5' } } }
    >>> Utils.merge_dict(b, a) == { 'first' : { 'all_rows' : { 'pass' : 'dog', 'fail' : 'cat', 'number' : '5' } } }
    True

    >>> main = {'a': {'b': {'test': 'bug'}, 'c': 'C'}}
    >>> suply = {'a': {'b': 2, 'd': 'D', 'c': {'test': 'bug2'}}}
    >>> Utils.merge_dict(main, suply) == {'a': {'b': {'test': 'bug'}, 'c': 'C', 'd': 'D'}}
    True

    """

    @staticmethod
    def merge_dict(main, suply):
        """
        获取融合的字典，以main为主,suply补充,冲突时以main为准
        :return:
        """
        for key, value in suply.items():
            if key in main:
                if isinstance(main[key], dict):
                    if isinstance(value, dict):
                        Utils.merge_dict(main[key], value)
                    else:
                        pass
                else:
                    pass
            else:
                main[key] = value
        return main

if __name__ == '__main__':
    import doctest
    doctest.testmod()

python - 辞書の辞書をマージする方法は？

30 に答える 30

最も単純なケース: 「葉は、空の辞書で終わるネストされた辞書です」:

複雑なケース: 「葉は他のタイプです:」

元の質問への適用:

Related

Reference