[Solved]-Merge/join lists of dictionaries based on a common value in Python

19👍

I’d use itertools.groupby to group the elements:

lst = sorted(itertools.chain(list_a,list_b), key=lambda x:x['user__id'])
list_c = []
for k,v in itertools.groupby(lst, key=lambda x:x['user__id']):
    d = {}
    for dct in v:
        d.update(dct)
    list_c.append(d)
    #could also do:
    #list_c.append( dict(itertools.chain.from_iterable(dct.items() for dct in v)) )
    #although that might be a little harder to read.

If you have an aversion to lambda functions, you can always use operator.itemgetter('user__id') instead. (it’s probably slightly more efficient too)

To demystify lambda/itemgetter a little bit, Note that:

def foo(x):
    return x['user__id']

is the same thing* as either of the following:

foo = operator.itemgetter('user__id')
foo = lambda x: x['user__id']

*There are a few differences, but they’re not important for this problem

6👍

from collections import defaultdict
from itertools import chain

list_a = [{'user__name': u'Joe', 'user__id': 1},
      {'user__name': u'Bob', 'user__id': 3}]
list_b = [{'hours_worked': 25, 'user__id': 3},
      {'hours_worked': 40, 'user__id': 1}]

collector = defaultdict(dict)

for collectible in chain(list_a, list_b):
    collector[collectible['user__id']].update(collectible.iteritems())

list_c = list(collector.itervalues())

As you can see, this just uses another dict to merge the existing dicts. The trick with defaultdict is that it takes out the drudgery of creating a dict for a new entry.

There is no need to group or sort these inputs. The dict takes care of all of that.

A truly bulletproof solution would catch the potential key error in case the input does not have a ‘user__id’ key, or use a default value to collect up all of the dicts without such a key.

👤Marcin

Leave a comment