How do I merge multiple dictionaries values having same key in python? - python

I have n number of dicts like this :
dict_1 = {1: {'Name': 'xyz', 'Title': 'Engineer'}, 2: {'Name': 'abc',
'Title': 'Software'}}
dict_2 = {1: {'Education': 'abc'}, 2: {'Education': 'xyz'}}
dict_3 = {1: {'Experience': 2}, 2:{'Experience': 3}}
.
.
.
dict_n
I just want to combine all of them based on main key like this :
final_dict = {1: {'Name': 'xyz', 'Title': 'Engineer', 'Education':
'abc', 'Experience': 2},
2: {'Name': 'abc', 'Title': 'Software', 'Education':
'xyz', 'Experience': 3}}
can anybody help me to achieve this ?

from your question I think you have n number of dicts. So make list of your dicts and combine all the values having same key. But that itself won't give the exact answer. They are list of dicts. So the second thing I have done is make all those small dicts to one dict .
Here my code you can check :
d1 = {1: {'Name': 'xyz', 'Title': 'Engineer'}, 2: {'Name': 'abc',
'Title': 'Software'}}
d2 = {1: {'Education': 'abc'}, 2: {'Education': 'xyz'}}
d3 = {1: {'Experience': 2}, 2:{'Experience': 3}}
ds = [d1, d2, d3] # list of your dicts you can change it to dict_
big_dict = {}
for k in ds[0]:
big_dict[k] = [d[k] for d in ds]
for k in big_dict.keys():
result = {}
for d in big_dict[k]:
result.update(d)
big_dict[k] = result
print(big_dict)
It gives O/P like this :
{
1: {'Education': 'abc', 'Title': 'Engineer', 'Name': 'xyz',
'Experience': 2},
2: {'Education': 'xyz', 'Title': 'Software', 'Name': 'abc',
'Experience': 3}
}

Related

Python list of dictionaries - adding the dicts with same key names [duplicate]

This question already has answers here:
Python sum on keys for List of Dictionaries [duplicate]
(5 answers)
Closed 4 years ago.
I have a python list like this:
user = [
{'name': 'ozzy', 'quantity': 5},
{'name': 'frank', 'quantity': 4},
{'name': 'ozzy', 'quantity': 3},
{'name': 'frank', 'quantity': 2},
{'name': 'james', 'quantity': 7},
]
I am trying to write the code to join the dictionaries with the same name by also adding the quantities. The final list will be that:
user = [
{'name': 'ozzy', 'quantity': 8},
{'name': 'frank', 'quantity': 6},
{'name': 'james', 'quantity': 7}
]
I have tried a few things but I am struggling to get the right code. The code I have written below is somewhat adding the values (actually my list is much longer, I have just added a small portion for reference).
newList = []
quan = 0
for i in range(0,len(user)):
originator = user[i]['name']
for j in range(i+1,len(user)):
if originator == user[j]['name']:
quan = user[i]['quantity'] + user[j]['quantity']
newList.append({'name': originator, 'Quantity': quan})
can you please help me to get the correct code?
Just count the items in a collections.Counter, and expand back to list of dicts if needed:
user = [
{'name': 'ozzy', 'quantity': 5},
{'name': 'frank', 'quantity': 4},
{'name': 'ozzy', 'quantity': 3},
{'name': 'frank', 'quantity': 2},
{'name': 'james', 'quantity': 7},
]
import collections
d = collections.Counter()
for u in user:
d[u['name']] += u['quantity']
print(dict(d))
newlist = [{'name' : k, 'quantity' : v} for k,v in d.items()]
print(newlist)
outputs Counter dict first, which is already sufficient:
{'frank': 6, 'ozzy': 8, 'james': 7}
and the reformatted output using list of dicts:
[{'name': 'frank', 'quantity': 6}, {'name': 'ozzy', 'quantity': 8}, {'name': 'james', 'quantity': 7}]
The solution is also straightforward with a standard dictionary. No need for Counter or OrderedDict here:
user = [
{'name': 'ozzy', 'quantity': 5},
{'name': 'frank', 'quantity': 4},
{'name': 'ozzy', 'quantity': 3},
{'name': 'frank', 'quantity': 2},
{'name': 'james', 'quantity': 7},
]
dic = {}
for item in user:
n, q = item.values()
dic[n] = dic.get(n,0) + q
print(dic)
user = [{'name':n, 'quantity':q} for n,q in dic.items()]
print(user)
Result:
{'ozzy': 8, 'frank': 6, 'james': 7}
[{'name': 'ozzy', 'quantity': 8}, {'name': 'frank', 'quantity': 6}, {'name': 'james', 'quantity': 7}]
I would suggest changing the way the output dictionary looks so that it is actually useable. Consider something like this
user = [
{'name': 'ozzy', 'quantity': 5},
{'name': 'frank', 'quantity': 4},
{'name': 'ozzy', 'quantity': 3},
{'name': 'frank', 'quantity': 2},
{'name': 'james', 'quantity': 7},
]
data = {}
for i in user:
print(i)
if i['name'] in data:
data[i['name']] += i['quantity']
else:
data.update({i['name']: i['quantity']})
print(data)
{'frank': 6, 'james': 7, 'ozzy': 8}
If you need to maintain the original relative order:
from collections import OrderedDict
user = [
{'name': 'ozzy', 'quantity': 5},
{'name': 'frank', 'quantity': 4},
{'name': 'ozzy', 'quantity': 3},
{'name': 'frank', 'quantity': 2},
{'name': 'james', 'quantity': 7},
]
d = OrderedDict()
for item in user:
d[item['name']] = d.get(item['name'], 0) + item['quantity']
newlist = [{'name' : k, 'quantity' : v} for k, v in d.items()]
print(newlist)
Output:
[{'name': 'ozzy', 'quantity': 8}, {'name': 'frank', 'quantity': 6}, {'name': 'james', 'quantity': 7}]
user = [
{'name': 'ozzy', 'quantity': 8},
{'name': 'frank', 'quantity': 6},
{'name': 'james', 'quantity': 7}
]
reference_dict = {}
for item in user :
reference_dict[item['name']] = reference_dict.get(item['name'],0) + item['quantity']
#Creating new list from reference dict
updated_user = [{'name' : k , 'quantity' : v} for k,v in reference_dict.items()]
print updated_user

Cartesian product of multiple lists of dictionaries

I have two or more dictionaries and each of them is a list of dictionaries (something like json format), for example:
list_1 = [{'Name': 'John' , 'Age': 25} , {'Name': 'Mary' , 'Age': 15}]
list_2 = [{'Product': 'Car', 'Id': 1} , {'Product': 'TV' , 'Id': 2}]
cartesian_product(list_1 * list_2) = [{'Name': 'John', 'Age':25, 'Product': 'Car', 'Id': 1}, {'Name': 'John', 'Age':25, 'Product': 'TV', 'Id': 2}, {'Name': 'Mary' , 'Age': 15, 'Product': 'Car', 'Id': 1}, {'Name': 'Mary' , 'Age': 15, 'Product': 'TV', 'Id': 2}]
How can I do this and be efficient with memory use? The way i'm doing it right now runs out of RAM with big lists. I know it's probably something with itertools.product , but i couldn't figure out how to do this with a list of dicts. Thank you.
PD: I'm doing it this way for the moment:
gen1 = (row for row in self.tables[0])
table = []
for row in gen1:
gen2 = (dictionary for table in self.tables[1:] for dictionary in table)
for element in gen2:
new_row = {}
new_row.update(row)
new_row.update(element)
table.append(new_row)
Thank you!
Here is a solution to the problem posted:
list_1 = [{'Name': 'John' , 'Age': 25} , {'Name': 'Mary' , 'Age': 15}]
list_2 = [{'Product': 'Car', 'Id': 1} , {'Product': 'TV' , 'Id': 2}]
from itertools import product
ret_list = []
for i1, i2 in product(list_1, list_2):
merged = {}
merged.update(i1)
merged.update(i2)
ret_list.append(merged)
The key here is to make use of the update functionality of dicts to add members. This version will leave the parent dicts unmodified. and will silently drop duplicate keys in favor of whatever is seen last.
However, this will not help with memory usage. The simple fact is that if you want to do this operation in memory you will need to be able to store the starting lists and the resulting product. Alternatives include periodically writing to disk or breaking the starting data into chunks and deleting chunks as you go.
Just convert the dictionaries to lists, take the product, and back to dictionaries again:
import itertools
list_1 = [{'Name': 'John' , 'Age': 25} , {'Name': 'Mary' , 'Age': 15}]
list_2 = [{'Product': 'Car', 'Id': 1} , {'Product': 'TV' , 'Id': 2}]
l1 = [l.items() for l in list_1]
l2 = [l.items() for l in list_2]
print [dict(l[0] + l[1]) for l in itertools.product(l1, l2)]
The output is:
[{'Age': 25, 'Id': 1, 'Name': 'John', 'Product': 'Car'}, {'Age': 25,
'Id': 2, 'Name': 'John', 'Product': 'TV'}, {'Age': 15, 'Id': 1,
'Name': 'Mary', 'Product': 'Car'}, {'Age': 15, 'Id': 2, 'Name':
'Mary', 'Product': 'TV'}]
If this isn't memory-efficient enough for you, then try:
for l in itertools.product(l1.iteritems() for l1 in list_1,
l2.iteritems() for l2 in list_2):
# work with one product at a time
For Python 3:
import itertools
list_1 = [{'Name': 'John' , 'Age': 25} , {'Name': 'Mary' , 'Age': 15}]
list_2 = [{'Product': 'Car', 'Id': 1} , {'Product': 'TV' , 'Id': 2}]
print ([{**l[0], **l[1]} for l in itertools.product(list_1, list_2)])

What Is a Pythonic Way to Build a Dict of Dictionary-Lists by Attribute?

I'm looking for pythonic way to convert list of tuples which looks like this:
res = [{type: 1, name: 'Nick'}, {type: 2, name: 'Helma'}, ...]
To dict like this:
{1: [{type: 1, name: 'Nick'}, ...], 2: [{type: 2, name: 'Helma'}, ...]}
Now i do this with code like this (based on this question):
d = defaultdict(list)
for v in res:
d[v["type"]].append(v)
Is this a Pythonic way to build dict of lists of objects by attribute?
I agree with the commentators that here, list comprehension will lack, well, comprehension.
Having said that, here's how it can go:
import itertools
a = [{'type': 1, 'name': 'Nick'}, {'type': 2, 'name': 'Helma'}, {'type': 1, 'name': 'Moshe'}]
by_type = lambda a: a['type']
>>> dict([(k, list(g)) for (k, g) in itertools.groupby(sorted(a, key=by_type), key=by_type)])
{1: [{'name': 'Nick', 'type': 1}, {'name': 'Moshe', 'type': 1}], ...}
The code first sorts by 'type', then uses itertools.groupby to group by the exact same critera.
I stopped understanding this code 15 seconds after I finished writing it :-)
You could do it with a dictionary comprehension, which wouldn't be as illegible or incomprehensible as the comments suggest (IMHO):
# A collection of name and type dictionaries
res = [{'type': 1, 'name': 'Nick'},
{'type': 2, 'name': 'Helma'},
{'type': 3, 'name': 'Steve'},
{'type': 1, 'name': 'Billy'},
{'type': 3, 'name': 'George'},
{'type': 4, 'name': 'Sylvie'},
{'type': 2, 'name': 'Wilfred'},
{'type': 1, 'name': 'Jim'}]
# Creating a dictionary by type
res_new = {
item['type']: [each for each in res
if each['type'] == item['type']]
for item in res
}
>>>res_new
{1: [{'name': 'Nick', 'type': 1},
{'name': 'Billy', 'type': 1},
{'name': 'Jim', 'type': 1}],
2: [{'name': 'Helma', 'type': 2},
{'name': 'Wilfred', 'type': 2}],
3: [{'name': 'Steve', 'type': 3},
{'name': 'George', 'type': 3}],
4: [{'name': 'Sylvie', 'type': 4}]}
Unless I missed something, this should give you the result you're looking for.

Grouping items in Python Dictionary by common value

I have a nested dictonary that I'm iterating over, I'd like to make a new dictonary derived from the old dictonry that groups certain values together based on a value present in the old dictonary. To illustrate:
{'name': Fido, 'breed': Dalmatian, 'age': 3}
{'name': Rex, 'breed': Dalmatian, 'age': 2}
{'name': Max, 'breed': Dalmatian, 'age': 0}
{'name': Rocky, 'breed': Pitbull, 'age': 6}
{'name': Buster, 'breed': Pitbull, 'age': 7}
Would give me:
Dalmation: {'name': [Fido, Rex, Max], 'age': [3, 2, 0]}
Pitbull : {'name': [Rocky, Buster], 'age': [6, 7]}
I've tried to find an elegant and pythonic solution to this to no avail.
Here are two possibilities:
Example #1: http://ideone.com/RRzWaL
dogs = [
{'name': 'Fido', 'breed': 'Dalmatian', 'age': 3},
{'name': 'Rex', 'breed': 'Dalmatian', 'age': 2},
{'name': 'Max', 'breed': 'Dalmatian', 'age': 0},
{'name': 'Rocky', 'breed': 'Pitbull', 'age': 6},
{'name': 'Buster', 'breed': 'Pitbull', 'age': 7},
]
# get rid of duplicates
breeds = set([ dog['breed'] for dog in dogs ])
breed_dict = {}
for breed in breeds:
# get the names of all dogs corresponding to `breed`
names = [ dog['name'] for dog in dogs if dog['breed'] == breed ]
# get the ages of all dogs corresponding to `breed`
ages = [ dog['age'] for dog in dogs if dog['breed'] == breed ]
# add to the new dict
breed_dict[breed] = { 'age': ages, 'name': names }
I'll also add a simplification of #JohnGordon's code using collections's defaultdict:
Example #2: http://ideone.com/B2xLGR
from collections import defaultdict
doglist = [
{'name': 'Fido', 'breed': 'Dalmatian', 'age': 3},
{'name': 'Rex', 'breed': 'Dalmatian', 'age': 2},
{'name': 'Max', 'breed': 'Dalmatian', 'age': 0},
{'name': 'Rocky', 'breed': 'Pitbull', 'age': 6},
{'name': 'Buster', 'breed': 'Pitbull', 'age': 7},
]
dogdict = defaultdict(lambda: defaultdict(list))
for dog in doglist:
# `defaultdict` allows us to not have to check whether
# a key is already in the `dict`, it'll just set it to
# a default (`[]` in the inner dict in our case)
# if it's not there, and then append it.
dogdict[dog['breed']]['name'].append(dog['name'])
dogdict[dog['breed']]['age'].append(dog['age'])
Note that the second example using defaultdict will be faster than the first example, which has two separate list comprehensions (i.e., two separate inner loops).
doglist = [
{'name': 'Fido', 'breed': 'Dalmatian', 'age': 3},
{'name': 'Rex', 'breed': 'Dalmatian', 'age': 2},
{'name': 'Max', 'breed': 'Dalmatian', 'age': 0},
{'name': 'Rocky', 'breed': 'Pitbull', 'age': 6},
{'name': 'Buster', 'breed': 'Pitbull', 'age': 7},
]
dogdict = {}
for dog in doglist:
if dog['breed'] in dogdict:
dogdict[dog['breed']]['name'].append(dog['name'])
dogdict[dog['breed']]['age'].append(dog['age'])
else:
dogdict[dog['breed']] = {'name': [dog['name']], 'age': [dog['age']]}
Use itertools.groupby to segregate the dictionaries then construct the new dictionaries.
import itertools, collections, operator
dees = [{'name': 'Fido', 'breed': 'Dalmatian', 'age': 3},
{'name': 'Rex', 'breed': 'Dalmatian', 'age': 2},
{'name': 'Max', 'breed': 'Dalmatian', 'age': 0},
{'name': 'Rocky', 'breed': 'Pitbull', 'age': 6},
{'name': 'Buster', 'breed': 'Pitbull', 'age': 7}]
breed = operator.itemgetter('breed')
filtr = ['name', 'age']
new_dees = []
for key, group in itertools.groupby(dees, breed):
d = collections.defaultdict(list)
for thing in group:
for k, v in thing.items():
if k in filtr:
d[k].append(v)
new_dees.append({key:d})
As an alternative you can just extract the values you want instead of using if k in filtr. I haven't decided which alternate I like best so I'll post this also.
# using previously defined functions and variables
items_of_interest = operator.itemgetter(*filtr)
for key, group in itertools.groupby(dees, breed):
d = collections.defaultdict(list)
for thing in group:
values = items_of_interest(thing)
for k, v in zip(filtr, values):
d[k].append(v)
new_dees.append({key:d})

Given a list of dictionaries, how can I eliminate duplicates of one key, and sort by another

I'm working with a list of dict objects that looks like this (the order of the objects differs):
[
{'name': 'Foo', 'score': 1},
{'name': 'Bar', 'score': 2},
{'name': 'Foo', 'score': 3},
{'name': 'Bar', 'score': 3},
{'name': 'Foo', 'score': 2},
{'name': 'Baz', 'score': 2},
{'name': 'Baz', 'score': 1},
{'name': 'Bar', 'score': 1}
]
What I want to do is remove duplicate names, keeping only the one of each name that has the highest 'score'. The results from the above list would be:
[
{'name': 'Baz', 'score': 2},
{'name': 'Foo', 'score': 3},
{'name': 'Bar', 'score': 3}
]
I'm not sure which pattern to use here (aside from a seemingly idiotic loop that keeps checking if the current dict's 'name' is in the list already and then checking if its 'score' is higher than the existing one's 'score'.
One way to do that is:
data = collections.defaultdict(list)
for i in my_list:
data[i['name']].append(i['score'])
output = [{'name': i, 'score': max(j)} for i,j in data.items()]
so output will be:
[{'score': 2, 'name': 'Baz'},
{'score': 3, 'name': 'Foo'},
{'score': 3, 'name': 'Bar'}]
There's no need for defaultdicts or sets here. You can just use dirt simple dicts and lists.
Summarize the best running score in a dictionary and convert the result back into a list:
>>> s = [
{'name': 'Foo', 'score': 1},
{'name': 'Bar', 'score': 2},
{'name': 'Foo', 'score': 3},
{'name': 'Bar', 'score': 3},
{'name': 'Foo', 'score': 2},
{'name': 'Baz', 'score': 2},
{'name': 'Baz', 'score': 1},
{'name': 'Bar', 'score': 1}
]
>>> d = {}
>>> for entry in s:
name, score = entry['name'], entry['score']
d[name] = max(d.get(name, 0), score)
>>> [{'name': name, 'score': score} for name, score in d.items()]
[{'score': 2, 'name': 'Baz'}, {'score': 3, 'name': 'Foo'}, {'score': 3, 'name': 'Bar'}]
Just for fun, here is a purely functional approach:
>>> map(dict, dict(sorted(map(sorted, map(dict.items, s)))).items())
[{'score': 3, 'name': 'Bar'}, {'score': 2, 'name': 'Baz'}, {'score': 3, 'name': 'Foo'}]
Sorting is half the battle.
import itertools
import operator
scores = [
{'name': 'Foo', 'score': 1},
{'name': 'Bar', 'score': 2},
{'name': 'Foo', 'score': 3},
{'name': 'Bar', 'score': 3},
{'name': 'Foo', 'score': 2},
{'name': 'Baz', 'score': 2},
{'name': 'Baz', 'score': 1},
{'name': 'Bar', 'score': 1}
]
result = []
sl = sorted(scores, key=operator.itemgetter('name', 'score'),
reverse=True)
name = object()
for el in sl:
if el['name'] == name:
continue
name = el['name']
result.append(el)
print result
This is the simplest way I can think of:
names = set(d['name'] for d in my_dicts)
new_dicts = []
for name in names:
d = dict(name=name)
d['score'] = max(d['score'] for d in my_dicts if d['name']==name)
new_dicts.append(d)
#new_dicts
[{'score': 2, 'name': 'Baz'},
{'score': 3, 'name': 'Foo'},
{'score': 3, 'name': 'Bar'}]
Personally, I prefer not to import modules when the problem is too small.
In case you haven't heard of group by, this is nice use of it:
from itertools import groupby
data=[
{'name': 'Foo', 'score': 1},
{'name': 'Bar', 'score': 2},
{'name': 'Foo', 'score': 3},
{'name': 'Bar', 'score': 3},
{'name': 'Foo', 'score': 2},
{'name': 'Baz', 'score': 2},
{'name': 'Baz', 'score': 1},
{'name': 'Bar', 'score': 1}
]
keyfunc=lambda d:d['name']
data.sort(key=keyfunc)
ans=[]
for k, g in groupby(data, keyfunc):
ans.append({k:max((d['score'] for d in g))})
print ans
>>>
[{'Bar': 3}, {'Baz': 2}, {'Foo': 3}]
I think I can come up with an one-liner here:
result = dict((x['name'],x) for x in sorted(data,key=lambda x: x['score'])).values()

Categories

Resources