Cartesian product of multiple lists of dictionaries - python

I have two or more dictionaries and each of them is a list of dictionaries (something like json format), for example:
list_1 = [{'Name': 'John' , 'Age': 25} , {'Name': 'Mary' , 'Age': 15}]
list_2 = [{'Product': 'Car', 'Id': 1} , {'Product': 'TV' , 'Id': 2}]
cartesian_product(list_1 * list_2) = [{'Name': 'John', 'Age':25, 'Product': 'Car', 'Id': 1}, {'Name': 'John', 'Age':25, 'Product': 'TV', 'Id': 2}, {'Name': 'Mary' , 'Age': 15, 'Product': 'Car', 'Id': 1}, {'Name': 'Mary' , 'Age': 15, 'Product': 'TV', 'Id': 2}]
How can I do this and be efficient with memory use? The way i'm doing it right now runs out of RAM with big lists. I know it's probably something with itertools.product , but i couldn't figure out how to do this with a list of dicts. Thank you.
PD: I'm doing it this way for the moment:
gen1 = (row for row in self.tables[0])
table = []
for row in gen1:
gen2 = (dictionary for table in self.tables[1:] for dictionary in table)
for element in gen2:
new_row = {}
new_row.update(row)
new_row.update(element)
table.append(new_row)
Thank you!

Here is a solution to the problem posted:
list_1 = [{'Name': 'John' , 'Age': 25} , {'Name': 'Mary' , 'Age': 15}]
list_2 = [{'Product': 'Car', 'Id': 1} , {'Product': 'TV' , 'Id': 2}]
from itertools import product
ret_list = []
for i1, i2 in product(list_1, list_2):
merged = {}
merged.update(i1)
merged.update(i2)
ret_list.append(merged)
The key here is to make use of the update functionality of dicts to add members. This version will leave the parent dicts unmodified. and will silently drop duplicate keys in favor of whatever is seen last.
However, this will not help with memory usage. The simple fact is that if you want to do this operation in memory you will need to be able to store the starting lists and the resulting product. Alternatives include periodically writing to disk or breaking the starting data into chunks and deleting chunks as you go.

Just convert the dictionaries to lists, take the product, and back to dictionaries again:
import itertools
list_1 = [{'Name': 'John' , 'Age': 25} , {'Name': 'Mary' , 'Age': 15}]
list_2 = [{'Product': 'Car', 'Id': 1} , {'Product': 'TV' , 'Id': 2}]
l1 = [l.items() for l in list_1]
l2 = [l.items() for l in list_2]
print [dict(l[0] + l[1]) for l in itertools.product(l1, l2)]
The output is:
[{'Age': 25, 'Id': 1, 'Name': 'John', 'Product': 'Car'}, {'Age': 25,
'Id': 2, 'Name': 'John', 'Product': 'TV'}, {'Age': 15, 'Id': 1,
'Name': 'Mary', 'Product': 'Car'}, {'Age': 15, 'Id': 2, 'Name':
'Mary', 'Product': 'TV'}]
If this isn't memory-efficient enough for you, then try:
for l in itertools.product(l1.iteritems() for l1 in list_1,
l2.iteritems() for l2 in list_2):
# work with one product at a time

For Python 3:
import itertools
list_1 = [{'Name': 'John' , 'Age': 25} , {'Name': 'Mary' , 'Age': 15}]
list_2 = [{'Product': 'Car', 'Id': 1} , {'Product': 'TV' , 'Id': 2}]
print ([{**l[0], **l[1]} for l in itertools.product(list_1, list_2)])

Related

How to merge multi array of object in python?

I have two list of array, I don't know How I can merge multi array of object in python?
[{'name': 'James'}, {'name': 'Abhinay'}, {'name': 'Peter'}]
[{'age': 1}, {'age': 2}, {'age': 3}]
what I want
[{'name': 'James','age':1}, {'name': 'Abhinay','age':2}, {'name': 'Peter','age':3}]
Here's one approach that can work:
L1 = [{'name': 'James'}, {'name': 'Abhinay'}, {'name': 'Peter'}]
L2 = [{'age': 1}, {'age': 2}, {'age': 3}]
result = [dict(**x, **y) for x, y in zip(L1, L2)]
print(result)
# [{'name': 'James', 'age': 1}, {'name': 'Abhinay', 'age': 2}, {'name': 'Peter', 'age': 3}]
Using the dict union | operator in Python 3.9 or higher:
result = [x | y for x, y in zip(L1, L2)]

Increment a key value in a list of dictionaries

I would like to add an id key to a list of dictionaries, where each id represents the enumerated nested dictionary.
Current list of dictionaries:
current_list_d = [{'id': 0, 'name': 'Paco', 'age': 18} #all id's are 0
{'id': 0, 'name': 'John', 'age': 20}
{'id': 0, 'name': 'Claire', 'age': 22}]
Desired output:
output_list_d = [{'id': 1, 'name': 'Paco', 'age': 18} #id's are counted/enumerated
{'id': 2, 'name': 'John', 'age': 20}
{'id': 3, 'name': 'Claire', 'age': 22}]
My code:
for d in current_list_d:
d["id"]+=1
You could use a simple for loop with enumerate and update in-place the id keys in the dictionaries:
for new_id, d in enumerate(current_list_d, start=1):
d['id'] = new_id
current_list_d
[{'id': 1, 'name': 'Paco', 'age': 18},
{'id': 2, 'name': 'John', 'age': 20},
{'id': 3, 'name': 'Claire', 'age': 22}]
You can use a variable.
id_val = 1
for dict in current_list_d :
dict["id"] = id_val
id_val+=1

Extract dict value from list of dict in python

I have a list of dicts like this:
[{'name': 'John', 'age': 25}, {'name': 'Matt', 'age': 35} , {'name': 'Peter', 'age': 40}]
How can I get the name for those whose age is between 20-30 ?
Many thanks for your help.
You would use something like this:
dicta = {'name': 'John', 'age': 25}, {'name': 'Matt', 'age': 35} , {'name': 'Peter', 'age': 40}
for val in dicta:
if 20 <= val['age'] <= 30:
print(val['name'])
You seem to be new to Python so I suggest you look at some tutorials for example like this one on TutorialsPoint. Walks you through dictionaries.
Something like this should do it:
list_of_dicts = [{'name': 'John', 'age': 25}, {'name': 'Matt', 'age': 35} , {'name': 'Peter', 'age': 40}]
names = [d['name'] for d in list_of_dicts if 20 <= d['age'] <= 30]

Select list element where a field has the min value

Suppose I have a named list as follows:
myListOfPeople = [{'ID': 0, 'Name': 'Mary', 'Age': 25}, {'ID': 1, 'Name': 'John', 'Age': 28}]
I want to select the element (not only the field) where an specific field meets certain criteria, e.g., the element with the minimum 'Age'. Something like:
youngerPerson = [person for person in myListOfPeople if person = ***person with minimum age***]
And will get as answer:
>>youngerPerson: {'ID': 0, 'Name': Mary, 'Age': 25}
How can I do that?
You can use the key parameter of min:
>>> myListOfPeople = [{'ID': 0, 'Name': 'Mary', 'Age': 25}, {'ID': 1, 'Name': 'John', 'Age': 28}]
>>>
>>> min(myListOfPeople, key=lambda x: x["Age"])
{'ID': 0, 'Name': 'Mary', 'Age': 25}
>>>
You can use itemgetter :
from operator import itemgetter
myListOfPeople = [{'ID': 0, 'Name': 'Mary', 'Age': 25}, {'ID': 1, 'Name': 'John', 'Age': 28}]
sorted(myListOfPeople, key=itemgetter('Age'))[0]
# {'ID': 0, 'Name': 'Mary', 'Age': 25}

Grouping items in Python Dictionary by common value

I have a nested dictonary that I'm iterating over, I'd like to make a new dictonary derived from the old dictonry that groups certain values together based on a value present in the old dictonary. To illustrate:
{'name': Fido, 'breed': Dalmatian, 'age': 3}
{'name': Rex, 'breed': Dalmatian, 'age': 2}
{'name': Max, 'breed': Dalmatian, 'age': 0}
{'name': Rocky, 'breed': Pitbull, 'age': 6}
{'name': Buster, 'breed': Pitbull, 'age': 7}
Would give me:
Dalmation: {'name': [Fido, Rex, Max], 'age': [3, 2, 0]}
Pitbull : {'name': [Rocky, Buster], 'age': [6, 7]}
I've tried to find an elegant and pythonic solution to this to no avail.
Here are two possibilities:
Example #1: http://ideone.com/RRzWaL
dogs = [
{'name': 'Fido', 'breed': 'Dalmatian', 'age': 3},
{'name': 'Rex', 'breed': 'Dalmatian', 'age': 2},
{'name': 'Max', 'breed': 'Dalmatian', 'age': 0},
{'name': 'Rocky', 'breed': 'Pitbull', 'age': 6},
{'name': 'Buster', 'breed': 'Pitbull', 'age': 7},
]
# get rid of duplicates
breeds = set([ dog['breed'] for dog in dogs ])
breed_dict = {}
for breed in breeds:
# get the names of all dogs corresponding to `breed`
names = [ dog['name'] for dog in dogs if dog['breed'] == breed ]
# get the ages of all dogs corresponding to `breed`
ages = [ dog['age'] for dog in dogs if dog['breed'] == breed ]
# add to the new dict
breed_dict[breed] = { 'age': ages, 'name': names }
I'll also add a simplification of #JohnGordon's code using collections's defaultdict:
Example #2: http://ideone.com/B2xLGR
from collections import defaultdict
doglist = [
{'name': 'Fido', 'breed': 'Dalmatian', 'age': 3},
{'name': 'Rex', 'breed': 'Dalmatian', 'age': 2},
{'name': 'Max', 'breed': 'Dalmatian', 'age': 0},
{'name': 'Rocky', 'breed': 'Pitbull', 'age': 6},
{'name': 'Buster', 'breed': 'Pitbull', 'age': 7},
]
dogdict = defaultdict(lambda: defaultdict(list))
for dog in doglist:
# `defaultdict` allows us to not have to check whether
# a key is already in the `dict`, it'll just set it to
# a default (`[]` in the inner dict in our case)
# if it's not there, and then append it.
dogdict[dog['breed']]['name'].append(dog['name'])
dogdict[dog['breed']]['age'].append(dog['age'])
Note that the second example using defaultdict will be faster than the first example, which has two separate list comprehensions (i.e., two separate inner loops).
doglist = [
{'name': 'Fido', 'breed': 'Dalmatian', 'age': 3},
{'name': 'Rex', 'breed': 'Dalmatian', 'age': 2},
{'name': 'Max', 'breed': 'Dalmatian', 'age': 0},
{'name': 'Rocky', 'breed': 'Pitbull', 'age': 6},
{'name': 'Buster', 'breed': 'Pitbull', 'age': 7},
]
dogdict = {}
for dog in doglist:
if dog['breed'] in dogdict:
dogdict[dog['breed']]['name'].append(dog['name'])
dogdict[dog['breed']]['age'].append(dog['age'])
else:
dogdict[dog['breed']] = {'name': [dog['name']], 'age': [dog['age']]}
Use itertools.groupby to segregate the dictionaries then construct the new dictionaries.
import itertools, collections, operator
dees = [{'name': 'Fido', 'breed': 'Dalmatian', 'age': 3},
{'name': 'Rex', 'breed': 'Dalmatian', 'age': 2},
{'name': 'Max', 'breed': 'Dalmatian', 'age': 0},
{'name': 'Rocky', 'breed': 'Pitbull', 'age': 6},
{'name': 'Buster', 'breed': 'Pitbull', 'age': 7}]
breed = operator.itemgetter('breed')
filtr = ['name', 'age']
new_dees = []
for key, group in itertools.groupby(dees, breed):
d = collections.defaultdict(list)
for thing in group:
for k, v in thing.items():
if k in filtr:
d[k].append(v)
new_dees.append({key:d})
As an alternative you can just extract the values you want instead of using if k in filtr. I haven't decided which alternate I like best so I'll post this also.
# using previously defined functions and variables
items_of_interest = operator.itemgetter(*filtr)
for key, group in itertools.groupby(dees, breed):
d = collections.defaultdict(list)
for thing in group:
values = items_of_interest(thing)
for k, v in zip(filtr, values):
d[k].append(v)
new_dees.append({key:d})

Categories

Resources