Merge 2 lists containing obejcts with matching ids - python

I have 2 lists that contain objects that look like this:
list 1:
{'name': 'Nick', 'id': '123456'}
list 2:
{'address': 'London', 'id': '123456'}
Now I want to create a third list, containing objects that look like this:
{'name': 'Nick', 'address': 'London', 'id': '123456'}
i.e, I want to find the matching id's, and merge those objects.

you can use groupby to get all the matching dicts, then unify them using ChainMap, like this:
from itertools import groupby
from operator import itemgetter
from collections import ChainMap
list1 = [{'name': 'Nick', 'id': '123456'}, {'name': 'Donald', 'id': '999'}]
list2 = [{'address': 'London', 'id': '123456'}, {'address': 'NYC', 'id': '999'}]
grouped_subdicts = groupby(sorted(list1 + list2, key=itemgetter("id")), itemgetter("id"))
result = [dict(ChainMap(*g)) for k, g in grouped_subdicts]
print(result)
Output:
[{'id': '123456', 'address': 'London', 'name': 'Nick'},
{'id': '999', 'address': 'NYC', 'name': 'Donald'}]

Related

Comprehension list of nested dictionary to get values but got keys

Fairly new to list comprehension and have the_list that I want to extract the keys of nested dictionary but I got the values instead. What am I missing or what am I doing wrong?
the_list = [{'size': 0, 'values': [], 'start': 0}, {'size': 2, 'values': [{'user': {'name': 'anna', 'id': 10, 'displayName': 'Anna'}, 'category': 'Secretary'}, {'user': {'name': 'bob', 'id': 11, 'displayName': 'Bobby'}, 'category': 'Manager'}], 'start': 0}, {'size': 1, 'values': [{'user': {'name': 'claire', 'id': 13, 'displayName': 'Clarissa Claire'}, 'category': 'Secretary'}], 'start': 0}]
list_comprehension = []
list_comprehension = [x for x in the_list for x in the_list[1]['values'][0]]
print(list_comprehension)
>> ['user', 'category', 'user', 'category', 'user', 'category']
Want
list_comprehension = [[anna, Secretary], [bob, manager], [claire, Secretary]]
You could use this. I personnally try to avoid nested list comprehension as they are hard to read and debug.
[[x['category'], x['user']['displayName']] for nest_list in the_list for x in nest_list["values"] ]
Output:
[['Secretary', 'Anna'], ['Manager', 'Bobby'], ['Secretary', 'Clarissa Claire']]
EDIT:
A version that doesn't have a nested comprehension list. When doing it I realised that there was one more level than I realised that makes this version a bit long. So in the end I'm not sure which one I would use in prod.
result = []
dict_list = [nest_list["values"] for nest_list in the_list]
for elt in dict_list:
for d in elt:
result.append([d['category'], d['user']['displayName']])
I've come up with this solution, but it is not very readable ...
comprehensionList = [[user['user']['name'], user['category']] for x in the_list for user in x['values']]
# Output
[['anna', 'Secretary'], ['bob', 'Manager'], ['claire', 'Secretary']]

Adding key and value to dictionary in python based on other dictionaries

I am using for loop in python and every loop creates a dictionary. I have the below set of dictionaries created.
{'name': 'xxxx'}
{'name': 'yyyy','age':'28'}
{'name': 'zzzz','age':'27','sex':'F'}
My requirement is to compare all the dictionaries created and find out the missing key values and add the key to missing dictionaries and order every dictionary based on key. Below is the expected output
Expected output:
{'age':'','name': 'xxxx','sex':''}
{'age':'28','name': 'yyyy','sex':''}
{'age':'27','name': 'zzzz','sex':'F'}
How to achieve this in python.
If you want to modify the dicts in-place, dict.setdefault would be easy enough.
my_dicts = [
{'name': 'xxxx'},
{'name': 'yyyy','age':'28'},
{'name': 'zzzz','age':'27','sex':'F'},
]
desired_keys = ['name', 'age', 'sex']
for d in my_dicts:
for key in desired_keys:
d.setdefault(key, "")
print(my_dicts)
prints out
[
{'name': 'xxxx', 'age': '', 'sex': ''},
{'name': 'yyyy', 'age': '28', 'sex': ''},
{'name': 'zzzz', 'age': '27', 'sex': 'F'},
]
If you don't want to hard-code the desired_keys list, you can make it a set and gather it from the dicts before the loop above.
desired_keys = set()
for d in my_dicts:
desired_keys.update(set(d)) # update with keys from `d`
Another option, if you want new dicts instead of modifying them in place, is
desired_keys = ... # whichever method you like
empty_dict = dict.fromkeys(desired_keys, "")
new_dicts = [{**empty_dict, **d} for d in my_dicts]
EDIT based on comments:
This doesn't remove keys that are not there in desired keys.
This will leave only the desired keys:
desired_keys = ... # Must be a set
for d in my_dicts:
for key in desired_keys:
d.setdefault(key, "")
for key in set(d) - desired_keys:
d.pop(key)
However, at that point it might be easier to just create new dicts:
new_dicts = [
{key: d.get(value, "") for key in desired_keys}
for d in my_dicts
]
data = [{'name': 'xxxx'},
{'name': 'yyyy','age':'28'},
{'name': 'zzzz','age':'27','sex':'F'}]
First get the maximum, to get all the keys.
Then use dict.get to get default value as empty string for each of the keys, and sort the dictionary on key, you can combine List-comprehension and dict-comprehension:
allKD = max(data, key=len)
[dict(sorted({k:d.get(k, '') for k in allKD}.items(), key=lambda x:x[0])) for d in data]
OUTPUT:
[{'age': '', 'name': 'xxxx', 'sex': ''},
{'age': '28', 'name': 'yyyy', 'sex': ''},
{'age': '27', 'name': 'zzzz', 'sex': 'F'}]
One approach:
from operator import or_
from functools import reduce
lst = [{'name': 'xxxx'},
{'name': 'yyyy', 'age': '28'},
{'name': 'zzzz', 'age': '27', 'sex': 'F'}]
# find all the keys
keys = reduce(or_, map(dict.keys, lst))
# update each dictionary with the complement of the keys
for d in lst:
d.update(dict.fromkeys(keys - d.keys(), ""))
print(lst)
Output
[{'name': 'xxxx', 'age': '', 'sex': ''}, {'name': 'yyyy', 'age': '28', 'sex': ''}, {'name': 'zzzz', 'age': '27', 'sex': 'F'}]

How to iterate through 2 zipped list of dictionaries under certain conditions?

There are 4 list of dictionaries, the first two need to be added to the database, the second 2 already exist in the database:
to_add1 = [{'name': 'Kate', 'age': 25, 'id': 1234},
{'name': 'Claire', 'age': 25, 'id': 4567},
{'name': 'Bob', 'age': 25, 'id': 8910}]
to_add2 = [{'pets': 5, 'name_id': 1234},
{'pets': 0, 'name_id': 4567},
{'pets': 0, 'name_id': 8910}]
existing1 = [{'name': 'John', 'age': 50, 'id': 0000},
{'name': 'Claire', 'age': 25, 'id': 4567}]
existing2 = [{'pets': 2, 'name_id': 0000},
{'pets': 0, 'name_id': 4567}]
I would like to add to the database or in this reproducible example, print only the items that do not contain the existing1['name'] value in existing1. So in this case: Kate and Bob from to_add1 should be printed once. Due to the double loop, I am getting repetitions. How can I loop through all these dictionaries and print the items that do have a matching name to the names ofexisting 1 without repetitions ?
My code:
for person_to_add, pet_to_add in zip(to_add1, to_add2):
for existing_person, existing_pet in zip(existing1, existing2):
if person_to_add['name'] not in existing_person['name']:
print(person_to_add)
Current output:
{'name': 'Kate', 'age': 25, 'id': 1234}
{'name': 'Kate', 'age': 25, 'id': 1234}
{'name': 'Claire', 'age': 25, 'id': 4567}
{'name': 'Bob', 'age': 25, 'id': 8910}
{'name': 'Bob', 'age': 25, 'id': 8910}
Desired output:
{'name': 'Kate', 'age': 25, 'id': 4567}
{'name': 'Bob', 'age': 25, 'id': 8910}
Depending on the complexity of the actual underlying query this might not work. But here is a solution taking advantage of the list of dictionary structure you are using. It is unclear exactly to me why you would need to use zip.
a = [person_to_add for person_to_add in to_add1 if person_to_add['name'] not in [existing_person['name'] for existing_person in existing1]]
This uses list comprehensions which should be pretty fast. The output 'a' would be in the same format as all your other data i.e a list of dicts. Do let me know if you need a further breakdown on how the list comprehension works.
I assume you mean the output should be Kate & Bob because Claire is in existing1:
>>> for person_to_add, pet_to_add in zip(to_add1, to_add2):
... if person_to_add not in existing1:
... print(person_to_add)
...
{'name': 'Kate', 'age': 25, 'id': 1234}
{'name': 'Bob', 'age': 25, 'id': 8910}
lst = [print(person_to_add) for person_to_add in to_add1 if person_to_add not in existing1]
It might not the fastest solution but it get the job done. it uses python list comprehension, and easy to read.
As you described in the problem that you only want to compare data in existing1 and to_add1 so you don't need use nested loops adding unnecessary complexity. you will be fine using a single loop.

How to add a list of values to a list of nested dictionaries?

I would like to add each value of a list to each nested dictionary of a different list, with a new key name.
List of dictionaries:
list_dicts = [{'id': 1, 'text': 'abc'}, {'id':2, 'text': 'def'}]
List:
list = ['en', 'nl']
Desired output:
list_dicts = [{'id': 1, 'text': 'abc', 'language': 'en'}, {{'id':2, 'text': 'def', 'language':'nl'}]
Current method used:
I transformed the list_dicts to a Pandas data frame, added a new column 'language' that represents the list values. Then, I transformed the Pandas data frame back to a list of dictionaries using df.to_dict('records'). There must be a more efficient way to loop through the list and add each value to a new assigned key in the list of dictionaries without needing to use Pandas at all. Any ideas ?
list = ['en', 'nl'] # Don't use list as variable name tho.
list_dicts = [{'id': 1, 'text': 'abc'}, {'id':2, 'text': 'def'}]
for i,item in enumerate(list):
list_dicts[i]['language'] = item
that should do the trick, if you only want to assign values to the 'language' key.
Using a list comprehension with zip
Ex:
list_dicts = [{'id': 1, 'text': 'abc'}, {'id':2, 'text': 'def'}]
lst = ['en', 'nl']
list_dicts = [{**n, "language": m} for n,m in zip(list_dicts, lst)]
print(list_dicts)
# --> [{'id': 1, 'text': 'abc', 'language': 'en'}, {'id': 2, 'text': 'def', 'language': 'nl'}]
A simple loop over the zipped lists will do:
for d, lang in zip(list_dicts, list):
d["language"] = lang
Side note: you shouldn't name a variable list not to shadow built-in names.
Try like this (Dont use list as variable name):
list_dicts = [{'id': 1, 'text': 'abc'}, {'id':2, 'text': 'def'}]
langlist = ['en', 'nl']
x = 0
for y in list_dicts:
y['language'] = langlist[x]
x=x+1
print(list_dicts)
Simply:
for d, l in zip(list_dicts, list):
d['language'] = l
Then:
print(list_dicts)
(Assuming that both lists are of the same length)
list_dicts = [{'id': 1, 'text': 'abc'}, {'id':2, 'text': 'def'}]
list_lang = ['en', 'nl']
for i in range(len(list_dicts)):
list_dicts[i]['language']=list_lang[i]
>>> print(list_dicts)
[{'id': 1, 'text': 'abc', 'language': 'en'}, {'id': 2, 'text': 'def', 'language': 'nl'}]

List of tuples to dictionary not working as expected

For some reason my small small brain is having problems with this, I have a list of tuples list = [('name:john','age:25','location:brazil'),('name:terry','age:32','location:acme')]. Im trying to move these values into a dictionary for parsing later. I have made a few attempts, below the latest of these and im not getting all results into the dict, the dict ends up with the last value iterated (its recreating the dict each time).
people = {}
list = [('name:john','age:25','location:brazil'),('name:terry','age:32','location:acme')]
for value in list:
people = {'person': [dict(item.split(":",1) for item in value)]}
You can try this one too:
inlist = [('name:john','age:25','location:brazil'),('name:terry','age:32','location:acme')]
d = []
for tup in inlist:
tempDict = {}
for elem in tup:
elem = elem.split(":")
tempDict.update({elem[0]:elem[1]})
d.append({'person':tempDict})
print(d)
Output:
[{'person': {'location': 'brazil', 'name': 'john', 'age': '25'}}, {'person': {'location': 'acme', 'name': 'terry', 'age': '32'}}]
If you want a dictionary with a key person and values the dictionaries with the people's info, then replace d.append({'person':tempDict}) with d.append(tempDict) and add d = {'person':d} right before printing.
Output:
{'person': [{'location': 'brazil', 'name': 'john', 'age': '25'}, {'location': 'acme', 'name': 'terry', 'age': '32'}]}
You can try this:
l = [('name:john','age:25','location:brazil'),('person:terry','age:32','location:acme')]
people = [{c:d for c, d in [i.split(':') for i in a]} for a in l]
Output:
[{'name': 'john', 'age': '25', 'location': 'brazil'}, {'person': 'terry', 'age': '32', 'location': 'acme'}]
First of all try not to call your list list. This name is protected in python and used usually to get a list out of iterators or ranges etc.
I would make a list of people first and then append each person to the people list as separate dictionary as follows:
people = []
my_list = [('name:john','age:25','location:brazil'),('person:terry','age:32','location:acme')]
for tup in my_list:
person = {}
for item in tup:
splitted = item.split(':')
person.update({splitted[0]:splitted[1]})
people.append(person)
The output then would be this:
[{'age': '25', 'location': 'brazil', 'name': 'john'},
{'age': '32', 'location': 'acme', 'person': 'terry'}]

Categories

Resources