This question already has answers here:
Remove duplicate dict in list in Python
(16 answers)
Closed 6 years ago.
I have a list of dictionaries
l = [
{'firstname': 'joe', 'surname': 'bloggs'},
{'firstname': 'john', 'surname': 'smith'},
{'firstname': 'joe', 'surname': 'bloggs'},
{'firstname': 'jane', 'surname': 'bloggs'}
]
how do i remove duplicates i.e. {'firstname': 'joe', 'surname': 'bloggs'} appears twice so would want it only appearing once?
Something like this should do the stuff :
result = [dict(tupleized) for tupleized in set(tuple(item.items()) for item in l)]
first, I transform the inital dict in a list of tuples, then I put them into a set (that removes duplicates entries), and then back into a dict.
import itertools
import operator
from operator import itemgetter
import pprint
l = [
{'firstname': 'joe', 'surname': 'bloggs'},
{'firstname': 'john', 'surname': 'smith'},
{'firstname': 'joe', 'surname': 'bloggs'},
{'firstname': 'jane', 'surname': 'bloggs'}
]
getvals = operator.itemgetter('firstname', 'surname')
l.sort(key=getvals)
result = []
for k, g in itertools.groupby(l, getvals):
result.append(g.next())
l[:] = result
pprint.pprint(l)
Related
I am iterating through a list of dictionaries. I need to update the values for one specific key in all the dictionaries and I have the new values stored in a list. The list of new values is ordered so that the 1st new value belongs to a key in the 1st dictionary, 2nd new value to a key in the 2nd dictionary, etc.
My data looks something like this:
dict_list = [{'person':'Tom', 'job':'student'},
{'person':'John', 'job':'teacher'},
{'person':'Mary', 'job':'manager'}]
new_jobs = ['lecturer', 'cook', 'driver']
And I want to transform the list of dictionaries using the list of new jobs according to my description into this:
dict_list = [{'person':'Tom', 'job':'lecturer'},
{'person':'John', 'job':'cook'},
{'person':'Mary', 'job':'driver'}]
As I have a very long list of dictionaries I would like to define a function that would do this automatically but I am struggling how to do it with for loops and zip(), any suggestions?
I tried the for loop below. I guess the following code could work if it was possible to index the dictionaries like this dictionary['job'][i] Unfortunately dictionaries don't work like this as far as I know.
def update_dic_list():
for dictionary in dict_list:
for i in range(len(new_jobs)):
dictionary['job'] = new_jobs[i]
print(dict_list)
The output the code above gave me was this:
[{'person': 'Tom', 'job': 'driver'}, {'person': 'John', 'job': 'teacher'}, {'person': 'Mary', 'job': 'manager'}]
[{'person': 'Tom', 'job': 'driver'}, {'person': 'John', 'job': 'driver'}, {'person': 'Mary', 'job': 'manager'}]
[{'person': 'Tom', 'job': 'driver'}, {'person': 'John', 'job': 'driver'}, {'person': 'Mary', 'job': 'driver'}]
If your new_jobs list has the right job for each corresponding entry in the dict list, you could use zip:
dict_list = [{'person':'Tom', 'job':'student'},
{'person':'John', 'job':'teacher'},
{'person':'Mary', 'job':'manager'}]
new_jobs = ['lecturer', 'cook', 'driver']
for d, j in zip(dict_list, new_jobs):
d['job'] = j
print(dict_list)
prints
[{'person': 'Tom', 'job': 'lecturer'}, {'person': 'John', 'job': 'cook'}, {'person': 'Mary', 'job': 'driver'}]
With your loop, for each dictionary, you're going through the new jobs and updating that same dictionary over and over with each of the jobs until the last one. So by the end of it, they'll all be drivers. Because that's the last one.
You can do
dict_list = [{'person':'Tom', 'job':'student'},
{'person':'John', 'job':'teacher'},
{'person':'Mary', 'job':'manager'}]
new_jobs = ['lecturer', 'cook', 'driver']
def update_dic_list():
for job, _dict in zip(new_jobs, dict_list):
_dict['job'] = job
or
def update_dict_list():
for i, job in enumerate(new_jobs):
dict_list[i]['job'] = job
You only need to remove the inner loop because you are changing dictionary key job more than one time for each of item of outer loop:
def update_dic_list():
i = 0
for dictionary in dict_list:
dictionary['job'] = new_jobs[i]
i += 1
print(dict_list)
Or alternatively you could use enumerate:
def update_dic_list():
for i, dictionary in enumerate(dict_list):
dictionary['job'] = new_jobs[i]
print(dict_list)
Output:
[{'person': 'Tom', 'job': 'lecturer'}, {'person': 'John', 'job': 'cook'}, {'person': 'Mary', 'job': 'driver'}]
I am trying to separate a list of nested dictionaries. If you notice below, the first nest is combined
names = [{'firstname': [{'firstname': 'john', 'lastname': 'smith'},{'firstname': 'mary', 'lastname': 'smith'}], 'lastname': 'smith'},
{'firstname': 'henry', 'lastname': 'ford'},
{'firstname': 'henry', 'lastname': 'adams'} ]
Is there a way to split them to:
names2 =[{'firstname':'john', 'lastname':'smith'}, {'firstname':'mary', 'lastname':'smith'}, {'firstname':'henry', 'lastname':'ford'}, {'firstname':'henry', 'lastname':'adams'}]
I looked in stackoverflow but there was no consistent key printed, it is always random values.
I tried this
names2 = []
for idxA in names:
for idxB in idxA:
names2.append(idxB)
but it only printed firstname and lastname without the values
thanks
Check what kind of data is in "firstname" and handle it accordingly.
names2 = []
for dct in names:
if isinstance(dct["firstname"], str):
names2.append(dct)
else:
names2.extend(dct["firstname"])
I have 2 lists that contain objects that look like this:
list 1:
{'name': 'Nick', 'id': '123456'}
list 2:
{'address': 'London', 'id': '123456'}
Now I want to create a third list, containing objects that look like this:
{'name': 'Nick', 'address': 'London', 'id': '123456'}
i.e, I want to find the matching id's, and merge those objects.
you can use groupby to get all the matching dicts, then unify them using ChainMap, like this:
from itertools import groupby
from operator import itemgetter
from collections import ChainMap
list1 = [{'name': 'Nick', 'id': '123456'}, {'name': 'Donald', 'id': '999'}]
list2 = [{'address': 'London', 'id': '123456'}, {'address': 'NYC', 'id': '999'}]
grouped_subdicts = groupby(sorted(list1 + list2, key=itemgetter("id")), itemgetter("id"))
result = [dict(ChainMap(*g)) for k, g in grouped_subdicts]
print(result)
Output:
[{'id': '123456', 'address': 'London', 'name': 'Nick'},
{'id': '999', 'address': 'NYC', 'name': 'Donald'}]
I am using cassandra with python and i am executing two queries together. I want to group the results of the results together into a single list using a column as key.
list1 = [{'firstname':'foo','lastname':'bar','id':1},{'firstname':'foo2','lastname':'bar2','id':2}]
list2 = [{'text':'sample','contact_no':'666','id':1},{'text':'sample2','contact_no':'111','id':1}, {'text':'sample3','contact_no':'121','id':2}]
I want to group these two lists together using id key as the criteria
Expected result
[{'firstname':'foo','lastname':'bar','id':1,'text':'sample','contact_no':'666'}, {'firstname':'foo','lastname':'bar','id':1,'text':'sample2','contact_no':'111'},{'firstname':'foo2','lastname':'bar2','id':2,'text':'sample3','contact_no':'121'}]
Please advice on how can i do this the most pythonic way. Thanks in advance.
This is one way:
import itertools
list1 = [{'firstname':'foo','lastname':'bar','id':1},
{'firstname':'foo2','lastname':'bar2','id':2}]
list2 = [{'text':'sample','contact_no':'666','id':1},
{'text':'sample2','contact_no':'111','id':1},
{'text':'sample3','contact_no':'121','id':2}]
lst = []
for x, y in itertools.product(list1, list2):
if x['id'] == y['id']:
c = x.copy()
c.update(y)
lst.append(c)
print(lst)
# [{'firstname': 'foo', 'lastname': 'bar', 'id': 1, 'text': 'sample', 'contact_no': '666'},
# {'firstname': 'foo', 'lastname': 'bar', 'id': 1, 'text': 'sample2', 'contact_no': '111'},
# {'firstname': 'foo2', 'lastname': 'bar2', 'id': 2, 'text': 'sample3', 'contact_no': '121'}]
For some reason my small small brain is having problems with this, I have a list of tuples list = [('name:john','age:25','location:brazil'),('name:terry','age:32','location:acme')]. Im trying to move these values into a dictionary for parsing later. I have made a few attempts, below the latest of these and im not getting all results into the dict, the dict ends up with the last value iterated (its recreating the dict each time).
people = {}
list = [('name:john','age:25','location:brazil'),('name:terry','age:32','location:acme')]
for value in list:
people = {'person': [dict(item.split(":",1) for item in value)]}
You can try this one too:
inlist = [('name:john','age:25','location:brazil'),('name:terry','age:32','location:acme')]
d = []
for tup in inlist:
tempDict = {}
for elem in tup:
elem = elem.split(":")
tempDict.update({elem[0]:elem[1]})
d.append({'person':tempDict})
print(d)
Output:
[{'person': {'location': 'brazil', 'name': 'john', 'age': '25'}}, {'person': {'location': 'acme', 'name': 'terry', 'age': '32'}}]
If you want a dictionary with a key person and values the dictionaries with the people's info, then replace d.append({'person':tempDict}) with d.append(tempDict) and add d = {'person':d} right before printing.
Output:
{'person': [{'location': 'brazil', 'name': 'john', 'age': '25'}, {'location': 'acme', 'name': 'terry', 'age': '32'}]}
You can try this:
l = [('name:john','age:25','location:brazil'),('person:terry','age:32','location:acme')]
people = [{c:d for c, d in [i.split(':') for i in a]} for a in l]
Output:
[{'name': 'john', 'age': '25', 'location': 'brazil'}, {'person': 'terry', 'age': '32', 'location': 'acme'}]
First of all try not to call your list list. This name is protected in python and used usually to get a list out of iterators or ranges etc.
I would make a list of people first and then append each person to the people list as separate dictionary as follows:
people = []
my_list = [('name:john','age:25','location:brazil'),('person:terry','age:32','location:acme')]
for tup in my_list:
person = {}
for item in tup:
splitted = item.split(':')
person.update({splitted[0]:splitted[1]})
people.append(person)
The output then would be this:
[{'age': '25', 'location': 'brazil', 'name': 'john'},
{'age': '32', 'location': 'acme', 'person': 'terry'}]