Remove duplicates from a list of dicts - python

I have a list of dicts like this:
[{'ID': 'a', 'Number': 2}, {'ID': 'b', 'Number': 5} , {'ID': 'a', 'Number': 6}, {'ID': 'a', 'Number': 8}, {'ID': 'c', 'Number': 3}]
I want to remove the dicts that have same key and only keep the one with smallest value. The expected result should be:
[{'ID': 'a', 'Number': 2}, {'Id': 'b', 'Number': 5}, {'ID': 'c', 'Number': 3}]

Most efficient solution would be to use a temporary lookup dictionary with keys as IDs and values as the current dict which has the lowest Number corresponding to that ID.
l = [{'ID': 'a', 'Number': 2},
{'ID': 'b', 'Number': 5}, # note that I corrected a typo Id --> ID
{'ID': 'a', 'Number': 6},
{'ID': 'a', 'Number': 8},
{'ID': 'c', 'Number': 3}]
lookup_dict = {}
for d in l:
if d['ID'] not in lookup_dict or d['Number'] < lookup_dict[d['ID']]['Number']:
lookup_dict[d['ID']] = d
output = list(lookup_dict.values())
which gives output as:
[{'ID': 'a', 'Number': 2}, {'ID': 'b', 'Number': 5}, {'ID': 'c', 'Number': 3}]
A piece of advice: given your final data structure, I wonder if you may be better off now representing this final data as a dictionary - with the IDs as keys since these are now unique. This would allow for more convenient data access.

Related

Sort list of dicts by specific list of values

I'm trying to code a faster way to solve the following problem but I don't know how to do it:
I have the following list of dicts and list of identifiers:
list_of_dicts = [{'id': 1, 'name': 'A'}, {'id': 2, 'name': 'B'}, {'id': 3, 'name': 'C'}, {'id': 4, 'name': 'D'}]
list_of_ids = [1, 3, 2, 4, 1, 3, 4]
I'd like to have the following output:
[{'id': 1, 'name': 'A'}, {'id': 3, 'name': 'C'}, {'id': 2, 'name': 'B'}, {'id': 4, 'name': 'D'}, {'id': 1, 'name': 'A'}, {'id': 3, 'name': 'C'}, {'id': 4, 'name': 'D'}]
The way I'm doing it is:
list_of_dict_ids = [d['id'] for d in list_of_dicts]
ordered_list_by_ids = [list_of_dicts[list_of_dict_ids.index(i)] for i in list_of_ids]
Is there any faster way to do it?
You can do like this :
dic = {d["id"]: d for d in list_of_dicts}
dic
>>>{1: {'id': 1}, 2: {'id': 2}, 3: {'id': 3}, 4: {'id': 4}}
lst =[dic[i] for i in list_of_ids]
lst
>>>[{'id': 1}, {'id': 3}, {'id': 2}, {'id': 4}, {'id': 1}, {'id': 3}, {'id': 4}]

getting part of dictionary by value in python

I have dictionary:
teamDictionary = {
1: {'name': 'Bob', 'team': 'A', 'status': 'Leave'},
2: {'name': 'George', 'team': 'C', 'status': 'Training'},
3: {'name': 'Sam', 'team': 'B', 'status': 'Travel'},
4: {'name': 'Phil', 'team': 'A', 'status': 'Leave'},
5: {'name': 'Georgia', 'team': 'C', 'status': 'Training'}
}
I need to get all smaller dictionary where team is C. My cod is:
team_leave = [teamDictionary[a] for a, b in teamDictionary.items() if b['team'] == 'C' ]
print(team_leave)
[{'name': 'George', 'team': 'C', 'status': 'Training'}, {'name': 'Georgia', 'team': 'C', 'status': 'Training'}]
But I need to get
{
2: {'name': 'George', 'team': 'C', 'status': 'Training'},
5: {'name': 'Georgia', 'team': 'C', 'status': 'Training'}
}
How should I solve my problem?
You can use a dict comprehension instead:
{k: d for k, d in teamDictionary.items() if d['team'] == 'C'}
You should use Dictionary Comprehension:
team_leave = {key: item for key, item in teamDictionary.items() if item['team'] == 'C'}
print(team_leave)
Ouput:
{2: {'name': 'George', 'team': 'C', 'status': 'Training'}, 5: {'name': 'Georgia', 'team': 'C', 'status': 'Training'}}
print({key: values for key, values in teamDictionary.items() if values['team'] == 'C'}

How to merge two list of dictionaries based on a value

I have two lists of dictionaries, lets say:
a = [{'id': 1, 'name': 'a'}]
b = [{'id': 1, 'city': 'b'}]
I want to have a list that merges every dictionary in both lists with the same ID. In this example i expect to have:
a = [{'id': 1, 'name': 'a', 'city': 'b'}]
Is there any cleaner way of doing it other than a for nested into the other?
Thanks
You can keep track of the ids with another dict (or defaultdict to make things simpler). Then update the items in that dict as you iterate. In the end the dict's values will have your list.
from collections import defaultdict
d = defaultdict(dict)
a = [{'id': 1, 'name': 'a'}, {'id': 3, 'name': 'a'}]
b = [{'id': 1, 'city': 'b'}, {'id': 2, 'city': 'c'}, {'id': 3, 'city': 'd'}]
for item in a + b:
d[item['id']].update(item)
list(d.values())
# [{'id': 1, 'name': 'a', 'city': 'b'},
# {'id': 3, 'name': 'a', 'city': 'd'},
# {'id': 2, 'city': 'c'}]
Note this will overwrite duplicate values other than id — so if you have two with id: 1 and two different cities, you will only get the last city.
One way to do this is to make a dictionary, mapping the identifier that you want to use (id in this case) to a dictionary of merged results.
#!/usr/bin/python
import collections
def merge_on_key(list_of_dictionaries, key, result):
for d in list_of_dictionaries:
assert(key in d)
result[d[key]].update(d)
a = [{'id': 1, 'name': 'a'}]
b = [{'id': 1, 'city': 'b'}, {'id': 2, 'color': 'blue'}]
print 'a', a
print 'b', b
c = collections.defaultdict(lambda: {})
merge_on_key(a, 'id', c)
merge_on_key(b, 'id', c)
print 'merged results in dictionary with id 1', c[1]
That returns:
merged results in dictionary with id 1 {'city': 'b', 'id': 1, 'name': 'a'}
You can use map, lambda function in conjunction with update method for dictionaries, like this:
a = [{'id': 1, 'name': 'a'}, {'id': 2, 'name': 'a'}, {'id': 3, 'name': 'k'}]
b = [{'id': 1, 'city': 'b'}, {'id': 2, 'city': 'c'}, {'id': 4, 'city': 'cm'}]
a.extend(list(map(lambda x,y: y if x.get('id') != y.get('id') else x.update(y), a, b)))
a = list(filter(None, a))
a will now become a list containing dictionaries of merged values like this:
[{'id': 1, 'name': 'a', 'city': 'b'},
{'id': 2, 'name': 'a', 'city': 'c'},
{'id': 3, 'name': 'k'},
{'id': 4, 'city': 'cm'}]
from collections import defaultdict
from operator import itemgetter
l1 =[{'id': 1, 'City': 'Calcutta'}, {'id': 3, 'Country': 'Germany'}]
l2 = [{'id': 1, 'Country': 'India'}, {'id': 2, 'City': 'Delhi'}, {'id': 3, 'City': 'Berlin'}]
def merge1(l1,l2):
d = defaultdict(dict)
for l in (l1, l2):
for innerdict1 in l:
d[innerdict1['id']].update(innerdict1)
l4 = sorted(d.values(), key=itemgetter("id"))
l4p = print(l4)
return l4p
merge1(l1, l2)
"""
[{'id': 1, 'City': 'Delhi', 'Country': 'India'}, {'id': 2, 'City': 'Calcutta'}, {'id': 3, 'Country': 'Germany', 'City': 'Berlin'}]
"""

Using Glom on a nested structure, how to I move top level dictionary fields into a list of dictionaries?

This is a question about the usage of Glom (https://github.com/mahmoud/glom/)
I have a dictionary that includes a list of other dictionaries.
{'date': '2020-01-01',
'location': 'A',
'items': [
{'name': 'A', 'id': 'A1'},
{'name': 'B', 'id': 'B1'},
{'name': 'C', 'id': 'C1'}
]}
I would like to use Glom to move the outer, global dictionary fields 'date' and 'location' into list of dictionaries for the items.
This is the end result I try to reach
[
{'name': 'A', 'id': 'A1', 'date': '2020-01-01', 'location': 'A'},
{'name': 'B', 'id': 'B1', 'date': '2020-01-01', 'location': 'A'},
{'name': 'C', 'id': 'C1', 'date': '2020-01-01', 'location': 'A'}
]
Alas, when the spec arrives at the 'item' of the dictionary, the other values are not longer accessable and the T object is set to the inner value instead.
from glom import glom, T
def update_dict(x, other_dict):
x.update({'date': other_dict['date'], 'location': other_dict['location']})
return x.copy()
spec = (T, 'items', [(lambda x: update_dict(x, T()))])
data = {'date': '2020-01-01',
'location': 'A',
'items': [{'name': 'A', 'id': 'A1'},
{'name': 'B', 'id': 'B1'},
{'name': 'C', 'id': 'C1'}]}
glom(data, spec) # print this
returns
[{'name': 'A', 'id': 'A1', 'date': T()['date'], 'location': T()['location']},
{'name': 'B', 'id': 'B1', 'date': T()['date'], 'location': T()['location']},
{'name': 'C', 'id': 'C1', 'date': T()['date'], 'location': T()['location']}]
Which is useless.
It's not difficult to update the dictionaries with regular Python code, but
is there a way to do this within a Glom spec?
The trick is to pass the target as a global scope as well,
this way, the Assign command can access the full target.
from glom import S, glom, Assign, Spec
spec = ('items',
[Assign( 'date', Spec(S['date']))],
[Assign( 'location', Spec(S['location']))]
)
target = {'date': '2020-04-01',
'location': 'A',
'items': [
{'name': 'A', 'id': 'A1'},
{'name': 'B', 'id': 'B1'},
{'name': 'C', 'id': 'C1'}
]}
spec = Spec(('items', [Assign( 'date', Spec(S['date']))], [Assign( 'location', Spec(S['location']))]))
glom(target, spec, scope=target)
Results in
[{'name': 'A', 'id': 'A1', 'date': '2020-04-01', 'location': 'A'},
{'name': 'B', 'id': 'B1', 'date': '2020-04-01', 'location': 'A'},
{'name': 'C', 'id': 'C1', 'date': '2020-04-01', 'location': 'A'}]

python - split array of objects

I have a data structure that looks like this
arrayObjects = [{id: 1, array1: [a,b,c]}, {id: 2, array1: [d,e,f]}]
and would like to transform it into this:
newArrayObjects = [{id: 1, term: a}, {id:1, term: b}, ... {id:2, term: f} ]
any idea on how to do this?
this is my minimum version right now:
for item in arrayObjects:
for term in item['array1']:
print(term, item['id'])
to clarify: I know how to do this with a nested loop, I'm just going for the most pythonic version possible haha
You can use list comprehension:
>>> a = [{'id': 1, 'array': ['a','b','c']}, {'id': 2, 'array': ['d','e','f']}]
>>> [{'id': d['id'], 'term': v } for d in a for v in d['array']]
[{'term': 'a', 'id': 1}, {'term': 'b', 'id': 1}, {'term': 'c', 'id': 1}, {'term': 'd', 'id': 2}, {'term': 'e', 'id': 2}, {'term': 'f', 'id': 2}]

Categories

Resources