Generate list from two different lists by key - python

Considering the following structure:
myObj1 = [{"id":1, "name":"john"},
{"id":2, "name":"roger"},
{"id":3, "name":"carlos"}]
myObj2 = [{"group": "myGroup1","persons":[1, 2, 3]},
{"group": "myGroup2", "persons":[2]},
{"group": "myGroup3", "persons":[1,3]}]
I would like the produce the following result:
result = [{"group": "myGroup1","persons":[{"id":1, "name":"john"},
{"id":2, "name":"roger"},
{"id":3, "name":"carlos"}]},
{"group": "myGroup2", "persons":[{"id":2, "name":"roger"}]},
{"group": "myGroup3", "persons":[{"id":1, "name":"john"},
{"id":3, "name":"carlos"}]}]
The challenge is for each value in the "persons" array substitute it for the entire myObj1 item value where the id matches.
I could achieve that using like 3 for's but I want to know if there's a pythonic way of doing this using interpolation, map, filter, sets and etc.. I'm knew to the python word but got this question from an interviewer and he told me that I was supposed to do that with 1-2 lines of code.
UPDATE:
Here's what was my newbie approach:
for item in myObj1:
id = item["id"]
for item2 in myObj2:
for i in range(len(item2["persons"])):
if item2["persons"][i] == id:
item2["persons"][i] = item

result = myObj2.copy()
for d in result:
d['persons'] = [[j for j in myObj1 if j['id']==i][0] for i in d['persons']]
result
Output:
[{'group': 'myGroup1',
'persons': [{'id': 1, 'name': 'john'},
{'id': 2, 'name': 'roger'},
{'id': 3, 'name': 'carlos'}]},
{'group': 'myGroup2', 'persons': [{'id': 2, 'name': 'roger'}]},
{'group': 'myGroup3',
'persons': [{'id': 1, 'name': 'john'}, {'id': 3, 'name': 'carlos'}]}]

How about the following:
result = [dict(x) for x in myObj2]
for grp in result:
grp["persons"] = [p for p in myObj1 if p["id"] in grp["persons"]]
We create a new list (using dict(x) to ensure we don't retain references to the elements ofmyObj2`), and then update accordingly.

You can try this:
myObj1 = [{"id":1, "name":"john"},
{"id":2, "name":"roger"},
{"id":3, "name":"carlos"}]
myObj2 = [{"group": "myGroup1","persons":[1, 2, 3]},
{"group": "myGroup2", "persons":[2]},
{"group": "myGroup3", "persons":[1,3]}]
final_dict = [{a:b if a != "persons" else c for a,b in d.items()} for c, d in zip(myObj1, myObj2)]
Output:
[{'persons': {'id': 1, 'name': 'john'}, 'group': 'myGroup1'}, {'persons': {'id': 2, 'name': 'roger'}, 'group': 'myGroup2'}, {'persons': {'id': 3, 'name': 'carlos'}, 'group': 'myGroup3'}]

What you're trying to do is essentially a group by operation followed by mapping over dictionary values. This is an example of where the itertools module really shines.
from itertools import chain, groupby
def concat(lists):
"""
Helper function to make concatenating lists/iterables easier
"""
return list(chain.from_iterable(lists))
by_group = {
id: list(people)
for id, people in groupby(myObj1, key=lambda person: person['id'])
}
result = [
{'group': group['group'],
'persons': concat(by_group[id] for id in group['persons'])}
for group in myObj2
]
In this example, you still need the for-loops, but it is now clear what those loops are trying to do. The first is making an intermediate data structure to keep track of who has what id. The second is then going through another data structure and calculating who's in what group based on the groupby operation.

one approach is to substitute the values for the "persons" key, like this:
[group.update({'persons':[myObj1[next(index for (index, d) in enumerate(myObj1) if d["id"] == idstud)] for idstud in group['persons'] if idstud in [i['id'] for i in myObj1]]}) for group in myObj2]

Related

Adding multiple elements to dictionary inside the list and appending the dictionary every time and element is added

brand = [1,2,3]
consider this is the list I want to add to the list of dictionaries below
r = [{'time':1,
'id':1
'region':[{brand:1}]}]
and every time I add the element I want the dictionary to create new dictionary within the list Exampel:
r = [{'time':1,
'id':1
'region':[{brand:1},
{'time':1,
'id':1
'region':[{brand:2}]
{'time':1,
'id':1
'region':[{brand:3}]}
I am new to python and not able to figure out how to do it. Thanks in advance
This should work:
l = [1,2,3]
r = []
for i in l:
new_dict = {'time':1, 'id':1, 'region':[{"brand":i}]}
r.append(new_dict)
Output:
[{'id': 1, 'region': [{'brand': 1}], 'time': 1},
{'id': 1, 'region': [{'brand': 2}], 'time': 1},
{'id': 1, 'region': [{'brand': 3}], 'time': 1}]
Edit To address you question in the comment: bear in mind that this will always work in as much as brand and time are of the same length.
brand = [1,2,3]
time = [1,2,3]
r = []
for i in range(len(l)):
new_dict = {'time':time[i], 'id':1, 'region':[{"brand":brand[i]}]}
r.append(new_dict)
This should work too:
brand = [1,2,3]
r = list()
for i in brand:
r.append(dict(time=1, id=1, region=[dict(brand=i)]))
Output:
[{'time': 1, 'id': 1, 'region': [{'brand': 1}]},
{'time': 1, 'id': 1, 'region': [{'brand': 2}]},
{'time': 1, 'id': 1, 'region': [{'brand': 3}]}]
And if you want you can set the values of 'time' and 'id' just like 'region':
brand = [1,2,3]
r = list()
for i in brand:
r.append(dict(time=i, id=i, region=[dict(brand=i)]))
You can also do something like that:
brand = [1, 2, 3]
r = [{'time': 1, 'id': 1, 'region':[{"brand": i}]} for i in brand]

How to find latest entry for specific value in a dict?

I have a list of dictionaries which basically follows a structure like this:
elements = [{'id':1, 'date':1}, {'id':1, 'date':5}, {'id':2, 'date': 6}]
I want to write a function that only keeps the latest dict for each duplicate id, based on the date value:
[{'id':1, 'date':5}, {'id':2, 'date': 6}]
Is there an efficient way of doing this? So far I always end up with nested for loops and conditionals and I am sure there is a pythonic solution to this...
You don't need nested loops. This seems reasonably straightforward:
elements = [{'id':1, 'date':1}, {'id':1, 'date':5}, {'id':2, 'date': 6}]
latest = {}
for ele in elements:
if ele['id'] not in latest or ele['date'] > latest[ele['id']]['date']:
latest[ele['id']] = ele
print(list(latest.values()))
This will output:
[{'id': 1, 'date': 5}, {'id': 2, 'date': 6}]
First sort the list, then create a dict with id as the key and the elements of the list as values.
Now you can extract the values from this list
>>> elements = [{'id':1, 'date':1}, {'id':1, 'date':5}, {'id':2, 'date': 6}]
>>> dct = {d['id']:d for d in sorted(elements, key=lambda d: list(d.values()))}
>>> list(dct.values())
[{'id': 1, 'date': 5}, {'id': 2, 'date': 6}]
You can use itertools.groupby to group the dicts by id, operator.itemgetter will be the best helper method than using a lambda (make sure the list is sorted by id).
As for getting the "latest" I assume the last entry for each id you can use a collections.deque to get the last element in each group:
from itertools import groupby
from collections import deque
from operator import itemgetter
elements = [{'id':1, 'date':1}, {'id':1, 'date':5}, {'id':2, 'date': 6}]
get_id = itemgetter('id')
s_elements = sorted(elements, key=itemgetter('id', 'date'))
output = [deque(g, maxlen=1).pop() for _, g in groupby(s_elements, get_id)]
Output:
[{'id': 1, 'date': 5}, {'id': 2, 'date': 6}]
Does this work? It turns the dicts into tuples and then sorts them. Then when converting back to a dictionary any duplicates will be replaced by the last one.
def remove_duplicates(elements):
elements_as_tuples = sorted((d['id'], d['date']) for d in elements)
return dict(elements_as_tuples)
print(remove_duplicates(elements))
{1: 5, 2: 6}
The output is not in the original dictionary format but will it do?
If not you could add these steps:
elements_dict = remove_duplicates(elements)
list_of_dicts = [{'id': k, 'date': v} for k, v in elements_dict.items()]
print(list_of_dicts)
[{'id': 1, 'date': 5}, {'id': 2, 'date': 6}]

Deleting an element from a list inside a dict in Python

{
'tbl':'test',
'col':[
{
'id':1,
'name':"a"
},
{
'id':2,
'name':"b"
},
{
'id':3,
'name':"c"
}
]
}
I have a dictionary like the one above and I want to remove the element with id=2 from the list inside it. I wasted half a day wondering why modify2 is not working with del operation. Tried pop and it seems to be working but I don't completely understand why del doesn't work.
Is there a way to delete using del or pop is the ideal way to address this use case?
import copy
test_dict = {'tbl': 'test', 'col':[{'id':1, 'name': "a"}, {'id':2, 'name': "b"}, {'id':3, 'name': "c"}]}
def modify1(dict):
new_dict = copy.deepcopy(dict)
# new_dict = dict.copy()
for i in range(len(dict['col'])):
if dict['col'][i]['id'] == 2:
new_dict['col'].pop(i)
return new_dict
def modify2(dict):
new_dict = copy.deepcopy(dict)
# new_dict = dict.copy()
for i in new_dict['col']:
if i['id']==2:
del i
return new_dict
print("Output 1 : " + str(modify1(test_dict)))
print("Output 2 : " + str(modify2(test_dict)))
Output:
Output 1 : {'tbl': 'test', 'col': [{'id': 1, 'name': 'a'}, {'id': 3, 'name': 'c'}]}
Output 2 : {'tbl': 'test', 'col': [{'id': 1, 'name': 'a'}, {'id': 2, 'name': 'b'}, {'id': 3, 'name': 'c'}]}
I tried looking for answers on similar questions but didn't find the one that clears my confusion.
In Python 3, you can do this:
test_dict = {**test_dict, 'col': [x for x in test_dict['col'] if x['id'] != 2]}
del i just tells the interpreter that i (an arbitrary local variable/name that happens to reference to a dictionary) should not reference that dictionary any more. It does not change the content of that dictionary whatsoever.
This can be visualized on http://www.pythontutor.com/visualize.html:
Before del i. Note i references the second dictionary (noted by the blue line):
After del i. Note how the local variable i is removed from the local namespace (the blue box) but the dictionary it referenced to still exists.
Contrary to del i (which modifies the reference to the dictionary), dict.pop(key) modifies the dictionary.
This is one approach using a comprehension.
Ex:
data = {'tbl': 'test', 'col': [{'id': 1, 'name': 'a'}, {'id': 2, 'name': 'b'}, {'id': 3, 'name': 'c'}]}
data['col'] = [i for i in data['col'] if i["id"] != 2]
print(data)
Output:
{'col': [{'id': 1, 'name': 'a'}, {'id': 3, 'name': 'c'}], 'tbl': 'test'}
The reason it is not working is that you are using del wrong.
If you have a dictionary d = {'a': [{'id':1}, {'id':2}]} Then to delete the second element of the dictionary you use del d['a'][1] this returns
d = {'a': [{'id':1}]}
So for your problem you iterate to find the position of id 2 in the list and then you can simply do del dict['col'][ix] where ix is the index of id 2 in the list
You can't delete an element referenced by the iterating variable (the i)in the for loop
l = [1,2,3]
for i in l:
if i == 2:
del i
won't work. l will still be [1,2,3]
what you can do is get the index of that element and delete by using the index
l = [1,2,3]
for idx, elem in enumerate(l):
if elem == 2:
del l[idx]

Is there better way to merge dictionaries contained in two lists in Python? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 6 years ago.
Improve this question
I have two lists containing dictionaries:
list_a:
[{'id': 1, 'name': 'test'}, {'id': 2, 'name': 'test1'},....]
list_b:
[{'id': 1, 'age': 10}, {'id': 2, 'age': 20}, ....]
I want to merge these two lists with the result being:
[{'id': 1, 'name': 'test', 'age': 10}, {'id': 2, 'name': 'test1', 'age': 20}....]
I wan to use the nest loop to make it:
result= []
for i in list_a:
for j in list_b:
if i['id'] == j['id']:
i['age']=j['age']
result.append(i)
but there are 2000 elements for list_a, the ids of list_b is belongs to list_a, but the count of list_b is possibly less than 2000. the time complexityis of this method is too high, there a better way to merge them?
Not really, but dict.setdefault and dict.update probably are your friends for this.
data = {}
lists = [
[{'id': 1, 'name': 'test'}, {'id': 2, 'name': 'test1'},],
[{'id': 1, 'age': 10}, {'id': 2, 'age': 20},]
]
for each_list in lists:
for each_dict in each_list:
data.setdefault(each_dict['id'], {}).update(each_dict)
Result:
>>> data
{1: {'age': 10, 'id': 1, 'name': 'test'},
2: {'age': 20, 'id': 2, 'name': 'test1'}}
This way you can lookup by id (or just get data.values() if you want a plain list). Its been 20 years since I took my algorithms class, but I guess this is close enough to O(n) while your sample is more O(n²). This solution has some interesting properties: does not mutate the original lists, works for any number of lists, works for uneven lists containing distinct sets of "id".
answer = {}
for d in list_a: answer[d['id']] = d
for d in list_b:
if d['id'] not in d:
answer[d['id']] = d
continue
for k,v in d.items():
answer[d['id']][k] = v
answer = [d for k,d in sorted(answer.items(), key=lambda s:s[0])]
No, I think this is the best way because you are joining all data in the simplest data structure.
You can know how to implement it here
I hope my answer will be helpful for you.
It could be done in one line, given the items in list1 and list2 are in the same order, id wise, I mean.
result = [item1 for item1, item2 in zip(list1, list2) if not item1.update(item2)]
For a more lengthy one
for item1, item2 in zip(list1, list2):
item1.update(item2)
# list1 will be mutated to the result
To find a better way one needs to know how the values are generated and how they will be used.
For example if you have them as csv files you can use a Table-like module like pandas (I'll create them from your lists but they have a read_csv and from_csv as well):
import pandas as pd
df1 = pd.DataFrame.from_dict([{'id': 1, 'name': 'test'}, {'id': 2, 'name': 'test1'}])
df2 = pd.DataFrame.from_dict([{'id': 1, 'age': 10}, {'id': 2, 'age': 20}])
pd.merge(df1, df2, on='id')
Or if they come from a database most databases already have a JOIN ON (for example MYSQL) option.

remove a dict entry in a list of dict python

How can I remove a key in a dic in a list
for exemple
My_list= [{'ID':0,'Name':'Paul','phone':'1234'},{'ID':1,'Name':'John','phone':'5678'}]
I want to remove in ID 1 the phone key
My_list= [{'ID':0,'Name':'Paul','phone':'1234'},{'ID':1,'Name':'John'}]
thanks in advance for your help
Just iterate through the list, check whether if 'ID' equals to 1 and if so then delete the 'phone' key. This should work:
for d in My_list:
if d["ID"] == 1:
del d["phone"]
And finally print the list:
print My_list
When the id you are looking for matches 1, then reconstruct the dictionary excluding the key phone, otherwise use the dictionary as it is, like this
l = [{'ID': 0, 'Name': 'Paul', 'phone': '1234'},
{'ID': 1, 'Name': 'John', 'phone': '5678'}]
k, f = 1, {"phone"}
print([{k: i[k] for k in i.keys() - f} if i["ID"] == k else i for i in l])
# [{'phone': '1234', 'ID': 0, 'Name': 'Paul'}, {'ID': 1, 'Name': 'John'}]
Here, k is the value of ID you are looking for and f is a set of keys which need to be excluded in the resulting dictionary, if the id matches.

Categories

Resources