How to find latest entry for specific value in a dict? - python

I have a list of dictionaries which basically follows a structure like this:
elements = [{'id':1, 'date':1}, {'id':1, 'date':5}, {'id':2, 'date': 6}]
I want to write a function that only keeps the latest dict for each duplicate id, based on the date value:
[{'id':1, 'date':5}, {'id':2, 'date': 6}]
Is there an efficient way of doing this? So far I always end up with nested for loops and conditionals and I am sure there is a pythonic solution to this...

You don't need nested loops. This seems reasonably straightforward:
elements = [{'id':1, 'date':1}, {'id':1, 'date':5}, {'id':2, 'date': 6}]
latest = {}
for ele in elements:
if ele['id'] not in latest or ele['date'] > latest[ele['id']]['date']:
latest[ele['id']] = ele
print(list(latest.values()))
This will output:
[{'id': 1, 'date': 5}, {'id': 2, 'date': 6}]

First sort the list, then create a dict with id as the key and the elements of the list as values.
Now you can extract the values from this list
>>> elements = [{'id':1, 'date':1}, {'id':1, 'date':5}, {'id':2, 'date': 6}]
>>> dct = {d['id']:d for d in sorted(elements, key=lambda d: list(d.values()))}
>>> list(dct.values())
[{'id': 1, 'date': 5}, {'id': 2, 'date': 6}]

You can use itertools.groupby to group the dicts by id, operator.itemgetter will be the best helper method than using a lambda (make sure the list is sorted by id).
As for getting the "latest" I assume the last entry for each id you can use a collections.deque to get the last element in each group:
from itertools import groupby
from collections import deque
from operator import itemgetter
elements = [{'id':1, 'date':1}, {'id':1, 'date':5}, {'id':2, 'date': 6}]
get_id = itemgetter('id')
s_elements = sorted(elements, key=itemgetter('id', 'date'))
output = [deque(g, maxlen=1).pop() for _, g in groupby(s_elements, get_id)]
Output:
[{'id': 1, 'date': 5}, {'id': 2, 'date': 6}]

Does this work? It turns the dicts into tuples and then sorts them. Then when converting back to a dictionary any duplicates will be replaced by the last one.
def remove_duplicates(elements):
elements_as_tuples = sorted((d['id'], d['date']) for d in elements)
return dict(elements_as_tuples)
print(remove_duplicates(elements))
{1: 5, 2: 6}
The output is not in the original dictionary format but will it do?
If not you could add these steps:
elements_dict = remove_duplicates(elements)
list_of_dicts = [{'id': k, 'date': v} for k, v in elements_dict.items()]
print(list_of_dicts)
[{'id': 1, 'date': 5}, {'id': 2, 'date': 6}]

Related

Adding multiple elements to dictionary inside the list and appending the dictionary every time and element is added

brand = [1,2,3]
consider this is the list I want to add to the list of dictionaries below
r = [{'time':1,
'id':1
'region':[{brand:1}]}]
and every time I add the element I want the dictionary to create new dictionary within the list Exampel:
r = [{'time':1,
'id':1
'region':[{brand:1},
{'time':1,
'id':1
'region':[{brand:2}]
{'time':1,
'id':1
'region':[{brand:3}]}
I am new to python and not able to figure out how to do it. Thanks in advance
This should work:
l = [1,2,3]
r = []
for i in l:
new_dict = {'time':1, 'id':1, 'region':[{"brand":i}]}
r.append(new_dict)
Output:
[{'id': 1, 'region': [{'brand': 1}], 'time': 1},
{'id': 1, 'region': [{'brand': 2}], 'time': 1},
{'id': 1, 'region': [{'brand': 3}], 'time': 1}]
Edit To address you question in the comment: bear in mind that this will always work in as much as brand and time are of the same length.
brand = [1,2,3]
time = [1,2,3]
r = []
for i in range(len(l)):
new_dict = {'time':time[i], 'id':1, 'region':[{"brand":brand[i]}]}
r.append(new_dict)
This should work too:
brand = [1,2,3]
r = list()
for i in brand:
r.append(dict(time=1, id=1, region=[dict(brand=i)]))
Output:
[{'time': 1, 'id': 1, 'region': [{'brand': 1}]},
{'time': 1, 'id': 1, 'region': [{'brand': 2}]},
{'time': 1, 'id': 1, 'region': [{'brand': 3}]}]
And if you want you can set the values of 'time' and 'id' just like 'region':
brand = [1,2,3]
r = list()
for i in brand:
r.append(dict(time=i, id=i, region=[dict(brand=i)]))
You can also do something like that:
brand = [1, 2, 3]
r = [{'time': 1, 'id': 1, 'region':[{"brand": i}]} for i in brand]

How to unpack list of attributes to .filter argument of sqlalchemy orm query? [duplicate]

My code is
index = 0
for key in dataList[index]:
print(dataList[index][key])
Seems to work fine for printing the values of dictionary keys for index = 0. However, I can't figure out how to iterate through an unknown number of dictionaries in dataList.
You could just iterate over the indices of the range of the len of your list:
dataList = [{'a': 1}, {'b': 3}, {'c': 5}]
for index in range(len(dataList)):
for key in dataList[index]:
print(dataList[index][key])
or you could use a while loop with an index counter:
dataList = [{'a': 1}, {'b': 3}, {'c': 5}]
index = 0
while index < len(dataList):
for key in dataList[index]:
print(dataList[index][key])
index += 1
you could even just iterate over the elements in the list directly:
dataList = [{'a': 1}, {'b': 3}, {'c': 5}]
for dic in dataList:
for key in dic:
print(dic[key])
It could be even without any lookups by just iterating over the values of the dictionaries:
dataList = [{'a': 1}, {'b': 3}, {'c': 5}]
for dic in dataList:
for val in dic.values():
print(val)
Or wrap the iterations inside a list-comprehension or a generator and unpack them later:
dataList = [{'a': 1}, {'b': 3}, {'c': 5}]
print(*[val for dic in dataList for val in dic.values()], sep='\n')
the possibilities are endless. It's a matter of choice what you prefer.
You can easily do this:
for dict_item in dataList:
for key in dict_item:
print(dict_item[key])
It will iterate over the list, and for each dictionary in the list, it will iterate over the keys and print its values.
use=[{'id': 29207858, 'isbn': '1632168146', 'isbn13': '9781632168146', 'ratings_count': 0}]
for dic in use:
for val,cal in dic.items():
print(f'{val} is {cal}')
def extract_fullnames_as_string(list_of_dictionaries):
return list(map(lambda e : "{} {}".format(e['first'],e['last']),list_of_dictionaries))
names = [{'first': 'Zhibekchach', 'last': 'Myrzaeva'}, {'first': 'Gulbara', 'last': 'Zholdoshova'}]
print(extract_fullnames_as_string(names))
#Well...the shortest way (1 line only) in Python to extract data from the list of dictionaries is using lambda form and map together.
"""The approach that offers the most flexibility and just seems more dynamically appropriate to me is as follows:"""
Loop thru list in a Function called.....
def extract_fullnames_as_string(list_of_dictionaries):
result = ([val for dic in list_of_dictionaries for val in
dic.values()])
return ('My Dictionary List is ='result)
dataList = [{'first': 3, 'last': 4}, {'first': 5, 'last': 7},{'first':
15, 'last': 9},{'first': 51, 'last': 71},{'first': 53, 'last': 79}]
print(extract_fullnames_as_string(dataList))
"""This way, the Datalist can be any format of a Dictionary you throw at it, otherwise you can end up dealing with format issues, I found. Try the following and it will still works......."""
dataList1 = [{'a': 1}, {'b': 3}, {'c': 5}]
dataList2 = [{'first': 'Zhibekchach', 'last': 'Myrzaeva'}, {'first':
'Gulbara', 'last': 'Zholdoshova'}]
print(extract_fullnames_as_string(dataList1))
print(extract_fullnames_as_string(dataList2))
Another pythonic solution is using collections module.
Here is an example where I want to generate a dict containing only 'Name' and 'Last Name' values:
from collections import defaultdict
test_dict = [{'Name': 'Maria', 'Last Name': 'Bezerra', 'Age': 31},
{'Name': 'Ana', 'Last Name': 'Mota', 'Age': 31},
{'Name': 'Gabi', 'Last Name': 'Santana', 'Age': 31}]
collect = defaultdict(dict)
# at this moment, 'key' becomes every dict of your list of dict
for key in test_dict:
collect[key['Name']] = key['Last Name']
print(dict(collect))
Output should be:
{'Name': 'Maria', 'Last Name': 'Bezerra'}, {'Name': 'Ana', 'Last Name': 'Mota'}, {'Name': 'Gabi', 'Last Name': 'Santana'}
There are multiple ways to iterate through a list of dictionaries. However, if you are into Pythonic code, consider the following ways, but first, let's use data_list instead of dataList because in Python snake_case is preferred over camelCase.
Way #1: Iterating over a dictionary's keys
# let's assume that data_list is the following dictionary
data_list = [{'Alice': 10}, {'Bob': 7}, {'Charlie': 5}]
for element in data_list:
for key in element:
print(key, element[key])
Output
Alice 10
Bob 7
Charlie 5
Explanation:
for element in data_list: -> element will be a dictionary in data_list at each iteration, i.e., {'Alice': 10} in the first iteration,
{'Bob': 7} in the second iteration, and {'Charlie': 5}, in the third iteration.
for key in element: -> key will be a key of element at each iteration, so when element is {'Alice': 10}, the values for key will be 'Alice'. Keep in mind that element could contain more keys, but in this particular example it has just one.
print(key, element[key]) -> it prints key and the value of element for key key, i.e., it access the value of key in `element.
Way #2: Iterating over a dictionary's keys and values
# let's assume that data_list is the following dictionary
data_list = [{'Alice': 10}, {'Bob': 7}, {'Charlie': 5}]
for element in data_list:
for key, value in element.items():
print(key, value)
The output for this code snippet is the same as the previous one.
Explanation:
for element in data_list: -> it has the same explanation as the one in the code before.
for key, value in element.items(): -> at each iteration, element.items() will return a tuple that contains two elements. The former element is the key, and the latter is the value associated with that key, so when element is {'Alice': 10}, the value for key will be 'Alice', and the value for value will be 10. Keep in mind that this dictionary has only one key-value pair.
print(key, value) -> it prints key and value.
As stated before, there are multiple ways to iterate through a list of dictionaries, but to keep your code more Pythonic, avoid using indices or while loops.
had a similar issue, fixed mine by using a single for loop to iterate over the list, see code snippet
de = {"file_name":"jon","creation_date":"12/05/2022","location":"phc","device":"s3","day":"1","time":"44692.5708703703","year":"1900","amount":"3000","entity":"male"}
se = {"file_name":"bone","creation_date":"13/05/2022","location":"gar","device":"iphone","day":"2","time":"44693.5708703703","year":"2022","amount":"3000","entity":"female"}
re = {"file_name":"cel","creation_date":"12/05/2022","location":"ben car","device":"galaxy","day":"1","time":"44695.5708703703","year":"2022","amount":"3000","entity":"male"}
te = {"file_name":"teiei","creation_date":"13/05/2022","location":"alcon","device":"BB","day":"2","time":"44697.5708703703","year":"2022","amount":"3000","entity":"female"}
ye = {"file_name":"js","creation_date":"12/05/2022","location":"woji","device":"Nokia","day":"1","time":"44699.5708703703","year":"2022","amount":"3000","entity":"male"}
ue = {"file_name":"jsdjd","creation_date":"13/05/2022","location":"town","device":"M4","day":"5","time":"44700.5708703703","year":"2022","amount":"3000","entity":"female"}
d_list = [de,se,re,te,ye,ue]
for dic in d_list:
print (dic['file_name'],dic['creation_date'])

Generate list from two different lists by key

Considering the following structure:
myObj1 = [{"id":1, "name":"john"},
{"id":2, "name":"roger"},
{"id":3, "name":"carlos"}]
myObj2 = [{"group": "myGroup1","persons":[1, 2, 3]},
{"group": "myGroup2", "persons":[2]},
{"group": "myGroup3", "persons":[1,3]}]
I would like the produce the following result:
result = [{"group": "myGroup1","persons":[{"id":1, "name":"john"},
{"id":2, "name":"roger"},
{"id":3, "name":"carlos"}]},
{"group": "myGroup2", "persons":[{"id":2, "name":"roger"}]},
{"group": "myGroup3", "persons":[{"id":1, "name":"john"},
{"id":3, "name":"carlos"}]}]
The challenge is for each value in the "persons" array substitute it for the entire myObj1 item value where the id matches.
I could achieve that using like 3 for's but I want to know if there's a pythonic way of doing this using interpolation, map, filter, sets and etc.. I'm knew to the python word but got this question from an interviewer and he told me that I was supposed to do that with 1-2 lines of code.
UPDATE:
Here's what was my newbie approach:
for item in myObj1:
id = item["id"]
for item2 in myObj2:
for i in range(len(item2["persons"])):
if item2["persons"][i] == id:
item2["persons"][i] = item
result = myObj2.copy()
for d in result:
d['persons'] = [[j for j in myObj1 if j['id']==i][0] for i in d['persons']]
result
Output:
[{'group': 'myGroup1',
'persons': [{'id': 1, 'name': 'john'},
{'id': 2, 'name': 'roger'},
{'id': 3, 'name': 'carlos'}]},
{'group': 'myGroup2', 'persons': [{'id': 2, 'name': 'roger'}]},
{'group': 'myGroup3',
'persons': [{'id': 1, 'name': 'john'}, {'id': 3, 'name': 'carlos'}]}]
How about the following:
result = [dict(x) for x in myObj2]
for grp in result:
grp["persons"] = [p for p in myObj1 if p["id"] in grp["persons"]]
We create a new list (using dict(x) to ensure we don't retain references to the elements ofmyObj2`), and then update accordingly.
You can try this:
myObj1 = [{"id":1, "name":"john"},
{"id":2, "name":"roger"},
{"id":3, "name":"carlos"}]
myObj2 = [{"group": "myGroup1","persons":[1, 2, 3]},
{"group": "myGroup2", "persons":[2]},
{"group": "myGroup3", "persons":[1,3]}]
final_dict = [{a:b if a != "persons" else c for a,b in d.items()} for c, d in zip(myObj1, myObj2)]
Output:
[{'persons': {'id': 1, 'name': 'john'}, 'group': 'myGroup1'}, {'persons': {'id': 2, 'name': 'roger'}, 'group': 'myGroup2'}, {'persons': {'id': 3, 'name': 'carlos'}, 'group': 'myGroup3'}]
What you're trying to do is essentially a group by operation followed by mapping over dictionary values. This is an example of where the itertools module really shines.
from itertools import chain, groupby
def concat(lists):
"""
Helper function to make concatenating lists/iterables easier
"""
return list(chain.from_iterable(lists))
by_group = {
id: list(people)
for id, people in groupby(myObj1, key=lambda person: person['id'])
}
result = [
{'group': group['group'],
'persons': concat(by_group[id] for id in group['persons'])}
for group in myObj2
]
In this example, you still need the for-loops, but it is now clear what those loops are trying to do. The first is making an intermediate data structure to keep track of who has what id. The second is then going through another data structure and calculating who's in what group based on the groupby operation.
one approach is to substitute the values for the "persons" key, like this:
[group.update({'persons':[myObj1[next(index for (index, d) in enumerate(myObj1) if d["id"] == idstud)] for idstud in group['persons'] if idstud in [i['id'] for i in myObj1]]}) for group in myObj2]

Python - Get dictionary element in a list of dictionaries after an if statement

How can I get a dictionary value in a list of dictionaries, based on the dictionary satisfying some condition? For instance, if one of the dictionaries in the list has the id=5, I want to print the value corresponding to the name key of that dictionary:
list = [{'name': 'Mike', 'id': 1}, {'name': 'Ellen', 'id': 5}]
id = 5
if any(m['id'] == id for m in list):
print m['name']
This won't work because m is not defined outside the if statement.
You have a list of dictionaries, so you can use a list comprehension:
[d for d in lst if d['id'] == 5]
# [{'id': 5, 'name': 'Ellen'}]
new_list = [m['name'] for m in list if m['id']==5]
print '\n'.join(new_list)
This will be easy to accomplish with a single for-loop:
for d in list:
if 'id' in d and d['in'] == 5:
print(d['name'])
There are two key concepts to learn here. The first is that we used a for loop to "go through each element of the list". The second, is that we used the in word to check if a dictionary had a certain key.
How about the following?
for entry in list:
if entry['id']==5:
print entry['name']
It doesn't exist in Python2, but a simple solution in Python3 would be to use a ChainMap instead of a list.
import collections
d = collections.ChainMap(*[{'name':'Mike', 'id': 1}, {'name':'Ellen', 'id': 5}])
if 'id' in d:
print(d['id'])
You can do it by using the filter function:
lis = [ {'name': 'Mike', 'id': 1}, {'name':'Ellen', 'id': 5}]
result = filter(lambda dic:dic['id']==5,lis)[0]['name']
print(result)

Is there better way to merge dictionaries contained in two lists in Python? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 6 years ago.
Improve this question
I have two lists containing dictionaries:
list_a:
[{'id': 1, 'name': 'test'}, {'id': 2, 'name': 'test1'},....]
list_b:
[{'id': 1, 'age': 10}, {'id': 2, 'age': 20}, ....]
I want to merge these two lists with the result being:
[{'id': 1, 'name': 'test', 'age': 10}, {'id': 2, 'name': 'test1', 'age': 20}....]
I wan to use the nest loop to make it:
result= []
for i in list_a:
for j in list_b:
if i['id'] == j['id']:
i['age']=j['age']
result.append(i)
but there are 2000 elements for list_a, the ids of list_b is belongs to list_a, but the count of list_b is possibly less than 2000. the time complexityis of this method is too high, there a better way to merge them?
Not really, but dict.setdefault and dict.update probably are your friends for this.
data = {}
lists = [
[{'id': 1, 'name': 'test'}, {'id': 2, 'name': 'test1'},],
[{'id': 1, 'age': 10}, {'id': 2, 'age': 20},]
]
for each_list in lists:
for each_dict in each_list:
data.setdefault(each_dict['id'], {}).update(each_dict)
Result:
>>> data
{1: {'age': 10, 'id': 1, 'name': 'test'},
2: {'age': 20, 'id': 2, 'name': 'test1'}}
This way you can lookup by id (or just get data.values() if you want a plain list). Its been 20 years since I took my algorithms class, but I guess this is close enough to O(n) while your sample is more O(n²). This solution has some interesting properties: does not mutate the original lists, works for any number of lists, works for uneven lists containing distinct sets of "id".
answer = {}
for d in list_a: answer[d['id']] = d
for d in list_b:
if d['id'] not in d:
answer[d['id']] = d
continue
for k,v in d.items():
answer[d['id']][k] = v
answer = [d for k,d in sorted(answer.items(), key=lambda s:s[0])]
No, I think this is the best way because you are joining all data in the simplest data structure.
You can know how to implement it here
I hope my answer will be helpful for you.
It could be done in one line, given the items in list1 and list2 are in the same order, id wise, I mean.
result = [item1 for item1, item2 in zip(list1, list2) if not item1.update(item2)]
For a more lengthy one
for item1, item2 in zip(list1, list2):
item1.update(item2)
# list1 will be mutated to the result
To find a better way one needs to know how the values are generated and how they will be used.
For example if you have them as csv files you can use a Table-like module like pandas (I'll create them from your lists but they have a read_csv and from_csv as well):
import pandas as pd
df1 = pd.DataFrame.from_dict([{'id': 1, 'name': 'test'}, {'id': 2, 'name': 'test1'}])
df2 = pd.DataFrame.from_dict([{'id': 1, 'age': 10}, {'id': 2, 'age': 20}])
pd.merge(df1, df2, on='id')
Or if they come from a database most databases already have a JOIN ON (for example MYSQL) option.

Categories

Resources