How to search in list - python

Getting a list from mongodb and sorting it:
results = list(db1.zaklad.find({"name": "cola", "stav": '+'}))
print(a)
sorted_results = sorted(results, key=itemgetter('weight'), reverse=True)
Im getting: [{'_id': ObjectId('5a13a8c396fb3488bb6a0648'), 'name': 'cola', 'weight': '3', 'url': 'goo.gl/2BgLmm', 'stav': '+', 'time_exp': datetime.datetime(2017, 11, 17, 23, 37, 31, 946000)}, {'_id': ObjectId('5a13a8bc96fb3488bb6a0647'), 'name': 'cola', 'weight': '2', 'url': 'goo.gl/2BgLmm', 'stav': '+', 'time_exp': datetime.datetime(2017, 11, 17, 23, 37, 31, 946000)}, {'_id': ObjectId('5a13a8ca96fb3488bb6a0649'), 'name': 'cola', 'weight': '2', 'url': 'goo.gl/2BgLmm', 'stav': '+', 'time_exp': datetime.datetime(2017, 11, 17, 23, 37, 31, 946000)}
From this list I want to get all un-repeating weights(from example above: 3, 2).
So, how to search in this list?
Or its better to do dictionary with dict(enumerate(results))?
Thx for your help

If list is already sorted, create a new list, put the first weight of sorted_results in it, remember the weight of this item and iterate over the remaining items:
If an item has same weight as remembered weight, ignore it, if it has another weight, add it to the new list and remember the new weight instead of the previous one.

get weights
weights = [dic["weight"] for dic in results]
remove duplicates by converting to set
set(weights)

If you want to search by un-repeating weight value from results.
from collections import OrderedDict
ordered_results = OrderedDict({k['weight']:k for k in sorted(results, key=lambda x:x['weight'], reverse=True)})
You can get an ordered results:
OrderedDict([('3', {'stav': '+', 'name': 'cola', 'weight': '3', 'url': 'goo.gl/2BgLmm', 'time_exp': datetime.datetime(2017, 11, 17, 23, 37, 31, 946000), '_id': ObjectId('5a13a8c396fb3488bb6a0648')}), ('2', {'stav': '+', 'name': 'cola', 'weight': '2', 'url': 'goo.gl/2BgLmm', 'time_exp': datetime.datetime(2017, 11, 17, 23, 37, 31, 946000), '_id': ObjectId('5a13a8ca96fb3488bb6a0649')})])
You can get the value with 'weight' value '3' by using ordered_result['3']

Related

Python create list of dicts

I am new to python and I am trying to construct data structure from existing data.
I have following:
[
{'UserName': 'aaa', 'AccessKeyId': 'AKIAYWQTISJD6X27YVK', 'Status': 'Active', 'CreateDate': datetime.datetime(2022, 9, 8, 15, 56, 39, tzinfo=tzutc())},
{'UserName': 'eee', 'AccessKeyId': 'AKIAYWQTISJD6QXMAKY', 'Status': 'Active', 'CreateDate': datetime.datetime(2023, 1, 24, 12, 30, 59, tzinfo=tzutc())},
{'UserName': 'eee', 'AccessKeyId': 'AKIAYWQTISJDUARK6FV', 'Status': 'Active', 'CreateDate': datetime.datetime(2023, 1, 24, 16, 58, 38, tzinfo=tzutc())}
]
I need to get this:
{
"aaa": [
{'AccessKeyId': 'AKIAYWQTISJD6X27YVK', 'Status': 'Active', 'CreateDate': datetime.datetime(2022, 9, 8, 15, 56, 39, tzinfo=tzutc())}],
"eee": [
{'AccessKeyId': 'AKIAYWQTISJD6QXMAKY', 'Status': 'Active', 'CreateDate': datetime.datetime(2023, 1, 24, 12, 30, 59, tzinfo=tzutc())},
{'AccessKeyId': 'AKIAYWQTISJDUARK6FV', 'Status': 'Active', 'CreateDate': datetime.datetime(2023, 1, 24, 16, 58, 38, tzinfo=tzutc())}
]
}
I tried following:
list_per_user = {i['UserName']: copy.deepcopy(i) for i in key_list}
for obj in list_per_user:
del list_per_user[obj]['UserName']
but I am missing array here. So in case of two keys per user I will have only last one with this. I don't know how to get the list I need per user.
Thanks!
Create an external dict that maps username -> list of entries.
data = [
{'UserName': 'aaa', 'AccessKeyId': 'AKIAYWQTISJD6X27YVK', 'Status': 'Active', 'CreateDate': datetime.datetime(2022, 9, 8, 15, 56, 39, tzinfo=tzutc())},
{'UserName': 'eee', 'AccessKeyId': 'AKIAYWQTISJD6QXMAKY', 'Status': 'Active', 'CreateDate': datetime.datetime(2023, 1, 24, 12, 30, 59, tzinfo=tzutc())},
{'UserName': 'eee', 'AccessKeyId': 'AKIAYWQTISJDUARK6FV', 'Status': 'Active', 'CreateDate': datetime.datetime(2023, 1, 24, 16, 58, 38, tzinfo=tzutc())}
]
new_data = {}
for entry in data:
new_data.setdefault(entry["UserName"], []).append(
{k: v for k, v in entry.items() if k != "UserName"}
)
print(new_data)
Output (some fields hidden because I don't want to import those libraries in my repl, but they'll be there when you run it)
{'aaa': [{'AccessKeyId': 'AKIAYWQTISJD6X27YVK', 'Status': 'Active'}],
'eee': [{'AccessKeyId': 'AKIAYWQTISJD6QXMAKY', 'Status': 'Active'},
{'AccessKeyId': 'AKIAYWQTISJDUARK6FV', 'Status': 'Active'}]}

Python - filter list of dicts by top value in key

I'm trying to narrow down list of dicts by filtering it by value in one of the keys.
Current codes does it but I don't know how to retain entire dictionary rather then only those fields I filter by.
final_list = []
jobs = [glue_client.job_status(e) for e in j]
for e in jobs:
for page in e:
final_list.append(page["JobRuns"])
flat_list = [item for sublist in final_list for item in sublist]
sorted_list = sorted(flat_list, key=lambda k: (k['JobName'], k['StartedOn']), reverse=True)
#need to have following keys: "JobName", "JobRunState", "StartedOn" and "Id"
latest_jobs = [
{'JobName': key, 'StartedOn': max(item['StartedOn'] for item in values)}
for key, values in groupby(flat_list, lambda dct: dct['JobName'])
]
print(latest_jobs)
Data at sorted_list variable looks as below:
list_of_dicts = [
{'JobName': 'a', 'StartedOn': datetime.datetime(2022, 10, 18, 13, 0, 47, 306000, tzinfo=tzlocal()), 'JobRunState': 'fail', 'id': 'xyz'},
{'JobName': 'a', 'StartedOn': datetime.datetime(2021, 10, 18, 13, 0, 47, 306000, tzinfo=tzlocal()), 'JobRunState': 'ok', 'id': 'xyz'},
{'JobName': 'b', 'StartedOn': datetime.datetime(2022, 10, 18, 13, 0, 47, 306000, tzinfo=tzlocal()), 'JobRunState': 'fail', 'id': 'xyz'},
{'JobName': 'a', 'StartedOn': datetime.datetime(2020, 10, 18, 13, 0, 47, 306000, tzinfo=tzlocal()), 'JobRunState': 'fai;', 'id': 'xyz'},
{'JobName': 'b', 'StartedOn': datetime.datetime(2021, 10, 18, 13, 0, 47, 306000, tzinfo=tzlocal()), 'JobRunState': 'ok', 'id': 'xyz'}
]
Expected output:
filtered_list = [
{'JobName': 'a', 'StartedOn': datetime.datetime(2022, 10, 18, 13, 0, 47, 306000, tzinfo=tzlocal()), 'JobRunState': 'fail', 'id': 'xyz'},
{'JobName': 'b', 'StartedOn': datetime.datetime(2022, 10, 18, 13, 0, 47, 306000, tzinfo=tzlocal()), 'JobRunState': 'fail', 'id': 'xyz'}
]
Some judicious use of itertools.groupby, sorted, and max.
list_of_dicts = [
{'JobName': 'a', 'StartedOn': datetime.datetime(2022, 10, 18, 13, 0, 47, 306000), 'JobRunState': 'fail', 'id': 'xyz'},
{'JobName': 'a', 'StartedOn': datetime.datetime(2021, 10, 18, 13, 0, 47, 306000), 'JobRunState': 'ok', 'id': 'xyz'},
{'JobName': 'b', 'StartedOn': datetime.datetime(2022, 10, 18, 13, 0, 47, 306000), 'JobRunState': 'fail', 'id': 'xyz'},
{'JobName': 'a', 'StartedOn': datetime.datetime(2020, 10, 18, 13, 0, 47, 306000), 'JobRunState': 'fai;', 'id': 'xyz'},
{'JobName': 'b', 'StartedOn': datetime.datetime(2021, 10, 18, 13, 0, 47, 306000), 'JobRunState': 'ok', 'id': 'xyz'}
]
from itertools import groupby
from operator import itemgetter
lst = sorted(list_of_dicts, key=itemgetter('JobName'))
[max(jobs, key=itemgetter('StartedOn'))
for jn, jobs in groupby(lst, key=itemgetter('JobName'))]
# [{'JobName': 'a', 'StartedOn': datetime.datetime(2022, 10, 18, 13, 0, 47, 306000), 'JobRunState': 'fail', 'id': 'xyz'},
# {'JobName': 'b', 'StartedOn': datetime.datetime(2022, 10, 18, 13, 0, 47, 306000), 'JobRunState': 'fail', 'id': 'xyz'}]

How to create a dictionary from one list, making the first item mapped to the second, the third mapped to the fouth, and so on

I have a list of strings and integers:
students = ['Janet', 21, 'Bill', 19, 'Amanda', 22, 'Mike', 25, 'Susan', 24, 'Jen', 29, 'Sara', 30, 'Maria', 18, 'Kathy', 20, 'Andrew', 27]
I need to make a dictionary called peoples, that takes each name and maps it to their age, which is the integer after it. I thought I would have to iterate over the list, but I've had no luck. Here is what I have so far:
students = ['Janet', 21, 'Bill', 19, 'Amanda', 22, 'Mike', 25, 'Susan', 24, 'Jen', 29, 'Sara', 30, 'Maria', 18, 'Kathy', 20, 'Andrew', 27]
people = {}
for i in students:
if type(i) is int == False:
#here I would take i and make it a key in the dictionary, then map the following integer to its value
students = ['Janet', 21, 'Bill', 19, 'Amanda', 22, 'Mike', 25, 'Susan', 24, 'Jen', 29, 'Sara', 30, 'Maria', 18, 'Kathy', 20, 'Andrew', 27]
print(dict(zip(students[::2], students[1::2])))
Prints:
{'Janet': 21, 'Bill': 19, 'Amanda': 22, 'Mike': 25, 'Susan': 24, 'Jen': 29, 'Sara': 30, 'Maria': 18, 'Kathy': 20, 'Andrew': 27}
dict([x for x in zip(*[iter(students)]*2)])
dict = {}
for i in range(len(students)//2):
dict[student[i]] = dict[student[i+1]]

Trying To Sort A List Of Dicts After Combining Django Query

I have a django view that I need to query from different models and combine them, and then organize by date ('created_at'), right now when combining the models I get a list of dicts like below. How can I sort this by date.
[{'content': u'Just another another message', 'created_at':
datetime.datetime(2018, 4, 22, 15, 35, 11, 577175, tzinfo=<UTC>),
u'successmatch_id': 5, u'id': 8, 'reciever': u'UserA'},
{'content': u'testing blah', 'created_at': datetime.datetime(2018, 4,
22, 15, 33, 28, 84469, tzinfo=<UTC>), u'successmatch_id': 5, u'id': 7,
'reciever': u'UserB'}, {'content': u'Hi how are you',
'created_at': datetime.datetime(2018, 4, 22, 13, 29, 49, 516701,
tzinfo=<UTC>), u'successmatch_id': 5, u'id': 6, 'reciever':
u'UserA'}]
Python's built-in sorting has the ability to specify what metric to sort by:
x = [{"test": 1}, {"test": 2}, {"test": 0}]
x.sort(key=lambda item: item["test"])
x is edited in place, and is now:
[{'test': 0}, {'test': 1}, {'test': 2}]
So, in your case, assuming your list is called my_list, you'd want to do:
my_list.sort(key=lambda item: item["created_at"])
Or, if you wanted the newest dicts to occur first,
my_list.sort(key=lambda item: item["created_at"], reverse=True)
If you are happy using a 3rd party library, you can use pandas, which accepts a list of dictionaries.
But note that datetime objects may be converted to pandas.Timestamp objects.
import pandas as pd
import datetime
lst = [{'content': u'Just another another message',
'created_at': datetime.datetime(2018, 4, 22, 15, 35, 11, 577175, tzinfo=None),
u'successmatch_id': 5, u'id': 8, 'reciever': u'UserA'},
{'content': u'testing blah',
'created_at': datetime.datetime(2018, 4, 22, 15, 33, 28, 84469, tzinfo=None),
u'successmatch_id': 5, u'id': 7, 'reciever': u'UserB'},
{'content': u'Hi how are you',
'created_at': datetime.datetime(2018, 4, 22, 13, 29, 49, 516701, tzinfo=None),
u'successmatch_id': 5, u'id': 6, 'reciever': u'UserA'}]
res = pd.DataFrame(lst).sort_values('created_at').T.to_dict().values()
Result:
dict_values([{'content': 'Hi how are you', 'created_at': Timestamp('2018-04-22 13:29:49.516701'),
'id': 6, 'reciever': 'UserA', 'successmatch_id': 5},
{'content': 'testing blah', 'created_at': Timestamp('2018-04-22 15:33:28.084469'),
'id': 7, 'reciever': 'UserB', 'successmatch_id': 5},
{'content': 'Just another another message', 'created_at': Timestamp('2018-04-22 15:35:11.577175'),
'id': 8, 'reciever': 'UserA', 'successmatch_id': 5}])

Difficulty getting the item count for the combinations of list of items from python dictionary

I have below input list of dictionaries
inpdata = {"cat": [{"categories": [{"cid": 27}, {"cid": 66}, {"cid": 29}], "id": 20},
{"categories": [{"cid": 66}], "id": 21},
{"categories": [{"cid": 66}, {"cid": 27}], "id": 22},
{"categories": [{"cid": 66}, {"cid": 27}], "id": 23},
{"categories": [{"cid": 66}, {"cid": 29}, {"cid": 27}], "id": 24}]};
Am trying to get the count of id's for each cid along with the id values, I used below code for that -
allcategories = set( sec['cid'] for record in inpdata['cat'] for sec in record['categories'] )
summarize = lambda record: record['id']
fs_cat = [
{
'cat':cid,
'count':len(matches),
'ids':[ summarize( match ) for match in matches ]
}
for cid in allcategories
for matches in [[
record for record in inpdata['cat'] if cid in [ sec['cid'] for sec in record['categories'] ]
]]
]
print(fs_cat)
This gives the output as -
[{'cat': 66, 'count': 5, 'ids': [20, 21, 22, 23, 24]},
{'cat': 27, 'count': 4, 'ids': [20, 22, 23, 24]},
{'cat': 29, 'count': 2, 'ids': [20, 24]}
]
But how can I get the combination of the categories {66,27,29} ?
I tried using below approach for getting the combinations of this input - it gives the combination of items from the list
allcategories = {66,27,29}
for subset in itertools.chain.from_iterable(itertools.combinations(allcategories, n) for n in range(len(allcategories) + 1)):
print(subset)
But I couldn't figure out how can I use this approach to get me the result as below for categories {66,27,29} from the 'inpdata'
result=[{'cat': '66', 'count': 5, 'ids': [20, 21, 22, 23, 24]},
{'cat': '27', 'count': 4, 'ids': [20, 22, 23, 24]},
{'cat': '29', 'count': 2, 'ids': [20, 24]},
{'cat': '66&27', 'count': 4, 'ids': [20, 22, 23, 24]},
{'cat': '66&29', 'count': 2, 'ids': [20, 24]},
{'cat': '27&29', 'count': 2, 'ids': [20, 24]},
{'cat': '66&27&29', 'count': 2, 'ids': [20, 24]}
]
Could you please suggest on how I can achieve this?
itertools.combinations(1), itertools.combinations(2), ... upto itertools.combinations(n) will give you all combinations of fs_cat (where, n = len(fs_cat))
import itertools
import operator
from functools import reduce
fs_cat = [
{'cat': 66, 'count': 5, 'ids': [20, 21, 22, 23, 24]},
{'cat': 27, 'count': 4, 'ids': [20, 22, 23, 24]},
{'cat': 29, 'count': 2, 'ids': [20, 24]},
]
result = []
for n in range(1, len(fs_cat) + 1): # 1, 2, ..., len(fs_cat)
for xs in itertools.combinations(fs_cat, n):
cat = '&'.join(map(str, sorted(x['cat'] for x in xs)))
ids = sorted(reduce(operator.and_, (set(x['ids']) for x in xs)))
result.append({'cat': cat, 'count': len(ids), 'ids': ids})
>>> result
[{'cat': '66', 'count': 5, 'ids': [20, 21, 22, 23, 24]},
{'cat': '27', 'count': 4, 'ids': [20, 22, 23, 24]},
{'cat': '29', 'count': 2, 'ids': [20, 24]},
{'cat': '27&66', 'count': 4, 'ids': [20, 22, 23, 24]},
{'cat': '29&66', 'count': 2, 'ids': [20, 24]},
{'cat': '27&29', 'count': 2, 'ids': [20, 24]},
{'cat': '27&29&66', 'count': 2, 'ids': [20, 24]}]

Categories

Resources