Searching dict within a list - python

If I have the following list of items for a shopping store:
shop_list = [{'item': 'apple', 'amount': 10, 'cost': 5},
{'item': 'banana', 'amount': 12, 'cost': 6},
{'item': 'strawberry', 'amount': 8, 'cost': 9}]
So I have several dicts within a list. I want to find out how to get the item dict knowing the item. For example:
def x(item)
#do something to get the dict
print dict
x('apple') #print {'item': 'apple', 'amount': 10, 'cost': 5}
x('banana') #print {'item': 'banana', 'amount': 12, 'cost': 6}
What's the shortest, most efficient way to do this?

If you intend to lookup entries by their 'item', then you should consider having a dict which keys are the 'item' instead of a list of dict.
shop_list = {
'apple': {'amount': 10, 'cost': 5},
'banana': {'amount': 12, 'cost': 6},
'strawberry': {'amount': 8, 'cost': 9}
}
shop_list['banana'] # {'amount': 10, 'cost': 5}
In particular, this makes the lookup O(1) instead of the O(n) required for traversing the list.
If you cannot update the code that generated the original shop_list, then you can transform the already existing data with a dict-comprehension.
formatted_shop_list = {product['item']: product for product in shop_list}

def x(shop_list, item): # remove first parameter if you want to use global variable
for i in shop_list:
if i['item'] == item:
return i
Then, you can call this function as:
>>> todays_shop_list = [{'item': 'apple', 'amount': 10, 'cost': 5},
... {'item': 'banana', 'amount': 12, 'cost': 6},
... {'item': 'strawberry', 'amount': 8, 'cost': 9}]
>>> x(todays_shop_list, 'apple')
{'item': 'apple', 'amount': 10, 'cost': 5}

You can try iterating through the list and just extract the dict that matches your shop item:
def shop_item(shop_list, item):
return next((x for x in shop_list if x['item'] == item), None)
# or next(filter(lambda x: x['item'] == item, shop_list), None)
Which works as follows:
>>> shop_list = [{'item': 'apple', 'amount': 10, 'cost': 5},
... {'item': 'banana', 'amount': 12, 'cost': 6},
... {'item': 'strawberry', 'amount': 8, 'cost': 9}]
>>> shop_item(shop_list, 'apple')
{'item': 'apple', 'amount': 10, 'cost': 5}
>>> shop_item(shop_list, 'grape')
None
The above uses next() with a generator expression to iterate though the list until the condition is met, and returns None if item is not found.

You can try this:
def x(item):
return [elements for elements in shop_list if elements['item'] == item]
x('apple') #print {'item': 'apple', 'amount': 10, 'cost': 5}
x('banana') #print {'item': 'banana', 'amount': 12, 'cost': 6}
This will return the list item if found
[{'amount': 12, 'cost': 6, 'item': 'banana'}]
and if the result is not found an empty list will be returned.

Related

Dictionary list/dict comparison

I would really appreciate any help on the below. I am looking to create a set of values with 1 name compiling all duplicates, with a second dict value to total another value from a list of dicts. i have compiled the below code as an example:
l = [{'id': 1, 'name': 'apple', 'price': '100', 'year': '2000', 'currency': 'eur'},
{'id': 2, 'name': 'apple', 'price': '150', 'year': '2071', 'currency': 'eur'},
{'id': 3, 'name': 'apple', 'price': '1220', 'year': '2076', 'currency': 'eur'},
{'id': 4, 'name': 'cucumber', 'price': '90000000', 'year': '2080', 'currency': 'eur'},
{'id': 5, 'name': 'pear', 'price': '1000', 'year': '2000', 'currency': 'eur'},
{'id': 6, 'name': 'apple', 'price': '150', 'year': '2022', 'currency': 'eur'},
{'id': 9, 'name': 'apple', 'price': '100', 'year': '2000', 'currency': 'eur'},
{'id': 10, 'name': 'grape', 'price': '150', 'year': '2022', 'currency': 'eur'},
]
new_list = []
for d in l:
if d['name'] not in new_list:
new_list.append(d['name'])
print(new_list)
price_list = []
for price in l:
if price['price'] not in price_list:
price_list.append(price['price'])
print(price_list)
The out put i am hoping to achieve is:
[{'name': 'apple'}, {'price': <The total price for all apples>}]
Use a dictionary whose keys are the names and values are the list of prices. Then calculate the averages of each list.
d = {}
for item in l:
d.setdefault(item['name'], []).append(int(item['price']))
for name, prices in d.items()
d[name] = sum(prices)
print(d)
Actually, I thought this was the same as yesterday's question, where you wanted the average. If you just want the total, you don't need the lists. Use a defaultdict containing integers, and just add the price to it.
from collections import defaultdict
d = defaultdict(int)
for item in l:
d[item['name']] += int(item['price'])
print(d)
This method only requires one loop:
prices = {}
for item in l:
prices.update({item['name']: prices.get(item['name'], 0) + int(item['price'])})
print(prices)
Just for fun I decided to also implement the functionality with the item and price dictionaries separated as asked in the question, which gave the following horrendous code:
prices = []
for item in l:
# get indices of prices of corresponding items
price_idx = [n+1 for n, x in enumerate(prices) if item['name'] == x.get('name') and n % 2 == 0]
if not price_idx:
prices.append({'name': item['name']})
prices.append({'price': int(item['price'])})
else:
prices[price_idx[0]].update({'price': prices[price_idx[0]]['price'] + int(item['price'])})
print(prices)
And requires the following function to retrieve prices:
def get_price(name):
for n, x in enumerate(prices):
if n % 2 == 0 and x['name'] == name:
return prices2[n+1]['price']
Which honestly completely defeats the point of having a data structure. But if it answers your question, there you go.
This could be another one:
result = {}
for item in l:
if item['name'] not in result:
result[item['name']] = {'name': item['name'], 'price': 0}
result[item['name']]['price'] += int(item['price'])

How to remove empty key-value from dictionary comprehension when applying filter

I am new to python and learning how to use a dictionary comprehension. I have a movie cast dictionary that I would like to filter on a specific value using the dictionary comprehension technique. I was able to get it work but for some reason I get empty dictionaries added as well if the condition is not met. Why does it do it? And how can I ensure these are not included?
movie_cast = [{'id': 90633,'name': 'Gal Gadot','cast_id': 0, 'order': 0},
{'id': 62064, 'name': 'Chris Pine','cast_id': 15, 'order': 1},
{'id': 41091, 'name': 'Kristen Wiig', 'cast_id': 12,'order': 2},
{'id': 41092, 'name': 'Pedro Pascal', 'cast_id': 13, 'order': 3},
{'id': 32, 'name': 'Robin Wright', 'cast_id': 78, 'order': 4}]
limit = 1
cast_limit = []
for dict in movie_cast:
d = {key:value for (key,value) in dict.items() if dict['order'] < limit}
cast_limit.append(d)
print(cast_limit)
current_result = [{'id': 90633,'name': 'Gal Gadot','cast_id': 0, 'order': 0},
{'id': 62064, 'name': 'Chris Pine','cast_id': 15, 'order': 1},{},{},{}]
desired_result = [{'id': 90633,'name': 'Gal Gadot','cast_id': 0, 'order': 0},
{'id': 62064, 'name': 'Chris Pine','cast_id': 15, 'order': 1}]
Try with this (you need a list comprehension, not a dict comprehension):
cast_limit = [dct for dct in movie_cast if dct['order'] < limit]
I.e., you need to filter out elements of the list, not elements of a dict.

Find only duplicate entries in a list of dictionaries and then do something with them

I have a list of dictionary items that includes some duplicates. What I would like to do is iterate through this dictionary and pick out all of the duplicate items and then do something with them.
For example if I have the following list of dictionary:
animals = [
{'name': 'aardvark', 'value': 1},
{'name': 'badger', 'value': 2},
{'name': 'cat', 'value': 3},
{'name': 'aardvark', 'value': 4},
{'name': 'cat', 'value': 5}]
I would like to go through the list "animals" and extract the two dictionary entries for aardvark and cat and then do something with them.
for example:
duplicates = []
for duplicate in animals:
duplicates.append(duplicate)
The output I would like is for the list 'duplicates' to contain:
{'name': 'aardvark', 'value': 1},
{'name': 'cat', 'value': 3},
{'name': 'aardvark', 'value': 4},
{'name': 'cat', 'value': 5}
as always, any help is greatly appreciated and will hopefully go along way to me learning more about python.
This works!!!
animals = [
{'name': 'aardvark', 'value': 1},
{'name': 'badger', 'value': 2},
{'name': 'cat', 'value': 3},
{'name': 'aardvark', 'value': 4},
{'name': 'cat', 'value': 5},
{'name': 'lion', 'value': 6},
{'name': 'lion', 'value': 6},
]
uniq = dict()
dup_list = list()
for i in animals:
if not i["name"] in uniq:
uniq[i["name"]] = i["name"]
else:
dup_list.append(i)
print(dup_list)
You can sort the name of all the animals so that duplicates will be one next to the other. The time it takes is O(n log n).
names = [a['name'] for a in animals]
names.sort()
duplicates = []
prev, curr = None, None
for n in names:
if prev is None:
prev = n
continue
curr = n
if curr == prev:
duplicates.append(n)
prev = curr
So for this you should iterate through the dictionary with 2 for loops to check all the possible pairs and compare values and see if they match. Edited with desired output. Something like this:
animals = [
{'name': 'aardvark', 'value': 1},
{'name': 'badger', 'value': 2},
{'name': 'cat', 'value': 3},
{'name': 'aardvark', 'value': 4},
{'name': 'cat', 'value': 5}
]
duplicates = []
for i in range(len(animals)):
for j in range(i+1, len(animals)):
if animals[i]['name'] == animals[j]['name']:
duplicates.extend([animals[i], animals[j]])
print(duplicates)
With old-good defaultdict:
from collections import defaultdict
import pprint
d = defaultdict(list)
animals = [
{'name': 'aardvark', 'value': 1}, {'name': 'badger', 'value': 2},
{'name': 'cat', 'value': 3}, {'name': 'aardvark', 'value': 4},
{'name': 'cat', 'value': 5}]
for an in animals:
d[an['name']].append(an)
dups = [v for k,v in d.items() if len(v) > 1]
pprint.pprint(dups)
The output (list of lists/dups):
[[{'name': 'aardvark', 'value': 1}, {'name': 'aardvark', 'value': 4}],
[{'name': 'cat', 'value': 3}, {'name': 'cat', 'value': 5}]]
To achieve what you want to do you can transform your data animals into a pandas DataFrame juste like this :
import pandas as pd
animals = pd.DataFrame(animals)
You'll obtain a table like this :
name value
0 aardvark 1
1 badger 2
2 cat 3
3 aardvark 4
4 cat 5
Pandas' DataFrames are structures helping you manipulating the data.
(https://pandas.pydata.org/pandas-docs/stable/getting_started/index.html)
You can perform a lot of operations, for instance detecting duplicates as follow :
# Using duplicated() function
df.duplicated(subset=['name'], keep = False)
# It will give you a list of booleans associated with indexes as follow :
0 True
1 False
2 True
3 True
4 True
Once you know which lines are duplicates, you can filter your data like this and obtain the desired result :
duplicates = df[df.duplicated(subset=['name'], keep = False)]
# Gives you the following output :
name value
0 aardvark 1
2 cat 3
3 aardvark 4
4 cat 5
Good luck with your learning of python !

Python list of dictionaries - adding the dicts with same key names [duplicate]

This question already has answers here:
Python sum on keys for List of Dictionaries [duplicate]
(5 answers)
Closed 4 years ago.
I have a python list like this:
user = [
{'name': 'ozzy', 'quantity': 5},
{'name': 'frank', 'quantity': 4},
{'name': 'ozzy', 'quantity': 3},
{'name': 'frank', 'quantity': 2},
{'name': 'james', 'quantity': 7},
]
I am trying to write the code to join the dictionaries with the same name by also adding the quantities. The final list will be that:
user = [
{'name': 'ozzy', 'quantity': 8},
{'name': 'frank', 'quantity': 6},
{'name': 'james', 'quantity': 7}
]
I have tried a few things but I am struggling to get the right code. The code I have written below is somewhat adding the values (actually my list is much longer, I have just added a small portion for reference).
newList = []
quan = 0
for i in range(0,len(user)):
originator = user[i]['name']
for j in range(i+1,len(user)):
if originator == user[j]['name']:
quan = user[i]['quantity'] + user[j]['quantity']
newList.append({'name': originator, 'Quantity': quan})
can you please help me to get the correct code?
Just count the items in a collections.Counter, and expand back to list of dicts if needed:
user = [
{'name': 'ozzy', 'quantity': 5},
{'name': 'frank', 'quantity': 4},
{'name': 'ozzy', 'quantity': 3},
{'name': 'frank', 'quantity': 2},
{'name': 'james', 'quantity': 7},
]
import collections
d = collections.Counter()
for u in user:
d[u['name']] += u['quantity']
print(dict(d))
newlist = [{'name' : k, 'quantity' : v} for k,v in d.items()]
print(newlist)
outputs Counter dict first, which is already sufficient:
{'frank': 6, 'ozzy': 8, 'james': 7}
and the reformatted output using list of dicts:
[{'name': 'frank', 'quantity': 6}, {'name': 'ozzy', 'quantity': 8}, {'name': 'james', 'quantity': 7}]
The solution is also straightforward with a standard dictionary. No need for Counter or OrderedDict here:
user = [
{'name': 'ozzy', 'quantity': 5},
{'name': 'frank', 'quantity': 4},
{'name': 'ozzy', 'quantity': 3},
{'name': 'frank', 'quantity': 2},
{'name': 'james', 'quantity': 7},
]
dic = {}
for item in user:
n, q = item.values()
dic[n] = dic.get(n,0) + q
print(dic)
user = [{'name':n, 'quantity':q} for n,q in dic.items()]
print(user)
Result:
{'ozzy': 8, 'frank': 6, 'james': 7}
[{'name': 'ozzy', 'quantity': 8}, {'name': 'frank', 'quantity': 6}, {'name': 'james', 'quantity': 7}]
I would suggest changing the way the output dictionary looks so that it is actually useable. Consider something like this
user = [
{'name': 'ozzy', 'quantity': 5},
{'name': 'frank', 'quantity': 4},
{'name': 'ozzy', 'quantity': 3},
{'name': 'frank', 'quantity': 2},
{'name': 'james', 'quantity': 7},
]
data = {}
for i in user:
print(i)
if i['name'] in data:
data[i['name']] += i['quantity']
else:
data.update({i['name']: i['quantity']})
print(data)
{'frank': 6, 'james': 7, 'ozzy': 8}
If you need to maintain the original relative order:
from collections import OrderedDict
user = [
{'name': 'ozzy', 'quantity': 5},
{'name': 'frank', 'quantity': 4},
{'name': 'ozzy', 'quantity': 3},
{'name': 'frank', 'quantity': 2},
{'name': 'james', 'quantity': 7},
]
d = OrderedDict()
for item in user:
d[item['name']] = d.get(item['name'], 0) + item['quantity']
newlist = [{'name' : k, 'quantity' : v} for k, v in d.items()]
print(newlist)
Output:
[{'name': 'ozzy', 'quantity': 8}, {'name': 'frank', 'quantity': 6}, {'name': 'james', 'quantity': 7}]
user = [
{'name': 'ozzy', 'quantity': 8},
{'name': 'frank', 'quantity': 6},
{'name': 'james', 'quantity': 7}
]
reference_dict = {}
for item in user :
reference_dict[item['name']] = reference_dict.get(item['name'],0) + item['quantity']
#Creating new list from reference dict
updated_user = [{'name' : k , 'quantity' : v} for k,v in reference_dict.items()]
print updated_user

Sorting list of dictionaries---what is the default behaviour (without key parameter)?

I m trying to sort a list of dict using sorted
>>> help(sorted)
Help on built-in function sorted in module __builtin__:
sorted(...)
sorted(iterable, cmp=None, key=None, reverse=False) --> new sorted list
I have just given list to sorted and it sorts according to id.
>>>l = [{'id': 4, 'quantity': 40}, {'id': 1, 'quantity': 10}, {'id': 2, 'quantity': 20}, {'id': 3, 'quantity': 30}, {'id': 6, 'quantity': 60}, {'id': 7, 'quantity': -30}]
>>> sorted(l) # sorts by id
[{'id': -1, 'quantity': -10}, {'id': 1, 'quantity': 10}, {'id': 2, 'quantity': 20}, {'id': 3, 'quantity': 30}, {'id': 4, 'quantity': 40}, {'id': 6, 'quantity': 60}, {'id': 7, 'quantity': -30}]
>>> l.sort()
>>> l # sorts by id
[{'id': -1, 'quantity': -10}, {'id': 1, 'quantity': 10}, {'id': 2, 'quantity': 20}, {'id': 3, 'quantity': 30}, {'id': 4, 'quantity': 40}, {'id': 6, 'quantity': 60}, {'id': 7, 'quantity': -30}]
Many example of sorted says it requires key to sort the list of dict. But I didn't give any key. Why it didn't sort according to quantity? How did it choose to sort with id?
I tried another example with name & age,
>>> a
[{'age': 1, 'name': 'john'}, {'age': 3, 'name': 'shyam'}, {'age': 30,'name': 'ram'}, {'age': 15, 'name': 'rita'}, {'age': 5, 'name': 'sita'}]
>>> sorted(a) # sorts by age
[{'age': 1, 'name': 'john'}, {'age': 3, 'name': 'shyam'}, {'age': 5, 'name':'sita'}, {'age': 15, 'name': 'rita'}, {'age': 30, 'name': 'ram'}]
>>> a.sort() # sorts by age
>>> a
[{'age': 1, 'name': 'john'}, {'age': 3, 'name': 'shyam'}, {'age': 5, 'name':'sita'}, {'age': 15, 'name': 'rita'}, {'age': 30, 'name': 'ram'}]
Here it sorts according to age but not name. What am I missing in default behavior of these method?
From some old Python docs:
Mappings (dictionaries) compare equal if and only if their sorted (key, value) lists compare equal. Outcomes other than equality are resolved consistently, but are not otherwise defined.
Earlier versions of Python used lexicographic comparison of the sorted (key, value) lists, but this was very expensive for the common case of comparing for equality. An even earlier version of Python compared dictionaries by identity only, but this caused surprises because people expected to be able to test a dictionary for emptiness by comparing it to {}.
Ignore the default behaviour and just provide a key.
By default it will compare against the first difference it finds. If you are sorting dictionaries this is quite dangerous (consistent yet undefined).
Pass a function to key= parameter that takes a value from the list (in this case a dictionary) and returns the value to sort against.
>>> a
[{'age': 1, 'name': 'john'}, {'age': 3, 'name': 'shyam'}, {'age': 30,'name': 'ram'}, {'age': 15, 'name': 'rita'}, {'age': 5, 'name': 'sita'}]
>>> sorted(a, key=lambda d : d['name']) # sorts by name
[{'age': 1, 'name': 'john'}, {'age': 30, 'name': 'ram'}, {'age': 15, 'name': 'rita'}, {'age': 3, 'name': 'shyam'}, {'age': 5, 'name': 'sita'}]
See https://wiki.python.org/moin/HowTo/Sorting
The key parameter is quite powerful as it can cope with all sorts of data to be sorted, although maybe not very intuitive.

Categories

Resources