List of dictionaries into one dictionary with condition - python

I have a list of dictionaries:
foo = [{'name':'John Doe', 'customer':'a'},
{'name':'John Doe', 'customer':'b'},
{'name':'Jenny Wang', 'customer':'c'},
{'name':'Mary Rich', 'customer': None}
]
Is there a way to get the value of the first key and set it as the new key and the value of the new dict is the value of the 2nd key.
Expected result:
{'John Doe':['a', 'b'], 'Jenny Wang':['c'], 'Mary Rich':[]}

You could use dict.setdefault. The idea is initialize an empty list as a value for each key. Then check if a name already exists as a key and append non-None values to it (is not None check is necessary because if it's left out, other not-truthy values (such as False) may get left out; thanks #Grismar)
out = {}
for d in foo:
out.setdefault(d['name'], [])
if d['customer'] is not None:
out[d['name']].append(d['customer'])
Output:
{'John Doe': ['a', 'b'], 'Jenny Wang': ['c'], 'Mary Rich': []}

#enke answer is crystal and clear but adding my answer in case it
helps somehow.
A little different implementation could be:
foo = [{'name':'John Doe', 'customer':'a'},
{'name':'John Doe', 'customer':'b'},
{'name':'Jenny Wang', 'customer':'c'},
{'name':'Mary Rich', 'customer': None}
]
new_dict = dict()
for fo in foo:
if fo['name'] not in new_dict:
if fo['customer'] is None:
new_dict[fo['name']] = []
else:
new_dict[fo['name']] = [fo['customer']]
else:
if fo['customer'] is None:
new_dict[fo['name']].append()
else:
new_dict[fo['name']].append(fo['customer'])
print(new_dict)
Output
{'John Doe': ['a', 'b'], 'Jenny Wang': ['c'], 'Mary Rich': []}

There is a function in itertools called groupby. It splits your input list on a criteria you provide. It can then look like that.
from itertools import groupby
foo = [{'name':'John Doe', 'customer':'a'},
{'name':'John Doe', 'customer':'b'},
{'name':'Jenny Wang', 'customer':'c'},
{'name':'Mary Rich', 'customer': None}
]
def func_group(item):
return item['name']
def main():
for key, value in groupby(foo, func_group):
print(key)
print(list(value))
That leads not completely to your expected output but comes close:
John Doe
[{'name': 'John Doe', 'customer': 'a'}, {'name': 'John Doe', 'customer': 'b'}]
Jenny Wang
[{'name': 'Jenny Wang', 'customer': 'c'}]
Mary Rich
[{'name': 'Mary Rich', 'customer': None}]
(You now could apply it two times and get your desired output. I just showed the prinicple here :-) )

Related

Adding key and value to dictionary in python based on other dictionaries

I am using for loop in python and every loop creates a dictionary. I have the below set of dictionaries created.
{'name': 'xxxx'}
{'name': 'yyyy','age':'28'}
{'name': 'zzzz','age':'27','sex':'F'}
My requirement is to compare all the dictionaries created and find out the missing key values and add the key to missing dictionaries and order every dictionary based on key. Below is the expected output
Expected output:
{'age':'','name': 'xxxx','sex':''}
{'age':'28','name': 'yyyy','sex':''}
{'age':'27','name': 'zzzz','sex':'F'}
How to achieve this in python.
If you want to modify the dicts in-place, dict.setdefault would be easy enough.
my_dicts = [
{'name': 'xxxx'},
{'name': 'yyyy','age':'28'},
{'name': 'zzzz','age':'27','sex':'F'},
]
desired_keys = ['name', 'age', 'sex']
for d in my_dicts:
for key in desired_keys:
d.setdefault(key, "")
print(my_dicts)
prints out
[
{'name': 'xxxx', 'age': '', 'sex': ''},
{'name': 'yyyy', 'age': '28', 'sex': ''},
{'name': 'zzzz', 'age': '27', 'sex': 'F'},
]
If you don't want to hard-code the desired_keys list, you can make it a set and gather it from the dicts before the loop above.
desired_keys = set()
for d in my_dicts:
desired_keys.update(set(d)) # update with keys from `d`
Another option, if you want new dicts instead of modifying them in place, is
desired_keys = ... # whichever method you like
empty_dict = dict.fromkeys(desired_keys, "")
new_dicts = [{**empty_dict, **d} for d in my_dicts]
EDIT based on comments:
This doesn't remove keys that are not there in desired keys.
This will leave only the desired keys:
desired_keys = ... # Must be a set
for d in my_dicts:
for key in desired_keys:
d.setdefault(key, "")
for key in set(d) - desired_keys:
d.pop(key)
However, at that point it might be easier to just create new dicts:
new_dicts = [
{key: d.get(value, "") for key in desired_keys}
for d in my_dicts
]
data = [{'name': 'xxxx'},
{'name': 'yyyy','age':'28'},
{'name': 'zzzz','age':'27','sex':'F'}]
First get the maximum, to get all the keys.
Then use dict.get to get default value as empty string for each of the keys, and sort the dictionary on key, you can combine List-comprehension and dict-comprehension:
allKD = max(data, key=len)
[dict(sorted({k:d.get(k, '') for k in allKD}.items(), key=lambda x:x[0])) for d in data]
OUTPUT:
[{'age': '', 'name': 'xxxx', 'sex': ''},
{'age': '28', 'name': 'yyyy', 'sex': ''},
{'age': '27', 'name': 'zzzz', 'sex': 'F'}]
One approach:
from operator import or_
from functools import reduce
lst = [{'name': 'xxxx'},
{'name': 'yyyy', 'age': '28'},
{'name': 'zzzz', 'age': '27', 'sex': 'F'}]
# find all the keys
keys = reduce(or_, map(dict.keys, lst))
# update each dictionary with the complement of the keys
for d in lst:
d.update(dict.fromkeys(keys - d.keys(), ""))
print(lst)
Output
[{'name': 'xxxx', 'age': '', 'sex': ''}, {'name': 'yyyy', 'age': '28', 'sex': ''}, {'name': 'zzzz', 'age': '27', 'sex': 'F'}]

Write Values to Empty List of Dictionaries

I have a list of names of unknown length. I want to write the names with their corresponding key (name to an empty list of dictionaries).
Input:
names = ['john', 'bill']
Output:
[{'name': 'john'}, {'name': 'bill'}]
Here it is:
names = ['john', 'bill']
dic = [{'name': name} for name in names]
print(dic)
#[{'name': 'john'}, {'name': 'bill'}]

How to return multiple values in Python map?

I have a list of dicts:
>>> adict = {'name': 'John Doe', 'age': 18}
>>> bdict = {'name': 'Jane Doe', 'age': 20}
>>> l = []
>>> l.append(adict)
>>> l.append(bdict)
>>> l
[{'age': 18, 'name': 'John Doe'}, {'age': 20, 'name': 'Jane Doe'}]
Now I want to split up the values of each dict per key. Currently, this is how I do that:
>>> for i in l:
... name_vals.append(i['name'])
... age_vals.append(i['age'])
...
>>> name_vals
['John Doe', 'Jane Doe']
>>> age_vals
[18, 20]
Is it possible to achieve this via map? So that I don't have to call map multiple times, but just once?
name_vals, age_vals = map(lambda ....)
A simple & flexible way to do this is to "transpose" your list of dicts into a dict of lists. IMHO, this is easier to work with than creating a bunch of separate lists.
lst = [{'age': 18, 'name': 'John Doe'}, {'age': 20, 'name': 'Jane Doe'}]
out = {}
for d in lst:
for k, v in d.items():
out.setdefault(k, []).append(v)
print(out)
output
{'age': [18, 20], 'name': ['John Doe', 'Jane Doe']}
But if you really want to use map and separate lists on this there are several options. Willem has shown one way. Here's another.
from operator import itemgetter
lst = [{'age': 18, 'name': 'John Doe'}, {'age': 20, 'name': 'Jane Doe'}]
keys = 'age', 'name'
age_lst, name_lst = [list(map(itemgetter(k), lst)) for k in keys]
print(age_lst, name_lst)
output
[18, 20] ['John Doe', 'Jane Doe']
If you're using Python 2, then the list wrapper around the map call isn't necessary, but it's a good idea to use it to make your code compatible with Python 3.
If all dictionaries have the 'name' and 'age' key, and we can use the zip builtin, we can do this with:
from operator import itemgetter
names, ages = zip(*map(itemgetter('name', 'age'), l))
This will produce tuples:
>>> names
('John Doe', 'Jane Doe')
>>> ages
(18, 20)
In case you need lists, an additional map(list, ..) is necessary.

dictionary add values for the same keys

I have a list of dictionary:
[{'name':'Jay', 'value':'1'},{'name':'roc', 'value':'9'},{'name':'Jay', 'value':'7'},{'name':'roc', 'value':'2'}]
I want it to be:
[{'name':'Jay', 'value':'8'},{'name':'roc', 'value':'11'}]
I tried looping through but I am not able to find an example where I can do this. Any hint or idea will be appreciated.
You can use a defaultdict:
lst = [{'name':'Jay', 'value':'1'},{'name':'roc', 'value':'9'},{'name':'Jay', 'value':'7'},{'name':'roc', 'value':'2'}]
1) sum values for each name:
from collections import defaultdict
result = defaultdict(int)
for d in lst:
result[d['name']] += int(d['value'])
2) convert the name-value pair to a dictionary within a list:
[{'name': name, 'value': value} for name, value in result.items()]
# [{'name': 'roc', 'value': 11}, {'name': 'Jay', 'value': 8}]
Or if you want the value as str type as commented by #Kevin:
[{'name': name, 'value': str(value)} for name, value in result.items()]
​# [{'name': 'roc', 'value': '11'}, {'name': 'Jay', 'value': '8'}]
This is a good use case for itertools.groupby.
from itertools import groupby
from operator import itemgetter
orig = [{'name':'Jay', 'value':'1'},
{'name':'roc', 'value':'9'},
{'name':'Jay', 'value':'7'},
{'name':'roc', 'value':'2'}]
get_name = itemgetter('name')
result = [{'name': name, 'value': str(sum(int(d['value']) for d in dicts))}
for name, dicts in groupby(sorted(orig, key=get_name), key=get_name)]
Breaking it down:
get_name is a function that given a dictionary, returns the value of its "name" key. I.e., get_name = lambda x: x['name'].
sorted returns the list of dictionaries sorted by the value of the "name" key.
groupby returns an iterator of (name, dicts) where dicts is a list (ok, generator) of the dicts that share name as the value of the "name" key. (Grouping only occurs for consecutive items with the same key value, hence the need to sort the list in the previous step.)
The result is a list of new dictionaries using the given name and the sum of all the related "value" elements.
Similar to Psidom's answer but using collections.Counter which is the perfect candidate for accumulating integer values.
import collections
d =[{'name':'Jay', 'value':'1'},{'name':'roc', 'value':'9'},{'name':'Jay', 'value':'7'},{'name':'roc', 'value':'2'}]
c = collections.Counter()
for sd in d:
c[sd["name"]] += int(sd["value"])
Then, you need to rebuild the dicts if needed, by converting back to string.
print([{"name":n,"value":str(v)} for n,v in c.items()])
result:
[{'name': 'Jay', 'value': '8'}, {'name': 'roc', 'value': '11'}]
For the sake of completeness, without collections.defaultdict:
data = [{'name': 'Jay', 'value': '1'}, {'name': 'roc', 'value': '9'},
{'name': 'Jay', 'value': '7'}, {'name': 'roc', 'value': '2'}]
result = {}
# concetrate
for element in data:
result[element["name"]] = result.get(element["name"], 0) + int(element["value"])
# unpack
result = [{"name": element, "value": result[element]} for element in result]
# optionally, you can loop through result.items()
# you can, also, turn back result[elements] to str if needed
print(result)
# prints: [{'name': 'Jay', 'value': 8}, {'name': 'roc', 'value': 11}]
Another way to solve your question by using groupby from itertools module:
from itertools import groupby
a = [{'name':'Jay', 'value':'1'},{'name':'roc', 'value':'9'},{'name':'Jay', 'value':'7'},{'name':'roc', 'value':'2'}]
final = []
for k,v in groupby(sorted(a, key= lambda x: x["name"]), lambda x: x["name"]):
final.append({"name": k, "value": str(sum(int(j["value"]) for j in list(v)))})
print(final)
Output:
[{'name': 'Jay', 'value': '8'}, {'name': 'roc', 'value': '11'}]
ld = [{'name':'Jay', 'value':'1'},{'name':'roc', 'value':'9'},{'name':'Jay', 'value':'7'},{'name':'roc', 'value':'2'}]
tempDict = {}
finalList = []
for d in ld:
name = d['name']
value = d['value']
if name not in tempDict:
tempDict[name] = 0
tempDict[name] += int(value)
#tempDict => {'Jay': 8, 'roc': 11}
for name,value in tempDict.items():
finalList.append({'name':name,'value':value})
print(finalList)
# [{'name': 'Jay', 'value': 8}, {'name': 'roc', 'value': 11}]
Here's another way using pandas
names = [{'name':'Jay', 'value':'1'},{'name':'roc', 'value':'9'},{'name':'Jay', 'value':'7'},
{'name':'roc', 'value':'2'}]
df = pd.DataFrame(names)
df['value'] = df['value'].astype(int)
group = df.groupby('name')['value'].sum().to_dict()
result = [{'name': name, 'value': value} for name, value in group.items()]
Which outputs:
[{'value': 8, 'name': 'Jay'}, {'value': 11, 'name': 'roc'}]

How should I remove all dicts from a list that have None as one of their values?

Suppose I have a list like so:
[{'name': 'Blah1', 'age': x}, {'name': 'Blah2', 'age': y}, {'name': None, 'age': None}]
It is guaranteed that both 'name' and 'age' values will either be filled or empty.
I tried this:
for person_dict in list:
if person_dict['name'] == None:
list.remove(person_dict)
But obviously that does not work because the for loop skips over an index sometimes and ignores some blank people.
I am relatively new to Python, and I am wondering if there is a list method that can target dicts with a certain value associated with a key.
EDIT: Fixed tuple notation to list as comments pointed out
Just test for the presence of None in the dict's values to test ALL dict keys for the None value:
>>> ToD=({'name': 'Blah1', 'age': 'x'}, {'name': 'Blah2', 'age': 'y'}, {'name': None, 'age': None})
>>> [e for e in ToD if None not in e.values()]
[{'age': 'x', 'name': 'Blah1'}, {'age': 'y', 'name': 'Blah2'}]
Or, use filter:
>>> filter(lambda d: None not in d.values(), ToD)
({'age': 'x', 'name': 'Blah1'}, {'age': 'y', 'name': 'Blah2'})
Or, if it is a limited test to 'name':
>>> filter(lambda d: d['name'], ToD)
({'age': 'x', 'name': 'Blah1'}, {'age': 'y', 'name': 'Blah2'})
You can use list comprehension as a filter like this
[c_dict for c_dict in dict_lst if all(c_dict[key] is not None for key in c_dict)]
This will make sure that you get only the dictionaries where all the values are not None.
for index,person_dict in enumerate(lis):
if person_dict['name'] == None:
del lis[index]
you can also try
lis=[person_dict for person_dict in lis if person_dict['name'] != None]
never use List as variable
You can create new list with accepted data. If you have tuple then you have to create new list.
List comprehension could be faster but this version is more readable for beginners.
data = ({'name': 'Blah1', 'age': 'x'}, {'name': 'Blah2', 'age': 'y'}, {'name': None, 'age': None})
new_data = []
for x in data:
if x['name']: # if x['name'] is not None and x['name'] != ''
new_data.append(x)
print new_data

Categories

Resources