Average value in multiple dictionaries based on key in Python? - python

I have three dictionaries (or more):
A = {'a':1,'b':2,'c':3,'d':4,'e':5}
B = {'b':1,'c':2,'d':3,'e':4,'f':5}
C = {'c':1,'d':2,'e':3,'f':4,'g':5}
How can I get a dictionary of the average values of every key in the three dictionaries?
For example, given the above dictionaries, the output would be:
{'a':1/1, 'b':(2+1)/2, 'c':(3+2+1)/3, 'd':(4+3+2)/3, 'e':(5+4+3)/3, 'f':(5+4)/2, 'g':5/1}

You can use Pandas, like this:
import pandas as pd
df = pd.DataFrame([A,B,C])
answer = dict(df.mean())
print(answer)

I use Counter to solve this problem. Please try the following code :)
from collections import Counter
A = {'a':1,'b':2,'c':3,'d':4,'e':5}
B = {'b':1,'c':2,'d':3,'e':4,'f':5}
C = {'c':1,'d':2,'e':3,'f':4,'g':5}
sums = Counter()
counters = Counter()
for itemset in [A, B, C]:
sums.update(itemset)
counters.update(itemset.keys())
ret = {x: float(sums[x])/counters[x] for x in sums.keys()}
print ret

The easiest way would be to use collections.Counter as explained here, like this:
from collections import Counter
sums = dict(Counter(A) + Counter(B) + Counter(C))
# Which is {'a': 1, 'c': 6, 'b': 3, 'e': 12, 'd': 9, 'g': 5, 'f': 9}
means = {k: sums[k] / float((k in A) + (k in B) + (k in C)) for k in sums}
The result would be:
>>> means
{'a': 1.0, 'b': 1.5, 'c': 2.0, 'd': 3.0, 'e': 4.0, 'f': 4.5, 'g': 5.0}

If you are working in python 2.7 or 3.5 you can use the following:
keys = set(A.keys()+B.keys()+C.keys())
D = {key:(A.get(key,0)+B.get(key,0)+C.get(key,0))/float((key in A)+(key in B)+(key in C)) for key in keys}
which outputs
D
{'a': 1.0, 'c': 2.0, 'b': 1.5, 'e': 4.0, 'd': 3.0, 'g': 5.0, 'f': 4.5}
if you don't want to use any packages. This doesn't work in python 2.6 and below though.

Here's a very general way to do so (i.e. you can easily change to any aggregation function).:
def aggregate_dicts(dicts, operation=lambda x: sum(x) / len(x)):
"""
Aggregate a sequence of dictionaries to a single dictionary using `operation`. `Operation` should
reduce a list of all values with the same key. Keyrs that are not found in one dictionary will
be mapped to `None`, `operation` can then chose how to deal with those.
"""
all_keys = set().union(*[el.keys() for el in dicts])
return {k: operation([dic.get(k, None) for dic in dicts]) for k in all_keys}
example:
dicts_diff_keys = [{'x': 0, 'y': 1}, {'x': 1, 'y': 2}, {'x': 2, 'y': 3, 'c': 4}]
def mean_no_none(l):
l_no_none = [el for el in l if el is not None]
return sum(l_no_none) / len(l_no_none)
aggregate_dicts(dicts_diff_keys, operation= mean_no_none)
#{'x': 1.0, 'c': 4.0, 'y': 2.0}

Related

Unpack list of dictionaries in Python

Question
According to this answer, in Python 3.5 or greater, it is possible to merge two dictionaries x and y by unpacking them:
z = {**x, **y}
Is it possible to unpack a variadic list of dictionaries? Something like
def merge(*dicts):
return {***dicts} # this fails, of course. What should I use here?
For instance, I would expect that
list_of_dicts = [{'a': 1, 'b': 2}, {'c': 3}, {'d': 4}]
{***list_of_dicts} == {'a': 1, 'b': 2, 'c': 3, 'd': 4}
Note that this question is not about how to merge lists of dictionaries since the link above provides an answer to this. The question here is: is it possible, and how, to unpack lists of dictionaries?
Edit
As stated in the comments, this question is very similar to this one. However, unpacking a list of dictionaries is different from simply merging them. Supposing that there was an operator *** designed to unpack lists of dictionaries, and given
def print_values(a, b, c, d):
print('a =', a)
print('b =', b)
print('c =', c)
print('d =', d)
list_of_dicts = [{'a': 1, 'b': 2}, {'c': 3}, {'d': 4}]
it would be possible to write
print_values(***list_of_dicts)
instead of
print_values(**merge(list_of_dicts))
Another solution is using collections.ChainMap
from collections import ChainMap
dict(ChainMap(*list_of_dicts[::-1]))
Out[88]: {'a': 1, 'b': 2, 'c': 3, 'd': 4}
You could just iterate over the list and use update:
lst = [{'a': 1, 'b': 2}, {'c': 3}, {'d': 4}]
dct = {}
for item in lst:
dct.update(item)
print(dct)
# {'a': 1, 'b': 2, 'c': 3, 'd': 4}
There's no syntax for that, but you can use itertools.chain to concatenate the key/value tuples from each dict into a single stream that dict can consume.
from itertools import chain
def merge(*dicts):
return dict(chain.from_iterable(d.items() for d in dicts))
You can also unpack a list created by a list comprehension as well:
def merge(*dicts):
return dict(*[d.items() for d in dicts])
To merge multiple dictionaries you can use the function reduce:
from functools import reduce
lst = [{'a': 1, 'b': 2}, {'c': 3}, {'d': 4}]
reduce(lambda x, y: dict(**x, **y), lst)
# {'a': 1, 'b': 2, 'c': 3, 'd': 4}
You could use list comprehension and put this iterable object as an argument to dict
def merge(*dicts):
lst = [*[d.items() for d in dicts]]
return dict(lst)
You can just use a list comprehension to iterate over all the dicts in the list and then iterate over each if those dicts' items and finally convert them to dict
>>> lst = [{'a':1}, {'b':2}, {'c':1}, {'d':2}]
>>> dict(kv for d in lst for kv in d.items())
{'a': 1, 'b': 2, 'c': 1, 'd': 2}
You can use reduce to merge two dicts at a time using dict.update
>>> from functools import reduce
>>> lst = [{'a':1}, {'b':2}, {'c':1}, {'d':2}]
>>> reduce(lambda d1, d2: d1.update(d2) or d1, lst, {})
{'a': 1, 'b': 2, 'c': 1, 'd': 2}
When you *dicts its put in as a tuple, you can pull the list out with d[0], then use this comprehension for nonuniform keys
list_of_dicts = [{'a': 1, 'b': 2}, {'c': 3}, {'d': 4}]
def merge(*dicts):
return dict( j for i in dicts[0] for j in i.items())
print(merge(list_of_dicts))
{'a': 1, 'b': 2, 'c': 3, 'd': 4}
[Program finished]

Finding all the dicts of max len in a list of dicts

I have a list of dictionaries
ld = [{'a': 1}, {'b': 2, 'c': 3}, {'d': 4, 'e': 5}]
I need to get all the elements with the longest length from my list, i.e.
{'b': 2, 'c': 3} and {'d': 4, 'e': 5}.
I'm not very knowledgeable in Python but I found that:
>>> max(ld, key=len)
{'b': 2, 'c': 3}
And, an even better solution that returns the index of the longest length dictionary:
>>> max(enumerate(ld), key=lambda tup: len(tup[1]))
(1, {'b': 2, 'c': 3})
I would like to use an expression that would return something like
(1: {'b': 2, 'c': 3}, 2: {'d': 4, 'e': 5})
and I feel like I'm not far from the solution (or maybe I am) but I just don't know how to get it.
You can find the length of the maximum dictionary in the structure, and then use a list comprehension:
ld = [{'a':1}, {'b':2, 'c':3}, {'d':4, 'e':5}]
_max = max(map(len, ld))
new_result = dict(i for i in enumerate(ld) if len(i[-1]) == _max)
Output:
{1: {'b': 2, 'c': 3}, 2: {'d': 4, 'e': 5}}
Ajax1234 provided a really good solution. If you want something of a beginner level, here's a solution.
ld = [{'a':1}, {'b':2, 'c':3}, {'d':4, 'e':5}]
ans = dict()
for value in ld:
if len(value) in ans:
ans[len(value)].append(value)
else:
ans[len(value)] = list()
ans[len(value)].append(value)
ans[max(ans)]
Basically, you add everything in a dictionary to get the maximum dictionary size to be the key, and dictionary list to be the value, and then get that maximum size list of dictionaries.
There are a number of ways you could do this in python. Here's one example which illustrates a few different python capabilities:
ld = [{'a':1}, {'b':2, 'c':3}, {'d':4, 'e':5}]
lengths = list(map(len, ld)) # [1, 2, 2]
max_len = max(lengths) # 2
index_to_max_length_dictionaries = {
index: dictionary
for index, dictionary in enumerate(ld)
if len(dictionary) == max_len
}
# output: {1: {'b': 2, 'c': 3}, 2: {'d': 4, 'e': 5}}
Find the maximum length and then use a dictionary comprehension to find the dictionaries with such length
max_l = len(max(ld, key=len))
result = {i: d for i, d in enumerate(ld) if len(d) == max_l}
This is the simplest and more readable approach you can take
Below is another path, a better (but more verbose) approach
max_length = 0
result = dict()
for i, d in enumerate(ld):
l = len(d)
if l == max_length:
result[i] = d
elif l > max_length:
max_length = l
result = {i: d}
This is the most efficient approach. It just iterate 1 time through the full input list

Outputting dictionary values by min/max value

Lets say I have a dictionary:
dict1 = {'a': 3, 'b': 1.2, 'c': 1.6, 'd': 3.88, 'e': 0.72}
I need to be able to sort this by min and max value and call on them using this function I am still writing (note: 'occurences,' 'avg_scores' and 'std_dev' are all dictionaries and 'words' are the dictionary's keys.):
def sort(words, occurrences, avg_scores, std_dev):
'''sorts and prints the output'''
menu = menu_validate("You must choose one of the valid choices of 1, 2, 3, 4 \n Sort Options\n 1. Sort by Avg Ascending\n 2. Sort by Avg Descending\n 3. Sort by Std Deviation Ascending\n 4. Sort by Std Deviation Descending", 1, 4)
print ("{}{}{}{}\n{}".format("Word", "Occurence", "Avg. Score", "Std. Dev.", "="*51))
if menu == 1:
for i in range (len(word_list)):
print ("{}{}{}{}".format(cnt_list.sorted[i],)
I'm sure I am making this way more difficult on myself than necessary and any help would be appreciated. Thanks!
You can sort the keys based on the associated value. For instance:
>>> dict1 = {'a': 3, 'b': 1.2, 'c': 1.6, 'd': 3.88, 'e': 0.72}
>>> for k in sorted(dict1, key=dict1.get):
... print k, dict1[k]
...
e 0.72
b 1.2
c 1.6
a 3
d 3.88
Use min and max with key:
dict1 = {'a': 3, 'b': 1.2, 'c': 1.6, 'd': 3.88, 'e': 0.72}
min_v = min(dict1.items(), key=lambda x: x[1])
max_v = max(dict1.items(), key=lambda x: x[1])
print min_v, max_v
You can't sort a dict, only it's representation.
But, you can use an ordereddict instead.
from collections import OrderedDict
dictionnary = OrderedDict(
sorted(
{'a': 3, 'b': 1.2, 'c': 1.6, 'd': 3.88, 'e': 0.72
}.items(), key=lambda x:x[1], reverse=True))

How to combine a list of dictionaries to one dictionary

I have a list of dicts:
d =[{'a': 4}, {'b': 20}, {'c': 5}, {'d': 3}]
I want to remove the curly braces and convert d to a single dict which looks like:
d ={'a': 4, 'b': 20, 'c': 5, 'd': 3}
If you don't mind duplicate keys replacing earlier keys you can use:
from functools import reduce # Python 3 compatibility
d = reduce(lambda a, b: dict(a, **b), d)
This merges the first two dictionaries then merges each following dictionary into the result built so far.
Demo:
>>> d =[{'a': 4}, {'b': 20}, {'c': 5}, {'d': 3}]
>>> reduce(lambda a, b: dict(a, **b), d)
{'a': 4, 'c': 5, 'b': 20, 'd': 3}
Or if you need this to work for arbitrary (non string) keys (and you are using Python 3.5 or greater):
>>> d =[{4: 4}, {20: 20}, {5: 5}, {3: 3}]
>>> reduce(lambda a, b: dict(a, **b), d) # This wont work
TypeError: keywords must be strings
>>> reduce(lambda a, b: {**a, **b}, d) # Use this instead
{4: 4, 20: 20, 5: 5, 3: 3}
The first solution hacks the behaviour of keyword arguments to the dict function. The second solution is using the more general ** operator introduced in Python 3.5.
You just need to iterate over d and append (update()) the element to a new dict e.g. newD.
d =[{'a': 4}, {'b': 20}, {'c': 5}, {'d': 3}]
newD = {}
for entry in d:
newD.update(entry)
>>> newD
{'c': 5, 'b': 20, 'a': 4, 'd': 3}
Note: If there are duplicate values in d the last one will be appear in newD.
Overwriting the values of existing keys, a brutal and inexperienced solution is
nd = {}
for el in d:
for k,v in el.items():
nd[k] = v
or, written as a dictionary comprehension:
d = {k:v for el in d for k,v in el.items()}
a = [{'a': 4}, {'b': 20}, {'c': 5}, {'d': 3}]
b = {}
[b.update(c) for c in a]
b = {'a': 4, 'b': 20, 'c': 5, 'd': 3}
if order is important:
from collections import OrderedDict
a = [{'a': 4}, {'b': 20}, {'c': 5}, {'d': 3}]
newD = OrderedDict()
[newD.update(c) for c in a]
out = dict(newD)

Python - Find non mutual items in two dicts

Lets say I have two dictionaries:
a = {'a': 1, 'b': 2, 'c': 3}
b = {'b': 2, 'c': 3, 'd': 4, 'e': 5}
What's the most pythonic way to find the non mutual items between the two of them such that for a and b I would get:
{'a': 1, 'd': 4, 'e': 5}
I had thought:
{key: b[key] for key in b if not a.get(key)}
but that only goes one way (b items not in a) and
a_only = {key: a[key] for key in a if not b.get(key)}.items()
b_only = {key: b[key] for key in b if not a.get(key)}.items()
dict(a_only + b_only)
seams very messy. Any other solutions?
>>> dict(set(a.iteritems()) ^ set(b.iteritems()))
{'a': 1, 'e': 5, 'd': 4}
Try with the symetric difference of set() :
out = {}
for key in set(a.keys()) ^ set(b.keys()):
out[key] = a.get(key, b.get(key))
diff = {key: a[key] for key in a if key not in b}
diff.update((key,b[key]) for key in b if key not in a)
just a bit cheaper version of what you have.
>>> a = {'a': 1, 'b': 2, 'c': 3}
>>> b = {'b': 2, 'c': 3, 'd': 4, 'e': 5}
>>> keys = set(a.keys()).symmetric_difference(set(b.keys()))
>>> result = {}
>>> for k in keys: result[k] = a.get(k, b.get(k))
...
>>> result
{'a': 1, 'e': 5, 'd': 4}
Whether this is less messy than your version is debatable, but at least it doesn't re-implement symmetric_difference.

Categories

Resources