If I have a dictionary with their corresponding frequency values:
numbers = {'a': 1, 'b': 4, 'c': 1, 'd': 3, 'e': 3}
To find the highest, what I know is:
mode = max(numbers, key=numbers.get)
print mode
and that prints:
b
But if I have:
numbers = {'a': 1, 'b': 0, 'c': 1, 'd': 3, 'e': 3}
and apply the 'max' function above, the output is:
d
What I need is:
d,e
Or something similar, displaying both keys.
numbers = {'a': 1, 'b': 0, 'c': 1, 'd': 3, 'e': 3}
max_value = max(numbers.values())
[k for k,v in numbers.items() if v == max_value]
prints
['e', 'd']
what it does is, loop over all entries via .items and then check if the value is the maximum and if so add the key to a list.
numbers = {'a': 1, 'b': 4, 'c': 1, 'd':4 , 'e': 3}
mx_tuple = max(numbers.items(),key = lambda x:x[1]) #max function will return a (key,value) tuple of the maximum value from the dictionary
max_list =[i[0] for i in numbers.items() if i[1]==mx_tuple[1]] #my_tuple[1] indicates maximum dictionary items value
print(max_list)
This code will work in O(n). O(n) in finding maximum value and O(n) in the list comprehension. So overall it will remain O(n).
Note : O(2n) is equivalent to O(n).
The collections.Counter object is useful for this as well. It gives you a .most_common() method which will given you the keys and counts of all available values:
from collections import Counter
numbers = Counter({'a': 1, 'b': 0, 'c': 1, 'd': 3, 'e': 3})
values = list(numbers.values())
max_value = max(values)
count = values.count(max_value)
numbers.most_common(n=count)
You can use the .items() property and sort after a tuple of count, key - on similar counts the key will decide:
d = ['a','b','c','b','c','d','c','d','e','d','b']
from collections import Counter
get_data = Counter(d)
# sort by count, then key
maxmax = sorted(get_data.items(), key=lambda a: (a[1],a[0]) )
for elem in maxmax:
if elem[1] == maxmax[0][1]:
print (elem)
Output:
('a', 1)
('e', 1) # the last one is the one with "highest" key
To get the "highest" key, use maxmax[-1].
Related
How can I turn a list of dicts like this
dico = [{'a':1}, {'b':2}, {'c':1}, {'d':2}, {'e':2}, {'d':3}, {'g':1}, {'h':4}, {'h':2}, {'f':6}, {'a':2}, {'b':2}]
Into a single dict like this
{'a':3, 'b':4, 'c':1, 'd':5,'e':2,'f':6 , 'g':1 ,'h':6}
At the moment when doing this
result = {}
for d in dico:
result.update(d)
print(result)
Result :
{'a': 2, 'b': 2, 'c': 1, 'd': 3, 'e': 2, 'g': 1, 'h': 2, 'f': 6}
Just replace your dictionary with collections.Counter and it will work:
from collections import Counter
dico = [{'a':1}, {'b':2}, {'c':1}, {'d':2}, {'e':2}, {'d':3}, {'g':1}, {'h':4}, {'h':2}, {'f':6}, {'a':2}, {'b':2}]
result = Counter()
for d in dico:
result.update(d)
print(result)
Output:
Counter({'h': 6, 'f': 6, 'd': 5, 'b': 4, 'a': 3, 'e': 2, 'c': 1, 'g': 1})
Why the above works with update for Counter from the docs:
Elements are counted from an iterable or added-in from another mapping (or counter). Like dict.update() but adds counts instead of replacing them. Also, the iterable is expected to be a sequence of elements, not a sequence of (key, value) pairs.
Here's a fancy way to do it using collections.Counter, which is a kind of dictionary:
from collections import Counter
def add_dicts(dicts):
return sum(map(Counter, dicts), Counter())
The above is not efficient for a large number of dictionaries since it creates many intermediate Counter objects for the result, rather than updating one result in-place, so it runs in quadratic time. Here's a similar solution which runs in linear time:
from collections import Counter
def add_dicts(dicts):
out = Counter()
for d in dicts:
out += d
return out
Using a defaultdict:
from collections import defaultdict
dct = defaultdict(int)
for element in dico:
for key, value in element.items():
dct[key] += value
print(dct)
Which yields
defaultdict(<class 'int'>,
{'a': 3, 'b': 4, 'c': 1, 'd': 5, 'e': 2, 'g': 1, 'h': 6, 'f': 6})
As for time measurements, this is a comparison between the four answers:
from collections import defaultdict, Counter
from timeit import timeit
def solution_dani():
result = sum((Counter(e) for e in dico), Counter())
def solution_kaya():
return sum(map(Counter, dico), Counter())
def solution_roadrunner():
result = Counter()
for d in dico:
result.update(d)
return result
def solution_jan():
dct = defaultdict(int)
for element in dico:
for key, value in element.items():
dct[key] += value
return dct
print(timeit(solution_dani, number=10000))
print(timeit(solution_kaya, number=10000))
print(timeit(solution_roadrunner, number=10000))
print(timeit(solution_jan, number=10000))
On my MacBookAir this yields
0.839742998
0.8093687279999999
0.18643740100000006
0.04764247300000002
So the solution with a default dict is by far the fastest (factor 15-20), followed by #RoadRunner.
Use collections.Counter and sum:
from collections import Counter
dico = [{'a':1}, {'b':2}, {'c':1}, {'d':2}, {'e':2}, {'d':3}, {'g':1}, {'h':4}, {'h':2}, {'f':6}, {'a':2}, {'b':2}]
result = sum((Counter(e) for e in dico), Counter())
print(result)
Output
Counter({'h': 6, 'f': 6, 'd': 5, 'b': 4, 'a': 3, 'e': 2, 'c': 1, 'g': 1})
If you need an strict dictionary do:
result = dict(sum((Counter(e) for e in dico), Counter()))
print(result)
You could modify your approach, like this:
result = {}
for d in dico:
for key, value in d.items():
result[key] = result.get(key, 0) + value
print(result)
The update method will replace the values of existing keys, from the documentation:
Update the dictionary with the key/value pairs from other, overwriting
existing keys.
import collections
counter = collections.Counter()
for d in dico:
counter.update(d)
result = dict(counter)
print(result)
Output
{'a': 3, 'b': 4, 'c': 1, 'd': 5, 'e': 2, 'g': 1, 'h': 6, 'f': 6}
I need a method where I can merge two dicts keeping the max value when one of the keys, value are in both dicts.
dict_a maps "A", "B", "C" to 3, 2, 6
dict_b maps "B", "C", "D" to 7, 4, 1
final_dict map "A", "B", "C", "D" to 3, 7, 6, 1
I did get the job half done but I didn't figure out how to keep the max value for the 'C' key, value pair.
Used itertools chain() or update().
OK so this works by making a union set of all possible keys dict_a.keys() | dict_b.keys() and then using dict.get which by default returns None if the key is not present (rather than throwing an error). We then take the max (of the one which isn't None).
def none_max(a, b):
if a is None:
return b
if b is None:
return a
return max(a, b)
def max_dict(dict_a, dict_b):
all_keys = dict_a.keys() | dict_b.keys()
return {k: none_max(dict_a.get(k), dict_b.get(k)) for k in all_keys}
Note that this will work with any comparable values -- many of the other answers fail for negatives or zeros.
Example:
Inputs:
dict_a = {'a': 3, 'b': 2, 'c': 6}
dict_b = {'b': 7, 'c': 4, 'd': 1}
Outputs:
max_dict(dict_a, dict_b) # == {'b': 7, 'c': 6, 'd': 1, 'a': 3}
What about
{
k:max(
dict_a.get(k,-float('inf')),
dict_b.get(k,-float('inf'))
) for k in dict_a.keys()|dict_b.keys()
}
which returns
{'A': 3, 'D': 1, 'C': 6, 'B': 7}
With
>>> dict_a = {'A':3, 'B':2, 'C':6}
>>> dict_b = {'B':7, 'C':4, 'D':1}
Here is a working one liner
from itertools import chain
x = dict(a=30,b=40,c=50)
y = dict(a=100,d=10,c=30)
x = {k:max(x.get(k, 0), y.get(k, 0)) for k in set(chain(x,y))}
In[83]: sorted(x.items())
Out[83]: [('a', 100), ('b', 40), ('c', 50), ('d', 10)]
This is going to work in any case, i.e for common keys it will take the max of the value otherwise the existing value from corresponding dict.
Extending this so you can have any number of dictionaries in a list rather than just two:
a = {'a': 3, 'b': 2, 'c': 6}
b = {'b': 7, 'c': 4, 'd': 1}
c = {'c': 1, 'd': 5, 'e': 7}
all_dicts = [a,b,c]
from functools import reduce
all_keys = reduce((lambda x,y : x | y),[d.keys() for d in all_dicts])
max_dict = { k : max(d.get(k,0) for d in all_dicts) for k in all_keys }
If you know that all your values are non-negative (or have a clear smallest number), then this oneliner can solve your issue:
a = dict(a=3,b=2,c=6)
b = dict(b=7,c=4,d=1)
merged = { k: max(a.get(k, 0), b.get(k, 0)) for k in set(a) | set(b) }
Use your smallest-possible-number instead of the 0. (E. g. float('-inf') or similar.)
Yet another solution:
a = {"A":3, "B":2, "C":6}
b = {"B":7, "C":4, "D":1}
Two liner:
b.update({k:max(a[k],b[k]) for k in a if b.get(k,'')})
res = {**a, **b}
Or if you don't want to change b:
b_copy = dict(b)
b_copy.update({k:max(a[k],b[k]) for k in a if b.get(k,'')})
res = {**a, **b_copy}
> {'A': 3, 'B': 7, 'C': 6, 'D': 1}
I have a large numpy array, with each row containing a dict of words, in a similar format to below:
data = [{'a': 1, 'c': 2}, {'ba': 3, 'a': 4}, ... }
Could someone please point me in the right direction for how would I go about computing the sum of all the unique values of the dicts in each row of the numpy array? From the example above, I would hope to obtain something like this:
result = {'a': 5, 'c': 2, 'ba': 3, ...}
At the moment, the only way I can think to do it is iterating through each row of the data, and then each key of the dict, if a unique key is found then append it to the new dict and set the value, if a key that's already contained in the dict is found then add the value of that key to the key in the 'result'. Although this seems like an inefficient way to do it.
You could use a Counter() and update it with each dictionary contained in data, in a loop:
from collections import Counter
data = [{'a': 1, 'c': 2}, {'ba': 3, 'a': 4}]
c = Counter()
for d in data:
c.update(d)
output:
Counter({'a': 5, 'ba': 3, 'c': 2})
alternate one liner:
(as proposed by #AntonVBR in the comments)
sum((Counter(dict(x)) for x in data), Counter())
A pure Python solution using for-loops:
data = [{'a': 1, 'c': 2}, {'ba': 3, 'a': 4}]
result = {}
for d in data:
for k, v in d.items():
if k in result:
result[k] += v
else:
result[k] = v
output:
{'c': 2, 'a': 5, 'ba': 3}
I would like to achieve a function that deals with the following dataļ¼'d' and generate 'L'. How to achieve?
def func(**d):
'do something'
return [....]
source:
d = {'a': 1, 'b': 2, 'c': [3, 4, 5]}
or
d = {'a': 1, 'b': 2, 'c': [3, 4, 5], 'd': [6, 7]}
TO:
L=[{'a':1,'b':2,'c':3},
{'a':1,'b':2,'c':4},
{'a':1,'b':2,'c':5}]
or
L=[{'a': 1, 'b': 2, 'c': 3, 'd': 6},
{'a': 1, 'b': 2, 'c': 3, 'd': 7},
{'a': 1, 'b': 2, 'c': 4, 'd': 6},
{'a': 1, 'b': 2, 'c': 4, 'd': 7},
{'a': 1, 'b': 2, 'c': 5, 'd': 6},
{'a': 1, 'b': 2, 'c': 5, 'd': 7}]
d = {'a': 1, 'b': 2,' c': [3, 4, 5]}
temp_d = d
L = []
for key in temp_d:
item = {}
for key in temp_d:
if isinstance(temp_d[key], list):
item[key] = temp_d[key].pop(0)
else:
item[key] = temp_d[key]
L.append(item)
Basically, what i am doing here is:
I create a copy of the dictionary 'd' named 'temp_d';
I go through every key in the 'temp_d ' dictionary, and create an empty one;
I loop again through all the keys in the 'd' dictionary, and basically I verify if the value of the current key of the loop is a list, if it is, I add the key to the dictionary 'item' with the first value of the list, with the function pop(index) (this function removes an element from a list and returns it). If the value of the current key isn't a list, it just adds the key to the dict with the value.
After filling the dictionary 'item', I append it to 'L'.
Example in this case:
first key ('a'):
item = {}
first key of second loop ('a'):
is the value of 'a' a list?
no. adds the value.
new item{'a': 1}
second key of second loop ('b'):
is the value of 'b' a list?
no. adds the value.
new item{'a': 1, 'b': 2}
third key of second loop ('c'):
is the value of 'c' a list?
yes. adds the first element of the list, removing it from the list
(the list was [3, 4, 5], now is [4, 5])
new item{'a': 1, 'b': 2, 'c': 3}
appends the item to L
(the 'L' was [], now is [{'a': 1, 'b': 2, 'c': 3}])
etc until the end.
This will work with python 3:
d = {'a': 1, 'b': 2, 'c': [3, 4, 5]}
def f(**d):
return [{**d, 'c': i} for i in d.pop('c')]
Your problem can be solved as follows:
from itertools import cycle
def func(indict):
dictlist = [dict(indict)] # make copy to not change original dict
for key in indict: # loop keys to find lists
if type(indict[key]) == list:
listlength = len(indict[key])
dictlist = listlength * dictlist # elements are not unique
for dictindex, listelement in zip(range(len(dictlist)), cycle(indict[key])):
dictlist[dictindex] = dict(dictlist[dictindex]) # uniquify
dictlist[dictindex][key] = listelement # replace list by list element
return dictlist
In the general case you can have multiple lists in your dict. My solution assumes you want to unroll all of these.
Looking at the details of the solution, it starts by adding a copy of your original dict to dictlist then it cycles the elements and whenever it finds a list, it multiplies the dictlist with the the length of the list found. This will ensure that dictlist contains the correct number of elements.
However, the elements will not be unique as they will be references to the same underlying dicts.
To fix this, the elements of the dict list are "uniquified" by looping the list and replacing every element with a copy of itself and the list in the original indict is replaced by each element of the list, cycling the different elements of dictlist.
I know my explanation is a bit messy. I'm sorry about that, but I find it hard to explain in a short and simple way.
Also, the order of the element in the list, is not identical to what you ask for in the question. Since the individual key-value pairs of the dict are not ordered, it is not possible to ensure which order the elements will be unrolled, which leads to the list order is also not ensured.
If I have a dictionary with their corresponding frequency values:
numbers = {'a': 1, 'b': 4, 'c': 1, 'd': 3, 'e': 3}
To find the highest, what I know is:
mode = max(numbers, key=numbers.get)
print mode
and that prints:
b
But if I have:
numbers = {'a': 1, 'b': 0, 'c': 1, 'd': 3, 'e': 3}
and apply the 'max' function above, the output is:
d
What I need is:
d,e
Or something similar, displaying both keys.
numbers = {'a': 1, 'b': 0, 'c': 1, 'd': 3, 'e': 3}
max_value = max(numbers.values())
[k for k,v in numbers.items() if v == max_value]
prints
['e', 'd']
what it does is, loop over all entries via .items and then check if the value is the maximum and if so add the key to a list.
numbers = {'a': 1, 'b': 4, 'c': 1, 'd':4 , 'e': 3}
mx_tuple = max(numbers.items(),key = lambda x:x[1]) #max function will return a (key,value) tuple of the maximum value from the dictionary
max_list =[i[0] for i in numbers.items() if i[1]==mx_tuple[1]] #my_tuple[1] indicates maximum dictionary items value
print(max_list)
This code will work in O(n). O(n) in finding maximum value and O(n) in the list comprehension. So overall it will remain O(n).
Note : O(2n) is equivalent to O(n).
The collections.Counter object is useful for this as well. It gives you a .most_common() method which will given you the keys and counts of all available values:
from collections import Counter
numbers = Counter({'a': 1, 'b': 0, 'c': 1, 'd': 3, 'e': 3})
values = list(numbers.values())
max_value = max(values)
count = values.count(max_value)
numbers.most_common(n=count)
You can use the .items() property and sort after a tuple of count, key - on similar counts the key will decide:
d = ['a','b','c','b','c','d','c','d','e','d','b']
from collections import Counter
get_data = Counter(d)
# sort by count, then key
maxmax = sorted(get_data.items(), key=lambda a: (a[1],a[0]) )
for elem in maxmax:
if elem[1] == maxmax[0][1]:
print (elem)
Output:
('a', 1)
('e', 1) # the last one is the one with "highest" key
To get the "highest" key, use maxmax[-1].