Python collection counter
Curious if there is a better way to do this. Overriding a Counter class method?
The built-in multiply produces the dot product of two counters
from collections import Counter
a = Counter({'b': 4, 'c': 2, 'a': 1})
b = Counter({'b': 8, 'c': 4, 'a': 2})
newcounter = Counter()
for x in a.elements():
for y in b.elements():
if x == y:
newcounter[x] = a[x]*b[y]
$ newcounter
Counter({'b': 32, 'c': 8, 'a': 2})
Assuming a and b always have the same keys, you can achieve this with a dictionary comprehension as follows:
a = Counter({'b': 4, 'c': 2, 'a': 1})
b = Counter({'b': 8, 'c': 4, 'a': 2})
c = Counter({k:a[k]*b[k] for k in a})
print(c)
Output
Counter({'b': 32, 'c': 8, 'a': 2})
You can get the intersection of the keys if you don't have identical dicts:
from collections import Counter
a = Counter({'b': 4, 'c': 2, 'a': 1, "d":4})
b = Counter({'b': 8, 'c': 4, 'a': 2})
# just .keys() for python3
print Counter(({k: a[k] * b[k] for k in a.viewkeys() & b}))
Counter({'b': 32, 'c': 8, 'a': 2})
Or if you want to join both you can or the dicts and use dict.get:
from collections import Counter
a = Counter({'b': 4, 'c': 2, 'a': 1, "d":4})
b = Counter({'b': 8, 'c': 4, 'a': 2})
print Counter({k: a.get(k,1) * b.get(k, 1) for k in a.viewkeys() | b})
Counter({'b': 32, 'c': 8, 'd': 4, 'a': 2})
If you wanted to be able to use the * operator on the Counter dicts you would have to roll your own:
class _Counter(Counter):
def __mul__(self, other):
return _Counter({k: self[k] * other[k] for k in self.viewkeys() & other})
a = _Counter({'b': 4, 'c': 2, 'a': 1, "d": 4})
b = _Counter({'b': 8, 'c': 4, 'a': 2})
print(a * b)
Which would give you:
_Counter({'b': 32, 'c': 8, 'a': 2})
If you wanted inplace:
from collections import Counter
class _Counter(Counter):
def __imul__(self, other):
return _Counter({k: self[k] * other[k] for k in self.viewkeys() & other})
Output:
In [28]: a = _Counter({'b': 4, 'c': 2, 'a': 1, "d": 4})
In [29]: b = _Counter({'b': 8, 'c': 4, 'a': 2})
In [30]: a *= b
In [31]: a
Out[31]: _Counter({'a': 2, 'b': 32, 'c': 8})
This seems a bit better:
a = Counter({'b': 4, 'c': 2, 'a': 1})
b = Counter({'b': 8, 'c': 4, 'a': 2})
newcounter = Counter({k:a[k]*v for k,v in b.items()})
>>> newcounter
Counter({'b': 32, 'c': 8, 'a': 2})
Related
I have this list of dictionaries:
list_of_dicts = [{'A':1,'B':2,'C':3,'D':4,'E':5}, {'A':1,'B':1,'C':1,'D':1,'E':1}, {'A':2,'B':2,'C':2,'D':2,'E':2}]
To sum up values, I can use counter like this:
from collections import Counter
import functools, operator
# sum the values with same keys
counter = Counter()
for d in list_of_dicts:
counter.update(d)
result = dict(counter)
result
{'A': 4, 'B': 5, 'C': 6, 'D': 7, 'E': 8}
But how to achieve summation if some key in the dictionary has value as list:
list_of_dicts = [{'A':1,'B':2,'C':3,'D':4,'E':[1,2,3]}, {'A':1,'B':1,'C':1,'D':1,'E':[1,2,3]}, {'A':2,'B':2,'C':2,'D':2,'E':[1,2,3]}]
I want to get this result:
{'A': 4, 'B': 5, 'C': 6, 'D': 7, 'E':[3,6,9]}
If you can not use numpy you can try this:
(using collections.defaultdict)
from collections import defaultdict
list_of_dicts = [{'A':1,'B':2,'C':3,'D':4,'E':[1,2,3]},
{'A':1,'B':1,'C':1,'D':1,'E':[1,2,3]},
{'A':2,'B':2,'C':2,'D':2,'E':[1,2,3]}]
dct = defaultdict(list)
for l in list_of_dicts:
for k,v in l.items():
dct[k].append(v)
for k,v in dct.items():
if isinstance(v[0],list):
dct[k] = [sum(x) for x in zip(*v)]
else:
dct[k] = sum(v)
Output:
>>> dct
defaultdict(list, {'A': 4, 'B': 5, 'C': 6, 'D': 7, 'E': [3, 6, 9]})
If you can use numpy you can try this:
import numpy as np
dct = defaultdict(list)
for l in list_of_dicts:
for k,v in l.items():
dct[k].append(v)
for k,v in dct.items():
dct[k] = (np.array(v).sum(axis=0))
Output:
>>> dct
defaultdict(list, {'A': 4, 'B': 5, 'C': 6, 'D': 7, 'E': array([3, 6, 9])})
I have list of identical dictionaries:
my_list = [{'a': 1, 'b': 2, 'c': 3}, {'a': 4, 'b': 5, 'c': 6}, {'a': 7, 'b': 8, 'c': 9}]
I need to get something like this:
a = [1, 4, 7]
b = [2, 5, 8]
c = [3, 6, 9]
I know how to do in using for .. in .., but is there way to do it without looping?
If i do
a, b, c = zip(*my_list)
i`m getting
a = ('a', 'a', 'a')
b = ('b', 'b', 'b')
c = ('c', 'c', 'c')
Any solution?
You need to extract all the values in my_list.You could try:
my_list = [{'a': 1, 'b': 2, 'c': 3}, {'a': 4, 'b': 5, 'c': 6}, {'a': 7, 'b': 8, 'c': 9}]
a, b, c = zip(*map(lambda d: d.values(), my_list))
print(a, b, c)
# (1, 4, 7) (2, 5, 8) (3, 6, 9)
Pointed out by #Alexandre,This work only when the dict is ordered.If you couldn't make sure the order, consider the answer of yatu.
You will have to loop to obtain the values from the inner dictionaries. Probably the most appropriate structure would be to have a dictionary, mapping the actual letter and a list of values. Assigning to different variables is usually not the best idea, as it will only work with the fixed amount of variables.
You can iterate over the inner dictionaries, and append to a defaultdict as:
from collections import defaultdict
out = defaultdict(list)
for d in my_list:
for k,v in d.items():
out[k].append(v)
print(out)
#defaultdict(list, {'a': [1, 4, 7], 'b': [2, 5, 8], 'c': [3, 6, 9]})
Pandas DataFrame has just a factory method for this, so if you already have it as a dependency or if the input data is large enough:
import pandas as pd
my_list = ...
df = pd.DataFrame.from_rows(my_list)
a = list(df['a']) # df['a'] is a pandas Series, essentially a wrapped C array
b = list(df['b'])
c = list(df['c'])
Please find the code below. I believe that the version with a loop is much easier to read.
my_list = [{'a': 1, 'b': 2, 'c': 3}, {'a': 4, 'b': 5, 'c': 6}, {'a': 7, 'b': 8, 'c': 9}]
# we assume that all dictionaries have the sames keys
a, b, c = map(list, map(lambda k: map(lambda d: d[k], my_list), my_list[0]))
print(a,b,c)
Given any amount of dictionaries, how would one go about merging them all together, such that the merged dictionary contains all the dictionaries' elements as well as summing similar key values.
eg.
d1 = {a: 2, b: 3, c: 1}
d2 = {a: 3, b: 2, c: 3}
d3 = {b: 8, d: 2}
our merged dictionary would look like such:
{a: 5, b: 13, c: 4, d: 2}
Can this be done via kwargs? I am aware that one can do:
{**d1, **d2, **d3}
But can this be done for n-defined dictionaries?
you can use a Counter
from collections import Counter
d1 = {'a': 2, 'b': 3, 'c': 1}
d2 = {'a': 3, 'b': 2, 'c': 3}
d3 = {'b': 8, 'd': 2}
list_of_dicts = [d1, d2, d3]
cnt = Counter()
for d in list_of_dicts:
cnt.update(d)
print(cnt)
Counter({'b': 13, 'a': 5, 'c': 4, 'd': 2})
Per your comment regarding defaultdict, here is an approach along those lines. That said, I prefer the Counter approach in the answer from #Raphael.
from collections import defaultdict
def mergesum(*dicts):
merged = defaultdict(int)
for k, v in (item for d in dicts for item in d.items()):
merged[k] += v
return merged
d1 = {'a': 2, 'b': 3, 'c': 1}
d2 = {'a': 3, 'b': 2, 'c': 3}
d3 = {'b': 8, 'd': 2}
result = mergesum(d1, d2, d3)
print(result)
# defaultdict(<class 'int'>, {'a': 5, 'b': 13, 'c': 4, 'd': 2})
To my understanding, I know when I invoke Counter to covert dict. This dict includes value of keys is zero will disappear.
from collections import Counter
a = {"a": 1, "b": 5, "d": 0}
b = {"b": 1, "c": 2}
print Counter(a) + Counter(b)
If I want to keep my keys, how to do?
This is my expected result:
Counter({'b': 6, 'c': 2, 'a': 1, 'd': 0})
You can also use the update() method of Counter instead of + operator, example -
>>> a = {"a": 1, "b": 5, "d": 0}
>>> b = {"b": 1, "c": 2}
>>> x = Counter(a)
>>> x.update(Counter(b))
>>> x
Counter({'b': 6, 'c': 2, 'a': 1, 'd': 0})
update() function adds counts instead of replacing them , and it does not remove the zero value one either. We can also do Counter(b) first, then update with Counter(a), Example -
>>> y = Counter(b)
>>> y.update(Counter(a))
>>> y
Counter({'b': 6, 'c': 2, 'a': 1, 'd': 0})
Unfortunately, when summing two counter, only elements with a positive count are used.
If you want to keep the elements with a count of zero, you could define a function like this:
def addall(a, b):
c = Counter(a) # copy the counter a, preserving the zero elements
for x in b: # for each key in the other counter
c[x] += b[x] # add the value in the other counter to the first
return c
You can just subclass Counter and adjust its __add__ method:
from collections import Counter
class MyCounter(Counter):
def __add__(self, other):
"""Add counts from two counters.
Preserves counts with zero values.
>>> MyCounter('abbb') + MyCounter('bcc')
MyCounter({'b': 4, 'c': 2, 'a': 1})
>>> MyCounter({'a': 1, 'b': 0}) + MyCounter({'a': 2, 'c': 3})
MyCounter({'a': 3, 'c': 3, 'b': 0})
"""
if not isinstance(other, Counter):
return NotImplemented
result = MyCounter()
for elem, count in self.items():
newcount = count + other[elem]
result[elem] = newcount
for elem, count in other.items():
if elem not in self:
result[elem] = count
return result
counter1 = MyCounter({'a': 1, 'b': 0})
counter2 = MyCounter({'a': 2, 'c': 3})
print(counter1 + counter2) # MyCounter({'a': 3, 'c': 3, 'b': 0})
I help Anand S Kumar to do more a additional explanation.
Even though your dict includes negative value, it still keep your keys.
from collections import Counter
a = {"a": 1, "b": 5, "d": -1}
b = {"b": 1, "c": 2}
print Counter(a) + Counter(b)
#Counter({'b': 6, 'c': 2, 'a': 1})
x = Counter(a)
x.update(Counter(b))
print x
#Counter({'b': 6, 'c': 2, 'a': 1, 'd': -1})
I have two dictionaries in Python:
d1 = {'a': 10, 'b': 9, 'c': 8, 'd': 7}
d2 = {'a': 1, 'b': 2, 'c': 3, 'e': 2}
I want to substract values between dictionaries d1-d2 and get the result:
d3 = {'a': 9, 'b': 7, 'c': 5, 'd': 7 }
Now I'm using two loops but this solution is not too fast
for x,i in enumerate(d2.keys()):
for y,j in enumerate(d1.keys()):
I think a very Pythonic way would be using dict comprehension:
d3 = {key: d1[key] - d2.get(key, 0) for key in d1}
Note that this only works in Python 2.7+ or 3.
Use collections.Counter, iif all resulting values are known to be strictly positive. The syntax is very easy:
>>> from collections import Counter
>>> d1 = Counter({'a': 10, 'b': 9, 'c': 8, 'd': 7})
>>> d2 = Counter({'a': 1, 'b': 2, 'c': 3, 'e': 2})
>>> d3 = d1 - d2
>>> print d3
Counter({'a': 9, 'b': 7, 'd': 7, 'c': 5})
Mind, if not all values are known to remain strictly positive:
elements with values that become zero will be omitted in the result
elements with values that become negative will be missing, or replaced with wrong values. E.g., print(d2-d1) can yield Counter({'e': 2}).
Just an update to Haidro answer.
Recommended to use subtract method instead of "-".
d1.subtract(d2)
When - is used, only positive counters are updated into dictionary.
See examples below
c = Counter(a=4, b=2, c=0, d=-2)
d = Counter(a=1, b=2, c=3, d=4)
a = c-d
print(a) # --> Counter({'a': 3})
c.subtract(d)
print(c) # --> Counter({'a': 3, 'b': 0, 'c': -3, 'd': -6})
Please note the dictionary is updated when subtract method is used.
And finally use dict(c) to get Dictionary from Counter object
Haidro posted an easy solution, but even without collections you only need one loop:
d1 = {'a': 10, 'b': 9, 'c': 8, 'd': 7}
d2 = {'a': 1, 'b': 2, 'c': 3, 'e': 2}
d3 = {}
for k, v in d1.items():
d3[k] = v - d2.get(k, 0) # returns value if k exists in d2, otherwise 0
print(d3) # {'c': 5, 'b': 7, 'a': 9, 'd': 7}