Creating Python defaultdict using nested list of tuples - python

The scenario is that I have a 2-D list. Each item of the inner list is tuple (key, value pair). The key might repeat in the list. I want to create a default-dict on the fly, in such a way that finally, the dictionary stores the key, and the cumulative sum of all the values of that key from the 2-D list.
To put the code :
listOfItems = [[('a', 1), ('b', 3)], [('a', 6)], [('c', 0), ('d', 5), ('b', 2)]]
finalDict = defaultdict(int)
for eachItem in listOfItems:
for key, val in eachItem:
finalDict[key] += val
print(finalDict)
This is giving me what I want : defaultdict(<class 'int'>, {'a': 7, 'b': 5, 'c': 0, 'd': 5}) but I am looking for a more 'Pythonic' way using comprehensions. So I tried the below :
finalDict = defaultdict(int)
finalDict = {key : finalDict[key]+val for eachItem in listOfItems for key, val in eachItem}
print(finalDict)
But the output is : {'a': 6, 'b': 2, 'c': 0, 'd': 5} What is it that I am doing wrong? Or is it that when using comprehension the Dictionary is not created and modified on the fly?

Yes a comprehension can't be updated on-the-fly. Anyway, this task might be better suited to collections.Counter() with .update() calls:
>>> from collections import Counter
>>> c = Counter()
>>> for eachItem in listOfItems:
... c.update(dict(eachItem))
...
>>> c
Counter({'a': 7, 'b': 5, 'd': 5, 'c': 0})

This is because you do not assign any value to your finalDict inside your dict in comprehension.
In your dict in comprehension you are literally changing the type of finalDict
As far as I know you cannot assign value to your dict inside a dict in comprehension.
Here is a way to get the dictionnary you want
from functools import reduce
listOfItems = [[('a', 1), ('b', 3)], [('a', 6)], [('c', 0), ('d', 5), ('b', 2)]]
list_dict = [{key: val} for eachItem in listOfItems for key, val in eachItem]
def sum_dict(x, y):
return {k: x.get(k, 0) + y.get(k, 0) for k in set(x) | set(y)}
print(reduce(sum_dict, list_dict))

Simple solution without using additional modules:
inp_list = [[('a', 1), ('b', 3)], [('a', 6)], [('c', 0), ('d', 5), ('b', 2)]]
l = [item for sublist in inp_list for item in sublist] # flatten the list
sums = [(key, sum([b for (a,b) in l if a == key])) for key in dict(l)]
print(sums)

trying to use python's built-in methods instead of coding the functionality myself:
The long and explained solution
from itertools import chain, groupby
from operator import itemgetter
listOfItems = [[('a', 1), ('b', 3)], [('a', 6)], [('c', 0), ('d', 5), ('b', 2)]]
# just flat the list of lists into 1 list..
flatten_list = chain(*listOfItems)
# get all elements grouped by the key, e.g 'a', 'b' etc..
first = itemgetter(0)
groupedByKey = groupby(sorted(flatten_list, key=first), key=first))
#sum
summed_by_key = ((k, sum(item[1] for item in tups_to_sum)) for k, tups_to_sum in groupedByKey)
# create a dict
d = dict(summed_by_key)
print(d) # {'a': 7, 'b': 5, 'c': 0, 'd': 5}
~one line solution
from itertools import chain, groupby
from operator import itemgetter
first = itemgetter(0)
d = dict((k, sum(item[1] for item in tups_to_sum)) for k, tups_to_sum in groupby(sorted(chain(*listOfItems), key=first), key=first))
print(d) # {'a': 7, 'b': 5, 'c': 0, 'd': 5}

Related

Python: if first element is equal tuple together other elements

Apologies if this has been asked before, but I couldn't find it. If I have something like:
lst = [(('a', 'b'), 1, 2), (('a', 'b'), 3, 4), (('b', 'c'), 5, 6)]
and I want to obtain a shorter list:
new = [(('a', 'b'), (1, 3), (2, 4)), (('b', 'c'), 5, 6)]
so that it groups together the other elements in a tuple by first matching element, what is the fastest way to go about it?
You are grouping, based on a key. If your input groups are always consecutive, you can use itertools.groupby(), otherwise use a dictionary to group the elements. If order matters, use a dictionary that preserves insertion order (> Python 3.6 dict or collections.OrderedDict).
Using groupby():
from itertools import groupby
from operator import itemgetter
new = [(k, *zip(*(t[1:] for t in g))) for k, g in groupby(lst, key=itemgetter(0))]
The above uses Python 3 syntax to interpolate tuple elements from an iterable (..., *iterable)`.
Using a dictionary:
groups = {}
for key, *values in lst:
groups.setdefault(key, []).append(values)
new = [(k, *zip(*v)) for k, v in groups.items()]
In Python 3.6 or newer, that'll preserve the input order of the groups.
Demo:
>>> from itertools import groupby
>>> from operator import itemgetter
>>> lst = [(('a', 'b'), 1, 2), (('a', 'b'), 3, 4), (('b', 'c'), 5, 6)]
>>> [(k, *zip(*(t[1:] for t in g))) for k, g in groupby(lst, key=itemgetter(0))]
[(('a', 'b'), (1, 3), (2, 4)), (('b', 'c'), (5,), (6,))]
>>> groups = {}
>>> for key, *values in lst:
... groups.setdefault(key, []).append(values)
...
>>> [(k, *zip(*v)) for k, v in groups.items()]
[(('a', 'b'), (1, 3), (2, 4)), (('b', 'c'), (5,), (6,))]
If you are using Python 2, you'd have to use:
new = [(k,) + tuple(zip(*(t[1:] for t in g))) for k, g in groupby(lst, key=itemgetter(0))]
or
from collections import OrderedDict
groups = OrderedDict()
for entry in lst:
groups.setdefault(entry[0], []).append(entry[1:])
new = [(k,) + tuple(zip(*v)) for k, v in groups.items()]
You could also use a collections.defaultdict to group your tuple keys:
from collections import defaultdict
lst = [(('a', 'b'), 1, 2), (('a', 'b'), 3, 4), (('b', 'c'), 5, 6)]
d = defaultdict(tuple)
for tup, fst, snd in lst:
d[tup] += fst, snd
# defaultdict(<class 'tuple'>, {('a', 'b'): (1, 2, 3, 4), ('b', 'c'): (5, 6)})
for key, value in d.items():
d[key] = value[0::2], value[1::2]
# defaultdict(<class 'tuple'>, {('a', 'b'): ((1, 3), (2, 4)), ('b', 'c'): ((5,), (6,))})
result = [(k, v1, v2) for k, (v1, v2) in d.items()]
Which Outputs:
[(('a', 'b'), (1, 3), (2, 4)), (('b', 'c'), (5,), (6,))]
The logic of the above code:
Group the tuples into a defaultdict of tuples.
Split the values into firsts and seconds with slicing [0::2] and [1::2].
Wrap this updated dictionary into the correct tuple structure with a list comprehension.
Depending on your use case, you might find using a dictionary or defaultdict more useful. It will scale better too.
from collections import defaultdict
listmaker = lambda: ([],[]) # makes a tuple of 2 lists for the values.
my_data = defaultdict(listmaker)
for letter_tuple, v1, v2 in lst:
my_data[letter_tuple][0].append(v1)
my_data[letter_tuple][1].append(v2)
Then you’ll get a new tuple of lists for each unique (x,y) key. Python handles the checking to see if the key already exists and it’s fast. If you absolutely need it to be a list, you can always convert it too:
new = [(k, tuple(v1s), tuple(v2s)) for k, (v1s, v2s) in my_data.items()]
This list comprehension is a bit opaque, but it will unpack your dictionary into the form specified [(('a', 'b'), (1,3), (2,4)), ... ]

Sum numbers by letter in list of tuples

I have a list of tuples:
[ ('A',100), ('B',50), ('A',50), ('B',20), ('C',10) ]
I am trying to sum up all numbers that have the same letter. I.e. I want to output
[('A', 150), ('B', 70), ('C',10)]
I have tried using set to get the unique values but then when I try and compare the first elements to the set I get
TypeError: unsupported operand type(s) for +: 'int' and 'str'
Any quick solutions to match the numbers by letter?
Here is a one(and a half?)-liner: group by letter (for which you need to sort before), then take the sum of the second entries of your tuples.
from itertools import groupby
from operator import itemgetter
data = [('A', 100), ('B', 50), ('A', 50), ('B', 20), ('C', 10)]
res = [(k, sum(map(itemgetter(1), g)))
for k, g in groupby(sorted(data, key=itemgetter(0)), key=itemgetter(0))]
print(res)
// => [('A', 150), ('B', 70), ('C', 10)]
The above is O(n log n) — sorting is the most expensive operation. If your input list is truly large, you might be better served by the following O(n) approach:
from collections import defaultdict
data = [('A', 100), ('B', 50), ('A', 50), ('B', 20), ('C', 10)]
d = defaultdict(int)
for letter, value in data:
d[letter] += value
res = list(d.items())
print(res)
// => [('B', 70), ('C', 10), ('A', 150)]
>>> from collections import Counter
>>> c = Counter()
>>> for k, num in items:
c[k] += num
>>> c.items()
[('A', 150), ('C', 10), ('B', 70)]
Less efficient (but nicer looking) one liner version:
>>> Counter(k for k, num in items for i in range(num)).items()
[('A', 150), ('C', 10), ('B', 70)]
How about this: (assuming a is the name of the tuple you have provided)
letters_to_numbers = {}
for i in a:
if i[0] in letters_to_numbers:
letters_to_numbers[i[0]] += i[1]
else:
letters_to_numbers[i[0]] = i[1]
b = letters_to_numbers.items()
The elements of the resulting tuple b will be in no particular order.
In order to achieve this, firstly create a dictionary to store your values. Then convert the dict object to tuple list using .items() Below is the sample code on how to achieve this:
my_list = [ ('A',100), ('B',50), ('A',50), ('B',20), ('C',10) ]
my_dict = {}
for key, val in my_list:
if key in my_dict:
my_dict[key] += val
else:
my_dict[key] = val
my_dict.items()
# Output: [('A', 150), ('C', 10), ('B', 70)]
What is generating the list of tuples? Is it you? If so, why not try a defaultdict(list) to append the values to the right letter at the time of making the list of tuples. Then you can simply sum them. See example below.
>>> from collections import defaultdict
>>> val_store = defaultdict(list)
>>> # next lines are me simulating the creation of the tuple
>>> val_store['A'].append(10)
>>> val_store['B'].append(20)
>>> val_store['C'].append(30)
>>> val_store
defaultdict(<class 'list'>, {'C': [30], 'A': [10], 'B': [20]})
>>> val_store['A'].append(10)
>>> val_store['C'].append(30)
>>> val_store['B'].append(20)
>>> val_store
defaultdict(<class 'list'>, {'C': [30, 30], 'A': [10, 10], 'B': [20, 20]})
>>> for val in val_store:
... print(val, sum(val_store[val]))
...
C 60
A 20
B 40
Try this:
a = [('A',100), ('B',50), ('A',50), ('B',20), ('C',10) ]
letters = set([s[0] for s in a])
new_a = []
for l in letters:
nums = [s[1] for s in a if s[0] == l]
new_a.append((l, sum(nums)))
print new_a
Results:
[('A', 150), ('C', 10), ('B', 70)]
A simpler approach
x = [('A',100),('B',50),('A',50),('B',20),('C',10)]
y = {}
for _tuple in x:
if _tuple[0] in y:
y[_tuple[0]] += _tuple[1]
else:
y[_tuple[0]] = _tuple[1]
print [(k,v) for k,v in y.iteritems()]
A one liner:
>>> x = [ ('A',100), ('B',50), ('A',50), ('B',20), ('C',10) ]
>>> {
... k: reduce(lambda u, v: u + v, [y[1] for y in x if y[0] == k])
... for k in [y[0] for y in x]
... }.items()
[('A', 150), ('C', 10), ('B', 70)]

How to cast a list to a dictionary

I have a list as a input made from tuples where the origin is the 1st object and the neighbour is the 2nd object of the tuple.
for example :
inp : lst = [('a','b'),('b','a'),('c',''),('a','c')]
out : {'a': ('a', ['b', 'c']), 'b': ('b', ['a']), 'c': ('c', [])}
first i tried to cast the list into a dictonary,
like this
dictonary = dict(lst)
but i got an error say that
dictionary update sequence element #0 has length 1; 2 is required
The simplest is probably inside a try / except block:
lst = [('a','b'),('b','a'),('c',''),('a','c')]
out = dict()
for k, v in lst:
try:
if v != '':
out[k][1].append(v)
else:
out[k][1].append([])
except KeyError:
if v != '':
out[k] = (k, [v])
else:
out[k] = (k, [])
print out
Which gives:
{'a': ('a', ['b', 'c']), 'b': ('b', ['a']), 'c': ('c', [])}
Here's how I did it, gets the result you want, you can blend the two operations into the same loop, make a function out of it etc, have fun! Written without Python one liners kung-fu for beginner friendliness!
>>> lst = [('a','b'),('b','a'),('c',''),('a','c')]
>>> out = {}
>>> for pair in lst:
... if pair[0] not in out:
... out[pair[0]] = (pair[0], [])
...
>>> out
{'a': ('a', []), 'c': ('c', []), 'b': ('b', [])}
>>> for pair in lst:
... out[pair[0]][1].append(pair[1])
...
>>> out
{'a': ('a', ['b', 'c']), 'c': ('c', ['']), 'b': ('b', ['a'])}
Just here to mention setdefault
lst = [('a','b'),('b','a'),('c',''),('a','c')]
d = {}
for first, second in lst:
tup = d.setdefault(first, (first, []))
if second and second not in tup[1]:
tup[1].append(second)

Returning all keys that have the same corresponding value in a dictionary with python

I'm new to this site, and I have a problem that I need some help with. I am trying to find the highest integer value in a dictionary and the corresponding key and then check if there are other keys with the same value. If there are duplicate values i want to randomly select one of them and return it. As of now the code can find the highest value in the dictionary and return the key, but it returns the same key each time. I'm not able to check for other keys with the same value.
def lvl2():
global aiMove2
posValueD = {}
for x in moveList(): #Movelist returns a list of tuples
m = aiFlip(x) #aiFlip returns an integer
posValueD[x] = m
aiMove2 = max(posValueD, key = posValueD.get)
return aiMove2
After getting the maximum, you can check each key of their values. This comprehension list returns a list of keys where the value associated if the same as aiMove2.
keys = [x for x,y in posValueD.items() if y == posValueD[aiMove2]]
Here's an example in Python shell:
>>> a = {'a':1, 'b':2, 'c':2}
>>> [x for x,y in a.items() if y == 2]
['c', 'b']
You could write something like this:
max_value = 0
max_keys = []
for key,value in myDict.iteritems():
if value > max_value:
max_value = value
max_keys = [key]
elif value == max_value:
max_keys.append(key)
if max_keys:
return random.choice(max_keys)
return None
You could use itertools groupby:
from itertools import groupby
di={'e': 0, 'd': 1, 'g': 2, 'f': 0, 'a': 1, 'c': 3, 'b': 2, 'l': 2, 'i': 1, 'h': 3, 'k': 0, 'j': 1}
groups=[]
for k, g in groupby(sorted(di.items(), key=lambda t: (-t[1], t[0])), lambda t: t[1]):
groups.append(list(g))
print(groups)
# [[('c', 3), ('h', 3)],
[('b', 2), ('g', 2), ('l', 2)],
[('a', 1), ('d', 1), ('i', 1), ('j', 1)],
[('e', 0), ('f', 0), ('k', 0)]]
Or, more succinctly:
print([list(g) for k, g in groupby(
sorted(di.items(), key=lambda t: (-t[1], t[0])),
lambda t: t[1])])
Then just take the first list in the groups list of lists.

Combine two lists: aggregate values that have similar keys

I have two lists or more than . Some thing like this:
listX = [('A', 1, 10), ('B', 2, 20), ('C', 3, 30), ('D', 4, 30)]
listY = [('a', 5, 50), ('b', 4, 40), ('c', 3, 30), ('d', 1, 20),
('A', 6, 60), ('D', 7, 70])
i want to get the result that move the duplicate elements like this:
my result is to get all the list from listX + listY,but in the case there are duplicated
for example
the element ('A', 1, 10), ('D', 4, 30) of listX is presented or exitst in listY.so the result so be like this
result = [('A', 7, 70), ('B', 2, 20), ('C', 3, 30), ('D', 11, 100),
('a', 5, 50), ('b', 4, 40), ('c', 3, 30), ('d', 1, 20)]
(A, 7, 70) is obtained by adding ('A', 1, 10) and ('A', '6', '60') together
Anybody could me to solve this problem.?
Thanks.
This is pretty easy if you use a dictionary.
combined = {}
for item in listX + listY:
key = item[0]
if key in combined:
combined[key][0] += item[1]
combined[key][1] += item[2]
else:
combined[key] = [item[1], item[2]]
result = [(key, value[0], value[1]) for key, value in combined.items()]
You appear to be using lists like a dictionary. Any reason you're using lists instead of dictionaries?
My understanding of this garbled question, is that you want to add up values in tuples where the first element in the same.
I'd do something like this:
counter = dict(
(a[0], (a[1], a[2]))
for a in listX
)
for key, v1, v2 in listY:
if key not in counter:
counter[key] = (0, 0)
counter[key][0] += v1
counter[key][1] += v2
result = [(key, value[0], value[1]) for key, value in counter.items()]
I'd say use a dictionary:
result = {}
for eachlist in (ListX, ListY,):
for item in eachlist:
if item[0] not in result:
result[item[0]] = item
It's always tricky do do data manipulation if you have data in a structure that doesn't represent the data well. Consider using better data structures.
Use dictionary and its 'get' method.
d = {}
for x in (listX + listY):
y = d.get(x[0], (0, 0, 0))
d[x[0]] = (x[0], x[1] + y[1], x[2] + y[2])
d.values()

Categories

Resources