Add up values of same key in a nested dictionary

Add up values of same key in a nested dictionary - python

I'm trying to count the tot values of sub dictionaries with same subkeys. I have a list containing the relevant keys mylist, I only need to count the total of the values for each element in the list.
mylist = ['age','answ1', 'answ2', 'answ3']
d = {'01': {'age':19, 'answ1':3, 'answ2':7, 'answ3':2}, '02': {'age':52, 'answ1':8, 'answ2':1, 'answ3':10},...}
I've tried
tot = []
for k,v in d.items():
for ke, va in v.items():
for i in mylist[0:]
count=0
if ke == i:
count+=v[ke]
tot.append(count)
but instead of the sum of the values with same key, I get the values of different keys in the order of appearance in the dictionary.
The expected outcome would be
tot = [71, 11, 8, 12]
What I get is
tot = [19, 3, 7, 2, 52, 8, 1, 10]

With collections.Counter:
>>> ctr = sum(map(Counter, d.values()), Counter())
>>> [ctr[x] for x in mylist]
[71, 11, 8, 12]
Or:
>>> [sum(e[k] for e in d.values()) for k in mylist]
[71, 11, 8, 12]
In case some sub dicts can have keys missing, just use e.get(k, 0). The Counter solution doesn't need it, it supplies zeros by default.
Hmm, since you now accepted a dict result solution...
>>> dict(sum(map(Counter, d.values()), Counter()))
{'age': 71, 'answ1': 11, 'answ2': 8, 'answ3': 12}
Or maybe just
>>> sum(map(Counter, d.values()), Counter())
Counter({'age': 71, 'answ3': 12, 'answ1': 11, 'answ2': 8})
Although these might have more keys than just the desired ones, if there are more in your data.

If you wish to store your result in a dictionary, you can create one with the keys from your list and calculate the results there.
result = {i: 0 for i in mylist}
for k, v in d.items():
result['age'] += v['age']
result['answ1'] += v['answ1']
result['answ2'] += v['answ2']
result['answ3'] += v['answ3']
result
{'age': 71, 'answ1': 11, 'answ2': 8, 'answ3': 12}
However this does rely on the keys not changing, order should not matter.
EDIT
You can do this regardless of key names with the following update. Note it adds one extra iteration.
result = {i: 0 for i in mylist}
for k, v in d.items():
for ke, va in v.items():
result[ke] += v[ke]

mylist = ['age','answ1', 'answ2', 'answ3']
d = {'01': {'age':19, 'answ1':3, 'answ2':7, 'answ3':2}, '02': {'age':52, 'answ1':8, 'answ2':1, 'answ3':10}}
tot = [0] * len(mylist)
for k in d:
for idx, i in enumerate(mylist):
tot[idx] += d[k].get(i, 0)
print(tot)
Prints:
[71, 11, 8, 12]

Try the following code
for i in mylist:
count=0
for k,v in d.items():
for ke, va in v.items():
if ke == i:
count+=va
tot.append(count)
~

You can accomplish this using list comprehensions, zip and map. Firstly, we want to extract the corresponding values from each sub-dict:
>>> vals = [[v[k] for k in mylist] for v in d.values()]
>>> vals
[[19, 3, 7, 2], [52, 8, 1, 10]]
Now we want to perform an element-wise sum across all the sub-lists:
>>> result = map(sum, zip(*vals))
>>> list(result)
[71, 11, 8, 12]
Putting it all in one line:
>>> result = map(sum, zip(*([v[k] for k in mylist] for v in d.values())))
>>> list(result)
[71, 11, 8, 12]
This approach has the benefit of only accessing the keys that we want to build instead of constructing a full Counter and then afterwards extracting the data.

Same thing but different.
>>> import operator
>>> f = operator.itemgetter(*mylist)
>>> vals = map(f,d.values())
>>> sums = map(sum,zip(*vals))
>>> result = dict(zip(mylist,sums))
>>> result
{'age': 71, 'answ1': 11, 'answ2': 8, 'answ3': 12}
If you don't want the dict, skip that and use result = list(sums)

Related

Efficient way to find key by value in Python dict where dict values are iterables

I have an iterable of unique numbers:
lst = [14, 11, 8, 55]
where every value is somewhere among numbers of dict's iterable-values, say lists:
dict_itms.items() = dict_items([(1, [0, 1, 2, 3]), (2, [11, 14, 12]), (3, [30, 8, 42]), (4, [55, 6])])
I have to find each lst element in a dict such a way that, finally, I would have a list of keys pairwise against each element in lst.
This method:
keys_ = []
for a in lst:
for k, v in dict_itms.items():
if a in v:
keys_ += [k]
break
else:
continue
gives:
[2, 2, 3, 4]
Is there more efficient way to find every key pairwise against each number to find?

You can use any in a list comprehension:
print([k for k,v in dict_itms.items() if any(x in lst for x in v)])
Output:
[2, 3, 4]
Update
According to this answer not set(v).isdisjoint(lst) is the fastest:
print([k for k,v in dict_itms.items() if not set(v).isdisjoint(lst)])

It's unclear what you mean by 'efficient'; do you need this to be efficient in a given pass or in aggregate? The reason I ask is that typically the best way to handle this in aggregate is by doing a pre-processing pass that flips your key-value relation:
reverse_lookup = dict()
for k,v in d.items():
for i in v:
keys = reverse_lookup.get(i, []) # Provide an empty list if this item not yet found
keys.append(k)
reverse_lookup[i] = keys
Now that you have your reverse lookup processed, you can use it in a straightforward manner:
result = [reverse_lookup.get(i) for i in lst]
# `result` is actually a list of lists, so as to allow duplicates. You will need to flatten it, or change the reverse lookup to ignore dupes.
The initial processing for the reverse lookup is O(n*m), where n*m is the total length of the original dictionary values summed. However, each lookup for the lst portion is O(1), so if you squint and have enough lookups this is O(p), where p is the length of lst. This will be wildly more efficient than other approaches if you have to do it a lot, and much less efficient if you're only ever passing over a given dictionary once.

A simple and Pythonic implementation:
d = dict([(1, [0, 1, 2, 3]), (2, [11, 14, 12]), (3, [30, 8, 42]), (4, [55, 6])])
xs = [14, 11, 8, 55]
keys = [k for k, v in d.items() if set(v).intersection(xs)]
print(keys)
However, this doesn't duplicate the 2 key, which your example does - not sure if that's behaviour you need?

Remove all same characters in list

I want to know how to remove all repeated characters in this list below (not use set()). I tried to use remove() in list but it just can remove first occurence of value.
Ex: input = [23,42,65,73,5,2,73,51]
output = [23,42,65,5,2,51]
def number_repeated(lst):
index = int(input('Remove index : '))
a = 0
for num in lst:
if num == index:
a += 1
print('The number of digits are repeated: {}'.format(a))
list(set(lst)).remove(index)
return lst
print(number_repeated([23, 42, 65, 73, 5, 2, 73, 51]))
Output:
[65, 2, 5, 42, 51, 23]
Also, why in this code above when I use set() is the output not sorted?

You can use collections.Counter() to remove all the duplicates:
Example:
from collections import Counter
originalList = [23,42,65,73,5,2,73,51]
filteredList = [k for k, v in Counter(originalList).items() if v == 1]
print(filteredList)

Try this
def myFunc(lst):
list2 = []
for x in lst:
if not x in list2:
list2.append(x)
list2 = sorted(list2)
return list2
print(myFunc([23,42,65,73,5,2,73,51]))
Output:
[2, 5, 23, 42, 51, 65, 73]

Checking to see if values of both defaultdict match for same keys

I have two defaultdicts and essentially I want to see if the values on both dictionaries match up for the same corresponding keys. For example: {1,4} {1,4}. So it looks for matching keys which is 1 and then checks to see if their value match up 4 which it does.
So in my case I have:
keyOne = [30, 30, 60, 70, 90]
valueOne = [3, 4, 6, 7, 0]
KeyTwo = [30, 30, 60, 70, 90]
valueTwo = [4, 5, 6, -10, 9]
I create two defaultdicts as such:
one = defaultdict(list)
for k, v in zip(keyOne, valueOne):
one[k].append(v)
two = defaultdict(list)
for k, v in zip(keyTwo, valueTwo):
two[k].append(v)
I then want to add the entries where the keys match but the values don't - so I write this, but it doesn't work:
three = defaultdict(list)
for k,v in one.items():
for key in k:
if key in two.items():
if (value != v):
three[k].append(value)
I am not sure where I am going wrong and it would mean a lot if someone could help me fix it. I'm new to programming and really want to learn

You got a typo and can simplify your loop:
from collections import defaultdict
keyOne = [30, 30, 60, 70, 90]
valueOne = [3, 4, 6, 7, 0]
keyTwo = [30, 30, 60, 70, 90] # typo KeyTwo
valueTwo = [4, 5, 6, -10, 9]
one = defaultdict(list)
for k, v in zip(keyOne, valueOne):
one[k].append(v)
two = defaultdict(list)
for k, v in zip(keyTwo, valueTwo):
two[k].append(v)
three = defaultdict(list)
for k1,v1 in one.items(): # there is no need for a double loop if you just want
v2 = two.get(k1) # to check the keys that are duplicate - simply query
if v2: # the dict 'two' for this key and see if it has value
three[k1] = [value for value in v2 if value not in v1]
# delete key again if empty list (or create a temp list and only add if non empty)
if not three[k1]:
del three[k1]
print(three)
Output:
# all values in `two` for "keys" in `one` that are not values of `one[key]`
defaultdict(<class 'list'>, {30: [5], 70: [-10], 90: [9]})
Using dict.get(key) returns None if the key is not in the dictionary and eliminates the if - checking before retrieval. You still need the if afterwards though - but I think this code is "cleaner".
See Why dict.get(key) instead of dict[key]?

Python: split list into indices based on consecutive identical values

If you could advice me how to write the script to split list by number of values I mean:
my_list =[11,11,11,11,12,12,15,15,15,15,15,15,20,20,20]
And there are 11-4,12-2,15-6,20-3 items.
So in next list for exsample range(0:100)
I have to split on 4,2,6,3 parts
So I counted same values and function for split list, but it doen't work with list:
div=Counter(my_list).values() ##counts same values in the list
def chunk(it, size):
it = iter(it)
return iter(lambda: tuple(islice(it, size)), ())
What do I need:
Out: ([0,1,2,3],[4,5],[6,7,8,9,10,11], etc...]

You can use enumerate, itertools.groupby, and operator.itemgetter:
In [45]: import itertools
In [46]: import operator
In [47]: [[e[0] for e in d[1]] for d in itertools.groupby(enumerate(my_list), key=operator.itemgetter(1))]
Out[47]: [[0, 1, 2, 3], [4, 5], [6, 7, 8, 9, 10, 11], [12, 13, 14]]
What this does is as follows:
First it enumerates the items.
It groups them, using the second item in each enumeration tuple (the original value).
In the resulting list per group, it uses the first item in each tuple (the enumeration)

Solution in Python 3 , If you are only using counter :
from collections import Counter
my_list =[11,11,11,11,12,12,15,15,15,15,15,15,20,20,20]
count = Counter(my_list)
div= list(count.keys()) # take only keys
div.sort()
l = []
num = 0
for i in div:
t = []
for j in range(count[i]): # loop number of times it occurs in the list
t.append(num)
num+=1
l.append(t)
print(l)
Output:
[[0, 1, 2, 3], [4, 5], [6, 7, 8, 9, 10, 11], [12, 13, 14]]
Alternate Solution using set:
my_list =[11,11,11,11,12,12,15,15,15,15,15,15,20,20,20]
val = set(my_list) # filter only unique elements
ans = []
num = 0
for i in val:
temp = []
for j in range(my_list.count(i)): # loop till number of occurrence of each unique element
temp.append(num)
num+=1
ans.append(temp)
print(ans)
EDIT:
As per required changes made to get desired output as mention in comments by #Protoss Reed
my_list =[11,11,11,11,12,12,15,15,15,15,15,15,20,20,20]
val = list(set(my_list)) # filter only unique elements
val.sort() # because set is not sorted by default
ans = []
index = 0
l2 = [54,21,12,45,78,41,235,7,10,4,1,1,897,5,79]
for i in val:
temp = []
for j in range(my_list.count(i)): # loop till number of occurrence of each unique element
temp.append(l2[index])
index+=1
ans.append(temp)
print(ans)
Output:
[[54, 21, 12, 45], [78, 41], [235, 7, 10, 4, 1, 1], [897, 5, 79]]
Here I have to convert set into list because set is not sorted and I think remaining is self explanatory.
Another Solution if input is not always Sorted (using OrderedDict):
from collections import OrderedDict
v = OrderedDict({})
my_list=[12,12,11,11,11,11,20,20,20,15,15,15,15,15,15]
l2 = [54,21,12,45,78,41,235,7,10,4,1,1,897,5,79]
for i in my_list: # maintain count in dict
if i in v:
v[i]+=1
else:
v[i]=1
ans =[]
index = 0
for key,values in v.items():
temp = []
for j in range(values):
temp.append(l2[index])
index+=1
ans.append(temp)
print(ans)
Output:
[[54, 21], [12, 45, 78, 41], [235, 7, 10], [4, 1, 1, 897, 5, 79]]
Here I use OrderedDict to maintain order of input sequence which is random(unpredictable) in case of set.
Although I prefer #Ami Tavory's solution which is more pythonic.
[Extra work: If anybody can convert this solution into list comprehension it will be awesome because i tried but can not convert it to list comprehension and if you succeed please post it in comments it will help me to understand]

Python - Dictionary with Lists and tuples

End Goal to create the following Dictionary
DictFinal = {'peach': [7,33], 'berries': [33,47], 'grapes': [47,98], 'apple': [98,200]}
snippets of code
FinalEndofline = 200
List1 = ["apple","peach","grapes","berries"]
List2 = [98,7,47,33]
Step1 : To create a dictionary using key value.List1 is the key and List2 is value.
professions_dict = dict(zip(List1, List2))
print professions_dict
Output - {'apple': 98, 'peach': 7, 'grapes': 47, 'berries': 33}
Step 2 : Sort the dictionary based on value
sorted_x = sorted(professions_dict.items(), key=operator.itemgetter(1))
print sorted_x
Output - {'peach': 7, 'berries': 33, 'grapes': 47, 'apple': 98}
Step 3 : Now how do I achieve
DictFinal = {'peach': [7,33], 'berries': [33,47], 'grapes': [47,98], 'apple': [98,200]}
The Dictfinal is again a key value, but a value having the list with first value and second value and goes on and it appends the finalendofline variable to last value list

>>> List1 = ["apple","peach","grapes","berries"]
>>> List2 = [98,7,47,33]
>>> List1 = [x[1] for x in sorted(zip(List2, List1))]
>>> List2.sort()
>>> List2.append(200)
>>> DictFinal = dict((key, List2[i:i+2]) for i, key in enumerate(List1))
>>> DictFinal
{'berries': [33, 47], 'grapes': [47, 98], 'peach': [7, 33], 'apple': [98, 200]}
That's fairly straightforward. This is probably a bit more efficient, though -- only requires one sort(). If efficiency really matters, you could also use itertools to do the slice on the second zip (and, of course, with Python 2, you would want to use izip instead of zip).
>>> List1 = ["apple","peach","grapes","berries"]
>>> List2 = [98,7,47,33]
>>> zipped = sorted(zip(List2, List1)) + [(200,)]
>>> FinalDict = dict((x[1], [x[0], y[0]]) for x, y in zip(zipped, zipped[1:]))

Maybe try:
List2 = ["""Blah""",FinalEndofLine]
unsorted = dict(zip(List1,[[List2[i],List2[i+1]] for i in range(len(l2) - 1)]))
DictFinal = sorted(unsorted.items(), key = lambda x: x[1][0])
This seemed to work for me, if I understand your problem fully. List2 just needs that FinalEndofLine at the end.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Add up values of same key in a nested dictionary - python

mylist = ['age','answ1', 'answ2', 'answ3'] d = {'01': {'age':19, 'answ1':3, 'answ2':7, 'answ3':2}, '02': {'age':52, 'answ1':8, 'answ2':1, 'answ3':10}} tot = [0] * len(mylist) for k in d: for idx, i in enumerate(mylist): tot[idx] += d[k].get(i, 0) print(tot) Prints: [71, 11, 8, 12]

Try the following code for i in mylist: count=0 for k,v in d.items(): for ke, va in v.items(): if ke == i: count+=va tot.append(count) ~

Related

Efficient way to find key by value in Python dict where dict values are iterables

Remove all same characters in list

Checking to see if values of both defaultdict match for same keys

Python: split list into indices based on consecutive identical values

Python - Dictionary with Lists and tuples

Categories

Resources