Summing up numbers in a defaultdict(list)

Summing up numbers in a defaultdict(list) - python

I've been experimenting trying to get this to work and I've exhausted every idea and web search. Nothing seems to do the trick. I need to sum numbers in a defaultdict(list) and i just need the final result but no matter what i do i can only get to the final result by iterating and returning all sums adding up to the final. What I've been trying generally,
d = { key : [1,2,3] }
running_total = 0
#Iterate values
for value in d.itervalues:
#iterate through list inside value
for x in value:
running_total += x
print running_total
The result is :
1,3,6
I understand its doing this because its iterating through the for loop. What i dont get is how else can i get to each of these list values without using a loop? Or is there some sort of method iv'e overlooked?
To be clear i just want the final number returned e.g. 6
EDIT I neglected a huge factor , the items in the list are timedealta objects so i have to use .seconds to make them into integers for adding. The solutions below make sense and I've tried similar but trying to throw in the .seconds conversion in the sum statement throws an error.
d = { key : [timedelta_Obj1,timedelta_Obj2,timedelta_Obj3] }

I think this will work for you:
sum(td.seconds for sublist in d.itervalues() for td in sublist)

Try this approach:
from datetime import timedelta as TD
d = {'foo' : [TD(seconds=1), TD(seconds=2), TD(seconds=3)],
'bar' : [TD(seconds=4), TD(seconds=5), TD(seconds=6), TD(seconds=7)],
'baz' : [TD(seconds=8)]}
print sum(sum(td.seconds for td in values) for values in d.itervalues())

You could just sum each of the lists in the dictionary, then take one final sum of the returned list.
>>> d = {'foo' : [1,2,3], 'bar' : [4,5,6,7], 'foobar' : [10]}
# sum each value in the dictionary
>>> [sum(d[i]) for i in d]
[10, 6, 22]
# sum each of the sums in the list
>>> sum([sum(d[i]) for i in d])
38

If you don't want to iterate or to use comprehensions you can use this:
d = {'1': [1, 2, 3], '2': [3, 4, 5], '3': [5], '4': [6, 7]}
print(sum(map(sum, d.values())))
If you use Python 2 and your dict has a lot of keys it's better you use imap (from itertools) and itervalues
from itertools import imap
print sum(imap(sum, d.itervalues()))

Your question was how to get the value "without using a loop". Well, you can't. But there is one thing you can do: use the high performance itertools.
If you use chain you won't have an explicit loop in your code. chain manages that for you.
>>> data = {'a': [1, 2, 3], 'b': [10, 20], 'c': [100]}
>>> import itertools
>>> sum(itertools.chain.from_iterable(data.itervalues()))
136
If you have timedelta objects you can use the same recipe.
>>> data = {'a': [timedelta(minutes=1),
timedelta(minutes=2),
timedelta(minutes=3)],
'b': [timedelta(minutes=10),
timedelta(minutes=20)],
'c': [timedelta(minutes=100)]}
>>> sum(td.seconds for td in itertools.chain.from_iterable(data.itervalues()))
8160

Related

Update multiple key/value pairs in python at once

Say I have
d = {"a":0,"b":0,"c":0}
is there a way to update the keys a and b at the same time, instead of looping over them, such like
update_keys = ["a","b"]
d.some_function(update_keys) +=[10,5]
print(d)
{"a":10,"b":5,"c":0}

Yes, you can use update like this:
d.update({'a':10, 'b':5})
Thus, your code would look this way:
d = {"a":0,"b":0,"c":0}
d.update({'a':10, 'b':5})
print(d)
and shows:
{"a":10,"b":5,"c":0}

If you mean a function that can add a new value to the existing value without an explict loop, you can definitely do it like this.
add_value = lambda d,k,v: d.update(zip(k,list(map(lambda _k,_v:d[_k]+_v,k,v)))) or d
and you can use it like this
>>> d = {"a":2,"b":3}
>>> add_value(d,["a","b"],[2,-3])
{'a': 4, 'b': 0}
There is nothing tricky here, I just replace the loop with a map and a lambda to do the update job and use list to wrap them up so Python will immediately evaluate the result of map. Then I use zip to create an updated key-value pair and use dict's update method the update the dictionary. However I really doubt if this has any practical usage since this is definitely more complex than a for loop and introduces extra complexity to the code.

Update values of multiple keys in dictionary
d = {"a":0,"b":0,"c":0}
d.update({'a': 40, 'b': 41, 'c': 89})
print(d)
{'a': 40, 'b': 41, 'c': 89}

If you are just storing integer values, then you can use the Counter class provided by the python module "collections":
from collections import Counter
d = Counter({"a":0,"b":0,"c":0})
result = d + Counter({"a":10, "b":5})
'result' will have the value of
Counter({'a': 10, 'b': 5})
And since Counter is subclassed from Dict, you have probably do not have to change anything else in your code.
>>> isinstance(result, dict)
True
You do not see the 'c' key in the result because 0-values are not stored in a Counter instance, which saves space.
You can check out more about the Counter instance here.
Storing other numeric types is supported, with some conditions:
"For in-place operations such as c[key] += 1, the value type need only support addition and subtraction. So fractions, floats, and decimals would work and negative values are supported. The same is also true for update() and subtract() which allow negative and zero values for both inputs and outputs."
Performing the inverse operation of "+" requires using the method "subtract", which is a note-worthy "gotcha".
>>> d = Counter({"a":10, "b":15})
>>> result.subtract(d)
>>> c
Counter({'a': 0, 'b': 0})

Python dictionary - list compute to avergae

I have a dictionary with a list as value.
I want to have an average of this list.
How do I compute that?
dict1 = {
'Monty Python and the Holy Grail': [[9, 10, 9.5, 8.5, 3, 7.5, 8]],
"Monty Python's Life of Brian": [[10, 10, 0, 9, 1, 8, 7.5, 8, 6, 9]],
"Monty Python's Meaning of Life": [[7, 6, 5]],
'And Now For Something Completely Different': [[6, 5, 6, 6]]
}
I have tried
dict2 = {}
for key in dict1:
dict2[key] = sum(dict1[key])
but it says: "TypeError: unsupported operand type(s) for +: 'int' and 'list'"

As noted in other posts, the first issue is that your dictionary keys are lists of lists, and not simple lists. The second issue is that you were calling sum, without then dividing by the number of elements, which would not give you an average.
If you are willing to use numpy, try this:
import numpy as np
dict_of_means = {k:np.mean(v) for k,v in dict1.items()}
>>> dict_of_means
{'Monty Python and the Holy Grail': 7.9285714285714288, "Monty Python's Life of Brian": 6.8499999999999996, "Monty Python's Meaning of Life": 6.0, 'And Now For Something Completely Different': 5.75}
Or, without using numpy or any external packages, you can do it manually by first flattening your lists of lists in the keys, and going through the same type of dict comprehension, but getting the sum of your flattened list and then dividing by the number of elements in that flattened list:
dict_of_means = {k: sum([i for x in v for i in x])/len([i for x in v for i in x])
for k, v in dict1.items()}
Note that [i for x in v for i in x] takes a list of lists v and flattens it to a simple list.
FYI, the dictionary comprehension syntax is more or less equivalent to this for loop:
dict_of_means = {}
for k,v in dict1.items():
dict_of_means[k] = sum([i for x in v for i in x])/len([i for x in v for i in x])
There is an in-depth description of dictionary comprehensions in the question I linked above.

If you don't want to use external libraries and you want to keep that structure:
dict2 = {}
for key in dict1:
dict2[key] = sum(dict1[key][0])/len(dict1[key][0])

The problem is that your values are not 1D lists, they're 2D lists. If you simply remove the extra brackets, your solution should work.
Also don't forget to divide the sum of the list by the length of the list (and if you're using python 2, to import the new division).

You can do that simply by using itertools.chain and a helper function to compute average.
Here is the helper function to compute average
def average(iterable):
sum = 0.0
count = 0
for v in iterable:
sum += v
count += 1
if count > 0:
return sum / count
If you want to average for each key, you can simply do that using dictionary comprehension and helper function we wrote above:
from itertools import chain
averages = {k: average(chain.from_iterable(v)) for k, v in dict1.items()}
Or If you want to get average across all the keys:
from itertools import chain
average(chain.from_iterable(chain.from_iterable(dict1.values())))

Your lists are nested, all being lists of a single item, which is itself a list of the actual numbers. Here I extract these lists using val[0], val being the outer lists:
for key, val in dict1.copy().items():
the_list = val[0]
dict1[key] = sum(the_list)/len(the_list)
This replaces all these nested lists with the average you are after. Also, you should never mutate anything while looping over it. Therefore, a copy of the dict is used above.
Alternatively you could make use of the fancier dictionary comprehension:
dict2 = {key: sum(the_list)/len(the_list) for key, (the_list,) in dict1.items()}
Note the clever but subtle way the inner list is extracted here.

In Python how to obtain a partial view of a dict?

Is it possible to get a partial view of a dict in Python analogous of pandas df.tail()/df.head(). Say you have a very long dict, and you just want to check some of the elements (the beginning, the end, etc) of the dict. Something like:
dict.head(3) # To see the first 3 elements of the dictionary.
{[1,2], [2, 3], [3, 4]}
Thanks

Kinda strange desire, but you can get that by using this
from itertools import islice
# Python 2.x
dict(islice(mydict.iteritems(), 0, 2))
# Python 3.x
dict(islice(mydict.items(), 0, 2))
or for short dictionaries
# Python 2.x
dict(mydict.items()[0:2])
# Python 3.x
dict(list(mydict.items())[0:2])

Edit:
in Python 3.x:
Without using libraries it's possible to do it this way. Use method:
.items()
which returns a list of dictionary keys with values.
It's necessary to convert it to a list otherwise an error will occur 'my_dict' object is not subscriptable. Then convert it to the dictionary. Now it's ready to slice with square brackets.
dict(list(my_dict.items())[:3])

import itertools
def glance(d):
return dict(itertools.islice(d.iteritems(), 3))
>>> x = {1:2, 3:4, 5:6, 7:8, 9:10, 11:12}
>>> glance(x)
{1: 2, 3: 4, 5: 6}
However:
>>> x['a'] = 2
>>> glance(x)
{1: 2, 3: 4, u'a': 2}
Notice that inserting a new element changed what the "first" three elements were in an unpredictable way. This is what people mean when they tell you dicts aren't ordered. You can get three elements if you want, but you can't know which three they'll be.

I know this question is 3 years old but here a pythonic version (maybe simpler than the above methods) for Python 3.*:
[print(v) for i, v in enumerate(my_dict.items()) if i < n]
It will print the first n elements of the dictionary my_dict

one-up-ing #Neb's solution with Python 3 dict comprehension:
{k: v for i, (k, v) in enumerate(my_dict.items()) if i < n}
It returns a dict rather than printouts

For those who would rather solve this problem with pandas dataframes. Just stuff your dictionary mydict into a dataframe, rotate it, and get the first few rows:
pd.DataFrame(mydict, index=[0]).T.head()
0 hi0
1 hi1
2 hi2
3 hi3
4 hi4

From the documentation:
CPython implementation detail: Keys and values are listed in an
arbitrary order which is non-random, varies across Python
implementations, and depends on the dictionary’s history of insertions
and deletions.
I've only toyed around at best with other Python implementations (eg PyPy, IronPython, etc), so I don't know for certain if this is the case in all Python implementations, but the general idea of a dict/hashmap/hash/etc is that the keys are unordered.
That being said, you can use an OrderedDict from the collections library. OrderedDicts remember the order of the keys as you entered them.

If keys are someway sortable, you can do this:
head = dict([(key, myDict[key]) for key in sorted(myDict.keys())[:3]])
Or perhaps:
head = dict(sorted(mydict.items(), key=lambda: x:x[0])[:3])
Where x[0] is the key of each key/value pair.

list(reverse_word_index.items())[:10]
Change the number from 10 to however many items of the dictionary reverse_word_index you want to preview

A quick and short solution can be this:
import pandas as pd
d = {"a": [1,2], "b": [2, 3], "c": [3, 4]}
pd.Series(d).head()
a [1, 2]
b [2, 3]
c [3, 4]
dtype: object

This gives back a dictionary:
dict(list(my_dictname.items())[0:n])
If you just want to have a glance of your dict, then just do:
list(freqs.items())[0:n]

Order of items in a dictionary is preserved in Python 3.7+, so this question makes sense.
To get a dictionary with only 10 items from the start you can use pandas:
d = {"a": [1,2], "b": [2, 3], "c": [3, 4]}
import pandas as pd
result = pd.Series(d).head(10).to_dict()
print(result)
This will produce a new dictionary.

d = {"a": 1,"b": 2,"c": 3}
for i in list(d.items())[:2]:
print('{}:{}'.format(d[i][0], d[i][1]))
a:1
b:2

How to only store 3 values for a key in a dictionary? Python

So I tried to only allow the program to store only last 3 scores(values) for each key(name) however I experienced a problem of the program only storing the 3 scores and then not updating the last 3 or the program appending more values then it should do.
The code I have so far:
#appends values if a key already exists
while tries < 3:
d.setdefault(name, []).append(scores)
tries = tries + 1

Though I could not fully understand your question, the concept that I derive from it is that, you want to store only the last three scores in the list. That is a simple task.
d.setdefault(name,[]).append(scores)
if len(d[name])>3:
del d[name][0]
This code will check if the length of the list exceeds 3 for every addition. If it exceeds, then the first element (Which is added before the last three elements) is deleted

Use a collections.defaultdict + collections.deque with a max length set to 3:
from collections import deque,defaultdict
d = defaultdict(lambda: deque(maxlen=3))
Then d[name].append(score), if the key does not exist the key/value will be created, if it does exist we will just append.
deleting an element from the start of a list is an inefficient solution.
Demo:
from random import randint
for _ in range(10):
for name in range(4):
d[name].append(randint(1,10))
print(d)
defaultdict(<function <lambda> at 0x7f06432906a8>, {0: deque([9, 1, 1], maxlen=3), 1: deque([5, 5, 8], maxlen=3), 2: deque([5, 1, 3], maxlen=3), 3: deque([10, 6, 10], maxlen=3)})

One good way for keeping the last N items in python is using deque with maxlen N, so in this case you can use defaultdict and deque functions from collections module.
example :
>>> from collections import defaultdict ,deque
>>> l=[1,2,3,4,5]
>>> d=defaultdict()
>>> d['q']=deque(maxlen=3)
>>> for i in l:
... d['q'].append(i)
...
>>> d
defaultdict(<type 'collections.deque'>, {'q': deque([3, 4, 5], maxlen=3)})

A slight variation on another answer in case you want to extend the list in the entry name
d.setdefault(name,[]).extend(scores)
if len(d[name])>3:
del d[name][:-3]

from collections import defaultdict
d = defaultdict(lambda:[])
d[key].append(val)
d[key] = d[key][:3]
len(d[key])>2 or d[key].append(value) # one string solution

Set dictionary values based on range

index = {
u'when_air': 0,
u'chrono': 1,
u'age_marker': 2,
u'name': 3
}
How can I make this more beautiful (and clear) way than just manually setting each value?
like:
index = dict_from_range(
[u'when_air', u'chrono', u'age_marker', u'name'],
range(4)
)

You can feed the results of zip() to the builtin dict():
>>> names = [u'when_air', u'chrono', u'age_marker', u'name']
>>> print(dict(zip(names, range(4))))
{'chrono': 1, 'name': 3, 'age_marker': 2, 'when_air': 0}
zip() will return a list of tuples, where each tuple is the ith element from names and range(4). dict() knows how to create a dictionary from that.
Notice that if you give sequences of uneven lengths to zip(), the results are truncated. Thus it might be smart to use range(len(names)) as the argument, to guarantee an equal length.
>>> print(dict(zip(names, range(len(names)))))
{'chrono': 1, 'name': 3, 'age_marker': 2, 'when_air': 0}

You can use a dict comprehension together with the built-in function enumerate to build the dictionary from the keys in the desired order.
Example:
keys = [u'when_air', u'chrono', u'age_marker', u'name']
d = {k: i for i,k in enumerate(keys)}
print d
The output is:
{u'age_marker': 2, u'when_air': 0, u'name': 3, u'chrono': 1}
Note that with Python 3.4 the enum module was added. It may provide the desired semantics more conveniently than a dictionary.
For reference:
http://legacy.python.org/dev/peps/pep-0274/
https://docs.python.org/2/library/functions.html#enumerate
https://docs.python.org/3/library/enum.html

index = {k:v for k,v in zip(['when_air','chrono','age_marker','name'],range(4))}

This?
#keys = [u'when_air', u'chrono', u'age_marker', u'name']
from itertools import count
print dict(zip(keys, count()))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Summing up numbers in a defaultdict(list) - python

I think this will work for you: sum(td.seconds for sublist in d.itervalues() for td in sublist)

Try this approach: from datetime import timedelta as TD d = {'foo' : [TD(seconds=1), TD(seconds=2), TD(seconds=3)], 'bar' : [TD(seconds=4), TD(seconds=5), TD(seconds=6), TD(seconds=7)], 'baz' : [TD(seconds=8)]} print sum(sum(td.seconds for td in values) for values in d.itervalues())

Related

Update multiple key/value pairs in python at once

Python dictionary - list compute to avergae

In Python how to obtain a partial view of a dict?

How to only store 3 values for a key in a dictionary? Python

Set dictionary values based on range

Categories

Resources