Python dictionary - list compute to avergae

Python dictionary - list compute to avergae - python

I have a dictionary with a list as value.
I want to have an average of this list.
How do I compute that?
dict1 = {
'Monty Python and the Holy Grail': [[9, 10, 9.5, 8.5, 3, 7.5, 8]],
"Monty Python's Life of Brian": [[10, 10, 0, 9, 1, 8, 7.5, 8, 6, 9]],
"Monty Python's Meaning of Life": [[7, 6, 5]],
'And Now For Something Completely Different': [[6, 5, 6, 6]]
}
I have tried
dict2 = {}
for key in dict1:
dict2[key] = sum(dict1[key])
but it says: "TypeError: unsupported operand type(s) for +: 'int' and 'list'"

As noted in other posts, the first issue is that your dictionary keys are lists of lists, and not simple lists. The second issue is that you were calling sum, without then dividing by the number of elements, which would not give you an average.
If you are willing to use numpy, try this:
import numpy as np
dict_of_means = {k:np.mean(v) for k,v in dict1.items()}
>>> dict_of_means
{'Monty Python and the Holy Grail': 7.9285714285714288, "Monty Python's Life of Brian": 6.8499999999999996, "Monty Python's Meaning of Life": 6.0, 'And Now For Something Completely Different': 5.75}
Or, without using numpy or any external packages, you can do it manually by first flattening your lists of lists in the keys, and going through the same type of dict comprehension, but getting the sum of your flattened list and then dividing by the number of elements in that flattened list:
dict_of_means = {k: sum([i for x in v for i in x])/len([i for x in v for i in x])
for k, v in dict1.items()}
Note that [i for x in v for i in x] takes a list of lists v and flattens it to a simple list.
FYI, the dictionary comprehension syntax is more or less equivalent to this for loop:
dict_of_means = {}
for k,v in dict1.items():
dict_of_means[k] = sum([i for x in v for i in x])/len([i for x in v for i in x])
There is an in-depth description of dictionary comprehensions in the question I linked above.

If you don't want to use external libraries and you want to keep that structure:
dict2 = {}
for key in dict1:
dict2[key] = sum(dict1[key][0])/len(dict1[key][0])

The problem is that your values are not 1D lists, they're 2D lists. If you simply remove the extra brackets, your solution should work.
Also don't forget to divide the sum of the list by the length of the list (and if you're using python 2, to import the new division).

You can do that simply by using itertools.chain and a helper function to compute average.
Here is the helper function to compute average
def average(iterable):
sum = 0.0
count = 0
for v in iterable:
sum += v
count += 1
if count > 0:
return sum / count
If you want to average for each key, you can simply do that using dictionary comprehension and helper function we wrote above:
from itertools import chain
averages = {k: average(chain.from_iterable(v)) for k, v in dict1.items()}
Or If you want to get average across all the keys:
from itertools import chain
average(chain.from_iterable(chain.from_iterable(dict1.values())))

Your lists are nested, all being lists of a single item, which is itself a list of the actual numbers. Here I extract these lists using val[0], val being the outer lists:
for key, val in dict1.copy().items():
the_list = val[0]
dict1[key] = sum(the_list)/len(the_list)
This replaces all these nested lists with the average you are after. Also, you should never mutate anything while looping over it. Therefore, a copy of the dict is used above.
Alternatively you could make use of the fancier dictionary comprehension:
dict2 = {key: sum(the_list)/len(the_list) for key, (the_list,) in dict1.items()}
Note the clever but subtle way the inner list is extracted here.

Related

when I use a tuple for a key of a dictionary, can I search data with only one element of the tuple key?

Let's assume that there's a dictionary variable 'dict' like below. with tuple type keys in this case.
dict = {(a,2019): 6, (a,2020): 7 , (a,2021):8, (a,2022):9, (b,2020):8, (b,2021):10}
And then I want to search all values with keys that has 'a' for the first element of the key.
So after search I want to put the result set into a list 'result'. result will have the values like below.
result = [6,7,8,9]
I would be able to get values like below
result.append(dict.get((a,2019)))
result.append(dict.get((a,2020)))
....
but I wanted to search data by matching only once for example using regex in this case like
result=dict.get((a, "\d{4}"))
Obviously, this doesn't work.
I just want to know if there's a way that I can search data by matching only one element of tuple type keys in this case.

You may just want a dictionary of dictionaries. If you define:
from collections import defaultdict
mydict = defaultdict(dict)
Then you can write things like mydict['a'][2010] = 100 and have what you expect.
Looking at the value of mydict['a'] will returns dictionary of all years in which the first part of the key is 'a'.

How about a list comprehension to see if a is in the key of you dictionary?
mydict = {(a,2019): 6, (a,2020): 7 , (a,2021):8, (a,2022):9, (b,2020):8, (b,2021):10}
result = [v for k,v in mydict.items() if a in k]
# [6, 7, 8, 9]

You can use list comprehension:
dct = {('a',2019): 6, ('a',2020): 7 , ('a',2021):8, ('a',2022):9, ('b',2020):8, ('b',2021):10}
result = [v for k, v in dct.items() if k[0] == 'a']
print(result) # [6, 7, 8, 9]

Delete items in a dictionary with values that don't equal the highest value in Python

Essentially I want to delete every key in a dictionary if its value doesn't equal the highest value.
Let's say this is the dictionary:
myDict = {"Bob": 1, "Bill": 5, "Barry": 4, "Steve": 5}
I'm able to sort it by value using this:
myDict = sorted(myDict, key=myDict.get, reverse=True)
Now I want to remove any key in the dictionary that doesn't equal the highest value (in this case '5'). To end up with this:
myDict = {"Bill": 5, "Steve": 5}
I've tried using this for loop:
for item, v in myDict:
if v < myDict[0]:
del myDict[v]
But I get this error:
ValueError: too many values to unpack (expected 2)
This is a) my first time posting here, and b) I've only been learning Python for a few months so I'm sorry if I've made any stupid mistakes.

for item, v in myDict just give you keys mydict, and you are collecting that key in item, v that's why,
use myDict.items() or myDict.iteritems().
for item, v in myDict.iteritems():
if v < myDict[0]:
del myDict[v]
To get Highest value of myDict
max(myDict.values())
To delete keys from Dict never change the iterator you are iterating on, it will give you RuntimeError. So copy it in another variable and change previous one as Anand S Kumar suggested.

You should never alter the object you're iterating over, that usually yields unexpected results (internal pointers get shifted and you miss elements in your iteration and such). You best gather the keys you want to delete and then remove the keys in a separate iteration:
keys = [k for k in myDict.keys() if myDict[k] == max(myDict.values())];
for k in keys: del myDict[k];
It might be best to put the max expression in a variable too so it doesn't get evaluated multiple times. Not sure if Python's able to optimize that for you (probably not).

You can use dictionary comprehension to create a new dictionary:
newDict = {k: v for k,v in myDict.items() if v == max(myDict.values())}
The output for newDict:
{'Steve': 5, 'Bill': 5}

In Python how to obtain a partial view of a dict?

Is it possible to get a partial view of a dict in Python analogous of pandas df.tail()/df.head(). Say you have a very long dict, and you just want to check some of the elements (the beginning, the end, etc) of the dict. Something like:
dict.head(3) # To see the first 3 elements of the dictionary.
{[1,2], [2, 3], [3, 4]}
Thanks

Kinda strange desire, but you can get that by using this
from itertools import islice
# Python 2.x
dict(islice(mydict.iteritems(), 0, 2))
# Python 3.x
dict(islice(mydict.items(), 0, 2))
or for short dictionaries
# Python 2.x
dict(mydict.items()[0:2])
# Python 3.x
dict(list(mydict.items())[0:2])

Edit:
in Python 3.x:
Without using libraries it's possible to do it this way. Use method:
.items()
which returns a list of dictionary keys with values.
It's necessary to convert it to a list otherwise an error will occur 'my_dict' object is not subscriptable. Then convert it to the dictionary. Now it's ready to slice with square brackets.
dict(list(my_dict.items())[:3])

import itertools
def glance(d):
return dict(itertools.islice(d.iteritems(), 3))
>>> x = {1:2, 3:4, 5:6, 7:8, 9:10, 11:12}
>>> glance(x)
{1: 2, 3: 4, 5: 6}
However:
>>> x['a'] = 2
>>> glance(x)
{1: 2, 3: 4, u'a': 2}
Notice that inserting a new element changed what the "first" three elements were in an unpredictable way. This is what people mean when they tell you dicts aren't ordered. You can get three elements if you want, but you can't know which three they'll be.

I know this question is 3 years old but here a pythonic version (maybe simpler than the above methods) for Python 3.*:
[print(v) for i, v in enumerate(my_dict.items()) if i < n]
It will print the first n elements of the dictionary my_dict

one-up-ing #Neb's solution with Python 3 dict comprehension:
{k: v for i, (k, v) in enumerate(my_dict.items()) if i < n}
It returns a dict rather than printouts

For those who would rather solve this problem with pandas dataframes. Just stuff your dictionary mydict into a dataframe, rotate it, and get the first few rows:
pd.DataFrame(mydict, index=[0]).T.head()
0 hi0
1 hi1
2 hi2
3 hi3
4 hi4

From the documentation:
CPython implementation detail: Keys and values are listed in an
arbitrary order which is non-random, varies across Python
implementations, and depends on the dictionary’s history of insertions
and deletions.
I've only toyed around at best with other Python implementations (eg PyPy, IronPython, etc), so I don't know for certain if this is the case in all Python implementations, but the general idea of a dict/hashmap/hash/etc is that the keys are unordered.
That being said, you can use an OrderedDict from the collections library. OrderedDicts remember the order of the keys as you entered them.

If keys are someway sortable, you can do this:
head = dict([(key, myDict[key]) for key in sorted(myDict.keys())[:3]])
Or perhaps:
head = dict(sorted(mydict.items(), key=lambda: x:x[0])[:3])
Where x[0] is the key of each key/value pair.

list(reverse_word_index.items())[:10]
Change the number from 10 to however many items of the dictionary reverse_word_index you want to preview

A quick and short solution can be this:
import pandas as pd
d = {"a": [1,2], "b": [2, 3], "c": [3, 4]}
pd.Series(d).head()
a [1, 2]
b [2, 3]
c [3, 4]
dtype: object

This gives back a dictionary:
dict(list(my_dictname.items())[0:n])
If you just want to have a glance of your dict, then just do:
list(freqs.items())[0:n]

Order of items in a dictionary is preserved in Python 3.7+, so this question makes sense.
To get a dictionary with only 10 items from the start you can use pandas:
d = {"a": [1,2], "b": [2, 3], "c": [3, 4]}
import pandas as pd
result = pd.Series(d).head(10).to_dict()
print(result)
This will produce a new dictionary.

d = {"a": 1,"b": 2,"c": 3}
for i in list(d.items())[:2]:
print('{}:{}'.format(d[i][0], d[i][1]))
a:1
b:2

Summing up numbers in a defaultdict(list)

I've been experimenting trying to get this to work and I've exhausted every idea and web search. Nothing seems to do the trick. I need to sum numbers in a defaultdict(list) and i just need the final result but no matter what i do i can only get to the final result by iterating and returning all sums adding up to the final. What I've been trying generally,
d = { key : [1,2,3] }
running_total = 0
#Iterate values
for value in d.itervalues:
#iterate through list inside value
for x in value:
running_total += x
print running_total
The result is :
1,3,6
I understand its doing this because its iterating through the for loop. What i dont get is how else can i get to each of these list values without using a loop? Or is there some sort of method iv'e overlooked?
To be clear i just want the final number returned e.g. 6
EDIT I neglected a huge factor , the items in the list are timedealta objects so i have to use .seconds to make them into integers for adding. The solutions below make sense and I've tried similar but trying to throw in the .seconds conversion in the sum statement throws an error.
d = { key : [timedelta_Obj1,timedelta_Obj2,timedelta_Obj3] }

I think this will work for you:
sum(td.seconds for sublist in d.itervalues() for td in sublist)

Try this approach:
from datetime import timedelta as TD
d = {'foo' : [TD(seconds=1), TD(seconds=2), TD(seconds=3)],
'bar' : [TD(seconds=4), TD(seconds=5), TD(seconds=6), TD(seconds=7)],
'baz' : [TD(seconds=8)]}
print sum(sum(td.seconds for td in values) for values in d.itervalues())

You could just sum each of the lists in the dictionary, then take one final sum of the returned list.
>>> d = {'foo' : [1,2,3], 'bar' : [4,5,6,7], 'foobar' : [10]}
# sum each value in the dictionary
>>> [sum(d[i]) for i in d]
[10, 6, 22]
# sum each of the sums in the list
>>> sum([sum(d[i]) for i in d])
38

If you don't want to iterate or to use comprehensions you can use this:
d = {'1': [1, 2, 3], '2': [3, 4, 5], '3': [5], '4': [6, 7]}
print(sum(map(sum, d.values())))
If you use Python 2 and your dict has a lot of keys it's better you use imap (from itertools) and itervalues
from itertools import imap
print sum(imap(sum, d.itervalues()))

Your question was how to get the value "without using a loop". Well, you can't. But there is one thing you can do: use the high performance itertools.
If you use chain you won't have an explicit loop in your code. chain manages that for you.
>>> data = {'a': [1, 2, 3], 'b': [10, 20], 'c': [100]}
>>> import itertools
>>> sum(itertools.chain.from_iterable(data.itervalues()))
136
If you have timedelta objects you can use the same recipe.
>>> data = {'a': [timedelta(minutes=1),
timedelta(minutes=2),
timedelta(minutes=3)],
'b': [timedelta(minutes=10),
timedelta(minutes=20)],
'c': [timedelta(minutes=100)]}
>>> sum(td.seconds for td in itertools.chain.from_iterable(data.itervalues()))
8160

Selecting elements of a Python dictionary greater than a certain value

I need to select elements of a dictionary of a certain value or greater. I am aware of how to do this with lists, Return list of items in list greater than some value.
But I am not sure how to translate that into something functional for a dictionary. I managed to get the tags that correspond (I think) to values greater than or equal to a number, but using the following gives only the tags:
[i for i in dict if dict.values() >= x]

.items() will return (key, value) pairs that you can use to reconstruct a filtered dict using a list comprehension that is feed into the dict() constructor, that will accept an iterable of (key, value) tuples aka. our list comprehension:
>>> d = dict(a=1, b=10, c=30, d=2)
>>> d
{'a': 1, 'c': 30, 'b': 10, 'd': 2}
>>> d = dict((k, v) for k, v in d.items() if v >= 10)
>>> d
{'c': 30, 'b': 10}
If you don't care about running your code on python older than version 2.7, see #opatut answer using "dict comprehensions":
{k:v for (k,v) in dict.items() if v > something}

While nmaier's solution would have been my way to go, notice that since python 2.7+ there has been a "dict comprehension" syntax:
{k:v for (k,v) in dict.items() if v > something}
Found here: Create a dictionary with list comprehension in Python. I found this by googling "python dictionary list comprehension", top post.
Explanation
{ .... } includes the dict comprehension
k:v what elements to add to the dict
for (k,v) in dict.items() this iterates over all tuples (key-value-pairs) of the dict
if v > something a condition that has to apply on every value that is to be included

You want dict[i] not dict.values(). dict.values() will return the whole list of values that are in the dictionary.
dict = {2:5, 6:2}
x = 4
print [dict[i] for i in dict if dict[i] >= x] # prints [5]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python dictionary - list compute to avergae - python

If you don't want to use external libraries and you want to keep that structure: dict2 = {} for key in dict1: dict2[key] = sum(dict1[key][0])/len(dict1[key][0])

The problem is that your values are not 1D lists, they're 2D lists. If you simply remove the extra brackets, your solution should work. Also don't forget to divide the sum of the list by the length of the list (and if you're using python 2, to import the new division).

Related

when I use a tuple for a key of a dictionary, can I search data with only one element of the tuple key?

Delete items in a dictionary with values that don't equal the highest value in Python

In Python how to obtain a partial view of a dict?

Summing up numbers in a defaultdict(list)

Selecting elements of a Python dictionary greater than a certain value

Categories

Resources