How do I use dictionary comprehension to obtain the average of the student scores
co_dct = {"Juan":[90,85,98], "Lana":[94,80,100], "Alicia":[100,90], "Sam":[]}
co_dct = d/d[] for d in co_dct
print(co_dct)
As a dictionary comprehension:
>>> co_dct = {"Juan":[90,85,98], "Lana":[94,80,100], "Alicia":[100,90], "Sam":[]}
>>> {k: sum(co_dct[k])/float(len(co_dct[k])) for k in co_dct if co_dct[k]}
{'Juan': 91.0, 'Lana': 91.33333333333333, 'Alicia': 95.0}
Note the use of a filter to guard against division by zero errors when the sample list is empty. This results in the loss of keys that have empty samples, but that seems reasonable since you can't produce an average without data.
Since you are using Python 3 another way is to use statistics.mean():
>>> from statistics import mean
>>> {k: mean(co_dct[k]) for k in co_dct if co_dct[k]}
{'Lana': 91.33333333333333, 'Alicia': 95, 'Juan': 91}
A minor optimisation might be to use co_dct.items() to avoid multiple dict lookups:
>>> {k: mean(values) for k, values in co_dct.items() if values}
Related
I have a dictionary with an int as value for each key. I also have total stored in a variable. I want to obtain a percentage that each value represent for the variable and return the percentage to the dictionary as another value for the same key.
I tried to extract the values in a list, then do the operation and append the results to another list. But I don't know how to append that list to the dictionary.
total = 1000
d = {"key_1":150, "key_2":350, "key_3":500}
lst = list(d.values())
percentages = [100 * (i/total) for i in lst]
# Desired dictionary
d
{"key_1": [15%, 150],
"key_2": [35%, 350],
"key_3": [50%, 500]
}
You're better off avoiding the intermediate list and just updating each key as you go:
total = 1000
d = {"key_1":150, "key_2":350, "key_3":500}
for k, v in d.items():
d[k] = [100 * (v / total), v]
While it's technically possible to zip the dict's keys with the values of the list, as long as the keys aren't changed and the list order is kept in line with the values extracted from the dict, the resulting code would reek of code smell, and it's just easier to avoid that list entirely anyway.
Note that this won't put a % sign in the representation, because there is no such thing as a percentage type. The only simple way to shove one in there would be to store it as a string, not a float, e.g. replacing the final line with:
d[k] = [f'{100 * (v / total)}%', v]
to format the calculation as a string, and shove a % on the end.
Here
total = 1000
d = {"key_1": 150, "key_2": 350, "key_3": 500}
d1 = {k: ['{}%'.format(v * 100 / 1000),v] for k, v in d.items()}
print(d1)
output
{'key_1': ['15.0%', 150], 'key_2': ['35.0%', 350], 'key_3': ['50.0%', 500]}
c = [{'text': 'LahoreRightNow', 'indices': [111, 126]},
{'text': 'PakvsSL', 'indices': [127, 135]}]
I want access the text of both the dictionaries. I can get them with c[0]['text'] and c[1]['text'].
Isn't there a way to do this using a single command ?
If a list comprehension satisfies your single command constraint, use
>>> [dic['text'] for dic in c]
['LahoreRightNow', 'PakvsSL']
which is shorthand for
>>> result = []
>>> for dic in c:
... result.append(dic['text'])
...
>>> result
['LahoreRightNow', 'PakvsSL']
It does not get more single-command-esque than the comprehension here, but you could hide the for loop if you prefer the functional programming style:
>>> from operator import itemgetter
>>> map(itemgetter('text'), c)
['LahoreRightNow', 'PakvsSL']
(convert the map-object to a list in Python3 with list(map(...)).)
There is no way to access all text keys in a single operation from a list of dictionaries.
You can create a function, but this function perform an operation on each dictionary individually.
What you can do is create a new dictionary after performing the aggregation yourself. The optimal way to do this is via collections.defaultdict:
from collections import defaultdict
d = defaultdict(list)
for my_d in c:
for k, v in my_d.items():
d[k].append(v)
# defaultdict(<class 'list'>, {'text': ['LahoreRightNow', 'PakvsSL'],
# 'indices': [[111, 126], [127, 135]]})
Then d['text'] will return the list you require.
#timgeb's list comprehension is fine for the single key case. But the above method will be more efficient if you have multiple keys.
Try this short approach:
print(list(map(lambda x:x['text'],c)))
output:
['LahoreRightNow', 'PakvsSL']
I have a dictionary with a list as value.
I want to have an average of this list.
How do I compute that?
dict1 = {
'Monty Python and the Holy Grail': [[9, 10, 9.5, 8.5, 3, 7.5, 8]],
"Monty Python's Life of Brian": [[10, 10, 0, 9, 1, 8, 7.5, 8, 6, 9]],
"Monty Python's Meaning of Life": [[7, 6, 5]],
'And Now For Something Completely Different': [[6, 5, 6, 6]]
}
I have tried
dict2 = {}
for key in dict1:
dict2[key] = sum(dict1[key])
but it says: "TypeError: unsupported operand type(s) for +: 'int' and 'list'"
As noted in other posts, the first issue is that your dictionary keys are lists of lists, and not simple lists. The second issue is that you were calling sum, without then dividing by the number of elements, which would not give you an average.
If you are willing to use numpy, try this:
import numpy as np
dict_of_means = {k:np.mean(v) for k,v in dict1.items()}
>>> dict_of_means
{'Monty Python and the Holy Grail': 7.9285714285714288, "Monty Python's Life of Brian": 6.8499999999999996, "Monty Python's Meaning of Life": 6.0, 'And Now For Something Completely Different': 5.75}
Or, without using numpy or any external packages, you can do it manually by first flattening your lists of lists in the keys, and going through the same type of dict comprehension, but getting the sum of your flattened list and then dividing by the number of elements in that flattened list:
dict_of_means = {k: sum([i for x in v for i in x])/len([i for x in v for i in x])
for k, v in dict1.items()}
Note that [i for x in v for i in x] takes a list of lists v and flattens it to a simple list.
FYI, the dictionary comprehension syntax is more or less equivalent to this for loop:
dict_of_means = {}
for k,v in dict1.items():
dict_of_means[k] = sum([i for x in v for i in x])/len([i for x in v for i in x])
There is an in-depth description of dictionary comprehensions in the question I linked above.
If you don't want to use external libraries and you want to keep that structure:
dict2 = {}
for key in dict1:
dict2[key] = sum(dict1[key][0])/len(dict1[key][0])
The problem is that your values are not 1D lists, they're 2D lists. If you simply remove the extra brackets, your solution should work.
Also don't forget to divide the sum of the list by the length of the list (and if you're using python 2, to import the new division).
You can do that simply by using itertools.chain and a helper function to compute average.
Here is the helper function to compute average
def average(iterable):
sum = 0.0
count = 0
for v in iterable:
sum += v
count += 1
if count > 0:
return sum / count
If you want to average for each key, you can simply do that using dictionary comprehension and helper function we wrote above:
from itertools import chain
averages = {k: average(chain.from_iterable(v)) for k, v in dict1.items()}
Or If you want to get average across all the keys:
from itertools import chain
average(chain.from_iterable(chain.from_iterable(dict1.values())))
Your lists are nested, all being lists of a single item, which is itself a list of the actual numbers. Here I extract these lists using val[0], val being the outer lists:
for key, val in dict1.copy().items():
the_list = val[0]
dict1[key] = sum(the_list)/len(the_list)
This replaces all these nested lists with the average you are after. Also, you should never mutate anything while looping over it. Therefore, a copy of the dict is used above.
Alternatively you could make use of the fancier dictionary comprehension:
dict2 = {key: sum(the_list)/len(the_list) for key, (the_list,) in dict1.items()}
Note the clever but subtle way the inner list is extracted here.
I have a dictionary d with 100 keys where the values are variable length lists, e.g.
In[165]: d.values()[0]
Out[165]:
[0.0432,
0.0336,
0.0345,
0.044,
0.0394,
0.0555]
In[166]: d.values()[1]
Out[166]:
[0.0236,
0.0333,
0.0571]
Here's what I'd like to do: for every list in d.values(), I'd like to organize the values into 10 bins (where a value gets tossed into a bin if it satisfies the criteria, e.g. is between 0.03 and 0.04, 0.04 and 0.05, etc.).
What'd I'd like to end up with is something that looks exactly like d, but instead of d.values()[0] being a list of numbers, I'd like it to be a list of lists, like so:
In[167]: d.values()[0]
Out[167]:
[[0.0336,0.0345,0.0394],
[0.0432,0.044],
[0.0555]]
Each key would still be associated with the same values, but they'd be structured into the 10 bins.
I've been going crazy with nested for loops and if/elses, etc. What is the best way to go about this?
EDIT: Hi, all. Just wanted to let you know I resolved my issues. I used a variation of #Brent Washburne's answer. Thanks for the help!
def bin(values):
bins = [[] for _ in range(10)] # create ten bins
for n in values:
b = int(n * 100) # normalize the value to the bin number
bins[b].append(n) # add the number to the bin
return bins
d = [0.0432,
0.0336,
0.0345,
0.044,
0.0394,
0.0555]
print bin(d)
The result is:
[[], [], [], [0.0336, 0.0345, 0.0394], [0.0432, 0.044], [0.0555], [], [], [], []]
You can use itertools.groupby() function by passing a proper key-function in order to categorize your items. And in this case you can use floor(x*100) as your key-function:
>>> from math import floor
>>> from itertools import groupby
>>> lst = [0.0432, 0.0336, 0.0345, 0.044, 0.0394, 0.0555]
>>> [list(g) for _,g in groupby(sorted(lst), key=lambda x: floor(x*100))]
[[0.0336, 0.0345, 0.0394], [0.0432, 0.044], [0.0555]]
And for applying this on your values you can use a dictionary comprehension:
def categorizer(val):
return [list(g) for _,g in groupby(sorted(lst), key=lambda x: floor(x*100))]
new_dict = {k:categorizer(v) for k,v in old_dict.items()}
As another approach which is more optimized in term of execution speed you can use a dictionary for categorizing:
>>> def categorizer(val, d={}):
... for i in val:
... d.setdefault(floor(i*100),[]).append(i)
... return d.values()
Why not make the values a set of dictionaries where the ke is the bin indicator and the values a list of those items that are in that bin?
yoe would define
newd = [{bin1:[], bin2:[], ...binn:[]}, ... ]
newd[0][bin1] = (list of items in d[0] that belong in bin1)
You now have a list of dictionaries each of which has the appropriate bin listings.
newd[0] is now the equivalent of a dictionary built from d[0] each key (which I call bin1, bin2, ... binn) contains a list of the values that are appropriate for that bin. Thus we have `newd[0][bin1], newd[0][bin2, ... new[k][lastbin]
Dictionary creation allows you to create the appropriate key and value list as you go along. If there is not yet a particular bin key, create the empty list and then the append of the value to the list will succeed.
Now when you want to identify elements of a bin, you can loop through the list of newd and extract whichever bin that you want. This allows you to have bins with no entry without having to create empty lists. If a bin key is not in newd, the retrieve is set to return an empty list as a default (to avoid the dictionary invalid key exception).
Say I have a list of list like this: (suppose you have no idea how many lists in this list)
list=[['food','fish'],['food','meat'],['food','veg'],['sports','football']..]
how can I merge the items in the list like the following:
list=[['food','fish','meat','veg'],['sports','football','basketball']....]
i.e, merge all the nested lists into the same list if they contain one of the same items.
Use defaultdict to make a dictionary that maps a type to values and then get the items:
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> items = [['food','fish'],['food','meat'],['food','veg'],['sports','football']]
>>> for key, value in items:
... d[key].append(value)
...
>>> [[key] + values for key, values in d.items()]
[['food', 'fish', 'meat', 'veg'], ['sports', 'football']]
The "compulsory" alternative to defaultdict which can work better for data that's already in order of the key and if you don't want to build data structures on it (ie, just work on groups)...
data = [['food','fish'],['food','meat'],['food','veg'],['sports','football']]
from itertools import groupby
print [[k] + [i[1] for i in v] for k, v in groupby(data, lambda L: L[0])]
But defaultdict is more flexible and easier to understand - so go with #Blender's answer.