Averaging a list of dicts in Python - python

I've got a list of daily values ordered into a list of dicts like so:
vals = [
{'date': '1-1-2014', 'a': 10, 'b': 33.5, 'c': 82, 'notes': 'high repeat rate'},
{'date': '2-1-2014', 'a': 5, 'b': 11.43, 'c': 182, 'notes': 'normal operations'},
{'date': '3-1-2014', 'a': 0, 'b': 0.5, 'c': 2, 'notes': 'high failure rate'},
...]
What I'd like to do is get an average of a, b & c for the month.
Is there a better way than doing something like:
val_points = {}
val_len = len(vals)
for day in vals:
for p in ['a', 'b', 'c']:
if val_points.has_key(p):
val_points += day[p]
else:
val_points = day[p]
val_avg = dict([(i, val_points[i] / val_len] for p in val_points])
I haven't run the code above, may have glitches but I hope I'm getting the idea across. I know there's probably a better way using some combination of operator, itertools and collections.

{p:sum(map(lambda x:x[p],vals))/len(vals) for p in ['a','b','c']}
output:
{'a': 5, 'c': 88, 'b': 15.143333333333333}

This might be slightly longer than Elisha's answer, but there are less intermediate data structures, hence it might be faster:
KEYS = ['a', 'b', 'c']
def sum_and_count(sums_and_counts, item, key):
prev_sum, prev_count = sums_and_counts.get(key, (0,0)) # using get to have a fall-back if there is nothing in our sums_and_counts
return (prev_sum+item.get(key, 0), prev_count+1) # using get to have a 0 default for a non-existing key in item
sums_and_counts = reduce(lambda sc, item: {key: sum_and_count(sc, item, key) for key in KEYS}, vals, {})
averages = {k:float(total)/no for (k,(total,no)) in sums_and_counts.iteritems()}
print averages
output:
{'a': 5.0, 'c': 88.66666666666667, 'b': 15.143333333333333}

As you want to calculate average by month(Here considering the date format in 'dd-mm-yyyy'):
vals = [
{'date': '1-1-2014', 'a': 10, 'b': 33.5, 'c': 82, 'notes': 'high repeat rate'},
{'date': '2-1-2014', 'a': 5, 'b': 11.43, 'c': 182, 'notes': 'normal operations'},
{'date': '3-1-2014', 'a': 20, 'b': 0.5, 'c': 2, 'notes': 'high failure rate'},
{'date': '3-2-2014', 'a': 0, 'b': 0.5, 'c': 2, 'notes': 'high failure rate'},
{'date': '4-2-2014', 'a': 20, 'b': 0.5, 'c': 2, 'notes': 'high failure rate'}
]
month = {}
for x in vals:
newKey = x['date'].split('-')[1]
if newKey not in month:
month[newKey] = {}
for k in 'abc':
if k in month[newKey]:
month[newKey][k].append(x[k])
else:
month[newKey][k] = [x[k]]
output = {}
for y in month:
if y not in output:
output[y] = {}
for z in month[y]:
output[y][z] = sum(month[y][z])/float(len(month[y][z]))
print output
OUTPUT:
{'1': {'a': 11.666666666666666, 'c': 88.66666666666667, 'b': 15.143333333333333},
'2': {'a': 10.0, 'c': 2.0, 'b': 0.5}}

If you have multiple month's data, Pandas will make your life a lot easier:
df = pandas.DataFrame(vals)
df.date = [pandas.datetools.parse(d, dayfirst=True) for d in df.date]
df.set_index('date', inplace=True)
means = df.resample('m', how='mean')
Results in:
a b c
date
2014-01-31 5 15.143333 88.666667

Related

Compare a dictionary with list of dictionaries and return index from the list which has higher value than the separate dictionary

I have a list of dictionaries and a separate dictionary having the same keys and only the values are different. For example the list of dictionaries look like this:
[{'A': 0.102, 'B': 0.568, 'C': 0.33}, {'A': 0.026, 'B': 0.590, 'C': 0.382}, {'A': 0.005, 'B': 0.857, 'C': 0.137}, {'A': 0.0, 'B': 0.962, 'C': 0.036}, {'A': 0.0, 'B': 0.991, 'C': 0.008}]
and the separate dictionary looks like this:
{'A': 0.005, 'B': 0.956, 'C': 0.038}
I want to compare the separate dictionary with the list of dictionaries and return the index from the list which has higher value than the separate dictionary. In this example, the indices would be 3, 4 as the dictionary in indices 3 and 4 has a higher value for key 'B' since 'B' has the highest value in the separate dictionary.
Any ideas on how I should I proceed the problem?
You can use enumerate for finding index of max value:
org = [
{'A': 0.102, 'B': 0.568, 'C': 0.33},
{'A': 0.026, 'B': 0.590, 'C': 0.382},
{'A': 0.005, 'B': 0.857, 'C': 0.137},
{'A': 0.0, 'B': 0.962, 'C': 0.036},
{'A': 0.0, 'B': 0.991, 'C': 0.008}
]
com = {'A': 0.005, 'B': 0.956, 'C': 0.038}
def fnd_index(org, com):
key_max, val_max = max(com.items(), key=lambda x: x[1])
print('key_max:', key_max)
print('val_max:', val_max)
res = []
for idx, dct in enumerate(org):
if dct[key_max] > val_max:
res.append(idx)
return res
res = fnd_index(org, com)
print('result:', res)
Output:
key_max: B
val_max: 0.956
result: [3, 4]
are you sure that it should be only index 4?
dict_list = [{'A': 0.102, 'B': 0.568, 'C': 0.33},
{'A': 0.026, 'B': 0.590, 'C': 0.382},
{'A': 0.005, 'B': 0.857, 'C': 0.137},
{'A': 0.0, 'B': 0.962, 'C': 0.036},
{'A': 0.0, 'B': 0.991, 'C': 0.008}]
d = {'A': 0.005, 'B': 0.956, 'C': 0.038}
max_val = max(d.values())
idxmax = [i for i,j in enumerate(dict_list) if max(j.values()) > max_val]
print(idxmax) # [3, 4]

Python rearrange the grouped diction

I am trying to develop a code and half of it is done, I am grouping my diction. I want to create a function to go back to the a_dict from b_dict
I want to print it as this;
Expected output;
a_dict: {'A': 1, 'B': 2, 'C': 3, 'D': 1, 'E': 2, 'F': 3} # Original Diction
Grouped dict: {1: ['A', 'D'], 2: ['B', 'E'], 3: ['C', 'F']} # Grouped Diction
Expected dict: {'A': 1, 'D': 1, 'B': 2, 'E': 2, 'C': 3, 'F': 3} # Expected second output with go_back function. Current output can not do this
Code:
a_dict = {'A': 1, 'B': 2, 'C': 3, 'D': 1, 'E': 2, 'F': 3}
print('a_dict: ', a_dict)
def fun_dict(a_dict):
b_dict = {}
for i, v in a_dict.items():
b_dict[v] = [i] if v not in b_dict.keys() else b_dict[v] + [i]
return b_dict
def go_back(b_dict):
#
# Need a function to convert b_dict to c_dict to go back as the expected output
#
b_dict = fun_dict(a_dict)
print('Grouped dict: ', b_dict)
c_dict = fun_dict(b_dict)
print('Went to the original dict: ', c_dict)
The go_back you want could be like this:
def go_back(b_dict):
r = {}
for k, vv in b_dict.items():
for v in vv:
r[v] = k
return r
Result:
Went to the original dict: {'A': 1, 'D': 1, 'B': 2, 'E': 2, 'C': 3, 'F': 3}
Here is a proposal:
def go_back(b_dict):
return {e: k for k, v in b_dict.items() for e in v}

Creating a function that filters a nest dictionary by asking certain values

I am a beginner in python trying to create a function that filters through my nested dictionary through by asking multiple values in a dictionary like
filtered_options = {'a': 5, 'b': "Cloth'}
For my dictionary
my_dict = {1.0:{'a': 1, 'b': "Food', 'c': 500, 'd': 'Yams'},
2.0:{'a': 5, 'v': "Cloth', 'c': 210, 'd': 'Linen'}}
If I input my dictionary in the filter function with such options I should get something that looks like
filtered_dict(my_dict, filtered_options = {'a': 5, 'b': "Cloth'})
which outputs the 2nd key and other keys with the same filtered options in my dictionary.
This should do what you want.
def dict_matches(d, filters):
return all(k in d and d[k] == v for k, v in filters.items())
def filter_dict(d, filters=None):
filters = filters or {}
return {k: v for k, v in d.items() if dict_matches(v, filters)}
Here's what happens when you test it:
>>> filters = {'a': 5, 'b': 'Cloth'}
>>> my_dict = {
... 1.0: {'a': 1, 'b': 'Food', 'c': 500, 'd': 'Yams'},
... 2.0: {'a': 5, 'b': 'Cloth', 'c': 210, 'd': 'Linen'}
... }
>>> filter_dict(my_dict, filters)
{2.0: {'b': 'Cloth', 'a': 5, 'd': 'Linen', 'c': 210}}
You can do this :
import operator
from functools import reduce
def multi_level_indexing(nested_dict, key_list):
"""Multi level index a nested dictionary, nested_dict through a list of keys in dictionaries, key_list
"""
return reduce(operator.getitem, key_list, nested_dict)
def filtered_dict(my_dict, filtered_options):
return {k : v for k, v in my_dict.items() if all(multi_level_indexing(my_dict, [k,f_k]) == f_v for f_k, f_v in filtered_options.items())}
So that:
my_dict = {1.0:{'a': 1, 'b': 'Food', 'c': 500, 'd': 'Yams'},
2.0:{'a': 5, 'b': 'Cloth', 'c': 210, 'd': 'Linen'}}
will give you:
print(filtered_dict(my_dict, {'a': 5, 'b': 'Cloth'}))
# prints {2.0: {'a': 5, 'b': 'Cloth', 'c': 210, 'd': 'Linen'}}

Update list value in list of dictionaries

I have a list of dictionaries (much like in JSON). I want to apply a function to a key in every dictionary of the list.
>> d = [{'a': 2, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}]
# Desired value
[{'a': 200, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}]
# If I do this, I can only get the changed key
>> map(lambda x: {k: v * 100 for k, v in x.iteritems() if k == 'a'}, d)
[{'a': 200}, {'a': 100}, {'a': 100}, {'a': 100}]
# I try to add the non-modified key-values but get an error
>> map(lambda x: {k: v * 100 for k, v in x.iteritems() if k == 'a' else k:v}, d)
SyntaxError: invalid syntax
File "<stdin>", line 1
map(lambda x: {k: v * 100 for k, v in x.iteritems() if k == 'a' else k:v}, d)
How can I achieve this?
EDIT: 'a' and 'b' are not the only keys. These were selected for demo purposes only.
Iterate through the list and update the desired dict item,
lst = [{'a': 2, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}]
for d in lst:
d['a'] *= 100
Using list comprehension will give you speed but it will create a new list and n new dicts, It's useful if you don't wanna mutate your list, here it is
new_lst = [{**d, 'a': d['a']*100} for d in lst]
In python 2.X we can't use {**d} so I built custom_update based on the update method and the code will be
def custom_update(d):
new_dict = dict(d)
new_dict.update({'a':d['a']*100})
return new_dict
[custom_update(d) for d in lst]
If for every item in the list you want to update a different key
keys = ['a', 'b', 'a', 'b'] # keys[0] correspond to lst[0] and keys[0] correspond to lst[0], ...
for index, d in enumerate(lst):
key = keys[index]
d[key] *= 100
using list comprehension
[{**d, keys[index]: d[keys[index]] * 100} for index, d in enumerate(lst)]
In python 2.x the list comprehension will be
def custom_update(d, key):
new_dict = dict(d)
new_dict.update({key: d[key]*100})
return new_dict
[custom_update(d, keys[index]) for index, d in enumerate(lst)]
You can use your inline conditionals (ternaries) in a better location within a comprehension:
>>> d = [{'a': 2, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}]
>>> d2 = [{k: v * 100 if k == 'a' else v for k, v in i.items()} for i in d]
>>> d2
[{'a': 200, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}]
Your map() call is close to working, you just need to change the order of your dict comprehension, and turn else k:v into else v:
>>> d = [{'a': 2, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}]
>>> list(map(lambda x: {k: v * 100 if k == 'a' else v for k, v in x.items()}, d))
[{'a': 200, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}]
If you are using a function, you may want to provide a target key and corresponding value:
d = [{'a': 2, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}]
f = lambda new_val, d1, key='a': {a:b*new_val if a == key else b for a, b in d1.items()}
new_d = list(map(lambda x:f(100, x), d))
Output:
[{'a': 200, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}]
After your edit of "'a' and 'b' are not the only keys. These were selected for demo purposes only", here is a very simple map function that alters only the value of 'a', and leaves the rest as is:
map(lambda x: x.update({'a': x['a']*100}), d)
My original answer was:
I think the simplest and most appropriate way of this is iterating in d and utilizing the fact that each item in d is a dictionary that has keys 'a' and 'b':
res = [{'a':e['a']*100, 'b':e['b']} for e in d]
Result:
[{'a': 200, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}]

Python How to Extract data from Nested Dict

I have output from python networkX code:
flow_value, flow_dict = nx.maximum_flow(T, 'O', 'T')
print(flow_dict)
#Output as followesenter
#{'O': {'A': 4, 'B': 6, 'C': 4}, 'A': {'B': 1, 'D': 3}, 'B': {'C': 0, 'E': 3,'D': 4}, 'C': {'E': 4}, 'E': {'D': 1, 'T': 6}, 'D': {'T': 8}, 'T': {}}
I want to extract all the data in the form looks like:
#('O','A',4),('O','B','6'),('O','C','4'),('A','B',1),......,('D','T',8)
Any ways can I traverse thru the nested dict and get the data I need?
I tried this and it works. Some type checking to only capture strings
def retrieve_all_strings_from_dict(nested_dict, keys_to_ignore = None):
values = []
if not keys_to_ignore:
keys_to_ignore = []
else: keys_to_ignore = to_list(keys_to_ignore)
if not isinstance(nested_dict,dict):
return values
dict_stack = []
dict_stack.append(nested_dict)
for dict_var in dict_stack:
data_list = [v for k,v in dict_var.items() if all([isinstance(v,str), k not in keys_to_ignore]) ]
additional_dicts = [v for k,v in dict_var.items() if isinstance(v,dict)]
for x in additional_dicts:
dict_stack.append(x)
for w in data_list:
values.append(w)
return values

Categories

Resources