Merge dicts in one and find mean

Merge dicts in one and find mean - python

I am new in programming.
I have lists of dicts.
[{'Program Analysis': 0},
{'Algorithms': 0},
{'Number systems': 0},
{'Game theory': 1},
{'Algorithms': 1},
{'Number systems': 0},
{'Program Analysis': 0}]
I want to merge all dicts in one and then find mean.
{'Program Analysis': [0, 0]},
'Algorithms': [0, 1],
'Number systems': [0, 0],
'Game theory': 1 }
{'Program Analysis': 0},
'Algorithms': 0.5,
'Number systems': 0,
'Game theory': 1 }

def avg(num_list):
total = 0
for num in num_list:
total += num
return total * 1.0 / len(num_list)
my_list = [{'Program Analysis': 0}, {'Algorithms': 0}, {'Number systems': 0}, {'Game theory': 1}, {'Algorithms': 1}, {'Number systems': 0}, {'Program Analysis': 0}]
result = dict()
for d in my_list:
for name, num in d.items():
if name in result:
result[name].append(num)
else:
result[name] = [num]
for name, num_list in result.items():
result[name] = avg(num_list)

You can use a default dictionary to group the data and use mean function from statistics to calculate it:
from collections import defaultdict
from statistics import mean
data = [
{'Program Analysis': 0},
{'Algorithms': 0},
{'Number systems': 0},
{'Game theory': 1},
{'Algorithms': 1},
{'Number systems': 0},
{'Program Analysis': 0}
]
grouped_data = defaultdict(list)
for d in data:
for k, v in d.items():
grouped_data[k].append(v)
grouped_data = dict(grouped_data)
print(grouped_data)
>>> {'Program Analysis': [0, 0], 'Algorithms': [0, 1], 'Number systems': [0, 0], 'Game theory': [1]}
mean_data = {
k: mean(v) for k, v in grouped_data.items()
}
print(mean_data)
>>> {'Program Analysis': 0, 'Algorithms': 0.5, 'Number systems': 0, 'Game theory': 1}

Related

python nested dictionary to pandas DataFrame

main_dict = {
'NSE:ACC': {'average_price': 0,
'buy_quantity': 0,
'depth': {'buy': [{'orders': 0, 'price': 0, 'quantity': 0},
{'orders': 0, 'price': 0, 'quantity': 0},
{'orders': 0, 'price': 0, 'quantity': 0},
{'orders': 0, 'price': 0, 'quantity': 0},
{'orders': 0, 'price': 0, 'quantity': 0}],
'sell': [{'orders': 0, 'price': 0, 'quantity': 0},
{'orders': 0, 'price': 0, 'quantity': 0},
{'orders': 0, 'price': 0, 'quantity': 0},
{'orders': 0, 'price': 0, 'quantity': 0},
{'orders': 0, 'price': 0, 'quantity': 0}]},
'instrument_token': 5633,
'last_price': 2488.9,
'last_quantity': 0,
'last_trade_time': '2022-09-23 15:59:10',
'lower_circuit_limit': 2240.05,
'net_change': 0,
'ohlc': {'close': 2555.7,
'high': 2585.5,
'low': 2472.2,
'open': 2575},
'oi': 0,
'oi_day_high': 0,
'oi_day_low': 0,
'sell_quantity': 0,
'timestamp': '2022-09-23 18:55:17',
'upper_circuit_limit': 2737.75,
'volume': 0},
}
convert dict to pandas dataframe
for example:
symbol last_price net_change Open High Low Close
NSE:ACC 2488.9 0 2575 2585.5 2472.2 2555.7
I am trying pd.DataFrame.from_dict(main_dict)
but it does not work.
please give the best suggestion.

I would first select the necessary data from your dict and then pass that as input to pd.DataFrame()
df_input = [{
"symbol": symbol,
"last_price": main_dict.get(symbol).get("last_price"),
"net_change": main_dict.get(symbol).get("net_change"),
"open": main_dict.get(symbol).get("ohlc").get("open"),
"high": main_dict.get(symbol).get("ohlc").get("high"),
"low": main_dict.get(symbol).get("ohlc").get("low"),
"close": main_dict.get(symbol).get("ohlc").get("close")
} for symbol in main_dict]
import pandas as pd
df = pd.DataFrame(df_input)

how to count duplicate of dictionarys inside the list?

Was wondering if anyone could help me with counting duplicate dictionarys. I have this list:
a = [{'key1':1, 'key2':2, 'key3':3, 'count': 0},
{'key1':1, 'key2':2, 'key3':4, 'count': 0},
{'key1':3, 'key2':2, 'key3':4, 'count': 0},
{'key1':1, 'key2':2, 'key3':3, 'count': 0}]
i`m lookig to count all duplicates and once it is match remove copy and append to that dictionry ['count'] += 1. so final result might look like
a = [{'key1':1, 'key2':2, 'key3':3, 'count': 2},
{'key1':1, 'key2':2, 'key3':4, 'count': 0},
{'key1':3, 'key2':2, 'key3':4, 'count': 0}]
i did tryed simple aprouch such as which didnt worked:
a = [{'key1':1, 'key2':2, 'key3':3, 'count': 0},
{'key1':1, 'key2':2, 'key3':4, 'count': 0},
{'key1':3, 'key2':2, 'key3':4, 'count': 0},
{'key1':1, 'key2':2, 'key3':3, 'count': 0}]
for i in range(len(a)):
for n in range(len(a)):
if a[i]['key1'] == a[n]['key1'] and a[i]['key2'] == a[n]['key2'] and a[i]['key3'] == a[n]['key3']:
a[i]['count'] += 1
del a[n]
was thinking aswell there should be some simpler aprouch. thanks

You can leverage Counter method in the collections module.
import collections
count = collections.Counter([tuple(d.items()) for d in a])
[dict(k) | {c:v} for (*k, (c,_)), v in count.items()]
this looks like:
[{'key1': 1, 'key2': 2, 'key3': 3, 'count': 2},
{'key1': 1, 'key2': 2, 'key3': 4, 'count': 1},
{'key1': 3, 'key2': 2, 'key3': 4, 'count': 1}]

Cummulative Dictionary

I am trying to write a python function where for each key (the dates), the value would be the sum of that day's result and the previous day(s) (sort of following the same logic as the fibonacci sequence).
For example, I have:
{20200516: {'Level1': 0, 'Level2': 1, 'Level3': 0, 'Level4': 0}, 20200517: {'Level1': 0, 'Level2': 0, 'Level3': 0, 'Level4': 1}, 20200518: {'Level1': 1, 'Level2': 0, 'Level3': 0, 'Level4': 0}, 20200519: {'Level1': 0, 'Level2': 1, 'Level3': 0, 'Level4': 1}}
but I want to have:
{20200516: {'Level1': 0, 'Level2': 1, 'Level3': 0, 'Level4': 0}, 20200517: {'Level1': 0, 'Level2': 1, 'Level3': 0, 'Level4': 1}, 20200518: {'Level1': 1, 'Level2': 1, 'Level3': 0, 'Level4': 1}, 20200519: {'Level1': 1, 'Level2': 2, 'Level3': 0, 'Level4': 2}
What I have done until now:
def summing(d):
'''
each key after the first one is the sum of the one before and its own result
>>> {20200516: {'Level1': 0, 'Level2': 1, 'Level3': 0, 'Level4': 0}, 20200517: {'Level1': 0,
'Level2': 0, 'Level3': 0, 'Level4': 1}, 20200518: {'Level1': 1, 'Level2': 0, 'Level3':
0, 'Level4': 0}, 20200519: {'Level1': 0, 'Level2': 1, 'Level3': 0, 'Level4': 1}}
{20200516: {'Level1': 0, 'Level2': 1, 'Level3': 0, 'Level4': 0}, 20200517: {'Level1': 0,
'Level2': 1, 'Level3': 0, 'Level4': 1}, 20200518: {'Level1': 1, 'Level2': 1, 'Level3': 0, '
Level4': 1}, 20200519: {'Level1': 1, 'Level2': 2, 'Level3': 0, 'Level4': 2}
'''
#STILL IN PROGRESS
c={}
for key in d:
if key == 20200516:
c[20200516]=d[20200516]
else:
c[key]=d[key-1]+d[key]
return c

You made a good effort, but you can't just add dicts like that. Here's a minimal change to get from your input to desired output, by using dict comprehension to add the value for each entry in the daily record:
from pprint import pprint
def summing_oneday(d1, d2):
return {key: d1[key] + d2[key] for key in d2}
def summing(data):
result = {}
for day in sorted(data.keys()):
if not result:
result[day] = data[day]
else:
result[day] = summing_oneday(previous, data[day])
previous = result[day]
return result
data = {20200516: {'Level1': 0, 'Level2': 1, 'Level3': 0, 'Level4': 0}, 20200517: {'Level1': 0, 'Level2': 0, 'Level3': 0, 'Level4': 1}, 20200518: {'Level1': 1, 'Level2': 0, 'Level3': 0, 'Level4': 0}, 20200519: {'Level1': 0, 'Level2': 1, 'Level3': 0, 'Level4': 1}}
pprint(summing(data))
I'm assuming all the keys are present on all the daily records. Otherwise we'll have to deal with that.

Cleanest way to sum list of nested dicts

Is there a cleaner/more pythonic way of summing the contents of a list of nested dicts? Here's what I'm doing, but I suspect that there may be a better way:
list_of_nested_dicts = [{'class1': {'TP': 1, 'FP': 0, 'FN': 2}, 'class2': {'TP': 0, 'FP': 0, 'FN': 0}, 'class3': {'TP': 0, 'FP': 0, 'FN': 0}, 'class4': {'TP': 1, 'FP': 0, 'FN': 2}},
{'class1': {'TP': 1, 'FP': 0, 'FN': 2}, 'class2': {'TP': 0, 'FP': 0, 'FN': 0}, 'class3': {'TP': 0, 'FP': 0, 'FN': 0}, 'class4': {'TP': 1, 'FP': 0, 'FN': 2}},
{'class1': {'TP': 1, 'FP': 0, 'FN': 2}, 'class2': {'TP': 0, 'FP': 0, 'FN': 0}, 'class3': {'TP': 0, 'FP': 0, 'FN': 0}, 'class4': {'TP': 1, 'FP': 0, 'FN': 2}},
{'class1': {'TP': 1, 'FP': 0, 'FN': 2}, 'class2': {'TP': 0, 'FP': 0, 'FN': 0}, 'class3': {'TP': 0, 'FP': 0, 'FN': 0}, 'class4': {'TP': 1, 'FP': 0, 'FN': 2}}]
total_counts = {k:{'TP': 0, 'FP': 0, 'FN': 0} for k in list_of_nested_dicts[0].keys()}
for d in list_of_nested_dicts:
for label,counts_dict in d.items():
for k,v in counts_dict.items():
total_counts[label][k] += v
print(total_counts)
(Assuming all keys are exactly the same, but values could be any integer)

You can have a slightly tighter code using collections (similar result to #blhsing)
import collections
counts = collections.defaultdict(collections.Counter)
for d in list_of_nested_dicts:
for k, v in d.items():
counts[k].update(v)
This will give you a defaultdict of counters instead of only dicts, but they behave similarly. You can also explicitly cast them to dicts at the end if you want.
{'class1': {'FN': 8, 'FP': 0, 'TP': 4},
'class2': {'FN': 0, 'FP': 0, 'TP': 0},
'class3': {'FN': 0, 'FP': 0, 'TP': 0},
'class4': {'FN': 8, 'FP': 0, 'TP': 4}}
vs
defaultdict(<class 'collections.Counter'>,
{'class1': Counter({'FN': 8, 'TP': 4, 'FP': 0}),
'class2': Counter({'TP': 0, 'FP': 0, 'FN': 0}),
'class3': Counter({'TP': 0, 'FP': 0, 'FN': 0}),
'class4': Counter({'FN': 8, 'TP': 4, 'FP': 0})})

One thing in your code that stands out as "unclean" is the fact that you are hard-coding the keys of the sub-dicts in the initialization of total_counts. You can avoid such hard-coding by using the dict.setdefault and dict.get methods as you iterate over the items of the sub-dicts instead:
total_counts = {}
for d in list_of_nested_dicts:
for label, counts_dict in d.items():
for k, v in counts_dict.items():
total_counts[label][k] = total_counts.setdefault(label, {}).get(k, 0) + v

GIven a set of strings, create a dictionary of dictionaries using them as keys to entries with default values

I have this:
set_of_strings = {'abc', 'def', 'xyz'}
And I want to create this:
dict_of_dicts = {
'abc': {'pr': 0, 'wt': 0},
'def' : {'pr': 0, 'wt': 0},
'xyz' : {'pr': 0, 'wt': 0}
}
What's the pythonic way? (Python 2.7)

Like this?
>>> set_of_strings = {'abc', 'def', 'xyz'}
>>> dict_of_dicts = {}
>>> for key in set_of_strings:
... dict_of_dicts[key] = {'pr':0, 'wt':0}
...
>>> print dict_of_dicts
{'xyz': {'pr': 0, 'wt': 0}, 'abc': {'pr': 0, 'wt': 0}, 'def': {'pr': 0, 'wt': 0}}
As a dictionary comprehension:
>>> {k:{'pr':0, 'wt':0} for k in {'abc', 'def', 'xyz'}}
{'xyz': {'pr': 0, 'wt': 0}, 'abc': {'pr': 0, 'wt': 0}, 'def': {'pr': 0, 'wt': 0}}
Alternatively, you can do something like:
>>> set_of_strings = {'abc', 'def', 'xyz'}
>>> value = {'pr': 0, 'wt': 0}
>>> dict(zip(set_of_strings, [value]*len(set_of_strings)))
{'xyz': {'pr': 0, 'wt': 0}, 'abc': {'pr': 0, 'wt': 0}, 'def': {'pr': 0, 'wt': 0}}

You can also use dict.fromkeys:
>>> d = dict.fromkeys({'abc', 'def', 'xyz'}, {'pr': 0, 'wt': 0})
>>> d
{'xyz': {'pr': 0, 'wt': 0}, 'abc': {'pr': 0, 'wt': 0}, 'def': {'pr': 0, 'wt': 0}}
NOTE:
The value specified ({'pr': 0, 'wt': 0}) is shared by all keys.
>>> d['xyz']['py'] = 1
>>> d
{'xyz': {'pr': 0, 'py': 1, 'wt': 0}, 'abc': {'pr': 0, 'py': 1, 'wt': 0}, 'def': {'pr': 0, 'py': 1, 'wt': 0}}

As the other answers show, there are several ways to achieve this, but IMO the most (only?) pythonic way is using a dict comprehension:
keys = ...
{ k: { 'pr': 0, 'wt': 0 } for k in keys }
If the values were immutable, dict.fromkeys is good, and is probably faster than dict comprehension.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Merge dicts in one and find mean - python

Related

python nested dictionary to pandas DataFrame

how to count duplicate of dictionarys inside the list?

Cummulative Dictionary

Cleanest way to sum list of nested dicts

GIven a set of strings, create a dictionary of dictionaries using them as keys to entries with default values

Categories

Resources