I would like to create nested dict in python3, I've the following list(from a sql-query):
[('madonna', 'Portland', 'Oregon', '0.70', '+5551234', 'music', datetime.date(2016, 9, 8), datetime.date(2016, 9, 1)), ('jackson', 'Laredo', 'Texas', '2.03', '+555345', 'none', datetime.date(2016, 5, 23), datetime.date(2016, 5, 16)), ('bohlen', 'P', 'P', '2.27', '+555987', 'PhD Student', datetime.date(2016, 9, 7))]
I would like to have the following output:
{madonna:{city:Portland, State:Oregon, Index: 0.70, Phone:+5551234, art:music, exp-date:2016, 9, 8, arrival-date:datetime.date(2016, 5, 23)},jackson:{city: Laredo, State:Texas........etc...}}
Can somebody show me an easy to understand code?
I try:
from collections import defaultdict
usercheck = defaultdict(list)
for accname, div_ort, standort, raum, telefon, position, exp, dep in cur.fetchall():
usercheck(accname).append[..]
but this don't work, I can't think any further myself
You can use Dict Comprehension (defined here) to dynamically create a dictionary based on the elements of a list:
sql_list = [
('madonna', 'Portland', 'Oregon', '0.70', '+5551234', 'music', datetime.date(2016, 9, 8), datetime.date(2016, 9, 1)),
('jackson', 'Laredo', 'Texas', '2.03', '+555345', 'none', datetime.date(2016, 5, 23), datetime.date(2016, 5, 16)),
('bohlen', 'P', 'P', '2.27', '+555987', 'PhD Student', datetime.date(2016, 9, 7))
]
sql_dict = {
element[0]: {
'city': element[1],
'state': element[2],
'index': element[3],
'phone': element[4],
'art': element[5],
} for element in sql_list
}
Keep in mind that every item in the dictionary needs to have a key and a value, and in your example you have a few values with no key.
If you have a list of the columns, you can use the zip function:
from collections import defaultdict
import datetime
# list of columns returned from your database query
columns = ["city", "state", "index", "phone", "art", "exp-date", "arrival-date"]
usercheck = defaultdict(list)
for row in cur.fetchall():
usercheck[row[0]] = defaultdict(list, zip(columns, row[1:]))
print usercheck
This will output a dictionary like:
defaultdict(<type 'list'>, {'madonna': defaultdict(<type 'list'>, {'city': 'Portland', 'art': 'music', 'index': '0.70', 'phone': '+5551234', 'state': 'Oregon', 'arrival-date': datetime.date(2016, 9, 1), 'exp-date': datetime.date(2016, 9, 8)}), 'jackson': defaultdict(<type 'list'>, {'city': 'Laredo', 'art': 'none', 'index': '2.03', 'phone': '+555345', 'state': 'Texas', 'arrival-date': datetime.date(2016, 5, 16), 'exp-date': datetime.date(2016, 5, 23)}), 'bohlen': defaultdict(<type 'list'>, {'city': 'P', 'art': 'PhD Student', 'index': '2.27', 'phone': '+555987', 'state': 'P', 'arrival-date': None, 'exp-date': datetime.date(2016, 9, 7)})})
When using defaultdict, the argument specifies the default value type in the dictionary.
from collections import defaultdict
usercheck = defaultdict(dict)
for accname, div_ort, standort, raum, telefon, position, exp, dep in cur.fetchall():
usercheck[accname]['city'] = div_ort
usercheck[accname]['state'] = standout
...
The keys in the dictionary are referenced using [key], not (key).
Related
I have this data :
[
{'name': 'INV/2021/0913', 'invoice_date': datetime.date(2021, 3, 12), 'qty_total': 5.0},
{'name': 'INV/2021/0965', 'invoice_date': datetime.date(2021, 3, 14), 'qty_total': 6.0},
{'name': 'INV/2021/0966', 'invoice_date': datetime.date(2021, 3, 14), 'qty_total': 7.0},
{'name': 'INV/2021/0967', 'invoice_date': datetime.date(2021, 3, 14), 'qty_total': 3.0},
{'name': 'INV/2021/0992', 'invoice_date': datetime.date(2021, 3, 15), 'qty_total': 4.0}
]
As it can be seen the middle 3 dicts have same date.
I want to combine the dictionaries having the same invoice_date and sum up the its qty_total.
Set the name attribute to "" for the combined dictionaries.
The result should look like this:
[
{'name': 'INV/2021/0913', 'invoice_date': datetime.date(2021, 3, 12), 'qty_total': 5.0},
{'name': '', 'invoice_date': datetime.date(2021, 3, 14), 'qty_total': 16.0},
{'name': 'INV/2021/0992', 'invoice_date': datetime.date(2021, 3, 15), 'qty_total': 4.0}
]
use itertools.groupby
from datetime import datetime
from itertools import groupby
l = [
{'name': 'INV/2021/0913', 'invoice_date': datetime(2021, 3, 12).date(), 'qty_total': 5.0},
{'name': 'INV/2021/0965', 'invoice_date': datetime(2021, 3, 14).date(), 'qty_total': 6.0},
{'name': 'INV/2021/0966', 'invoice_date': datetime(2021, 3, 14).date(), 'qty_total': 7.0},
{'name': 'INV/2021/0967', 'invoice_date': datetime(2021, 3, 14).date(), 'qty_total': 3.0},
{'name': 'INV/2021/0992', 'invoice_date': datetime(2021, 3, 15).date(), 'qty_total': 4.0}
]
res = []
for k, v in groupby(sorted(l, key=lambda x: x["invoice_date"]), key=lambda x: (x["invoice_date"])):
val = list(v)
res.append(
{"name": " " if len(val)>1 else val[0]["name"], "invoice_date": k, "qty_total": sum(vals["qty_total"] for vals in val)}
)
print(res)
Output
[{'name': 'INV/2021/0913',
'invoice_date': datetime.date(2021, 3, 12),
'qty_total': 5.0},
{'name': ' ', 'invoice_date': datetime.date(2021, 3, 14), 'qty_total': 16.0},
{'name': 'INV/2021/0992',
'invoice_date': datetime.date(2021, 3, 15),
'qty_total': 4.0}]
{'YOU': {'HE': {'EST': 8, 'OLM': 6}, 'SLO': {'WLR': 8}},
'ARE': {'KLP': {'EST': 6}, 'POL': {'WLR': 4}},
'DOING': {'TIS': {'OIL': 8}},
'GREAT': {'POL': {'EOL': 6}},
'WORK': {'KOE': {'RIW': 8, 'PNG': 4}, 'ROE': {'ERC': 8, 'WQD': 6}},
'KEEP': {'PAR': {'KOM': 8, 'RTW': 6}, 'PIL': {'XCE': 4, 'ACE': 8}},
'ROCKING': {'OUL': {'AZS': 6, 'RVX': 8}}}
Need to perform a calculation on the numbers in dictionary.
Eg: {'YOU': {'HE': {'EST': 8, 'OLM': 6}, 'SLO': {'WLR': 8}},
'WORK': {'KOE': {'RIW': 8, 'PNG': 4}, 'ROE': {'ERC': 8, 'WQD': 6}}} for this example the output would be
[(8+6)x8]+[(8+4)x(8+6)]
[14x8]+[12x14]
112+168
280
Following is the code I tried :
a = [tuple([k]+list(v.keys())+list(j.values())) for k,v in data.items() for i,j in v.items()]
and it gives :
[('YOU', 'HE', 'SLO', 8, 6),
('YOU', 'HE', 'SLO', 8),
('ARE', 'KLP', 'POL', 6),
('ARE', 'KLP', 'POL', 4),
('DOING', 'TIS', 8),
('GREAT', 'POL', 6),
('WORK', 'KOE', 'ROE', 8, 4),
('WORK', 'KOE', 'ROE', 8, 6),
('KEEP', 'PAR', 'PIL', 8, 6),
('KEEP', 'PAR', 'PIL', 4, 8),
('ROCKING', 'OUL', 6, 8)]
The rules aren't well-defined, but I'll give it a shot anyway. I am assuming you only want this calculation to apply to keys YOU and WORK in your nested dictionary. I think a list comprehension will get pretty complicated, and it's more readable to work with loops.
For each key YOU and WORK, I summed up these two innermost sets of values 8+6, 8 for YOU and 8+4, 8+6 for WORK, multiplied these values together 14*8 for YOU and 12*14 for WORK, then added the products together to get the result = 280
dict_nested = {'YOU': {'HE': {'EST': 8, 'OLM': 6}, 'SLO': {'WLR': 8}},
'ARE': {'KLP': {'EST': 6}, 'POL': {'WLR': 4}},
'DOING': {'TIS': {'OIL': 8}},
'GREAT': {'POL': {'EOL': 6}},
'WORK': {'KOE': {'RIW': 8, 'PNG': 4}, 'ROE': {'ERC': 8, 'WQD': 6}},
'KEEP': {'PAR': {'KOM': 8, 'RTW': 6}, 'PIL': {'XCE': 4, 'ACE': 8}},
'ROCKING': {'OUL': {'AZS': 6, 'RVX': 8}}}
keys = ['YOU','WORK']
result = 0
for key in keys:
inner_keys = dict_nested[key].keys()
# multiply the values together for the first values of the inner key
inner_product = 1
for inner_key in inner_keys:
inner_product *= sum(list(dict_nested[key][inner_key].values()))
# print(inner_product)
result += inner_product
Output:
>>> result
280
NOTE
By any means don't use eval, it is insecure ("eval is evil").
For more details about eval harmfulness (there are too many, I've just cherry-picked one) read here.
Some Inspiration Towards a Solution
As others and smarter before me have noted, I haven't found any reasonable explanation regarding the operands assignment in the example you've provided.
However, this is a little try - hope it will help you with the challenge.
So here you go:
import json
d = {'YOU': {'HE': {'EST': 8, 'OLM': 6}, 'SLO': {'WLR': 8}}, 'WORK': {'KOE': {'RIW': 8, 'PNG': 4}, 'ROE': {'ERC': 8, 'WQD': 6}}}
# Convet dictionary to a string
r = json.dumps(d)
# Convert string to char list
chars = list(r)
# Legal chars needed for computing
legal_chars = ['{', '}', ','] + [str(d) for d in range(10)]
# Filtering in only legal chars
filtered_chars = [x for x in chars if x in legal_chars]
# Replacing the {} with () and , with +
expression = ''.join(filtered_chars).replace('{', '(').replace('}', ')').replace(',', '+')
# Evaluating expression
result = eval(expression)
# (((8+6)+(12))+((8+4)+(8+6)))=52
print(f'{expression}={result}')
I'm having some trouble accessing a value that is inside an array that contains a dictionary and another array.
It looks like this:
[{'name': 'Alex',
'number_of_toys': [{'classification': 3, 'count': 383},
{'classification': 1, 'count': 29},
{'classification': 0, 'count': 61}],
'total_toys': 473},
{'name': 'John',
'number_of_toys': [{'classification': 3, 'count': 8461},
{'classification': 0, 'count': 3825},
{'classification': 1, 'count': 1319}],
'total_toys': 13605}]
I want to access the 'count' number for each 'classification'. For example, for 'name' Alex, if 'classification' is 3, then the code returns the 'count' of 383, and so on for the other classifications and names.
Thanks for your help!
Not sure what your question asks, but if it's just a mapping exercise this will get you on the right track.
def get_toys(personDict):
person_toys = personDict.get('number_of_toys')
return [ (toys.get('classification'), toys.get('count')) for toys in person_toys]
def get_person_toys(database):
return [(personDict.get('name'), get_toys(personDict)) for personDict in database]
This result is:
[('Alex', [(3, 383), (1, 29), (0, 61)]), ('John', [(3, 8461), (0, 3825), (1, 1319)])]
This isn't as elegant as the previous answer because it doesn't iterate over the values, but if you want to select specific elements, this is one way to do that:
data = [{'name': 'Alex',
'number_of_toys': [{'classification': 3, 'count': 383},
{'classification': 1, 'count': 29},
{'classification': 0, 'count': 61}],
'total_toys': 473},
{'name': 'John',
'number_of_toys': [{'classification': 3, 'count': 8461},
{'classification': 0, 'count': 3825},
{'classification': 1, 'count': 1319}],
'total_toys': 13605}]
import pandas as pd
df = pd.DataFrame(data)
print(df.loc[0]['name'])
print(df.loc[0][1][0]['classification'])
print(df.loc[0][1][0]['count'])
which gives:
Alex
3
383
I want to change this in python
Before :
{'NewYork': {'Paris': 12, 'Hawaii': 8, 'Tokyo': 11, 'Incheon': 12, 'LA': 2},
'Beijing': {'Hongkong': 3, 'Cebu': 5},
'Incheon': {'Cairo': 10, 'LA': 11, 'Tokyo': 1},
'Tokyo': {'NewYork': 12, 'Paris': 14, 'LA': 9}}
After :
[("NewYork","Paris",12),
("NewYork","Hawaii",8),
("Newyork","Tokyo",11),
("NewYork","Incheon",12),
("NewYork","LA",2),
("Beijing","HongKong",3),
("Beijing","Cebu",5),
("Incheon","Cairo",10),
("Incheon","LA",11),
("Incheon","Tokyo",1),
("Tokyo","NewYork",12),
("Tokyo","Paris",14),
("Tokyo","LA",9)]
How can I do this?
>>> before = {'NewYork': {'Paris': 12, 'Hawaii': 8, 'Tokyo': 11, 'Incheon': 12, 'LA': 2},
... 'Beijing': {'Hongkong': 3, 'Cebu': 5}, 'Incheon': {'Cairo': 10, 'LA': 11, 'Tokyo': 1},
... 'Tokyo': {'NewYork': 12, 'Paris': 14, 'LA': 9}}
>>>
>>> print [(key,k,v) for key,val in before.iteritems() for k,v in val.iteritems()]
[('NewYork', 'Paris', 12), ('NewYork', 'LA', 2), ('NewYork', 'Hawaii', 8), ('NewYork', 'Incheon', 12), ('NewYork', 'Tokyo', 11), ('Beijing', 'Hongkong', 3), ('Beijin
g', 'Cebu', 5), ('Incheon', 'Cairo', 10), ('Incheon', 'Tokyo', 1), ('Incheon', 'LA', 11), ('Tokyo', 'NewYork', 12), ('Tokyo', 'Paris', 14), ('Tokyo', 'LA', 9)]
You can create an array with a list comprehension and two lots of iteration:
>>> city_pairings = {'NewYork': {'Paris': 12,
... 'Hawaii': 8,
... 'Tokyo': 11,
... 'Incheon': 12,
... 'LA': 2},
... 'Beijing': {'Hongkong': 3,
... 'Cebu': 5},
... 'Incheon': {'Cairo': 10,
... 'LA': 11,
... 'Tokyo': 1},
... 'Tokyo': {'NewYork': 12,
... 'Paris': 14,
... 'LA': 9}}
>>> flat = [(city, other_city, value)
... for city, pairings in city_pairings.iteritems()
... for other_city, value in pairings.iteritems()]
>>> from pprint import pprint
>>> pprint(flat)
[('NewYork', 'Paris', 12),
('NewYork', 'LA', 2),
('NewYork', 'Hawaii', 8),
('NewYork', 'Incheon', 12),
('NewYork', 'Tokyo', 11),
('Beijing', 'Hongkong', 3),
('Beijing', 'Cebu', 5),
('Incheon', 'Cairo', 10),
('Incheon', 'Tokyo', 1),
('Incheon', 'LA', 11),
('Tokyo', 'NewYork', 12),
('Tokyo', 'Paris', 14),
('Tokyo', 'LA', 9)]
The second cities aren't exactly in the order you wanted as the dictionary sorts them according to some scheme. To have cities in the order they were inserted into the dictionary you'll have to do something else, using OrderedDict, for example.
This will do the trick:
dict.items()
I'm trying to write a function, in an elegant way, that will group a list of dictionaries and aggregate (sum) the values of like-keys.
Example:
my_dataset = [
{
'date': datetime.date(2013, 1, 1),
'id': 99,
'value1': 10,
'value2': 10
},
{
'date': datetime.date(2013, 1, 1),
'id': 98,
'value1': 10,
'value2': 10
},
{
'date': datetime.date(2013, 1, 2),
'id' 99,
'value1': 10,
'value2': 10
}
]
group_and_sum_dataset(my_dataset, 'date', ['value1', 'value2'])
"""
Should return:
[
{
'date': datetime.date(2013, 1, 1),
'value1': 20,
'value2': 20
},
{
'date': datetime.date(2013, 1, 2),
'value1': 10,
'value2': 10
}
]
"""
I've tried doing this using itertools for the groupby and summing each like-key value pair, but am missing something here. Here's what my function currently looks like:
def group_and_sum_dataset(dataset, group_by_key, sum_value_keys):
keyfunc = operator.itemgetter(group_by_key)
dataset.sort(key=keyfunc)
new_dataset = []
for key, index in itertools.groupby(dataset, keyfunc):
d = {group_by_key: key}
d.update({k:sum([item[k] for item in index]) for k in sum_value_keys})
new_dataset.append(d)
return new_dataset
You can use collections.Counter and collections.defaultdict.
Using a dict this can be done in O(N), while sorting requires O(NlogN) time.
from collections import defaultdict, Counter
def solve(dataset, group_by_key, sum_value_keys):
dic = defaultdict(Counter)
for item in dataset:
key = item[group_by_key]
vals = {k:item[k] for k in sum_value_keys}
dic[key].update(vals)
return dic
...
>>> d = solve(my_dataset, 'date', ['value1', 'value2'])
>>> d
defaultdict(<class 'collections.Counter'>,
{
datetime.date(2013, 1, 2): Counter({'value2': 10, 'value1': 10}),
datetime.date(2013, 1, 1): Counter({'value2': 20, 'value1': 20})
})
The advantage of Counter is that it'll automatically sum the values of similar keys.:
Example:
>>> c = Counter(**{'value1': 10, 'value2': 5})
>>> c.update({'value1': 7, 'value2': 3})
>>> c
Counter({'value1': 17, 'value2': 8})
Thanks, I forgot about Counter. I still wanted to maintain the output format and sorting of my returned dataset, so here's what my final function looks like:
def group_and_sum_dataset(dataset, group_by_key, sum_value_keys):
container = defaultdict(Counter)
for item in dataset:
key = item[group_by_key]
values = {k:item[k] for k in sum_value_keys}
container[key].update(values)
new_dataset = [
dict([(group_by_key, item[0])] + item[1].items())
for item in container.items()
]
new_dataset.sort(key=lambda item: item[group_by_key])
return new_dataset
Here's an approach using more_itertools where you simply focus on how to construct output.
Given
import datetime
import collections as ct
import more_itertools as mit
dataset = [
{"date": datetime.date(2013, 1, 1), "id": 99, "value1": 10, "value2": 10},
{"date": datetime.date(2013, 1, 1), "id": 98, "value1": 10, "value2": 10},
{"date": datetime.date(2013, 1, 2), "id": 99, "value1": 10, "value2": 10}
]
Code
# Step 1: Build helper functions
kfunc = lambda d: d["date"]
vfunc = lambda d: {k:v for k, v in d.items() if k.startswith("val")}
rfunc = lambda lst: sum((ct.Counter(d) for d in lst), ct.Counter())
# Step 2: Build a dict
reduced = mit.map_reduce(dataset, keyfunc=kfunc, valuefunc=vfunc, reducefunc=rfunc)
reduced
Output
defaultdict(None,
{datetime.date(2013, 1, 1): Counter({'value1': 20, 'value2': 20}),
datetime.date(2013, 1, 2): Counter({'value1': 10, 'value2': 10})})
The items are grouped by date and pertinent values are reduced as Counters.
Details
Steps
build helper functions to customize construction of keys, values and reduced values in the final defaultdict. Here we want to:
group by date (kfunc)
built dicts keeping the "value*" parameters (vfunc)
aggregate the dicts (rfunc) by converting to collections.Counters and summing them. See an equivalent rfunc below+.
pass in the helper functions to more_itertools.map_reduce.
Simple Groupby
... say in that example you wanted to group by id and date?
No problem.
>>> kfunc2 = lambda d: (d["date"], d["id"])
>>> mit.map_reduce(dataset, keyfunc=kfunc2, valuefunc=vfunc, reducefunc=rfunc)
defaultdict(None,
{(datetime.date(2013, 1, 1),
99): Counter({'value1': 10, 'value2': 10}),
(datetime.date(2013, 1, 1),
98): Counter({'value1': 10, 'value2': 10}),
(datetime.date(2013, 1, 2),
99): Counter({'value1': 10, 'value2': 10})})
Customized Output
While the resulting data structure clearly and concisely presents the outcome, the OP's expected output can be rebuilt as a simple list of dicts:
>>> [{**dict(date=k), **v} for k, v in reduced.items()]
[{'date': datetime.date(2013, 1, 1), 'value1': 20, 'value2': 20},
{'date': datetime.date(2013, 1, 2), 'value1': 10, 'value2': 10}]
For more on map_reduce, see the docs. Install via > pip install more_itertools.
+An equivalent reducing function:
def rfunc(lst: typing.List[dict]) -> ct.Counter:
"""Return reduced mappings from map-reduce values."""
c = ct.Counter()
for d in lst:
c += ct.Counter(d)
return c