I have two dictionaries. one is a nested dictionary and another one is general dictionary. I want to do some divisions:
dict1 = {'document1': {'a': 3, 'b': 1, 'c': 5}, 'document2': {'d': 2, 'e': 4}}
dict2 = {'document1': 28, 'document2': 36}
I want to use the inner dictionary values form dict1 to divided by the value of matching document in dict2. The expect output would be:
enter code here
dict3 = {'document1': {'a': 3/28, 'b': 1/28, 'c': 5/28}, 'document2': {'d': 2/36, 'e': 4/36}}
I tried using two for loop to run each dictionary, but values will be duplicate multiple times and I have no idea how to fix this? Does anyone has idea of how to achieve this goal? I would be appreciate it!``
You can achieve this using dictionary comprehension.
dict3 = {} # create a new dictionary
# iterate dict1 keys, to get value from dict2, which will be used to divide dict 1 values
for d in dict1:
y = dict2[d]
dict3[d] = {k:(v/y) for k, v in dict1[d].items() }
You can try the following code
dict1 = {'document1': {'a': 3, 'b': 1, 'c': 5},
'document2': {'d': 2, 'e': 4}}
dict2 = {'document1': 28, 'document2': 36}
for k,v in dict1.items():
for ki,vi in v.items():
dict1[k][ki] /= dict2[k]
print(dict1)
# output
#{'document1': {'a': 0.10714285714285714, 'b': 0.03571428571428571, 'c': 0.17857142857142858},
#'document2': {'d': 0.05555555555555555, 'e': 0.1111111111111111}}
In one line, using nested dictionary comprehensions:
dict3 = {doc_key: {k: (v/doc_value) for k, v in dict1[doc_key].items()} for doc_key, doc_value in dict2.items()}
Related
I have a complex data structure consisting of list of dictionaries and these dictionaries consists of list of dictionaries further. Now, I am trying to extract specific key:value pairs from internal nested dicts (from list of dictionaries). Hopefully below example shows what I am trying to achieve
complex_data =
[[{'A': 'test1'},
{'A': 'test2'},
{'B': [{'C': {'testabc': {'A': 'xxx'}}},
{'C': {'test123': {'A': 'yyy'}, 'test456': {'A': '111abc'}}},
{'C': {'test123': {'A': 'yyy'}, 'test456': {'A': '111def'}}}]}],
.
.
[{'A': 'test11'},
{'A': 'test22'}],
.
.
[{'A': 'test33'},
{'A': 'test44'},
{'B': []}],
.
[{'A': 'test3'},
{'A': 'test4'},
{'B': [{'C': {'testabc': {'A': '111'}}},
{'C': {'test123': {'A': 'yyy'}, 'test456': {'A': '999abc'}}},
{'C': {'test123': {'A': 'yyy'}, 'test456': {'A': '999def'}}}]}]]
Now the output should be a nested list of dictionaries like:
desired_output = [[{'A': 'test1'}, {'A': 'test2'}, 'test456': {'A': '111def'}],
.
.
[{'A': 'test3'}, {'A': 'test4'}, 'test456': {'A': '999def'}]]
I am doing
for y in complex_data:
desired_output.append([y[2]['B'][2]['C'] for y in row] for row in y)
But this won't work. Varibale y doesn't iterate over list B. Can anyone please let me know what is the issue here and how to resolve it? i am using python3.9
Update: In some cases, the complete list B could be missing or could be empty {'B': []}.
Thanks in advance.
P.S: Please let me know if any info is missing or not clear.
Here the main idea is to convert dict to dataframe and dataframe to append on new list by rows
Code:
Step 1:
df = pd.json_normalize(complex_data )
df[2] = df[2].apply(lambda x: {k:v for k , v in dict(map(dict.popitem, x['B']))['C'].items() if k=='test456'})
df
#Output
0 1 2
0 {'A': 'test1'} {'A': 'test2'} {'test456': {'A': '111def'}}
1 {'A': 'test3'} {'A': 'test4'} {'test456': {'A': '999def'}}
Step 2:
desired_output = df.values.tolist()
desired_output
#output
[[{'A': 'test1'}, {'A': 'test2'}, {'test456': {'A': '111def'}}],
[{'A': 'test3'}, {'A': 'test4'}, {'test456': {'A': '999def'}}]]
Update you can avoid the None or {} value using if..else.. as below:
df[2].apply(lambda x: {} if len(x['B'])==0 else({} if not x['B'][-1] else ({'test456':x['B'][-1]['C']['test456']} if 'test456' in x['B'][-1]['C'].keys() else {})))
In Python I have a list of dictionary containing dictionaries.
list = [{a: {b:1, c:2}}, {d: {b:3, c:4}}, {a: {b:2, c:3}}]
I want one final list with dictionary that contain dictionary that will contain the sum of all dictionary with the same dictionary as the key. i.e. the result will be:
result = [{a: {b:3, c:5}}, {d: {b:3, c:4}}]
N.B: every dictionary in the list will contain same number of key, value pairs.
Code:
lst = [{'a': {'b':1, 'c':2}}, {'d': {'b':3, 'c':4}}, {'a': {'b':2, 'c':3}}]
p={}
for l in lst:
for key , val in l.items():
if key in p and val != p[key]:
p.update({key:{k: p[key].get(k, 0) + val.get(k, 0) for k in set(p[key])}})
else:
p.update(l)
Output:
{'a': {'c': 5, 'b': 3}, 'd': {'b': 3, 'c': 4}}
I don't know if it is the best way to do this, but here it goes:
First I've made a loop for to get all of the primary and secondary keys:
# list containing the data
lista = [{'a': {'b':1, 'c':2}}, {'d': {'b':3, 'c':4}}, {'a': {'b':2, 'c':3}}]
# empty list with keys
primary_keys = []
secondary_keys = []
# for each dict in the list it appends the primary key and the secondary key
for dic in lista:
for key in dic.keys():
primary_keys.append(key)
for key2 in dic[key].keys():
secondary_keys.append(key2)
# prints all the keys
print('Primary keys:',primary_keys)
print('Secondary keys:',secondary_keys)
Results:
Primary keys: ['a', 'd', 'a']
Secondary keys: ['b', 'c', 'b', 'c', 'b', 'c']
Then I've made a final dict with all the combinations:
# creates the final dict from the list
dict_final = dict.fromkeys(primary_keys)
# for all primary keys creates a secondary key
for pkey in dict_final.keys():
dict_final[pkey] = dict.fromkeys(secondary_keys)
# for all secondary keys puts a zero
for skey in dict_final[pkey].keys():
dict_final[pkey][skey] = 0
# prints the dict
print(dict_final)
Results:
{'a': {'b': 0, 'c': 0}, 'd': {'b': 0, 'c': 0}}
And later I've made a loop through each dictionary item and added to the corresponding keys in the final dict
# for each primary and secondary keys in the dic list sums into the final dict
for dic in lista:
for pkey in dict_final.keys():
for skey in dict_final[pkey].keys():
try:
dict_final[pkey][skey] += dic[pkey][skey]
except:
pass
# prints the final dict
print(dict_final)
Results:
{'a': {'b': 3, 'c': 5}, 'd': {'b': 3, 'c': 4}}
By using defaultdict, we can simplify the logic a bit:
# Using default dict
from collections import defaultdict
original_list = [{'a': {'b':1, 'c':2}}, {'d': {'b':3, 'c':4}}, {'a': {'b':2, 'c':3}}]
out = defaultdict(lambda: defaultdict(int))
for outter_dict in original_list:
for outter_key, inner_dict in outter_dict.items():
for inner_key, inner_value in inner_dict.items():
out[outter_key][inner_key] += inner_value
print(out)
out_list = [{key: dict(value) for key, value in out.items()}]
print(out_li)
Output:
defaultdict(<function <lambda> at 0x103c18a60>, {'a': defaultdict(<class 'int'>, {'b': 3, 'c': 5}), 'd': defaultdict(<class 'int'>, {'b': 3, 'c': 4})})
[{'a': {'b': 3, 'c': 5}, 'd': {'b': 3, 'c': 4}}]
Notes
out is a nested defaultdict whose values are yet defaultdict whose values are integers
After the 3 nested loops, we converted the defaultdict out to a list of dictionaries to conform the output requirement
In a given list:
unmatched_items_array = [{'c': 45}, {'c': 35}, {'d': 5}, {'a': 3.2}, {'a': 3}]
Find all 'key' pairs and print out and if no pairs found for given dictionary print out that dictionary.
What I managed to write so far sort of works but it keeps testing some items of the list even though they were already tested. Not sure how to fix it.
for i in range(len(unmatched_items_array)):
for j in range(i + 1, len(unmatched_items_array)):
# when keys are the same print matching dictionary pairs
if unmatched_items_array[i].keys() == unmatched_items_array[j].keys():
print(unmatched_items_array[i], unmatched_items_array[j])
break
# when no matching pairs print currently processed dictionary
print(unmatched_items_array[i])
Output:
{'c': 45} {'c': 35}
{'c': 45}
{'c': 35}
{'d': 5}
{'a': 3.2} {'a': 3}
{'a': 3.2}
{'a': 3}
What the output should be:
{'c': 45} {'c': 35}
{'d': 5}
{'a': 3.2} {'a': 3}
What am I doing wrong here?
Using collections.defaultdict
Ex:
from collections import defaultdict
unmatched_items_array = [{'c': 45}, {'c': 35}, {'d': 5}, {'a': 3.2}, {'a': 3}]
result = defaultdict(list)
for i in unmatched_items_array:
key, _ = i.items()[0]
result[key].append(i) #Group by key.
for _, v in result.items(): #print Result.
print(v)
Output:
[{'a': 3.2}, {'a': 3}]
[{'c': 45}, {'c': 35}]
[{'d': 5}]
With itertools.groupby:
from itertools import groupby
unmatched_items_array = [{'d': 5}, {'c': 35}, {'a': 3}, {'a': 3.2}, {'c': 45}]
for v, g in groupby(sorted(unmatched_items_array, key=lambda k: tuple(k.keys())), lambda k: tuple(k.keys())):
print([*g])
Prints:
[{'a': 3}, {'a': 3.2}]
[{'c': 35}, {'c': 45}]
[{'d': 5}]
EDIT: If your items in the list are sorted by keys already, then you can skip the sorted() call:
for v, g in groupby(unmatched_items_array, lambda k: tuple(k.keys()) ):
print([*g])
Imagine that you have to sort a list of dicts, by the value of a particular key. Note that the key might be missing from some of the dicts, in which case you default to the value of that key to being 0.
sample input
input = [{'a': 1, 'b': 2}, {'a': 10, 'b': 3}, {'b': 5}]
sample output (sorted by value of key 'a')
[{'b': 5}, {'a': 1, 'b': 2}, {'a': 10, 'b': 3}]
note that {'b': 5} is first in the sort-order because it has the lowest value for 'a' (0)
I would've used input.sort(key=operator.itemgetter('a')), if all the dicts were guaranteed to have the key 'a'. Or I could convert the input dicts to collections.defaultdict and then sort.
Is there a way to do this in-place without having to creating new dicts or updating the existing dicts? Can operator.itemgetter handle missing keys?
>>> items = [{'a': 1, 'b': 2}, {'a': 10, 'b': 3}, {'b': 5}]
>>> sorted(items, key=lambda d: d.get('a', 0))
[{'b': 5}, {'a': 1, 'b': 2}, {'a': 10, 'b': 3}]
Or to update the existing dictionary in-place
items.sort(key=lambda d: d.get('a', 0))
Or if in sorted:
>>> items = [{'a': 1, 'b': 2}, {'a': 10, 'b': 3}, {'b': 5}]
>>> sorted(items,key=lambda x: x['a'] if 'a' in x else 0)
[{'b': 5}, {'a': 1, 'b': 2}, {'a': 10, 'b': 3}]
>>>
Having a dict like:
x = {
'1': {'a': 1, 'b': 3},
'2': {'a': 2, 'b': 4}
}
I'd like to have a new key total with the sum of each key in the subdictionaries, like:
x['total'] = {'a': 3, 'b': 7}
I've tried adapting the answer from this question but found no success.
Could someone shed a light?
Assuming all the values of x are dictionaries, you can iterate over their items to compose your new dictionary.
from collections import defaultdict
x = {
'1': {'a': 1, 'b': 3},
'2': {'a': 2, 'b': 4}
}
total = defaultdict(int)
for d in x.values():
for k, v in d.items():
total[k] += v
print(total)
# defaultdict(<class 'int'>, {'a': 3, 'b': 7})
A variation of Patrick answer, using collections.Counter and just update since sub-dicts are already in the proper format:
from collections import Counter
x = {
'1': {'a': 1, 'b': 3},
'2': {'a': 2, 'b': 4}
}
total = Counter()
for d in x.values():
total.update(d)
print(total)
result:
Counter({'b': 7, 'a': 3})
(update works differently for Counter, it doesn't overwrite the keys but adds to the current value, that's one of the subtle differences with defaultdict(int))
You can use a dictionary comprehension:
x = {'1': {'a': 1, 'b': 3}, '2': {'a': 2, 'b': 4}}
full_sub_keys = {i for b in map(dict.keys, x.values()) for i in b}
x['total'] = {i:sum(b.get(i, 0) for b in x.values()) for i in full_sub_keys}
Output:
{'1': {'a': 1, 'b': 3}, '2': {'a': 2, 'b': 4}, 'total': {'b': 7, 'a': 3}}
from collections import defaultdict
dictionary = defaultdict(int)
x = {
'1': {'a': 1, 'b': 3},
'2': {'a': 2, 'b': 4}
}
for key, numbers in x.items():
for key, num in numbers.items():
dictionary[key] += num
x['total'] = {key: value for key, value in dictionary.items()}
print(x)
We can create a default dict to iterate through each of they key, value pairs in the nested dictionary and sum up the total for each key. That should enable a to evaluate to 3 and b to evaluate to 7. After we increment the values we can do a simple dictionary comprehension to create another nested dictionary for the totals, and make a/b the keys and their sums the values. Here is your output:
{'1': {'a': 1, 'b': 3}, '2': {'a': 2, 'b': 4}, 'total': {'a': 3, 'b': 7}}