How to merge data from multiple dictionaries with repeating keys?

How to merge data from multiple dictionaries with repeating keys? - python

I have two dictionaries:
dict1 = {'a': '2', 'b': '10'}
dict2 = {'a': '25', 'b': '7'}
I need to save all the values for same key in a new dictionary.
The best i can do so far is: defaultdict(<class 'list'>, {'a': ['2', '25'], 'b': ['10', '7']})
dd = defaultdict(list)
for d in (dict1, dict2):
for key, value in d.items():
dd[key].append(value)
print(dd)
that does not fully resolve the problem since a desirable result is:
a = {'dict1':'2', 'dict2':'25'}
b = {'dict2':'10', 'dict2':'7'}
Also i possibly would like to use new dictionary key same as initial dictionary name

Your main problem is that you're trying to cross the implementation boundary between a string value and a variable name. This is almost always bad design. Instead, start with all of your labels as string data:
table = {
"dict1": {'a': '2', 'b': '10'},
"dict2": {'a': '25', 'b': '7'}
}
... or, in terms of your original post:
table = {
"dict1": dict1,
"dict2": dict2
}
From here, you should be able to invert the levels to obtain
invert = {
"a": {'dict1': '2', 'dict2': '25'},
"b": {'dict2': '10', 'dict2': '7'}
}
Is that enough to get your processing where it needs to be? Keeping the data in comprehensive dicts like this, will make it easier to iterate through the sub-dicts as needed.

As #Prune suggested, structuring your result as a nested dictionary will be easier:
{'a': {'dict1': '2', 'dict2': '25'}, 'b': {'dict1': '10', 'dict2': '7'}}
Which could be achieved with a dict comprehension:
{k: {"dict%d" % i: v2 for i, v2 in enumerate(v1, start=1)} for k, v1 in dd.items()}
If you prefer doing it without a comprehension, you could do this instead:
result = {}
for k, v1 in dd.items():
inner_dict = {}
for i, v2 in enumerate(v1, start=1):
inner_dict["dict%d" % i] = v2
result[k] = inner_dict
Note: This assumes you want to always want to keep the "dict1", "dict2",... key structure.

Related

Reshaping a large dictionary

I am working on xbrl document parsing. I got to a point where I have a large dic structured like this....
sample of a dictionary I'm working on
Since it's bit challenging to describe the pattern of what I'm trying to achieve I just put an example of what I'd like it to be...
sample of what I'm trying to achieve
Since I'm fairly new to programing, I'm hustling for days with this. Trying different approaches with loops, list and dic comprehension starting from here...
for k in storage_gaap:
if 'context_ref' in storage_gaap[k]:
for _k in storage_gaap[k]['context_ref']:
storage_gaap[k]['context_ref']={_k}```
storage_gaap being the master dictionary. Sorry for attaching pictures, but it's just much clearer to see the dictionary
I'd really appreciate any and ever help

Here's a solution using zip and dictionary comprehension to do what you're trying to do using toy data in a similar structure.
import itertools
import pprint
# Sample data similar to provided screenshots
data = {
'a': {
'id': 'a',
'vals': ['a1', 'a2', 'a3'],
'val_num': [1, 2, 3]
},
'b': {
'id': 'b',
'vals': ['b1', 'b2', 'b3'],
'val_num': [4, 5, 6]
}
}
# Takes a tuple of keys, and a list of tuples of values, and transforms them into a list of dicts
# i.e ('id', 'val'), [('a', 1), ('b', 2) => [{'id': 'a', 'val': 1}, {'id': 'b', 'val': 2}]
def get_list_of_dict(keys, list_of_tuples):
list_of_dict = [dict(zip(keys, values)) for values in list_of_tuples]
return list_of_dict
def process_dict(key, values):
# Transform the dict with lists of values into a list of dicts
list_of_dicts = get_list_of_dict(('id', 'val', 'val_num'), zip(itertools.repeat(key, len(values['vals'])), values['vals'], values['val_num']))
# Dictionary comprehension to group them based on the 'val' property of each dict
return {d['val']: {k:v for k,v in d.items() if k != 'val'} for d in list_of_dicts}
# Reorganize to put dict under a 'context_values' key
processed = {k: {'context_values': process_dict(k, v)} for k,v in data.items()}
# {'a': {'context_values': {'a1': {'id': 'a', 'val_num': 1},
# 'a2': {'id': 'a', 'val_num': 2},
# 'a3': {'id': 'a', 'val_num': 3}}},
# 'b': {'context_values': {'b1': {'id': 'b', 'val_num': 4},
# 'b2': {'id': 'b', 'val_num': 5},
# 'b3': {'id': 'b', 'val_num': 6}}}}
pprint.pprint(processed)

Ok, Here is the updated solution from my case. Catch for me was the was the zip function since it only iterates over the smallest list passed. Solution was the itertools.cycle method Here is the code:
data = {'us-gaap_WeightedAverageNumberOfDilutedSharesOutstanding': {'context_ref': ['D20210801-20220731',
'D20200801-20210731',
'D20190801-20200731',
'D20210801-20220731',
'D20200801-20210731',
'D20190801-20200731'],
'decimals': ['-5',
'-5',
'-5',
'-5',
'-5',
'-5'],
'id': ['us-gaap:WeightedAverageNumberOfDilutedSharesOutstanding'],
'master_id': ['us-gaap_WeightedAverageNumberOfDilutedSharesOutstanding'],
'unit_ref': ['shares',
'shares',
'shares',
'shares',
'shares',
'shares'],
'value': ['98500000',
'96400000',
'96900000',
'98500000',
'96400000',
'96900000']},
def get_list_of_dict(keys, list_of_tuples):
list_of_dict = [dict(zip(keys, values)) for values in list_of_tuples]
return list_of_dict
def process_dict(k, values):
list_of_dicts = get_list_of_dict(('context_ref', 'decimals', 'id','master_id','unit_ref','value'),
zip((values['context_ref']),values['decimals'],itertools.cycle(values['id']),
itertools.cycle(values['master_id']),values['unit_ref'], values['value']))
return {d['context_ref']: {k:v for k,v in d.items()if k != 'context_ref'} for d in list_of_dicts}
processed = {k: {'context_values': process_dict(k, v)} for k,v in data.items()}
pprint.pprint(processed)

Convert list to dictionary with duplicate keys using dict comprehension [duplicate]

This question already has answers here:
How can one make a dictionary with duplicate keys in Python?
(9 answers)
Closed 6 months ago.
Good day all,
I am trying to convert a list of length-2 items to a dictionary using the below:
my_list = ["b4", "c3", "c5"]
my_dict = {key: value for (key, value) in my_list}
The issue is that when a key occurrence is more than one in the list, only the last key and its value are kept.
So in this case instead of
my_dict = {'c': '3', 'c': '5', 'b': '4'}
I get
my_dict = {'c': '5', 'b': '4'}
How can I keep all key:value pairs even if there are duplicate keys.
Thanks

For one key in a dictionary you can only store one value.
You can chose to have the value as a list.
{'b': ['4'], 'c': ['3', '5']}
following code will do that for you :
new_dict = {}
for (key, value) in my_list:
if key in new_dict:
new_dict[key].append(value)
else:
new_dict[key] = [value]
print(new_dict)
# output: {'b': ['4'], 'c': ['3', '5']}
Same thing can be done with setdefault. Thanks #Aadit M Shah for pointing it out
new_dict = {}
for (key, value) in my_list:
new_dict.setdefault(key, []).append(value)
print(new_dict)
# output: {'b': ['4'], 'c': ['3', '5']}
Same thing can be done with defaultdict. Thanks #MMF for pointing it out.
from collections import defaultdict
new_dict = defaultdict(list)
for (key, value) in my_list:
new_dict[key].append(value)
print(new_dict)
# output: defaultdict(<class 'list'>, {'b': ['4'], 'c': ['3', '5']})
you can also chose to store the value as a list of dictionaries:
[{'b': '4'}, {'c': '3'}, {'c': '5'}]
following code will do that for you
new_list = [{key: value} for (key, value) in my_list]

If you don't care about the O(n^2) asymptotic behaviour you can use a dict comprehension including a list comprehension:
>>> {key: [i[1] for i in my_list if i[0] == key] for (key, value) in my_list}
{'b': ['4'], 'c': ['3', '5']}
or the iteration_utilities.groupedby function (which might be even faster than using collections.defaultdict):
>>> from iteration_utilities import groupedby
>>> from operator import itemgetter
>>> groupedby(my_list, key=itemgetter(0), keep=itemgetter(1))
{'b': ['4'], 'c': ['3', '5']}

You can use defaultdict to avoid checking if a key is in the dictionnary or not :
from collections import defaultdict
my_dict = defaultdict(list)
for k, v in my_list:
my_dict[k].append(v)
Output :
defaultdict(list, {'b': ['4'], 'c': ['3', '5']})

Creating a nested dict out of a list with unknown length

I'm trying to get a method that gets 3 lists and turns them into a nested dict.
The first and the second list can have any amount of entries > 0.
The values list always has len(firstlist) * len(secondlist) entries.
For example:
givenlist1 = ["First", "Second"]
givenlist2 = ["A.B.D", "A.Y.Z", "A.B.E"]
Values = ["10", "2", "3", "4", "1", "3"]
Should return a dict like this:
{'First': {'A': {'B': {'D': '10', 'E': '3'}, 'Y': {'Z': '2'}}},
'Second': {'A': {'B': {'D': '4', 'E': '3'}, 'Y': {'Z': '1'}}}}
I tried a lot with .update but I just can't get an idea how to do it with a variable amount of entries in the second list.

You can use itertools.product to get the required combinations of the entries in givenlist1 and givenlist2, and use zip to associate them with the corresponding items from values. Then you need to .split the individual letter keys from the letters in the givenlist2 items to get the nested keys, creating new dicts as necessary.
from itertools import product
from pprint import pprint
givenlist1 = ["First", "Second"]
givenlist2 = ["A.B.D", "A.Y.Z", "A.B.E"]
values = ["10", "2", "3", "4", "1", "3"]
result = {k1: {} for k1 in givenlist1}
for (k1, k2), v in zip(product(givenlist1, givenlist2), values):
d = result[k1]
keys = k2.split('.')
for k in keys[:-1]:
d = d.setdefault(k, {})
d[keys[-1]] = v
pprint(result)
output
{'First': {'A': {'B': {'D': '10', 'E': '3'}, 'Y': {'Z': '2'}}},
'Second': {'A': {'B': {'D': '4', 'E': '3'}, 'Y': {'Z': '1'}}}}
Here's a less compact but possibly more readable way to write the inner for loop:
for k in keys[:-1]:
if k not in d:
d[k] = {}
d = d[k]

Working with dictionary of dictionaries in python

What I want the code below to achieve is that for which ever dictionary who has smaller value of temporary key, add an item in that dictionary with key "permanent" and value same as the value for the temporary key.
Variable="a"
a_list={'a': {'c': '2', 'b': '1'}, 'c': {'a': '2', 'b': '3'}, 'b': {'a': '1', 'c': '3'}}
a_list[Variable]["permanent"]="1"
for item in a_list[Variable].keys():
try:
if a_list[Variable][item]!=0:
a_list[item]["temporary"]=a_list[Variable][item]
except KeyError:
pass
for item in a_list.keys():
if "permanent" in a_list[item].keys():
del a_list[item]
print a_list
the output now is
{'c': {'a': '2', 'b': '3', 'temporary': '2'}, 'b': {'a': '1', 'c': '3', 'temporary': '1'}}
But after adding an statement I want the output to be
{'c': {'a': '2', 'b': '3', 'temporary': '2'}, 'b': {'a': '1', 'c': '3', 'temporary': '1', 'permanent': '1'}}
I don't know how to achieve this by comparing the two temporary keys in the two dictionaries.
Would very much appreciate any help!

The min function will iterate over a list. If given a special key= named parameter, it will use the one-argument function passed in as the key= function to extract the value to use for comparison purposes.
You show dictionaries using only strings containing digits. The maximum string is therefore 'A' (asciibetically later than a digit). So:
max_value = 'A'
min_temp = min(a_list.keys(), key= lambda k: a_list[k].get('temporary', max_value))
At this point, min_temp has the first key in a_list that has the lowest value for sub-dict key 'temporary', or if none had that subkey, the first key returned by keys() (with a defaulted value of max_value). So let's double-check that it was a valid match:
if 'temporary' in a_list[min_temp]:
a_list[min_temp]['permanent'] = a_list[min_temp]['permanent']

Create a list of unique keys in Python

I have a list of
[{"1":"value"},{"1":"second_value"},{"2":"third_value"},{"2":"fourth_value"},{"3":"fifth_value"}]
want to convert it into
[{"1":"value","2":"third_value","3":"fifth_value"},{"1":"second_value","2":"fourth_value"}]

There is probably a cleaner way of doing this, input is appreciated:
d = [{"1":"value"},{"1":"second_value"},{"2":"third_value"},{"2":"fourth_value"},{"3":"fifth_value"}]
results = [{}]
for item in stuff:
j,k = item.items()[0] // Do the initial dicts always contain one key-value pair?
for result in results:
if j not in result:
result[j] = k
break
if result == results[-1]:
results.append(item)
break
Result:
[{'1': 'value', '3': 'fifth_value', '2': 'third_value'}, {'1': 'second_value', '2': 'fourth_value'}]

You can use collections.defaultdict:
>>> import collections
>>> result = collections.defaultdict(list)
>>> for item in d:
... result[item.values()[0]].append(item.keys()[0])
...
>>> [{key: value for key in keys} for value, keys in result.items()]
[{'1': 'second_value', '2': 'second_value'}, {'1': 'value', '3': 'value', '2': 'value'}]
Note that second_value comes before value in this as the ordering is rather arbitrary (unless you were to explicitly specify that value should be ordered before second_value the above would give you the ordering that the dictionary returns).

You can use collections.defaultdict here. Iterate over the list, use the values as keys and collect all the keys related to a value in a list.
>>> from collections import defaultdict
>>> d = defaultdict(list)
for dic in lis:
for k, v in dic.items():
d[v].append(k)
...
Now d becomes:
>>> d
defaultdict(<type 'list'>,
{'second_value': ['1', '2'],
'value': ['1', '2', '3']})
Now iterate over d to get the desired result:
>>> [{v1:k for v1 in v} for k, v in d.items()]
[{'1': 'second_value', '2': 'second_value'}, {'1': 'value', '3': 'value', '2': 'value'}]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to merge data from multiple dictionaries with repeating keys? - python

Related

Reshaping a large dictionary

Convert list to dictionary with duplicate keys using dict comprehension [duplicate]

Creating a nested dict out of a list with unknown length

Working with dictionary of dictionaries in python

Create a list of unique keys in Python

Categories

Resources