How can I add different set of dictionaries into one [duplicate] - python

This question already has answers here:
How to merge dicts, collecting values from matching keys?
(17 answers)
Closed 5 years ago.
How can I merge three different set of dictionary to new one suppose if input is
d1 = {'name':'tom', 'age':'14', 'sex':'m'}
d2 = {'color':'w', 'weight':'58','style':'good'}
d3 = {'sports':'cricket','music':'rock','dance':'disco'}
Output should be d = {'name':'tom', 'age':'14', 'sex':'m','color':'w', 'weight':'58','style':'good','sports':'cricket','music':'rock','dance':'disco'}
I tried using update method, it is suitable for only two dictionary, if I use 3 set it results in repetation,so how can I merge three dictionary into single one

If you're using a recent version of Python (>= 3.5), you can take advantage of unpacking in mapping literals
d1 = {'name':'tom', 'age':'14', 'sex':'m'}
d2 = {'color':'w', 'weight':'58','style':'good'}
d3 = {'sports':'cricket','music':'rock','dance':'disco'}
new_dict = {**d1, **d2, **d3}

update() will work:
d1 = {'name':'tom', 'age':'14', 'sex':'m'}
d2 = {'color':'w', 'weight':'58','style':'good'}
d3 = {'sports':'cricket','music':'rock','dance':'disco'}
d1.update(d2)
d1.update(d3)
print(d1)

The Pythonic way:
{**d1, **d2, **d3}
Watch out for duplicate keys in the dictionaries.

Is there a problem using two lines? If not, I would recommend:
d1.update(d2)
d1.update(d3)

Yet an other option is to use collections.ChainMap:
>>> from collections import ChainMap
>>> d1 = {'name':'tom', 'age':'14', 'sex':'m'}
>>> d2 = {'color':'w', 'weight':'58','style':'good'}
>>> d3 = {'sports':'cricket','music':'rock','dance':'disco'}
>>> result = ChainMap(d1, d2, d3)
>>> result
ChainMap({'age': '14', 'name': 'tom', 'sex': 'm'}, {'color': 'w', 'style': 'good', 'weight': '58'}, {'music': 'rock', 'dance': 'disco', 'sports': 'cricket'})
The ChainMap will behave mostly like the merged dict you want. And you can always convert it to a plain dict by:
>>> dict(result)
{'name': 'tom', 'dance': 'disco', 'color': 'w', 'weight': '58', 'style': 'good', 'age': '14', 'music': 'rock', 'sex': 'm', 'sports': 'cricket'}
You could also write a simple wrapper function for simplicity of usage:
from collections import ChainMap
def merge_dicts(*dicts):
return dict(ChainMap(*dicts))
Note: It was introduced in python 3.3

Related

Reshaping a large dictionary

I am working on xbrl document parsing. I got to a point where I have a large dic structured like this....
sample of a dictionary I'm working on
Since it's bit challenging to describe the pattern of what I'm trying to achieve I just put an example of what I'd like it to be...
sample of what I'm trying to achieve
Since I'm fairly new to programing, I'm hustling for days with this. Trying different approaches with loops, list and dic comprehension starting from here...
for k in storage_gaap:
if 'context_ref' in storage_gaap[k]:
for _k in storage_gaap[k]['context_ref']:
storage_gaap[k]['context_ref']={_k}```
storage_gaap being the master dictionary. Sorry for attaching pictures, but it's just much clearer to see the dictionary
I'd really appreciate any and ever help
Here's a solution using zip and dictionary comprehension to do what you're trying to do using toy data in a similar structure.
import itertools
import pprint
# Sample data similar to provided screenshots
data = {
'a': {
'id': 'a',
'vals': ['a1', 'a2', 'a3'],
'val_num': [1, 2, 3]
},
'b': {
'id': 'b',
'vals': ['b1', 'b2', 'b3'],
'val_num': [4, 5, 6]
}
}
# Takes a tuple of keys, and a list of tuples of values, and transforms them into a list of dicts
# i.e ('id', 'val'), [('a', 1), ('b', 2) => [{'id': 'a', 'val': 1}, {'id': 'b', 'val': 2}]
def get_list_of_dict(keys, list_of_tuples):
list_of_dict = [dict(zip(keys, values)) for values in list_of_tuples]
return list_of_dict
def process_dict(key, values):
# Transform the dict with lists of values into a list of dicts
list_of_dicts = get_list_of_dict(('id', 'val', 'val_num'), zip(itertools.repeat(key, len(values['vals'])), values['vals'], values['val_num']))
# Dictionary comprehension to group them based on the 'val' property of each dict
return {d['val']: {k:v for k,v in d.items() if k != 'val'} for d in list_of_dicts}
# Reorganize to put dict under a 'context_values' key
processed = {k: {'context_values': process_dict(k, v)} for k,v in data.items()}
# {'a': {'context_values': {'a1': {'id': 'a', 'val_num': 1},
# 'a2': {'id': 'a', 'val_num': 2},
# 'a3': {'id': 'a', 'val_num': 3}}},
# 'b': {'context_values': {'b1': {'id': 'b', 'val_num': 4},
# 'b2': {'id': 'b', 'val_num': 5},
# 'b3': {'id': 'b', 'val_num': 6}}}}
pprint.pprint(processed)
Ok, Here is the updated solution from my case. Catch for me was the was the zip function since it only iterates over the smallest list passed. Solution was the itertools.cycle method Here is the code:
data = {'us-gaap_WeightedAverageNumberOfDilutedSharesOutstanding': {'context_ref': ['D20210801-20220731',
'D20200801-20210731',
'D20190801-20200731',
'D20210801-20220731',
'D20200801-20210731',
'D20190801-20200731'],
'decimals': ['-5',
'-5',
'-5',
'-5',
'-5',
'-5'],
'id': ['us-gaap:WeightedAverageNumberOfDilutedSharesOutstanding'],
'master_id': ['us-gaap_WeightedAverageNumberOfDilutedSharesOutstanding'],
'unit_ref': ['shares',
'shares',
'shares',
'shares',
'shares',
'shares'],
'value': ['98500000',
'96400000',
'96900000',
'98500000',
'96400000',
'96900000']},
def get_list_of_dict(keys, list_of_tuples):
list_of_dict = [dict(zip(keys, values)) for values in list_of_tuples]
return list_of_dict
def process_dict(k, values):
list_of_dicts = get_list_of_dict(('context_ref', 'decimals', 'id','master_id','unit_ref','value'),
zip((values['context_ref']),values['decimals'],itertools.cycle(values['id']),
itertools.cycle(values['master_id']),values['unit_ref'], values['value']))
return {d['context_ref']: {k:v for k,v in d.items()if k != 'context_ref'} for d in list_of_dicts}
processed = {k: {'context_values': process_dict(k, v)} for k,v in data.items()}
pprint.pprint(processed)

How to merge data from multiple dictionaries with repeating keys?

I have two dictionaries:
dict1 = {'a': '2', 'b': '10'}
dict2 = {'a': '25', 'b': '7'}
I need to save all the values for same key in a new dictionary.
The best i can do so far is: defaultdict(<class 'list'>, {'a': ['2', '25'], 'b': ['10', '7']})
dd = defaultdict(list)
for d in (dict1, dict2):
for key, value in d.items():
dd[key].append(value)
print(dd)
that does not fully resolve the problem since a desirable result is:
a = {'dict1':'2', 'dict2':'25'}
b = {'dict2':'10', 'dict2':'7'}
Also i possibly would like to use new dictionary key same as initial dictionary name
Your main problem is that you're trying to cross the implementation boundary between a string value and a variable name. This is almost always bad design. Instead, start with all of your labels as string data:
table = {
"dict1": {'a': '2', 'b': '10'},
"dict2": {'a': '25', 'b': '7'}
}
... or, in terms of your original post:
table = {
"dict1": dict1,
"dict2": dict2
}
From here, you should be able to invert the levels to obtain
invert = {
"a": {'dict1': '2', 'dict2': '25'},
"b": {'dict2': '10', 'dict2': '7'}
}
Is that enough to get your processing where it needs to be? Keeping the data in comprehensive dicts like this, will make it easier to iterate through the sub-dicts as needed.
As #Prune suggested, structuring your result as a nested dictionary will be easier:
{'a': {'dict1': '2', 'dict2': '25'}, 'b': {'dict1': '10', 'dict2': '7'}}
Which could be achieved with a dict comprehension:
{k: {"dict%d" % i: v2 for i, v2 in enumerate(v1, start=1)} for k, v1 in dd.items()}
If you prefer doing it without a comprehension, you could do this instead:
result = {}
for k, v1 in dd.items():
inner_dict = {}
for i, v2 in enumerate(v1, start=1):
inner_dict["dict%d" % i] = v2
result[k] = inner_dict
Note: This assumes you want to always want to keep the "dict1", "dict2",... key structure.

Multiplying the values of dictionaries with different keys

The problem of multiplying of values of two dictionaries with the same keys, I decided as follows:
v1={'name1': '10', 'name2': '20'}
v2={'name1': '4', 'name2': '5'}
foo = lambda dct_1, dct_2: {key: int(dct_2[key]) * int(dct_1[key]) for key in dct_2}
foo(v1, v2)
# Out: {'name1': 40, 'name2': 100}
How can I multiply values of two dictionaries in the same way, but with the different keys ?
v1={'name1': '10', 'name2': '20'}
v2={'quantity1': '4', 'quantity2': '5'}
#OUT: {'name1':'40', 'name2': '100'}
Assuming you always have corresponding nameX and quantityX values, you could use a simple replace on the keys:
foo = lambda dct_1, dct_2: {key: int(dct_2[key.replace('name', 'quantity')]) * int(dct_1[key]) for key in dct_1}
multiplydicts = lambda x,y: { key: str(int(v1[key]) * int(val)) for key,val in zip(v1.keys(), v2.values())}
Assuming that your dictionaries are the same size this should do the trick, and v2.values() will return the values of v2 in order of construction.
you can do :
>>> v1={'name1': '10', 'name2': '20'}
>>> v2={'quantity1': '4', 'quantity2': '5'}
>>> d={'name'+str(i+1) : int(v1['name'+str(i+1)])*int(v2['quantity'+str(i+1)]) for i in range(len(v1))}
>>> d
{'name2': 100, 'name1': 40}
You just need to add something to map the keys in the first dictionary to those in second. The easiest way to do it is with a third dictionary named keymap in the code below. The keys in the first dictionary determine the ones that will appear in the one returned.
This is needed because the order of keys in ordinary dictionaries is undefined, so you can't rely or predict what order they will appear in when you iterate over them.
v1={'name1': '10', 'name2': '20'}
v2={'quantity1': '4', 'quantity2': '5'}
keymap = {'name1': 'quantity1', 'name2': 'quantity2'} # Added.
foo = (lambda dct_1, dct_2:
{key: int(dct_2[keymap[key]]) * int(dct_1[key]) for key in dct_1})
print(foo(v1, v2)) # -> {'name1': 40, 'name2': 100}

What's a smart way of merging two nested dictionaries of depth 2?

What is the "best" way to merge two nested dictionaries of depth 2?
For example, I'd like to merge the following two dictionaries:
dicA: - user1 {name,age,sex}
- user2 {name,age,sex}
dicB - user1 {location,job}
- user3 {location,job}
In order to get:
dic_merged - user1 {name,age,sex,location,job}
- user2 {name,age,sex}
- user3 {location,job}
Note that the subvalues in dicA and dicB will always be disjoint.
Currently, I'm using:
def merge(dicA,dicB):
for user in dicB:
if user in dicA:
dicA[user].update(dicB[user])
else:
dicA[user] = dicB[user]
return dicA
Is there an alternative to update or a one-liner that can merge nested dictionaries?
You can use dict comprehension (bottom of section):
def merge(d1, d2):
return {key: dict(d1.get(key, {}).items() + d2.get(key, {}).items()) for key in d1.keys() + d2.keys()}
This creates a list of all keys to use in a dict comprehension, and uses the more robust dict.get to prevent errors if key is only in the one of the dicts.
For Python 3 dict view objects are returned from dict.keys() and dict.items(), so you can use itertools.chain (or wrap in each dict view object in list):
def merge(d1, d2):
return {key: dict(chain(d1.get(key, {}).items(), d2.get(key, {}).items())) for key in chain(d1.keys(), d2.keys())}
Note: Using collections.defaultdict(dict) would make the whole thing nicer.
I don't think there's too much wrong with the way you suggest, though I would use a try...except clause:
dicA = {'user1': { 'name': 'John', 'age': 45, 'sex': 'M'},
'user2': { 'name': 'Jane', 'age': 42, 'sex': 'F'}}
dicB = {'user1': {'job': 'janitor', 'location': 'HK'},
'user3': {'job': 'Python-wrangler', 'location': 'NY'}}
def merge(dicA,dicB):
for user, d in dicB.items():
try:
dicA[user].update(dicB[user])
except KeyError:
dicA[user] = d
return dicA
The result is then, for dicA:
{'user1': {'job': 'janitor', 'age': 45, 'name': 'John',
'location': 'HK', 'sex': 'M'},
'user2': {'age': 42, 'name': 'Jane', 'sex': 'F'},
'user3': {'job': 'Python-wrangler', 'location': 'NY'}}

Use of dictionary in Python

I'm writing a concept learning programs, where I need to convert from index to the name of categories.
For example:
# binary concept learning
# candidate eliminaton learning algorithm
import numpy as np
import csv
def main():
d1={0:'0', 1:'Japan', 2: 'USA', 3: 'Korea', 4: 'Germany', 5:'?'}
d2={0:'0', 1:'Honda', 2: 'Chrysler', 3: 'Toyota', 4:'?'}
d3={0:'0', 1:'Blue', 2:'Green', 3: 'Red', 4:'White', 5:'?'}
d4={0:'0', 1:1970,2:1980, 3:1990, 4:2000, 5:'?'}
d5={0:'0', 1:'Economy', 2:'Sports', 3:'SUV', 4:'?'}
a=[0,1,2,3,4]
print a
if __name__=="__main__":
main()
So [0,1,2,3,4] should convert to ['0', 'Honda', 'Green', '1990', '?']. What is the most pythonic way to do this?
I think you need a basic dictionary crash course:
this is a proper dictionary:
>>>d1 = { 'tires' : 'yoko', 'manufacturer': 'honda', 'vtec' : 'no' }
You can call invidual things in the dictionary easily:
>>>d1['tires']
'yoko'
>>>d1['vtec'] = 'yes' #mad vtec yo
>>>d1['vtec']
'yes'
Dictionaries are broken up into two different sections, the key and the value
testDict = {'key':'value'}
You were using a dictionary the exact same way as a list:
>>>test = {0:"thing0", 1:"thing1"} #dictionary
>>>test[0]
'thing0'
which is pretty much the exact same as saying
>>>test = ['thing0','thing1'] #list
>>>test[0]
'thing0'
in your particular case, you may want to either format your dictionaries properly ( i would suggest something like masterdictionary = {'country': ['germany','france','USA','japan], 'manufacturer': ['honda','ferrarri','hoopty'] } etcetera because you could call each individual item you wanted a lot easier
with that same dictionary:
>>>masterdictionary['country'][1]
'germany'
which is
dictionaryName['key'][iteminlistindex]
of course there is nothing preventing you from putting dictionaries as values inside of dictionaries.... inside values of other dictionaries...
You can do:
data = [d1,d2,d3,d4,d5]
print [d[key] for key, d in zip(a, data)]
The function zip() can be used to combine to iterables; lists in this case.
You've already got the answer to your direct question, but you may wish to consider re-structuring the data. To me, the following makes a lot more sense, and will enable you to more easily index into it for what you asked, and for any possible later queries:
from pprint import pprint
items = [[el.get(i, '?') for el in (d1,d2,d3,d4,d5)] for i in range(6)]
pprint(items)
[['0', '0', '0', '0', '0'],
['Japan', 'Honda', 'Blue', 1970, 'Economy'],
['USA', 'Chrysler', 'Green', 1980, 'Sports'],
['Korea', 'Toyota', 'Red', 1990, 'SUV'],
['Germany', '?', 'White', 2000, '?'],
['?', '?', '?', '?', '?']]
I would use a list of dicts d = [d1, d2, d3, d4, d5], and then a list comprehension:
[d[i][key] for i, key in enumerate(a)]
To make the whole thing more readable, use nested dictionaries - each of your dictionaries seems to represent something you could give a more descriptive name than d1 or d2:
data = {'country': {0: 'Japan', 1: 'USA' ... }, 'brand': {0: 'Honda', ...}, ...}
car = {'country': 1, 'brand': 2 ... }
[data[attribute][key] for attribute, key in car.items()]
Note this would not necessarily be in order if that is important, though I think there is an ordered dictionary type.
As suggested by the comment, a dictionary with contiguous integers as keys can be replaced by a list:
data = {'country': ['Japan', 'USA', ...], 'brand': ['Honda', ...], ...}
If you need to keep d1, d2, etc. as is:
newA = [locals()["d%d"%(i+1)][a_value] for i,a_value in enumerate(a)]
Pretty ugly, and fragile, but it should work with your existing code.
You don't need a dictionary for this at all. Lists in python automatically support indexing.
def main():
d1=['0','Japan','USA','Korea','Germany',"?"]
d2=['0','Honda','Chrysler','Toyota','?']
d3=['0','Blue','Green','Red','White','?']
d4=['0', 1970,1980,1990,2000,'?']
d5=['0','Economy','Sports','SUV','?']
ds = [d1, d2, d3, d4, d5] #This holds all your lists
#This is what range is for
a=range(5)
#Find the nth index from the nth list, seems to be what you want
print [ds[n][n] for n in a] #This is a list comprehension, look it up.

Categories

Resources