How to change one dictionary value in a dictionary of dictionaries

How to change one dictionary value in a dictionary of dictionaries - python

I am running a variation of the following script:
text1={'file1':0,'file2':0}
text2=['100-200','200-300','300-400']
text3=['1','2','3','4']
level1={}
level2={}
for i in text2:
level1[i]=text1
for n in text3:
level2[n]=level1
level2['3']['100-200']['file1']=level2['3']['100-200']['file1']+1
Unfortunately this changes the dictionary from:
{'1': {'200-300': {'file2': 0, 'file1': 0}, '300-400': {'file2': 0, 'file1': 0}, '100-200': {'file2': 0, 'file1': 0}}, '2': {'200-300': {'file2': 0, 'file1': 0}, '300-400': {'file2': 0, 'file1': 0}, '100-200': {'file2': 0, 'file1': 0}}, '3': {'200-300': {'file2': 0, 'file1': 0}, '300-400': {'file2': 0, 'file1': 0}, '100-200': {'file2': 0, 'file1': 0}}, '4': {'200-300': {'file2': 0, 'file1': 0}, '300-400': {'file2': 0, 'file1': 0}, '100-200': {'file2': 0, 'file1': 0}}}
to:
{'1': {'200-300': {'file2': 0, 'file1': 1}, '300-400': {'file2': 0, 'file1': 1}, '100-200': {'file2': 0, 'file1': 1}}, '2': {'200-300': {'file2': 0, 'file1': 1}, '300-400': {'file2': 0, 'file1': 1}, '100-200': {'file2': 0, 'file1': 1}}, '3': {'200-300': {'file2': 0, 'file1': 1}, '300-400': {'file2': 0, 'file1': 1}, '100-200': {'file2': 0, 'file1': 1}}, '4': {'200-300': {'file2': 0, 'file1': 1}, '300-400': {'file2': 0, 'file1': 1}, '100-200': {'file2': 0, 'file1': 1}}}
How do I change only one of the file values and not all of them?

Use a dict comprehension to produce your structure, where loop expressions are evaluated each iteration:
level2 = {n: {i: {'file1':0,'file2':0} for i in text2}} for n in text3}
You are not creating copies of the dictionaries, merely storing references to one dictionary object.
Thus, each time you stored text1 you created a reference, not a copy, and the same goes for each time you referenced level1.

Related

How to obtain a subset of nodes in NetworkX?

I've been working on a multipartite layout graph with NetworkX. The graph looks something like this:
Each node on my graph has a 'trajectory' attribute and a 'layer' attribute. The layer indicates which column the node belongs to.
An example of the first two columns of nodes is shown:
[('0.0', {'layer': 0, 'trajectory': 0}), ('1.0', {'layer': 0, 'trajectory': 1}), ('2.0', {'layer': 0, 'trajectory': 2}), ('3.0', {'layer': 0, 'trajectory': 3}), ('4.0', {'layer': 0, 'trajectory': 4}), ('5.0', {'layer': 0, 'trajectory': 5}), ('6.0', {'layer': 0, 'trajectory': 6}), ('7.0', {'layer': 0, 'trajectory': 7}), ('8.0', {'layer': 0, 'trajectory': 8}), ('9.0', {'layer': 0, 'trajectory': 9}), ('10.0', {'layer': 0, 'trajectory': 10}), ('11.0', {'layer': 0, 'trajectory': 11}), ('12.0', {'layer': 0, 'trajectory': 12}), ('13.0', {'layer': 0, 'trajectory': 13}), ('14.0', {'layer': 0, 'trajectory': 14}), ('0.1', {'layer': 1, 'trajectory': 0}), ('1.1', {'layer': 1, 'trajectory': 1}), ('2.1', {'layer': 1, 'trajectory': 2}), ('3.1', {'layer': 1, 'trajectory': 3}), ('4.1', {'layer': 1, 'trajectory': 4}), ('5.1', {'layer': 1, 'trajectory': 5}), ('6.1', {'layer': 1, 'trajectory': 6}), ('7.1', {'layer': 1, 'trajectory': 7}), ('8.1', {'layer': 1, 'trajectory': 8}), ('9.1', {'layer': 1, 'trajectory': 9}), ('10.1', {'layer': 1, 'trajectory': 10}), ('11.1', {'layer': 1, 'trajectory': 11}), ('12.1', {'layer': 1, 'trajectory': 12}), ('13.1', {'layer': 1, 'trajectory': 13}), ('14.1', {'layer': 1, 'trajectory': 14}), ('15.1', {'layer': 1, 'trajectory': '15'})]
I need to retrieve all the nodes in a specific column. If I were to pick the 2nd column then to access it I could do:
column = 2
for nodename, nodeattrs in G.nodes(data=True):
if nodeattrs['layer'] == column:
print('I am a node in the column: ' + str(column))
# I do more stuff here
I think that is not a very efficient nor elegant way to solve my issue since I have to check every node in the graph. My graph will have thousands of columns, which makes me believe there has to be a way of obtaining the subset of nodes I want without having to check if each node has the specified layer or not.
Is there a better way to implement this?
EDIT: I found an answer to my question here:
Select nodes and edges form networkx graph with attributes
In my case it would be something along the lines:
dict( (n,d['layer']) for n,d in G.nodes().items() if d['layer'] == 2)
Which returns a dictionary I can save.

Cleanest way to sum list of nested dicts

Is there a cleaner/more pythonic way of summing the contents of a list of nested dicts? Here's what I'm doing, but I suspect that there may be a better way:
list_of_nested_dicts = [{'class1': {'TP': 1, 'FP': 0, 'FN': 2}, 'class2': {'TP': 0, 'FP': 0, 'FN': 0}, 'class3': {'TP': 0, 'FP': 0, 'FN': 0}, 'class4': {'TP': 1, 'FP': 0, 'FN': 2}},
{'class1': {'TP': 1, 'FP': 0, 'FN': 2}, 'class2': {'TP': 0, 'FP': 0, 'FN': 0}, 'class3': {'TP': 0, 'FP': 0, 'FN': 0}, 'class4': {'TP': 1, 'FP': 0, 'FN': 2}},
{'class1': {'TP': 1, 'FP': 0, 'FN': 2}, 'class2': {'TP': 0, 'FP': 0, 'FN': 0}, 'class3': {'TP': 0, 'FP': 0, 'FN': 0}, 'class4': {'TP': 1, 'FP': 0, 'FN': 2}},
{'class1': {'TP': 1, 'FP': 0, 'FN': 2}, 'class2': {'TP': 0, 'FP': 0, 'FN': 0}, 'class3': {'TP': 0, 'FP': 0, 'FN': 0}, 'class4': {'TP': 1, 'FP': 0, 'FN': 2}}]
total_counts = {k:{'TP': 0, 'FP': 0, 'FN': 0} for k in list_of_nested_dicts[0].keys()}
for d in list_of_nested_dicts:
for label,counts_dict in d.items():
for k,v in counts_dict.items():
total_counts[label][k] += v
print(total_counts)
(Assuming all keys are exactly the same, but values could be any integer)

You can have a slightly tighter code using collections (similar result to #blhsing)
import collections
counts = collections.defaultdict(collections.Counter)
for d in list_of_nested_dicts:
for k, v in d.items():
counts[k].update(v)
This will give you a defaultdict of counters instead of only dicts, but they behave similarly. You can also explicitly cast them to dicts at the end if you want.
{'class1': {'FN': 8, 'FP': 0, 'TP': 4},
'class2': {'FN': 0, 'FP': 0, 'TP': 0},
'class3': {'FN': 0, 'FP': 0, 'TP': 0},
'class4': {'FN': 8, 'FP': 0, 'TP': 4}}
vs
defaultdict(<class 'collections.Counter'>,
{'class1': Counter({'FN': 8, 'TP': 4, 'FP': 0}),
'class2': Counter({'TP': 0, 'FP': 0, 'FN': 0}),
'class3': Counter({'TP': 0, 'FP': 0, 'FN': 0}),
'class4': Counter({'FN': 8, 'TP': 4, 'FP': 0})})

One thing in your code that stands out as "unclean" is the fact that you are hard-coding the keys of the sub-dicts in the initialization of total_counts. You can avoid such hard-coding by using the dict.setdefault and dict.get methods as you iterate over the items of the sub-dicts instead:
total_counts = {}
for d in list_of_nested_dicts:
for label, counts_dict in d.items():
for k, v in counts_dict.items():
total_counts[label][k] = total_counts.setdefault(label, {}).get(k, 0) + v

pandas - pd.replace and TypeError

I have all_data dataframe. I want to replace some categorical values in certain columns with numerical values. I'm trying to use this nested dictionary notation (I've checked that the brackets and curly brackets are in place, I don't think that's the issue):
all_data = all_data.replace({'Street': {'Pave': 1, 'Grvl': 0}},
{'LotShape': {'IR3': 1, 'IR2': 2, 'IR1': 3, 'Reg': 4}},
{'Utilities': {'ELO': 0, 'NoSeWa': 0, 'NoSewr': 0, 'AllPub': 1}},
{'LandSlope': {'Sev': 1, 'Mod': 2, 'Gtl': 3}},
{'ExterQual': {'Po': 1, 'Fa': 2, 'TA': 3, 'Gd': 4, 'Ex': 5}},
{'ExterCond': {'Po': 1, 'Fa': 2, 'TA': 3, 'Gd': 4, 'Ex': 5}},
{'BsmtQual': {'NA': 0, 'Po': 1, 'Fa': 2, 'TA': 3, 'Gd': 4,'Ex': 5}},
{'BsmtCond': {'NA': 0, 'Po': 1, 'Fa': 2, 'TA': 3, 'Gd': 4,'Ex': 5}},
{'BsmtExposure': {'NA': 0, 'No': 1, 'Mn': 2, 'Av': 3, 'Gd': 4}},
{'BsmtFinType1': {'NA': 0, 'Unf': 1, 'LwQ': 2, 'Rec': 3, 'BLQ': 4, 'ALQ': 5, 'GLQ': 6}},
{'BsmtFinType2': {'NA': 0, 'Unf': 1,'LwQ': 2,'Rec': 3, 'BLQ': 4,'ALQ': 5, 'GLQ': 6}},
{'HeatingQC': {'Po': 1,'Fa': 2,'TA': 3,'Gd': 4,'Ex': 5}},
{'CentralAir': {'No': 0,'Yes': 1}},
{'KitchenQual': {'Po': 1,'Fa': 2,'TA': 3,'Gd': 4,'Ex': 5}},
{'Functional': {'Sal': -7,'Sev': -6,'Maj1': -5,'Maj2': -4,'Mod': -3,'Min2': -2,'Min1': -1,
'Typ': 0}},
{'FireplaceQu': {'NA': 0,'Po': 1,'Fa': 2,'TA': 3,'Gd': 4,'Ex': 5}},
{'GarageFinish': {'NA': 0,'Unf': 1,'RFn': 2, 'Fin': 3}},
{'GarageQual': {'NA': 0, 'Po': 1,'Fa': 2, 'TA': 3,'Gd': 4, 'Ex': 5}},
{'GarageCond': {'NA': 0,'Po': 1,'Fa': 2,'TA': 3,'Gd': 4,'Ex': 5}},
{'PavedDrive': {'N': 0,'P': 0, 'Y': 1}},
{'Fence': {'NA': 0, 'MnWw': 1,'GdWo': 2,'MnPrv': 3,'GdPrv': 4}},
{'SaleCondition': {'Abnorml': 1, 'Alloca': 1, 'AdjLand': 1, 'Family': 1, 'Normal': 0,
'Partial': 0}}
)
Error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-40-f9c9c28b7237> in <module>()
22 {'Fence': {'NA': 0, 'MnWw': 1,'GdWo': 2,'MnPrv': 3,'GdPrv': 4}},
23 {'SaleCondition': {'Abnorml': 1, 'Alloca': 1, 'AdjLand': 1, 'Family': 1, 'Normal': 0,
---> 24 'Partial': 0}}
25 )
TypeError: replace() takes from 1 to 8 positional arguments but 23 were given
If I remove the 'SaleCondition' row from the above code, the error is again there but this time referring to 'Fence', and so on, for each line of code from bottom up. I've googled but have no idea what this means. Help MUCH appreciated.

You should do something like :
df.replace({'Fence':{'NA': 0, 'MnWw': 1,'GdWo': 2,'MnPrv': 3,'GdPrv': 4},'SaleCondition':{'Abnorml': 1, 'Alloca': 1, 'AdjLand': 1, 'Family': 1, 'Normal': 0,
'Partial': 0}})
the format should be .replace({'col1':{},'col2':{}}) not .replace({'col1':{}},{'col2':{}})

Python sort multi dimensional dict

input={11: {'perc': 0, 'name': u'B test', 'cid': 11, 'total': 0, 'pending': 0, 'complete': 0}, 10: {'perc': 0, 'name': u'C test', 'cid': 10, 'total': 0, 'pending': 0,'complete': 0}, 3: {'perc': 9, 'name': u'Atest Pre-requisites', 'cid': 3, 'total': 11, 'pending': 10, 'complete': 1}}
I want to sort this dict based on name field. I'm new in python, anyone please help me.

First, you should avoid using reserved words (such as input) as variables (now input is redefined and no longer calls the function input()).
Also, a dictionary cannot be sorted. If you don't need the keys, you can transform the dictionary into a list, and then sort it. The code would be like this:
input_dict = {11: {'perc': 0, 'name': u'B test', 'cid': 11, 'total': 0, 'pending': 0, 'complete': 0}, 10: {'perc': 0, 'name': u'C test', 'cid': 10, 'total': 0, 'pending': 0,'complete': 0}, 3: {'perc': 9, 'name': u'Atest Pre-requisites', 'cid': 3, 'total': 11, 'pending': 10, 'complete': 1}}
input_list = sorted(input_dict.values(), key=lambda x: x['name'])
print(input_list)
# prints [{'perc': 9, 'complete': 1, 'cid': 3, 'total': 11, 'pending': 10, 'name': u'Atest Pre-requisites'}, {'perc': 0, 'complete': 0, 'cid': 11, 'total': 0, 'pending': 0, 'name': u'B test'}, {'perc': 0, 'complete': 0, 'cid': 10, 'total': 0, 'pending': 0, 'name': u'C test'}]
EDIT
If you wish to keep the keys and use iteritems() as you said in the comments, use this code instead:
input_dict = {11: {'perc': 0, 'name': u'B test', 'cid': 11, 'total': 0, 'pending': 0, 'complete': 0}, 10: {'perc': 0, 'name': u'C test', 'cid': 10, 'total': 0, 'pending': 0,'complete': 0}, 3: {'perc': 9, 'name': u'Atest Pre-requisites', 'cid': 3, 'total': 11, 'pending': 10, 'complete': 1}}
input_list = sorted(input_dict.iteritems(), key=lambda x: x[1]['name'])
print(input_list)
# prints [(3, {'perc': 9, 'complete': 1, 'cid': 3, 'total': 11, 'pending': 10, 'name': u'Atest Pre-requisites'}), (11, {'perc': 0, 'complete': 0, 'cid': 11, 'total': 0, 'pending': 0, 'name': u'B test'}), (10, {'perc': 0, 'complete': 0, 'cid': 10, 'total': 0, 'pending': 0, 'name': u'C test'})]

GIven a set of strings, create a dictionary of dictionaries using them as keys to entries with default values

I have this:
set_of_strings = {'abc', 'def', 'xyz'}
And I want to create this:
dict_of_dicts = {
'abc': {'pr': 0, 'wt': 0},
'def' : {'pr': 0, 'wt': 0},
'xyz' : {'pr': 0, 'wt': 0}
}
What's the pythonic way? (Python 2.7)

Like this?
>>> set_of_strings = {'abc', 'def', 'xyz'}
>>> dict_of_dicts = {}
>>> for key in set_of_strings:
... dict_of_dicts[key] = {'pr':0, 'wt':0}
...
>>> print dict_of_dicts
{'xyz': {'pr': 0, 'wt': 0}, 'abc': {'pr': 0, 'wt': 0}, 'def': {'pr': 0, 'wt': 0}}
As a dictionary comprehension:
>>> {k:{'pr':0, 'wt':0} for k in {'abc', 'def', 'xyz'}}
{'xyz': {'pr': 0, 'wt': 0}, 'abc': {'pr': 0, 'wt': 0}, 'def': {'pr': 0, 'wt': 0}}
Alternatively, you can do something like:
>>> set_of_strings = {'abc', 'def', 'xyz'}
>>> value = {'pr': 0, 'wt': 0}
>>> dict(zip(set_of_strings, [value]*len(set_of_strings)))
{'xyz': {'pr': 0, 'wt': 0}, 'abc': {'pr': 0, 'wt': 0}, 'def': {'pr': 0, 'wt': 0}}

You can also use dict.fromkeys:
>>> d = dict.fromkeys({'abc', 'def', 'xyz'}, {'pr': 0, 'wt': 0})
>>> d
{'xyz': {'pr': 0, 'wt': 0}, 'abc': {'pr': 0, 'wt': 0}, 'def': {'pr': 0, 'wt': 0}}
NOTE:
The value specified ({'pr': 0, 'wt': 0}) is shared by all keys.
>>> d['xyz']['py'] = 1
>>> d
{'xyz': {'pr': 0, 'py': 1, 'wt': 0}, 'abc': {'pr': 0, 'py': 1, 'wt': 0}, 'def': {'pr': 0, 'py': 1, 'wt': 0}}

As the other answers show, there are several ways to achieve this, but IMO the most (only?) pythonic way is using a dict comprehension:
keys = ...
{ k: { 'pr': 0, 'wt': 0 } for k in keys }
If the values were immutable, dict.fromkeys is good, and is probably faster than dict comprehension.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to change one dictionary value in a dictionary of dictionaries - python

Related

How to obtain a subset of nodes in NetworkX?

Cleanest way to sum list of nested dicts

pandas - pd.replace and TypeError

Python sort multi dimensional dict

GIven a set of strings, create a dictionary of dictionaries using them as keys to entries with default values

Categories

Resources