Updating a multidimensional dictionary of dictionaries - python

I am trying to update the values in a dictionary of dictionaries of uncertain depth. This dictionary of dictionaries is loaded via h5py. As a MWE one could try:
C = {'A':{'B':12, 'C':13, 'D':{'E':20}}, 'F':14, 'G':{'H':15}}
So when I would like to set the Value of 'E' to 42 I would use
C['A']['D']['E'] = 42
This part is easy, so now lets suppose I am trying to set a value in this dict that is given by a user in the form:
([keys], value) -> e.g. (['A','D','E'], 42)
How could I set this value? Do I have to write a setter function for each possible depth?
I tried using the update function of dict like this:
def dod(keys, value):
""" Generate a dict of dicts from a list of keys and one value """
if len(keys) == 1:
return {keys[0]: value}
else:
return {keys[0]: dod(keys[1:], value)}
C.update(dod(['A', 'D', 'E'], 42)
But using this deletes all the elements from depth two onwards, resulting in:
C = {'A': {'D': {'E': 42}}, 'G': {'H': 15}, 'F': 14}
Did I make a mistake or is there even a simpler way of setting a value with unknown depth?

This answer simply implements the methodology found here for your particular example:
from functools import reduce
import operator
def getFromDict(dataDict, mapList):
return reduce(operator.getitem, mapList, dataDict)
def setInDict(dataDict, mapList, value):
getFromDict(dataDict, mapList[:-1])[mapList[-1]] = value
C = {'A':{'B':12, 'C':13, 'D':{'E':20}}, 'F':14, 'G':{'H':15}}
setInDict(C, ['A','D','E'], 42)
print(C)
Yields:
{'A': {'B': 12, 'C': 13, 'D': {'E': 42}}, 'F': 14, 'G': {'H': 15}}

Related

Why is this dictionary turning into a tuple?

I have a complex dictionary:
l = {10: [{'a':1, 'T':'y'}, {'a':2, 'T':'n'}], 20: [{'a':3,'T':'n'}]}
When I'm trying to iterate over the dictionary I'm not getting a dictionary with a list for values that are a dictionary I'm getting a tuple like so:
for m in l.items():
print(m)
(10, [{'a': 1, 'T': 'y'}, {'a': 2, 'T': 'n'}])
(20, [{'a': 3, 'T': 'n'}])
But when I just print l I get my original dictionary:
In [7]: l
Out[7]: {10: [{'a': 1, 'T': 'y'}, {'a': 2, 'T': 'n'}], 20: [{'a': 3, 'T': 'n'}]}
How do I iterate over the dictionary? I still need the keys and to process each dictionary in the value list.
There are two questions here. First, you ask why this is turned into a "tuple" - the answer to that question is because that is what the .items() method on dictionaries returns - a tuple of each key/value pair.
Knowing this, you can then decide how to use this information. You can choose to expand the tuple into the two parts during iteration
for k, v in l.items():
# Now k has the value of the key and v is the value
# So you can either use the value directly
print(v[0]);
# or access using the key
value = l[k];
print(value[0]);
# Both yield the same value
With a dictionary you can add another variable while iterating over it.
for key, value in l.items():
print(key,value)
I often rely on pprint when processing a nested object to know at a glance what structure that I am dealing with.
from pprint import pprint
l = {10: [{'a':1, 'T':'y'}, {'a':2, 'T':'n'}], 20: [{'a':3,'T':'n'}]}
pprint(l, indent=4, width=40)
Output:
{ 10: [ {'T': 'y', 'a': 1},
{'T': 'n', 'a': 2}],
20: [{'T': 'n', 'a': 3}]}
Others have already answered with implementations.
Thanks for all the help. I did discuss figure out how to process this. Here is the implementation I came up with:
for m in l.items():
k,v = m
print(f"key: {k}, val: {v}")
for n in v:
print(f"key: {n['a']}, val: {n['T']}")
Thanks for everyones help!

Output dictionary where each element is a string's character with an index

UPD: inserting from collections import OrderedDict into one of my code cells helped
I want to make a program which accepts a string and then outputs dictionary where each element is this string's character with an index.
Input: hello
Output {'h': 1, 'e': 2, 'l': 3, 'l': 4, 'o': 5}
I've come up with several ways to create this kind of dictionary. However, in every case I have the following output with above-mentioned input: {'e': 2, 'h': 1, 'l': 4, 'o': 5}
#solution 1
s = str(input())
dic = {}
for index, symb in enumerate(s, 1):
dic[symb]=index
dic
#soultion 2
s = input()
d4 = {}
d4 = dict(zip(s,range(1,len(s)+1)))
d4
What can be an issue here? I will appreciate any help. THanks in advance.
P.S. for coding I use Google Collab
Dictionaries only allow for one value per key, therefore there won't be two values for the key l, no matter how hard you try. Also, dictionaries are unordered, so the items won't appear in the order they were inserted.
Simple and Easy:-
v_str='hello'
class Dictlist(dict):
def __setitem__(self, key, value):
try:
self[key]
except KeyError:
super(Dictlist, self).__setitem__(key, [])
self[key].append(value)
p_dict=Dictlist()
for i, j in enumerate(v_str):
p_dict[j]=i
print(p_dict)
# output:-{'h': [0], 'e': [1], 'l': [2, 3], 'o': [4]}
As #IoaTzimas points out correctly in the comments is that you have fundamental misunderstanding of how dict keys work. dict keys must be unique (think like a house key) because the key maps to the value.

Python: update dict string that has placeholders?

Consider this string: "{'a': A, 'b': B, 'c': 10}". Now I want to update this "string" and add new key d with let say value 20, so result would be "{'a': A, 'b': B, 'c': 10, 'd': 20}"
Normally, you could just eval string (eval or literal_eval) into dict, update the way you want and convert it back to string. But in this case, there are placeholders, which would not be recognized when evaluating.
What would be best way to update it, so old values are kept the same, but "dict-string" is updated properly?
For a more robust solution that properly parses the dict, you can subclass lib2to3.refactor.RefactoringTool to refactor the code using a fixer that is a subclass of lib2to3.fixer_base.BaseFix with a pattern that looks for a dictsetmaker node, and a transform method that extends the children list with leaf nodes that consist of the tokens that will make for a new key-value pair in the dict:
from lib2to3 import fixer_base, refactor, pytree
from lib2to3.pgen2 import token
class AddKeyValue(fixer_base.BaseFix):
PATTERN = "dictsetmaker"
def transform(self, node, results):
node.children.extend((
pytree.Leaf(token.COMMA, ','),
pytree.Leaf(token.STRING, "'d'", prefix=' '),
pytree.Leaf(token.COLON, ':'),
pytree.Leaf(token.NUMBER, 20, prefix=' ')
))
return node
class Refactor(refactor.RefactoringTool):
def __init__(self, fixers):
self._fixers= [cls(None, None) for cls in fixers]
super().__init__(None)
def get_fixers(self):
return self._fixers, []
s = "{'a': A, 'b': B, 'c': 10}"
print(Refactor([AddKeyValue]).refactor_string(s + '\n', ''))
This outputs:
{'a': A, 'b': B, 'c': 10, 'd': 20}
lib2to3 is round-trip stable so all white spaces are preserved after the transformation, and a new node should be specified with a prefix if whitespaces are to be inserted before it.
You can find the definition of the Python grammar in Grammar.txt of the lib2to3 module.
Demo: https://repl.it/#blhsing/RudeLimegreenConcentrate
This by no means a best solution but here is one approach:
import re
dict_str = "{'a': A, 'b': B, 'c': 10}"
def update_dict(dict_str, **keyvals):
"""creates an updated dict_str
Parameters:
dict_str (str): current dict_str
**keyvals: variable amounts of key-values
Returns:
str:updated string
"""
new_entries = ", ".join(map(lambda keyval: f"'{keyval[0]}': {keyval[1]}", keyvals.items())) # create a string representation for each key-value and join by ','
return dict_str.replace("}", f", {new_entries}{'}'}") # update the dict_str by removing the last '}' and add the new entries
output:
updated = update_dict(dict_str,
d = 20,
e = 30
)
print(updated)
{'a': A, 'b': B, 'c': 10, 'd': 20, 'e': 30}
some_dict = {
'g': 2,
'h': 3
}
updated = update_dict(dict_str,
**some_dict
)
print(updated)
{'a': A, 'b': B, 'c': 10, 'g': 2, 'h': 3}
I think that you can:
Option 1 - Adding
Insert the new string ", key: value" at the end of the string, before the "}".
Option 2 - RagEx for adding/updating
1 - use find() and search for the key. If it exist use the regex to substitute:
re.replace(regex_search,regex_replace,contents)
So using something like:
string = re.sub(r'key: (.+),', 'key: value', article)
2 - if the find() fail, use the add of the option 1
If it's just about adding at the end of the string...
this_string = "{'a': A, 'b': B, 'c': 10}"
this_add = "'d': 20"
this_string = f"{this_string[:-1]}, {this_add}{this_string[-1]}"
print(this_string)
will output
{'a': A, 'b': B, 'c': 10, 'd': 20}
If you need to insert the new string in between you can do something similar using string.find to locate the index and use that index number instead.
It's basically rewriting the entire string but strings are immutable what can we do.

Iterate through nested dictionary

Im trying to create a function increase_by_one which takes in a dictionary and modifies the dictionary by increasing all values in it by 1. The function should remain all keys unchanged and finally return the modified dictionary. If the dictionary is empty, return it without changing. (Dictionaries can be nested)
e.g
increase_by_one({'1':2.7, '11':16, '111':{'a':5, 't':8}})
would give
{'1': 3.7, '11': 17, '111': {'a': 6, 't': 9}}
Im not sure how to do it for multiple(and unknown of number) nested dicitionaries. Thank you. Would prefer the code to be as simple as possible
This is a simple way to solve the problem using recursion and dict comprehension:
def increase_by_one(d):
try:
return d + 1
except:
return {k: increase_by_one(v) for k, v in d.items()}
In case there are values contained in the dict apart from numbers which can be added or other dictionaries, further type checking might be necessary.
Assuming the values are either a number or a dictionary, you could consider:
def increase_by_one(d):
for key in d:
if type(d[key])==dict:
d[key] = increase_by_one(d[key])
else:
d[key] += 1
return d
For you input:
print(increase_by_one({'1':2.7, '11':16, '111':{'a':5, 't':8}}))
I got:
{'1': 3.7, '11': 17, '111': {'a': 6, 't': 9}}
def increase_by_one(d):
for key in d:
try:
d[key] += 1
except: # cannot increase, so it's not a number
increase_by_one(d[key])
return d # only necessary because of spec
def increase_by_one(dictio):
for d in dictio:
if isinstance(dictio[d], int) or isinstance(dictio[d], float):
dictio[d] += 1
else:
increase_by_one(dictio[d])
return dictio
increase_by_one({'1':2.7, '11':16, '111':{'a':5, 't':8}})
Using recurrence
In-place modification of dict:
def increase_by_one(my_dict):
for k, v in my_dict.items():
if any(isinstance(v, x) for x in (float, int)):
my_dict.update({k: v + 1})
elif isinstance(v, dict):
my_dict.update({k: increase_by_one(v)})
return my_dict
v = {'1': 2.7, '11': 16, '111': {'a': 5, 't': 8}}
print(increase_by_one(v)) # prints: {'111': {'a': 6, 't': 9}, '1': 3.7, '11': 17}

How to use dict.get() with multidimensional dict?

I have a multidimensional dict, and I'd like to be able to retrieve a value by a key:key pair, and return 'NA' if the first key doesn't exist. All of the sub-dicts have the same keys.
d = { 'a': {'j':1,'k':2},
'b': {'j':2,'k':3},
'd': {'j':1,'k':3}
}
I know I can use d.get('c','NA') to get the sub-dict if it exists and return 'NA' otherwise, but I really only need one value from the sub-dict. I'd like to do something like d.get('c['j']','NA') if that existed.
Right now I'm just checking to see if the top-level key exists and then assigning the sub-value to a variable if it exists or 'NA' if not. However, I'm doing this about 500k times and also retrieving/generating other information about each top-level key from elsewhere, and I'm trying to speed this up a little bit.
How about
d.get('a', {'j': 'NA'})['j']
?
If not all subdicts have a j key, then
d.get('a', {}).get('j', 'NA')
To cut down on identical objects created, you can devise something like
class DefaultNASubdict(dict):
class NADict(object):
def __getitem__(self, k):
return 'NA'
NA = NADict()
def __missing__(self, k):
return self.NA
nadict = DefaultNASubdict({
'a': {'j':1,'k':2},
'b': {'j':2,'k':3},
'd': {'j':1,'k':3}
})
print nadict['a']['j'] # 1
print nadict['b']['j'] # 2
print nadict['c']['j'] # NA
Same idea using defaultdict:
import collections
class NADict(object):
def __getitem__(self, k):
return 'NA'
#staticmethod
def instance():
return NADict._instance
NADict._instance = NADict()
nadict = collections.defaultdict(NADict.instance, {
'a': {'j':1,'k':2},
'b': {'j':2,'k':3},
'd': {'j':1,'k':3}
})
Another way to get multidimensional dict example ( use get method twice)
d.get('a', {}).get('j')
Here's a simple and efficient way to do it with ordinary dictionaries, nested an arbitrary number of levels. The example code works in both Python 2 and 3.
from __future__ import print_function
try:
from functools import reduce
except ImportError: # Assume it's built-in (Python 2.x)
pass
def chained_get(dct, *keys):
SENTRY = object()
def getter(level, key):
return 'NA' if level is SENTRY else level.get(key, SENTRY)
return reduce(getter, keys, dct)
d = {'a': {'j': 1, 'k': 2},
'b': {'j': 2, 'k': 3},
'd': {'j': 1, 'k': 3},
}
print(chained_get(d, 'a', 'j')) # 1
print(chained_get(d, 'b', 'k')) # 3
print(chained_get(d, 'k', 'j')) # NA
It could also be done recursively:
# Recursive version.
def chained_get(dct, *keys):
SENTRY = object()
def getter(level, keys):
return (level if keys[0] is SENTRY else
'NA' if level is SENTRY else
getter(level.get(keys[0], SENTRY), keys[1:]))
return getter(dct, keys+(SENTRY,))
Although this way of doing it isn't quite as efficient as the first.
Rather than a hierarchy of nested dict objects, you could use one dictionary whose keys are a tuple representing a path through the hierarchy.
In [34]: d2 = {(x,y):d[x][y] for x in d for y in d[x]}
In [35]: d2
Out[35]:
{('a', 'j'): 1,
('a', 'k'): 2,
('b', 'j'): 2,
('b', 'k'): 3,
('d', 'j'): 1,
('d', 'k'): 3}
In [36]: timeit [d[x][y] for x,y in d2.keys()]
100000 loops, best of 3: 2.37 us per loop
In [37]: timeit [d2[x] for x in d2.keys()]
100000 loops, best of 3: 2.03 us per loop
Accessing this way looks like it's about 15% faster. You can still use the get method with a default value:
In [38]: d2.get(('c','j'),'NA')
Out[38]: 'NA'
For a functional approach very similar to martineau's answer, I've gone with the following:
def chained_get(dictionary: dict, *args, default: Any = None) -> Any:
"""
Get a value nested in a dictionary by its nested path.
"""
value_path = list(args)
dict_chain = dictionary
while value_path:
try:
dict_chain = dict_chain.get(value_path.pop(0))
except AttributeError:
return default
return dict_chain
It's a slightly simpler implementation but is still recursive and optionally allows a default value.
The usage is identical to martineau's answer:
from typing import Any
def chained_get(dictionary: dict, *args, default: Any = None) -> Any:
"""
Get a value nested in a dictionary by its nested path.
"""
value_path = list(args)
dict_chain = dictionary
while value_path:
try:
dict_chain = dict_chain.get(value_path.pop(0))
except AttributeError:
return default
return dict_chain
def main() -> None:
dct = {
"a": {"j": 1, "k": 2},
"b": {"j": 2, "k": 3},
"d": {"j": 1, "k": 3},
}
print(chained_get(dct, "a", "j")) # 1
print(chained_get(dct, "b", "k")) # 3
print(chained_get(dct, "k", "j")) # None
print(chained_get(dct, "k", "j", default="NA")) # NA
if __name__ == "__main__":
main()

Categories

Resources