Use dictionary as key for another dictionary? [duplicate] - python

Python doesn't allow dictionaries to be used as keys in other dictionaries. Is there a workaround for using non-nested dictionaries as keys?
The general problem with more complicated non-hashable objects and my specific use case has been moved here. My original description of my use case was incorrect.

If you have a really immutable dictionary (although it isn't clear to me why you don't just use a list of pairs: e.g. [('content-type', 'text/plain'), ('host', 'example.com')]), then you may convert your dict into:
A tuple of pairs. You've already done that in your question. A tuple is required instead of list because the results rely on the ordering and the immutability of the elements.
>>> tuple(sorted(a.items()))
A frozen set. It is a more suitable approach from the mathematical point of view, as it requires only the equality relation on the elements of your immutable dict, while the first approach requires the ordering relation besides equality.
>>> frozenset(a.items())

If I needed to use dictionaries as keys, I would flatten the dictionary into a tuple of tuples.
You might find this SO question useful: What is the best way to implement nested dictionaries?
And here is an example of a flatten module that will flatten dictionaries: http://yawpycrypto.sourceforge.net/html/public/Flatten.Flatten-module.html
I don't fully understand your use case and I suspect that you are trying to prematurely optimize something that doesn't need optimization.

To turn a someDictionary into a key, do this
key = tuple(sorted(someDictionary .items())
You can easily reverse this with dict( key )

One way to do this would be to subclass the dict and provide a hash method. ie:
class HashableDict(dict):
def __hash__(self):
return hash(tuple(sorted(self.iteritems())))
>>> d = HashableDict(a=1, b=2)
>>> d2 = { d : "foo"}
>>> d2[HashableDict(a=1, b=2)]
"foo"
However, bear in mind the reasons why dicts (or any mutable types) don't do this: mutating the object after it has been added to a hashtable will change the hash, which means the dict will now have it in the wrong bucket, and so incorrect results will be returned.
If you go this route, either be very sure that dicts will never change after they have been put in the other dictionary, or actively prevent them (eg. check that the hash never changes after the first call to __hash__, and throw an exception if not.)

Hmm, isn't your use case just memoizing function calls? Using a decorator, you will have easy support for arbitrary functions. And yes, they often pickle the arguments, and using circular reasoning, this works for non-standard types as long as they can be pickled.
See e.g. this memoization sample

I'll sum up the options and add one of my own,
you can :
make a subclass to dict and provide a hash function
flatten the dict into a tuple
pickle the dict
convert the Dict into a string using the json module (as shown below)
import json
Dict = {'key' :'value123'}
stringifiedDict = json.dumps(Dict)
print(stringifiedDict)
# {"key": "value123"}
newDict = {stringifiedDict: 12345}
print(newDict[stringifiedDict])
# 12345
for key, val in newDict.items():
print(json.loads(key))
# {'key': 'value123'}
print(json.loads(key)['key'])
# value123

I don't see why you'd ever want to do this, but if you really really do need to, you could try pickling the dictionary:
mydict = {"a":1, "b":{"c":10}}
import pickle
key = pickle.dumps(mydict)
d[key] = value

this function will convert a nested dictionary to an immutable tuple of tuples which you can use as a key:
def convert_dictionary_tuple(input_dict):
"""
this function receives a nested dictionary and convert it to an immutable tuple of tuples with all the given
dictionary data
:param input_dict: a nested dictionary
:return: immutable tuple of tuples with all the given dictionary data
"""
tuples_dict = {}
for key, value in input_dict.iteritems():
if isinstance(value, dict):
tuples_dict[key] = convert_dictionary_tuple(value)
elif isinstance(value, list):
tuples_dict[key] = tuple([convert_dictionary_tuple(v) if isinstance(v, dict) else v for v in value])
else:
tuples_dict[key] = value
return tuple(sorted(tuples_dict.items()))

Class name... OK :/
My solution is to create a class, with dict features, but implemented as a list with {key, value} objects. key and value can be anything then.
class DictKeyDictException(Exception):
pass
class DictKeyDict():
def __init__(self, *args):
values = [self.__create_element(key, value) for key, value in args]
self.__values__ = values
def __setitem__(self, key, value):
self.set(key, value)
def __getitem__(self, key):
return self.get(key)
def __len__(self):
return len(self.__values__)
def __delitem__(self, key):
keys = self.keys()
if key in keys:
index = keys.index(key)
del self.__values__[index]
def clear(self):
self.__values__ = []
def copy(self):
return self.__values__.copy()
def has_key(self, k):
return k in self.keys()
def update(self, *args, **kwargs):
if kwargs:
raise DictKeyDictException(f"no kwargs allowed in '{self.__class__.__name__}.update' method")
for key, value in args:
self[key] = value
return self.__values__
def __repr__(self) -> list:
return repr(self.__values__)
#classmethod
def __create_element(cls, key, value):
return {"key": key, "value": value}
def set(self, key, value) -> None:
keys = self.keys()
if key in keys:
index = keys.index(key)
self.__values__[index] = self.__create_element(key, value)
else:
self.__values__.append(self.__create_element(key, value))
return self.__values__
def keys(self):
return [dict_key_value["key"] for dict_key_value in self.__values__]
def values(self):
return [value["value"] for value in self.__values__]
def items(self):
return [(dict_key_value["key"], dict_key_value["value"]) for dict_key_value in self.__values__]
def pop(self, key, default=None):
keys = self.keys()
if key in keys:
index = keys.index(key)
value = self.__values__.pop(index)["value"]
else:
value = default
return value
def get(self, key, default=None):
keys = self.keys()
if key in keys:
index = keys.index(key)
value = self.__values__[index]["value"]
else:
value = default
return value
def __iter__(self):
return iter(self.keys())
and usage :
dad = {"name": "dad"}
mom = {"name": "mom"}
boy = {"name": "son"}
girl = {"name": "daughter"}
# set
family = DictKeyDict()
family[dad] = {"age": 44}
family[mom] = {"age": 43}
# or
family.set(dad, {"age": 44, "children": [boy, girl]})
# or
family = DictKeyDict(
(dad, {"age": 44, "children": [boy, girl]}),
(mom, {"age": 43, "children": [boy, girl]}),
)
# update
family.update((mom, {"age": 33})) # oups sorry miss /!\ loose my children
family.set({"pet": "cutty"}, "cat")
del family[{"pet": "cutty"}] # cutty left...
family.set({"pet": "buddy"}, "dog")
family[{"pet": "buddy"}] = "wolf" # buddy was not a dog
print(family.keys())
print(family.values())
for k, v in family.items():
print(k, v)

I don't know whether I understand your question correctly, but i'll give it a try
d[repr(a)]=value
You can interate over the dictionary like this
for el1 in d:
for el2 in eval(el1):
print el2,eval(el1)[el2]

Related

How can I access a deeply nested dictionary using tuples?

I would like to expand on the autovivification example given in a previous answer from nosklo to allow dictionary access by tuple.
nosklo's solution looks like this:
class AutoVivification(dict):
"""Implementation of perl's autovivification feature."""
def __getitem__(self, item):
try:
return dict.__getitem__(self, item)
except KeyError:
value = self[item] = type(self)()
return value
Testing:
a = AutoVivification()
a[1][2][3] = 4
a[1][3][3] = 5
a[1][2]['test'] = 6
print a
Output:
{1: {2: {'test': 6, 3: 4}, 3: {3: 5}}}
I have a case where I want to set a node given some arbitrary tuple of subscripts. If I don't know how many layers deep the tuple will be, how can I design a way to set the appropriate node?
I'm thinking that perhaps I could use syntax like the following:
mytuple = (1,2,3)
a[mytuple] = 4
But I'm having trouble coming up with a working implementation.
Update
I have a fully working example based on #JCash's answer:
class NestedDict(dict):
"""
Nested dictionary of arbitrary depth with autovivification.
Allows data access via extended slice notation.
"""
def __getitem__(self, keys):
# Let's assume *keys* is a list or tuple.
if not isinstance(keys, basestring):
try:
node = self
for key in keys:
node = dict.__getitem__(node, key)
return node
except TypeError:
# *keys* is not a list or tuple.
pass
try:
return dict.__getitem__(self, keys)
except KeyError:
raise KeyError(keys)
def __setitem__(self, keys, value):
# Let's assume *keys* is a list or tuple.
if not isinstance(keys, basestring):
try:
node = self
for key in keys[:-1]:
try:
node = dict.__getitem__(node, key)
except KeyError:
node[key] = type(self)()
node = node[key]
return dict.__setitem__(node, keys[-1], value)
except TypeError:
# *keys* is not a list or tuple.
pass
dict.__setitem__(self, keys, value)
Which can achieve the same output as above using extended slice notation:
d = NestedDict()
d[1,2,3] = 4
d[1,3,3] = 5
d[1,2,'test'] = 6
This seems to work
def __setitem__(self, key, value):
if isinstance(key, tuple):
node = self
for i in key[:-1]:
try:
node = dict.__getitem__(node, i)
except KeyError:
node = node[i] = type(self)()
return dict.__setitem__(node, i, value)
return dict.__setitem__(self, key, value)

Finding a key recursively in a dictionary

I'm trying to write a very simple function to recursively search through a possibly nested (in the most extreme cases ten levels deep) Python dictionary and return the first value it finds from the given key.
I cannot understand why my code doesn't work for nested dictionaries.
def _finditem(obj, key):
if key in obj: return obj[key]
for k, v in obj.items():
if isinstance(v,dict):
_finditem(v, key)
print _finditem({"B":{"A":2}},"A")
It returns None.
It does work, however, for _finditem({"B":1,"A":2},"A"), returning 2.
I'm sure it's a simple mistake but I cannot find it. I feel like there already might be something for this in the standard library or collections, but I can't find that either.
If you are looking for a general explanation of what is wrong with code like this, the canonical is Why does my recursive function return None?. The answers here are mostly specific to the task of searching in a nested dictionary.
when you recurse, you need to return the result of _finditem
def _finditem(obj, key):
if key in obj: return obj[key]
for k, v in obj.items():
if isinstance(v,dict):
return _finditem(v, key) #added return statement
To fix the actual algorithm, you need to realize that _finditem returns None if it didn't find anything, so you need to check that explicitly to prevent an early return:
def _finditem(obj, key):
if key in obj: return obj[key]
for k, v in obj.items():
if isinstance(v,dict):
item = _finditem(v, key)
if item is not None:
return item
Of course, that will fail if you have None values in any of your dictionaries. In that case, you could set up a sentinel object() for this function and return that in the case that you don't find anything -- Then you can check against the sentinel to know if you found something or not.
Here's a function that searches a dictionary that contains both nested dictionaries and lists. It creates a list of the values of the results.
def get_recursively(search_dict, field):
"""
Takes a dict with nested lists and dicts,
and searches all dicts for a key of the field
provided.
"""
fields_found = []
for key, value in search_dict.iteritems():
if key == field:
fields_found.append(value)
elif isinstance(value, dict):
results = get_recursively(value, field)
for result in results:
fields_found.append(result)
elif isinstance(value, list):
for item in value:
if isinstance(item, dict):
more_results = get_recursively(item, field)
for another_result in more_results:
fields_found.append(another_result)
return fields_found
Here is a way to do this using a "stack" and the "stack of iterators" pattern (credits to Gareth Rees):
def search(d, key, default=None):
"""Return a value corresponding to the specified key in the (possibly
nested) dictionary d. If there is no item with that key, return
default.
"""
stack = [iter(d.items())]
while stack:
for k, v in stack[-1]:
if isinstance(v, dict):
stack.append(iter(v.items()))
break
elif k == key:
return v
else:
stack.pop()
return default
The print(search({"B": {"A": 2}}, "A")) would print 2.
Just trying to make it shorter:
def get_recursively(search_dict, field):
if isinstance(search_dict, dict):
if field in search_dict:
return search_dict[field]
for key in search_dict:
item = get_recursively(search_dict[key], field)
if item is not None:
return item
elif isinstance(search_dict, list):
for element in search_dict:
item = get_recursively(element, field)
if item is not None:
return item
return None
Here's a Python 3.3+ solution which can handle lists of lists of dicts.
It also uses duck typing, so it can handle any iterable, or object implementing the 'items' method.
from typing import Iterator
def deep_key_search(obj, key: str) -> Iterator:
""" Do a deep search of {obj} and return the values of all {key} attributes found.
:param obj: Either a dict type object or an iterator.
:return: Iterator of all {key} values found"""
if isinstance(obj, str):
# When duck-typing iterators recursively, we must exclude strings
return
try:
# Assume obj is a like a dict and look for the key
for k, v in obj.items():
if k == key:
yield v
else:
yield from deep_key_search(v, key)
except AttributeError:
# Not a dict type object. Is it iterable like a list?
try:
for v in obj:
yield from deep_key_search(v, key)
except TypeError:
pass # Not iterable either.
Pytest:
#pytest.mark.parametrize(
"data, expected, dscr", [
({}, [], "Empty dict"),
({'Foo': 1, 'Bar': 2}, [1], "Plain dict"),
([{}, {'Foo': 1, 'Bar': 2}], [1], "List[dict]"),
([[[{'Baz': 3, 'Foo': 'a'}]], {'Foo': 1, 'Bar': 2}], ['a', 1], "Deep list"),
({'Foo': 1, 'Bar': {'Foo': 'c'}}, [1, 'c'], "Dict of Dict"),
(
{'Foo': 1, 'Bar': {'Foo': 'c', 'Bar': 'abcdef'}},
[1, 'c'], "Contains a non-selected string value"
),
])
def test_deep_key_search(data, expected, dscr):
assert list(deep_key_search(data, 'Foo')) == expected
I couldn't add a comment to the accepted solution proposed by #mgilston because of lack of reputation. The solution doesn't work if the key being searched for is inside a list.
Looping through the elements of the lists and calling the recursive function should extend the functionality to find elements inside nested lists:
def _finditem(obj, key):
if key in obj: return obj[key]
for k, v in obj.items():
if isinstance(v,dict):
item = _finditem(v, key)
if item is not None:
return item
elif isinstance(v,list):
for list_item in v:
item = _finditem(list_item, key)
if item is not None:
return item
print(_finditem({"C": {"B": [{"A":2}]}}, "A"))
I had to create a general-case version that finds a uniquely-specified key (a minimal dictionary that specifies the path to the desired value) in a dictionary that contains multiple nested dictionaries and lists.
For the example below, a target dictionary is created to search, and the key is created with the wildcard "???". When run, it returns the value "D"
def lfind(query_list:List, target_list:List, targ_str:str = "???"):
for tval in target_list:
#print("lfind: tval = {}, query_list[0] = {}".format(tval, query_list[0]))
if isinstance(tval, dict):
val = dfind(query_list[0], tval, targ_str)
if val:
return val
elif tval == query_list[0]:
return tval
def dfind(query_dict:Dict, target_dict:Dict, targ_str:str = "???"):
for key, qval in query_dict.items():
tval = target_dict[key]
#print("dfind: key = {}, qval = {}, tval = {}".format(key, qval, tval))
if isinstance(qval, dict):
val = dfind(qval, tval, targ_str)
if val:
return val
elif isinstance(qval, list):
return lfind(qval, tval, targ_str)
else:
if qval == targ_str:
return tval
if qval != tval:
break
def find(target_dict:Dict, query_dict:Dict):
result = dfind(query_dict, target_dict)
return result
target_dict = {"A":[
{"key1":"A", "key2":{"key3": "B"}},
{"key1":"C", "key2":{"key3": "D"}}]
}
query_dict = {"A":[{"key1":"C", "key2":{"key3": "???"}}]}
result = find(target_dict, query_dict)
print("result = {}".format(result))
Thought I'd throw my hat in the ring, this will allow for recursive requests on anything that implements a __getitem__ method.
def _get_recursive(obj, args, default=None):
"""Apply successive requests to an obj that implements __getitem__ and
return result if something is found, else return default"""
if not args:
return obj
try:
key, *args = args
_obj = object.__getitem__(obj, key)
return _get_recursive(_obj, args, default=default)
except (KeyError, IndexError, AttributeError):
return default

Checking a nested dictionary using a dot notation string "a.b.c.d.e", automatically create missing levels

Given the following dictionary:
d = {"a":{"b":{"c":"winning!"}}}
I have this string (from an external source, and I can't change this metaphor).
k = "a.b.c"
I need to determine if the dictionary has the key 'c', so I can add it if it doesn't.
This works swimmingly for retrieving a dot notation value:
reduce(dict.get, key.split("."), d)
but I can't figure out how to 'reduce' a has_key check or anything like that.
My ultimate problem is this: given "a.b.c.d.e", I need to create all the elements necessary in the dictionary, but not stomp them if they already exist.
You could use an infinite, nested defaultdict:
>>> from collections import defaultdict
>>> infinitedict = lambda: defaultdict(infinitedict)
>>> d = infinitedict()
>>> d['key1']['key2']['key3']['key4']['key5'] = 'test'
>>> d['key1']['key2']['key3']['key4']['key5']
'test'
Given your dotted string, here's what you can do:
>>> import operator
>>> keys = "a.b.c".split(".")
>>> lastplace = reduce(operator.getitem, keys[:-1], d)
>>> lastplace.has_key(keys[-1])
False
You can set a value:
>>> lastplace[keys[-1]] = "something"
>>> reduce(operator.getitem, keys, d)
'something'
>>> d['a']['b']['c']
'something'
... or using recursion:
def put(d, keys, item):
if "." in keys:
key, rest = keys.split(".", 1)
if key not in d:
d[key] = {}
put(d[key], rest, item)
else:
d[keys] = item
def get(d, keys):
if "." in keys:
key, rest = keys.split(".", 1)
return get(d[key], rest)
else:
return d[keys]
How about an iterative approach?
def create_keys(d, keys):
for k in keys.split("."):
if not k in d: d[k] = {} #if the key isn't there yet add it to d
d = d[k] #go one level down and repeat
If you need the last key value to map to anything else than a dictionary you could pass the value as an additional argument and set this after the loop:
def create_keys(d, keys, value):
keys = keys.split(".")
for k in keys[:-1]:
if not k in d: d[k] = {}
d = d[k]
d[keys[-1]] = value
I thought this discussion was very useful, but for my purpose to only get a value (not setting it), I ran into issues when a key was not present. So, just to add my flair to the options, you can use reduce in combination of an adjusted dict.get() to accommodate the scenario that the key is not present, and then return None:
from functools import reduce
import re
from typing import Any, Optional
def find_key(dot_notation_path: str, payload: dict) -> Any:
"""Try to get a deep value from a dict based on a dot-notation"""
def get_despite_none(payload: Optional[dict], key: str) -> Any:
"""Try to get value from dict, even if dict is None"""
if not payload or not isinstance(payload, (dict, list)):
return None
# can also access lists if needed, e.g., if key is '[1]'
if (num_key := re.match(r"^\[(\d+)\]$", key)) is not None:
try:
return payload[int(num_key.group(1))]
except IndexError:
return None
else:
return payload.get(key, None)
found = reduce(get_despite_none, dot_notation_path.split("."), payload)
# compare to None, as the key could exist and be empty
if found is None:
raise KeyError()
return found
In my use case, I need to find a key within an HTTP request payload, which can often include lists as well. The following examples work:
payload = {
"haystack1": {
"haystack2": {
"haystack3": None,
"haystack4": "needle"
}
},
"haystack5": [
{"haystack6": None},
{"haystack7": "needle"}
],
"haystack8": {},
}
find_key("haystack1.haystack2.haystack4", payload)
# "needle"
find_key("haystack5.[1].haystack7", payload)
# "needle"
find_key("[0].haystack5.[1].haystack7", [payload, None])
# "needle"
find_key("haystack8", payload)
# {}
find_key("haystack1.haystack2.haystack4.haystack99", payload)
# KeyError
EDIT: added list accessor
d = {"a":{}}
k = "a.b.c".split(".")
def f(d, i):
if i >= len(k):
return "winning!"
c = k[i]
d[c] = f(d.get(c, {}), i + 1)
return d
print f(d, 0)
"{'a': {'b': {'c': 'winning!'}}}"

Python force dict entries to be utf-8

I spent the better part of an afternoon trying to patch dictionary objects to be utf-8 encoded in lieu of unicode. I am trying to find the fastest and best performing way to extend a dictionary object and ensure that it's entries, keys and values are both utf-8.
Here is what I have come up with, it does the job but I'm wondering what improvements could be made.
class UTF8Dict(dict):
def __init__(self, *args, **kwargs):
d = dict(*args, **kwargs)
d = _decode_dict(d)
super(UTF8Dict,self).__init__(d)
def __setitem__(self,key,value):
if isinstance(key,unicode):
key = key.encode('utf-8')
if isinstance(value,unicode):
value = value.encode('utf-8')
return super(UTF8Dict,self).__setitem__(key,value)
def _decode_list(data):
rv = []
for item in data:
if isinstance(item, unicode):
item = item.encode('utf-8')
elif isinstance(item, list):
item = _decode_list(item)
elif isinstance(item, dict):
item = _decode_dict(item)
rv.append(item)
return rv
def _decode_dict(data):
rv = {}
for key, value in data.iteritems():
if isinstance(key, unicode):
key = key.encode('utf-8')
if isinstance(value, unicode):
value = value.encode('utf-8')
elif isinstance(value, list):
value = _decode_list(value)
elif isinstance(value, dict):
value = _decode_dict(value)
rv[key] = value
return rv
Suggestions that improve any of the following would be very helpful:
Performance
Cover more edge-cases
Error handling
I agree with the comments that say that this may be misguided. That said, here are some holes in your current scheme:
d.setdefault can be used to add unicode objects to your dict:
>>> d = UTF8Dict()
>>> d.setdefault(u'x', u'y')
d.update can be used to add unicode objects to your dict:
>>> d = UTF8Dict()
>>> d.update({u'x': u'y'})
the list values contained in a dict could be modified to include unicode objects, using any standard list operations. E.g.:
>>> d = UTF8Dict(x=[])
>>> d['x'].append(u'x')
Why do you want to ensure that your data structure contains only utf-8 strings?

how to get the values from a dictionary in the sorted-order of the keys?

i have a problem with my python class. it contains a method that goes through all the keys of a multi_dimensional dictionary. The dictionary keys may be in the following order (1->(2,3),2->(5,6)). the problem is when the method attempts to get the keys, sometimes it gets them in the right order (1,2) and sometimes it gets them in the wrong order (2,1). any help will be appreciated. below is a very simple example of what the code might look like
class tree:
tree_as_string = ""
def __init__(self):
self.id = ""
self.daughters = {1 = 'node0', 2 = 'node1'}
def get_as_string(self):
s = ''
for key in self.daughters:
tree_as_string = s.join([tree_as_string, key])
return tree_as_string
Note that dictionaries are unordered so in order to be sure that values would be handled in the ordered format you need to sort them first. Please find sample below:
d={1:{2:'tst', 3:'tst2'}, 4:{...} }
for key in sorted(d):
for skey in sorted(d[key]):
#do something
OR something like this:
from operator import itemgetter
d={1:{2:'tst', 3:'tst2'}, 4:{6:'tst7', 7:'tst12'} }
for key, val in sorted(d.items(), key=itemgetter(0)):
for skey, sval in sorted(val.items(), key=itemgetter(0)):
print key, skey, sval
This means that in your case:
class tree(object):
tree_as_string = ""
def __init__(self):
self.id = ""
self.daughters = {1 = 'node0', 2 = 'node1'}
def get_as_string(self):
s = ''
for key in sorted(self.daughters):
tree_as_string = s.join([tree_as_string, key])
return tree_as_string
You can use sorted (which I would suggest because it reduces your code even further example is below), or just call sort on keys. Sort doesn't return a value, it just sorts whatever list is provided.
class tree:
def __init__(self):
self.id = ""
self.daughters = {10: "test10", 2 : 'node2', 1 :'node1', 0 : 'node0'}
def get_as_string_using_sorted(self):
''' Makes me happy'''
return '->'.join(str(k) for k in sorted(self.daughters))
def get_as_string(self):
s = '->'
keys = self.daughters.keys()
keys.sort()
return s.join(str(k) for k in keys)
t = tree()
print t.get_as_string()
print t.get_as_string_using_sorted()
Side note I changed your code a bit.
I fixed your dict syntax its k:v verus k=v
I initialized tree_as_string ="" you defined a class variable but never used it.
I added str(key) because key is an int.
Added more test numbers
changed s to ->
Simplified your for loop.

Categories

Resources