In python 3, I need a function to dynamically return a value from a nested key.
nesteddict = {'a':'a1','b':'b1','c':{'cn':'cn1'}}
print(nesteddict['c']['cn']) #gives cn1
def nestedvalueget(keys):
print(nesteddict[keys])
nestedvalueget(['n']['cn'])
How should nestedvalueget be written?
I'm not sure the title is properly phrased, but I'm not sure how else to best describe this.
If you want to traverse dictionaries, use a loop:
def nestedvalueget(*keys):
ob = nesteddict
for key in keys:
ob = ob[key]
return ob
or use functools.reduce():
from functools import reduce
from operator import getitem
def nestedvalueget(*keys):
return reduce(getitem, keys, nesteddict)
then use either version as:
nestedvalueget('c', 'cn')
Note that either version takes a variable number of arguments to let you pas 0 or more keys as positional arguments.
Demos:
>>> nesteddict = {'a':'a1','b':'b1','c':{'cn':'cn1'}}
>>> def nestedvalueget(*keys):
... ob = nesteddict
... for key in keys:
... ob = ob[key]
... return ob
...
>>> nestedvalueget('c', 'cn')
'cn1'
>>> from functools import reduce
>>> from operator import getitem
>>> def nestedvalueget(*keys):
... return reduce(getitem, keys, nesteddict)
...
>>> nestedvalueget('c', 'cn')
'cn1'
And to clarify your error message: You passed the expression ['n']['cn'] to your function call, which defines a list with one element (['n']), which you then try to index with 'cn', a string. List indices can only be integers:
>>> ['n']['cn']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list indices must be integers, not str
>>> ['n'][0]
'n'
Related
dict.update([other]) says
Update the dictionary with the key/value pairs from other, overwriting existing keys. Return None.
update() accepts either another dictionary object or an iterable of key/value pairs (as tuples or other iterables of length two). If keyword arguments are specified, the dictionary is then updated with those key/value pairs: d.update(red=1, blue=2).
But
>>> {}.update( ("key", "value") )
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: dictionary update sequence element #0 has length 3; 2 is required
So why does Python apparently try to use the first string of the tuple?
The argument needs to be an iterable of tuples (or other iterables of length two), e.g. a list
>>> d = {}
>>> d.update([("key", "value")])
>>> d
{'key': 'value'}
or a tuple, however this fails:
>>> d = {}
>>> d.update((("key", "value")))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: dictionary update sequence element #0 has length 3; 2 is required
The Python documentation on tuple again solves this mystery:
Note that it is actually the comma which makes a tuple, not the parentheses. The parentheses are optional, except in the empty tuple case, or when they are needed to avoid syntactic ambiguity.
I.e. (None) is not a tuple at all, but (None,) is:
>>> type( (None,) )
<class 'tuple'>
So this works:
>>> d = {}
>>> d.update((("key", "value"),))
>>> d
{'key': 'value'}
>>>
You can't omit the parentheses because
>>> d = {}
>>> d.update(("key", "value"),)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: dictionary update sequence element #0 has length 3; 2 is required
that would be said syntactic ambiguity (comma is function argument separator).
In my script I work with large and complex object (a multi-dimentional list that contains strings, dictionaries, and class objects of custom types). I need to copy, pickle (cache) and unpickle it, as well as send between child processes through MPI interface. At some points I get suspicious that the data transfer is error-free, i.e. if in the end I have the same object.
Therefore, I want to calculate its hash sum or some other type of fingerprint. I know that there is, for example, hashlib library; however, it is limited in terms of object type:
>>> import hashlib
>>> a = "123"
>>> hashlib.sha224(a.encode()).hexdigest()
'78d8045d684abd2eece923758f3cd781489df3a48e1278982466017f'
>>> a = [1, 2, 3]
>>> hashlib.sha224(a).hexdigest()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: object supporting the buffer API required
Thus, the question: is there some analog of this function that works with objects of any type?
One option would be to recursively convert all elements of the structure into hashable counterparts, i.e. lists into tuples, dicts and objects into frozensets, and then simply apply hash() to the whole thing. An illustration:
def to_hashable(s):
if isinstance(s, dict):
return frozenset((x, to_hashable(y)) for x, y in s.items())
if isinstance(s, list):
return tuple(to_hashable(x) for x in s)
if isinstance(s, set):
return frozenset(s)
if isinstance(s, MyObject):
d = {'__class__': s.__class__.__name__}
d.update(s.__dict__)
return to_hashable(d)
return s
class MyObject:
pass
class X(MyObject):
def __init__(self, zzz):
self.zzz = zzz
my_list = [
1,
{'a': [1,2,3], 'b': [4,5,6]},
{1,2,3,4,5},
X({1:2,3:4}),
X({5:6,7:8})
]
print hash(to_hashable(my_list))
my_list2 = [
1,
{'b': [4,5,6], 'a': [1,2,3]},
{5,4,3,2,1},
X({3:4,1:2}),
X({7:8,5:6})
]
print hash(to_hashable(my_list2)) # the same as above
pickle.dumps(...)
returns a string, which is a hashable object. You can do it as follows
import pickle
a=[1,2,3,4]
h=pickle.dumps(a)
print hash(h)
# or like this
from hashlib import sha512
print sha512(h).hexdigest()
c=pickle.loads(h)
assert c==a
I want to define a list of an integer and defaultdict in python.
I am creating a parent dictionary which shall return the above list.
I am being unable to define the list type.
def index_struct():return defaultdict(list_struct)
def list_struct(): return list(int,post_struct)
def post_struct(): return defaultdict(list)
Currently getting an error as list cant take two args..
Thanks for the help in advance
You're right that list() only takes one argument. Use the square brackets notation instead. Also note the [int, post_struct] won't work because nothing is calling the two constructors. You need to call the constructors manually by adding parentheses:
from collections import defaultdict
def index_struct():return defaultdict(list_struct)
def list_struct(): return [int(), post_struct()]
def post_struct(): return defaultdict(list)
>>> d = index_struct()
>>> d['somekey'][0] = 5
>>> d['somekey'][1]['anotherkey'] = 6
>>> d
defaultdict(<function list_struct at 0x10252ff50>, {'somekey': [5, defaultdict(<type 'list'>, {'anotherkey': 6})]})
I'm coding a N'th order markov chain.
It goes something like this:
class Chain:
def __init__(self, order):
self.order = order
self.state_table = {}
def train(self, next_state, *prev_states):
if len(prev_states) != self.order: raise ValueError("prev_states does not match chain order")
if prev_states in self.state_table:
if next_state in self.state_table[prev_states]:
self.state_table[prev_states][next_state] += 1
else:
self.state_table[prev_states][next_state] = 0
else:
self.state_table[prev_states] = {next_state: 0}
Unfortunally, list and tuples are unhashable, and I cannot use them as keywords in dicts...
I have hopefully explained my problem well enough for you to understand what I try to achieve.
Any good ideas how I can use multiple values for dictionary keyword?
Followup question:
I did not know that tuples are hashable.
But the entropy for the hashes seem low. Are there hash collisions possible for tuples?!
Tuples are hashable when their contents are.
>>> a = {}
>>> a[(1,2)] = 'foo'
>>> a[(1,[])]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
As for collisions, when I try a bunch of very similar tuples, I see them being mapped widely apart:
>>> hash((1,2))
3713081631934410656
>>> hash((1,3))
3713081631933328131
>>> hash((2,2))
3713082714462658231
>>> abs(hash((1,2)) - hash((1,3)))
1082525
>>> abs(hash((1,2)) - hash((2,2)))
1082528247575
You can use tuples as dictionary keys, they are hashable as long as their content is hashable (as #larsman said).
Don't worry about collisions, Python's dict takes care of it.
>>> hash('a')
12416037344
>>> hash(12416037344)
12416037344
>>> hash('a') == hash(12416037344)
True
>>> {'a': 'one', 12416037344: 'two'}
{'a': 'one', 12416037344: 'two'}
In this example I took a string and an integer. But it works the same with tuples. Just didn't have any idea how to find two tuples with identical hashes.
Folks,
Relative n00b to python, trying to find out the diff of two lists of dictionaries.
If these were just regular lists, I could create sets and then do a '-'/intersect operation.
However, set operation does not work on lists of dictionaries:
>>> l = []
>>> pool1 = {}
>>> l.append(pool1)
>>> s = set(l)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'
You need a "hashable" dictionary.
The items() attribute is a list of tuples. Make this a tuple() and you have a hashable version of a dictionary.
tuple( sorted( some_dict.items() ) )
You can define your own dict wrapper that defines __hash__ method:
class HashableDict(dict):
def __hash__(self):
return hash(tuple(sorted(self.items())))
this wrapper is safe as long as you do not modify the dictionary while finding the intersection.
Python won't allow you to use a dictionary as a key in either a set or dictionary because it has no default __hash__ method defined. Unfortunately, collections.OrderedDict is also not hashable. There also isn't a built-in dictionary analogue to frozenset. You can either create a subclass of dict with your own hash method, or do something like this:
>>> def dict_item_set(dict_list):
... return set(tuple(*sorted(d.items())) for d in dict_list)
>>> a = [{1:2}, {3:4}]
>>> b = [{3:4}, {5:6}]
>>> dict(dict_item_set(a) - dict_item_set(b))
{1: 2}
>>> dict(dict_item_set(a) & dict_item_set(b))
{3: 4}
Of course, this is neither efficient nor pretty.