Efficiently accessing arbitrarily deep dictionaries - python

Suppose I have a multi-level dictionary like this
mydict = {
'first': {
'second': {
'third': {
'fourth': 'the end'
}
}
}
}
I'd like to access it like this
test = get_entry(mydict, 'first.second.third.fourth')
What I have so far is
def get_entry(dict, keyspec):
keys = keyspec.split('.')
result = dict[keys[0]]
for key in keys[1:]:
result = dict[key]
return result
Are there more efficient ways to do it? According to %timeit the runtime of the function is 1.26us, while accessing the dictionary the standard way like this
foo = mydict['first']['second']['third']['fourth']
takes 541ns. I'm looking for ways to trim it to 800ns range if possible.
Thanks

I got a 20% performance boost by tightening up the code a bit but a whopping 400% increase by using a cache for split strings. That only makes a difference if you use the same spec multiple times. Here are sample implementations and a profile script to test.
test.py
mydict = {
'first': {
'second': {
'third': {
'fourth': 'the end'
}
}
}
}
# original
def get_entry(dict, keyspec):
keys = keyspec.split('.')
result = dict[keys[0]]
for key in keys[1:]:
result = result[key]
return result
# tighten up code
def get_entry_2(mydict, keyspec):
for key in keyspec.split('.'):
mydict = mydict[key]
return mydict
# use a cache
cache = {}
def get_entry_3(mydict, keyspec):
global cache
try:
spec = cache[keyspec]
except KeyError:
spec = tuple(keyspec.split('.'))
cache[keyspec] = spec
for key in spec:
mydict = mydict[key]
return mydict
if __name__ == "__main__":
test = get_entry(mydict, 'first.second.third.fourth')
print(test)
profile.py
from timeit import timeit
print("original get_entry")
print(timeit("get_entry(mydict, 'first.second.third.fourth')",
setup="from test import get_entry, mydict"))
print("get_entry_2 with tighter code")
print(timeit("get_entry_2(mydict, 'first.second.third.fourth')",
setup="from test import get_entry_2, mydict"))
print("get_entry_3 with cache of split spec")
print(timeit("get_entry_3(mydict, 'first.second.third.fourth')",
setup="from test import get_entry_3, mydict"))
print("just splitting a spec")
print(timeit("x.split('.')", setup="x='first.second.third.fourth'"))
The timing on my machine is
original get_entry
4.148535753000033
get_entry_2 with tighter code
3.2986323120003362
get_entry_3 with cache of split spec
1.3073233439990872
just splitting a spec
1.0949148639992927
Notice that splitting the spec is a comparatively expensive operation for this function. That's why caching helps.

There's really only one solution. Rebuild your dictionary. But do it just once.
def recursive_flatten(mydict):
d = {}
for k, v in mydict.items():
if isinstance(v, dict):
for k2, v2 in recursive_flatten(v).items():
d[k + '.' + k2] = v2
else:
d[k] = v
return d
In [786]: new_dict = recursive_flatten(mydict); new_dict
Out[786]: {'first.second.third.fourth': 'the end'}
(Some more tests)
In [788]: recursive_flatten({'x' : {'y' : 1, 'z' : 2}, 'y' : {'a' : 5}, 'z' : 2})
Out[788]: {'x.y': 1, 'x.z': 2, 'y.a': 5, 'z': 2}
In [789]: recursive_flatten({'x' : 1, 'y' : {'x' : 234}})
Out[789]: {'x': 1, 'y.x': 234}
Every access becomes constant time from here on.
Now, just access your value using new_dict['first.second.third.fourth']. Should work for any arbitrarily nested dictionary that does not contain a self-reference.
Note that every solution has its fair share of tradeoffs, this is no exception. Unless you're firing millions of queries at your data such that preprocessing is an acceptable overhead, then this is it. With the other solutions, you are only sidestepping the problem instead of addressing it - which is dealing with the dictionary's structure. OTOH, if you're going to do this once on many such similar data structures, it make no sense to preprocess just for a single query, in which case you may prefer one of the other solutions.

I updated the answer from How to use a dot "." to access members of dictionary? to use an initial conversion which will then work for nested dictionaries:
You can use the following class to allow dot-indexing of dictionaries:
class dotdict(dict):
"""dot.notation access to dictionary attributes"""
__getattr__ = dict.get
__setattr__ = dict.__setitem__
__delattr__ = dict.__delitem__
However, this only supports nesting if all nested dictionaries are also of type dotdict. That's where the following helper function comes in:
def dct_to_dotdct(d):
if isinstance(d, dict):
d = dotdict({k: dct_to_dotdct(v) for k, v in d.items()})
return d
This function has to be run once on your nested dictionary, and the result can then be indexed using dot-indexing.
Here are some examples:
In [13]: mydict
Out[13]: {'first': {'second': {'third': {'fourth': 'the end'}}}}
In [14]: mydict = dct_to_dotdct(mydict)
In [15]: mydict.first.second
Out[15]: {'third': {'fourth': 'the end'}}
In [16]: mydict.first.second.third.fourth
Out[16]: 'the end'
A note about performance: this answer is slow compared to standard dictionary access, I just wanted to present an option that actually used "dot access" to a dictionary.

Here is a solution similar to chrisz's, but you do not have to anything to your dict a-prior. :
class dictDotter(dict):
def __getattr__(self,key):
val = self[key]
return val if type(val) != dict else dictDotter(val)
and just x=dictDotter(originalDict) will let you have arbitrary dot getting (`x.first.second...). I'll note this is twice as slow as chrisz solution, and his is 9 times as slow as yours (on my machine, approximately).
So, if you insist on making this work #tdelaney seems to have provided the only real performance improvement.
Another option that does better than what you have (in terms of run time):
class dictObjecter:
def __init__(self,adict):
for k,v in adict.items():
self.__dict__[k] = v
if type(v) == dict: self.__dict__[k] = dictObjecter(v)
which will make an object out of your dict, so dot notation is usual. This will improve run time to 3 times what you have, so not bad, but at the cost of going over your dict, and replacing it with something else.
Here is the total testing code:
from timeit import timeit
class dictObjecter:
def __init__(self,adict):
for k,v in adict.items():
self.__dict__[k] = v
if type(v) == dict: self.__dict__[k] = dictObjecter(v)
class dictDotter(dict):
def __getattr__(self,key):
val = self[key]
return val if type(val) != dict else dictDotter(val)
def get_entry(dict, keyspec):
keys = keyspec.split('.')
result = dict[keys[0]]
for key in keys[1:]:
result = result[key]
return result
class dotdict(dict):
"""dot.notation access to dictionary attributes"""
__getattr__ = dict.get
__setattr__ = dict.__setitem__
__delattr__ = dict.__delitem__
def dct_to_dotdct(d):
if isinstance(d, dict):
d = dotdict({k: dct_to_dotdct(v) for k, v in d.items()})
return d
x = {'a':{'b':{'c':{'d':1}}}}
y = dictDotter(x)
z = dct_to_dotdct(x)
w = dictObjecter(x)
print('{:15} : {}'.format('dict dotter',timeit('y.a.b.c.d',globals=locals(),number=1000)))
print('{:15} : {}'.format('dot dict',timeit('z.a.b.c.d',globals=locals(),number=1000)))
print('{:15} : {}'.format('dict objecter',timeit('w.a.b.c.d',globals=locals(),number=1000)))
print('{:15} : {}'.format('original',timeit("get_entry(x,'a.b.c.d')",globals=locals(),number=1000)))
print('{:15} : {:.20f}'.format('best ref',timeit("x['a']['b']['c']['d']",globals=locals(),number=1000)))
I provided the last regular lookup as a best reference.The results on a Windows Ubuntu subsystem:
dict dotter : 0.0035500000003594323
dot dict : 0.0017939999997906853
dict objecter : 0.00021699999979318818
original : 0.0006629999998040148
best ref : 0.00007999999979801942
so the is objectified dict is 3 times as slow as a regular dictionary lookup - so if speed is important, why would you want this?

I had the same need, so I created the Prodict.
For your case, you can do it in one line:
mydict = {
'first': {
'second': {
'third': {
'fourth': 'the end'
}
}
}
}
dotdict = Prodict.from_dict(mydict)
print(dotdict.first.second.third.fourth) # "the end"
After that, use dotdict just like a dict, because it is a subclass of dict:
dotdict.first == dotdict['first'] # True
You can also add more keys dynamically with dot notation:
dotdict.new_key = 'hooray'
print(dotdict.new_key) # "hooray"
It works even if the new keys are nested dictionaries:
dotdict.it = {'just': 'works'}
print(dotdict.it.just) # "works"
Lastly, if you define your keys beforehand, you get auto completion and auto type conversion:
class User(Prodict):
user_id: int
name: str
user = User(user_id="1", "name":"Ramazan")
type(user.user_id) # <class 'int'>
# IDE will be able to auto complete 'user_id' and 'name' properties
UPDATE:
This is the test result for the same code written by #kabanus:
x = {'a': {'b': {'c': {'d': 1}}}}
y = dictDotter(x)
z = dct_to_dotdct(x)
w = dictObjecter(x)
p = Prodict.from_dict(x)
print('{:15} : {}'.format('dict dotter', timeit('y.a.b.c.d', globals=locals(), number=10000)))
print('{:15} : {}'.format('prodict', timeit('p.a.b.c.d', globals=locals(), number=10000)))
print('{:15} : {}'.format('dot dict', timeit('z.a.b.c.d', globals=locals(), number=10000)))
print('{:15} : {}'.format('dict objecter', timeit('w.a.b.c.d', globals=locals(), number=10000)))
print('{:15} : {}'.format('original', timeit("get_entry(x,'a.b.c.d')", globals=locals(), number=10000)))
print('{:15} : {:.20f}'.format('prodict getitem', timeit("p['a']['b']['c']['d']", globals=locals(), number=10000)))
print('{:15} : {:.20f}'.format('best ref', timeit("x['a']['b']['c']['d']", globals=locals(), number=10000)))
And results:
dict dotter : 0.04535976458466595
prodict : 0.02860781018446784
dot dict : 0.019078164088831673
dict objecter : 0.0017378700050722368
original : 0.006594238310349346
prodict getitem : 0.00510931794975705289
best ref : 0.00121740293554022105
As you can see, its performance is between "dict dotter" and "dot dict".
Any performance enhancement suggestion will be appreciated.

The code should be less iterative and more dynamic!!
data
mydict = {
'first': {
'second': {
'third': {
'fourth': 'the end'
}
}
}
}
Function
def get_entry(dict, keyspec):
for keys in keyspec.split('.'):
dict = dict[keys]
return dict
call the function
res = get_entry(mydict, 'first.second.third.fourth')
this will take less time to execute even it's a dynamic code execution!!

You can use reduce (functools.reduce in python3):
import operator
def get_entry(dct, keyspec):
return reduce(operator.getitem, keyspec.split('.'), dct)
It is more nicely looking but with a little less perfomance.
Your version timeit:
>>> timeit("get_entry_original(mydict, 'first.second.third.fourth')",
"from __main__ import get_entry_original, mydict", number=1000000)
0.5646841526031494
with reduce:
>>> timeit("get_entry(mydict, 'first.second.third.fourth')",
"from __main__ import get_entry, mydict")
0.6140949726104736
As tdelaney notice - split consume almost as much cpu power as getting key in dict:
def split_keys(keyspec):
keys = keyspec.split('.')
timeit("split_keys('first.second.third.fourth')",
"from __main__ import split_keys")
0.28857898712158203
Just move string splitting away from get_entry function:
def get_entry(dct, keyspec_list):
return reduce(operator.getitem, keyspec_list, dct)
timeit("get_entry(mydict, ['first', 'second', 'third', 'fourth'])",
"from __main__ import get_entry, mydict")
0.37825703620910645

Related

python: couple of values to be called reciprocally? [duplicate]

This question already has answers here:
How to implement an efficient bidirectional hash table?
(8 answers)
Closed 2 years ago.
I'm doing this switchboard thing in python where I need to keep track of who's talking to whom, so if Alice --> Bob, then that implies that Bob --> Alice.
Yes, I could populate two hash maps, but I'm wondering if anyone has an idea to do it with one.
Or suggest another data structure.
There are no multiple conversations. Let's say this is for a customer service call center, so when Alice dials into the switchboard, she's only going to talk to Bob. His replies also go only to her.
You can create your own dictionary type by subclassing dict and adding the logic that you want. Here's a basic example:
class TwoWayDict(dict):
def __setitem__(self, key, value):
# Remove any previous connections with these values
if key in self:
del self[key]
if value in self:
del self[value]
dict.__setitem__(self, key, value)
dict.__setitem__(self, value, key)
def __delitem__(self, key):
dict.__delitem__(self, self[key])
dict.__delitem__(self, key)
def __len__(self):
"""Returns the number of connections"""
return dict.__len__(self) // 2
And it works like so:
>>> d = TwoWayDict()
>>> d['foo'] = 'bar'
>>> d['foo']
'bar'
>>> d['bar']
'foo'
>>> len(d)
1
>>> del d['foo']
>>> d['bar']
Traceback (most recent call last):
File "<stdin>", line 7, in <module>
KeyError: 'bar'
I'm sure I didn't cover all the cases, but that should get you started.
In your special case you can store both in one dictionary:
relation = {}
relation['Alice'] = 'Bob'
relation['Bob'] = 'Alice'
Since what you are describing is a symmetric relationship. A -> B => B -> A
I know it's an older question, but I wanted to mention another great solution to this problem, namely the python package bidict. It's extremely straight forward to use:
from bidict import bidict
map = bidict(Bob = "Alice")
print(map["Bob"])
print(map.inv["Alice"])
I would just populate a second hash, with
reverse_map = dict((reversed(item) for item in forward_map.items()))
Two hash maps is actually probably the fastest-performing solution assuming you can spare the memory. I would wrap those in a single class - the burden on the programmer is in ensuring that two the hash maps sync up correctly.
A less verbose way, still using reversed:
dict(map(reversed, my_dict.items()))
You have two separate issues.
You have a "Conversation" object. It refers to two Persons. Since a Person can have multiple conversations, you have a many-to-many relationship.
You have a Map from Person to a list of Conversations. A Conversion will have a pair of Persons.
Do something like this
from collections import defaultdict
switchboard= defaultdict( list )
x = Conversation( "Alice", "Bob" )
y = Conversation( "Alice", "Charlie" )
for c in ( x, y ):
switchboard[c.p1].append( c )
switchboard[c.p2].append( c )
No, there is really no way to do this without creating two dictionaries. How would it be possible to implement this with just one dictionary while continuing to offer comparable performance?
You are better off creating a custom type that encapsulates two dictionaries and exposes the functionality you want.
You may be able to use a DoubleDict as shown in recipe 578224 on the Python Cookbook.
Another possible solution is to implement a subclass of dict, that holds the original dictionary and keeps track of a reversed version of it. Keeping two seperate dicts can be useful if keys and values are overlapping.
class TwoWayDict(dict):
def __init__(self, my_dict):
dict.__init__(self, my_dict)
self.rev_dict = {v : k for k,v in my_dict.iteritems()}
def __setitem__(self, key, value):
dict.__setitem__(self, key, value)
self.rev_dict.__setitem__(value, key)
def pop(self, key):
self.rev_dict.pop(self[key])
dict.pop(self, key)
# The above is just an idea other methods
# should also be overridden.
Example:
>>> d = {'a' : 1, 'b' : 2} # suppose we need to use d and its reversed version
>>> twd = TwoWayDict(d) # create a two-way dict
>>> twd
{'a': 1, 'b': 2}
>>> twd.rev_dict
{1: 'a', 2: 'b'}
>>> twd['a']
1
>>> twd.rev_dict[2]
'b'
>>> twd['c'] = 3 # we add to twd and reversed version also changes
>>> twd
{'a': 1, 'c': 3, 'b': 2}
>>> twd.rev_dict
{1: 'a', 2: 'b', 3: 'c'}
>>> twd.pop('a') # we pop elements from twd and reversed version changes
>>> twd
{'c': 3, 'b': 2}
>>> twd.rev_dict
{2: 'b', 3: 'c'}
There's the collections-extended library on pypi: https://pypi.python.org/pypi/collections-extended/0.6.0
Using the bijection class is as easy as:
RESPONSE_TYPES = bijection({
0x03 : 'module_info',
0x09 : 'network_status_response',
0x10 : 'trust_center_device_update'
})
>>> RESPONSE_TYPES[0x03]
'module_info'
>>> RESPONSE_TYPES.inverse['network_status_response']
0x09
I like the suggestion of bidict in one of the comments.
pip install bidict
Useage:
# This normalization method should save hugely as aDaD ~ yXyX have the same form of smallest grammar.
# To get back to your grammar's alphabet use trans
def normalize_string(s, nv=None):
if nv is None:
nv = ord('a')
trans = bidict()
r = ''
for c in s:
if c not in trans.inverse:
a = chr(nv)
nv += 1
trans[a] = c
else:
a = trans.inverse[c]
r += a
return r, trans
def translate_string(s, trans):
res = ''
for c in s:
res += trans[c]
return res
if __name__ == "__main__":
s = "bnhnbiodfjos"
n, tr = normalize_string(s)
print(n)
print(tr)
print(translate_string(n, tr))
Since there aren't much docs about it. But I've got all the features I need from it working correctly.
Prints:
abcbadefghei
bidict({'a': 'b', 'b': 'n', 'c': 'h', 'd': 'i', 'e': 'o', 'f': 'd', 'g': 'f', 'h': 'j', 'i': 's'})
bnhnbiodfjos
The kjbuckets C extension module provides a "graph" data structure which I believe gives you what you want.
Here's one more two-way dictionary implementation by extending pythons dict class in case you didn't like any of those other ones:
class DoubleD(dict):
""" Access and delete dictionary elements by key or value. """
def __getitem__(self, key):
if key not in self:
inv_dict = {v:k for k,v in self.items()}
return inv_dict[key]
return dict.__getitem__(self, key)
def __delitem__(self, key):
if key not in self:
inv_dict = {v:k for k,v in self.items()}
dict.__delitem__(self, inv_dict[key])
else:
dict.__delitem__(self, key)
Use it as a normal python dictionary except in construction:
dd = DoubleD()
dd['foo'] = 'bar'
A way I like to do this kind of thing is something like:
{my_dict[key]: key for key in my_dict.keys()}

Python Conditional Operator Useage

I'm trying to better understand a way to implement a conditional variable into a python requests' http request.
I'm doing the following:
def make_post(parameter1_value, parameter2='None'):
payload = {
"parameter1": parameter1_value,
"parameter2": "None"
}
r = requests.post('https://myrequesturl.com/location/', params=payload)
I'm trying to find the best way to NOT include the "parameter2": "None" value in the params unless the value is not equal to None.
I realize I could have several conditional statements to select a proper format of request depending on the parameters, but then it would become sticky to scale for each additional parameterN needed in the function.
I'm wondering if there is a way to include a conditional type of variable in the payload that would have the effect of including that parameter only if it isn't equal to the default value set; in this case 'None'.
So if the the value of parameter2 = None:
payload = {
"parameter1": parameter1_value
}
But if the value of parameter2 = anything other than None
payload = {
"parameter1": parameter1_value,
"parameter2": "Non_default_value"
}
I'm trying to avoid the following type of approach:
if some_parameter != 'None':
payload = {
"one": "arrangement"
}
if some_some_other_parameter != 'None':
payload = {
"one": "arrangement"
}
It seems a bit impractical in an example of only two parameters, but if I were to have a function with many parameters it would seem that a one line include/exclude type expression would greatly reduce the amount of overall code required. It's been my experience when I can't figure out how to get python to do something clever it's only because I don't know how, and not that python can't.
Something like this?
def foo(v1, v2=None):
params = {k:v for k,v in locals().items() if v!=None}
print params
foo('hello')
foo('hello', 'world')
Output:
{'v1': 'hello'}
{'v1': 'hello', 'v2': 'world'}
Or
Something like this?
def foo(a={}, url=''):
params = {k:v for k,v in a.items() if v!=None}
print params
foo({'k1':'v1','k2':2,'k3':None, 'k4':'v4'})
foo()
Output:
{'k2': 2, 'k1': 'v1', 'k4': 'v4'}
{}
I think something like this would work for you, especially if the default values are specific to the key:
def optional_include_mut(payload, key, value, default=None):
"""
Mutate the given payload dict by adding the given key/value pair if the value is not equal to default.
:param payload: Dict to mutate
:param key: Key to insert into dict
:param value: Value to insert into dict
:param default: Default value to control insertion
:return: A mutated version of the payload dict
"""
if value != default:
payload[key] = value
return payload
>>> payload = dict(good=1)
>>> payload = optional_include_mut(payload, 'foo', 'bar', default='bar')
{'good': 1}
>>> payload = optional_include_mut(payload, 'foo', 'bar', default='not bar')
{'good': 1, 'foo': 'bar'}
You could use a dictionary comprehension to filter the values:
>>> payload = dict(a='value1',b='value2',c='None',d='None',e='value3')
>>> payload
{'a': 'value1', 'b': 'value2', 'c': 'None', 'd': 'None', 'e': 'value3'}
>>> {k:v for k,v in payload.items() if v != 'None'}
{'a': 'value1', 'b': 'value2', 'e': 'value3'}
how about something like this
>>> def fun(*param):
return {"parameter{}".format(k): v for k,v in enumerate(param,1) if v not in (None,"None")}
>>> fun("hello","None","world",None,42)
{'parameter1': 'hello', 'parameter5': 42, 'parameter3': 'world'}
>>>
(*param is the way to said you want a variable numbers of arguments)
this way you can pass as many parameter as you want, and filter those with invalid values
with python 3 syntax you can also do it as
def fun(*param, invalid=(None,"None")):
return {"parameter{}".format(k): v for k,v in enumerate(param,1) if v not in invalid}
for python 2 to accomplished the same would be
def fun(*param, **karg):
invalid = karg.get("invalid", (None,"None") )
return {"parameter{}".format(k): v for k,v in enumerate(param,1) if v not in invalid}
another example
>>> fun("hello","None","world",None,42,32, invalid=("None",None,32))
{'parameter1': 'hello', 'parameter5': 42, 'parameter3': 'world'}
>>>

Is it possible to turn a list into a nested dict of keys *without* recursion?

Supposing I had a list as follows:
mylist = ['a','b','c','d']
Is it possible to create, from this list, the following dict without using recursion/a recursive function?
{
'a': {
'b': {
'c': {
'd': { }
}
}
}
}
For the simple case, simply iterate and build, either from the end or the start:
result = {}
for name in reversed(mylist):
result = {name: result}
or
result = current = {}
for name in mylist:
current[name] = {}
current = current[name]
The first solution can also be expressed as a one-liner using reduce():
reduce(lambda res, name: {name: res}, reversed(mylist), {})
For this simple case at least, yes:
my_list = ['a', 'b', 'c', 'd']
cursor = built_dict = {}
for value in my_list:
cursor[value] = {}
cursor = cursor[value]
Or for fancyness and reduced readability:
dict = reduce(lambda x, y: {y: x}, reversed(myList), {})
It's worth mentioning that every recursion can be converted into iteration, although sometimes that might not be so easy. For the particular example in the question, it is simple enough, it's just a matter of accumulating the expected result in a variable and traversing the input list in the appropriate order. This is what I mean:
def convert(lst):
acc = {}
for e in reversed(lst):
acc = {e: acc}
return acc
Or even shorter, the above algorithm can be expressed as a one-liner (assuming Python 2.x, in Python 3.x reduce was moved to the functools module). Notice how the variable names in the previous solution correspond to the lambda's parameters, and how in both cases the initial value of the accumulator is {}:
def convert(lst):
return reduce(lambda acc, e: {e: acc}, reversed(lst), {})
Either way, the function convert works as expected:
mylist = ['a','b','c','d']
convert(mylist)
=> {'a': {'b': {'c': {'d': {}}}}}
mydict = dict()
currentDict = mydict
for el in mylist:
currentDict[el] = dict()
currentDict = currentDict[el]

Combine two dictionaries of dictionaries (Python)

Is there an easy way to combine two dictionaries of dictionaries in Python? Here's what I need:
dict1 = {'A' : {'B' : 'C'}}
dict2 = {'A' : {'D' : 'E'}}
result = dict_union(dict1, dict2)
# => result = {'A' : {'B' : 'C', 'D' : 'E'}}
I created a brute-force function that does it, but I was looking for a more compact solution:
def dict_union(train, wagon):
for key, val in wagon.iteritems():
if not isinstance(val, dict):
train[key] = val
else:
subdict = train.setdefault(key, {})
dict_union(subdict, val)
Here is a class, RUDict (for Recursive-Update dict) that implements the behaviour you're looking for:
class RUDict(dict):
def __init__(self, *args, **kw):
super(RUDict,self).__init__(*args, **kw)
def update(self, E=None, **F):
if E is not None:
if 'keys' in dir(E) and callable(getattr(E, 'keys')):
for k in E:
if k in self: # existing ...must recurse into both sides
self.r_update(k, E)
else: # doesn't currently exist, just update
self[k] = E[k]
else:
for (k, v) in E:
self.r_update(k, {k:v})
for k in F:
self.r_update(k, {k:F[k]})
def r_update(self, key, other_dict):
if isinstance(self[key], dict) and isinstance(other_dict[key], dict):
od = RUDict(self[key])
nd = other_dict[key]
od.update(nd)
self[key] = od
else:
self[key] = other_dict[key]
def test():
dict1 = {'A' : {'B' : 'C'}}
dict2 = {'A' : {'D' : 'E'}}
dx = RUDict(dict1)
dx.update(dict2)
print(dx)
if __name__ == '__main__':
test()
>>> import RUDict
>>> RUDict.test()
{'A': {'B': 'C', 'D': 'E'}}
>>>
This solution is pretty compact. It's ugly, but you're asking for some rather complicated behavior:
dict_union = lambda d1,d2: dict((x,(dict_union(d1.get(x,{}),d2[x]) if
isinstance(d2.get(x),dict) else d2.get(x,d1.get(x)))) for x in
set(d1.keys()+d2.keys()))
My solution is designed to combine any number of dictionaries as you had and could probably be cut down to look neater by limiting it to combining only two dictionaries but the logic behind it should be fairly easy to use in your program.
def dictCompressor(*args):
output = {x:{} for mydict in args for x,_ in mydict.items()}
for mydict in args:
for x,y in mydict.items():
output[x].update(y)
return output
You could subclass dict and wrap the original dict.update() method with a version which would call update() on the subdicts rather than directly overwriting subdicts. That may end up taking at least as much effort as your existing solution, though.
Has to be recursive, since dictionaries may nest. Here's my first take on it, you probably want to define your behavior when dictionaries nest at different depths.
def join(A, B):
if not isinstance(A, dict) or not isinstance(B, dict):
return A or B
return dict([(a, join(A.get(a), B.get(a))) for a in set(A.keys()) | set(B.keys())])
def main():
A = {'A': {'B': 'C'}, 'D': {'X': 'Y'}}
B = {'A': {'D': 'E'}}
print join(A, B)
As for me there is not enaugh information but anyway please find my sample code below:
dict1 = {'A' : {'B' : 'C'}}
dict2 = {'A' : {'D' : 'E'}, 'B':{'C':'D'}}
output = {}
for key in (set(dict1) | set(dict2):
output[key] = {}
(key in dict1 and output[key].update(dict1.get(key)))
(key in dict2 and output[key].update(dict2.get(key)))

Two way/reverse map [duplicate]

This question already has answers here:
How to implement an efficient bidirectional hash table?
(8 answers)
Closed 2 years ago.
I'm doing this switchboard thing in python where I need to keep track of who's talking to whom, so if Alice --> Bob, then that implies that Bob --> Alice.
Yes, I could populate two hash maps, but I'm wondering if anyone has an idea to do it with one.
Or suggest another data structure.
There are no multiple conversations. Let's say this is for a customer service call center, so when Alice dials into the switchboard, she's only going to talk to Bob. His replies also go only to her.
You can create your own dictionary type by subclassing dict and adding the logic that you want. Here's a basic example:
class TwoWayDict(dict):
def __setitem__(self, key, value):
# Remove any previous connections with these values
if key in self:
del self[key]
if value in self:
del self[value]
dict.__setitem__(self, key, value)
dict.__setitem__(self, value, key)
def __delitem__(self, key):
dict.__delitem__(self, self[key])
dict.__delitem__(self, key)
def __len__(self):
"""Returns the number of connections"""
return dict.__len__(self) // 2
And it works like so:
>>> d = TwoWayDict()
>>> d['foo'] = 'bar'
>>> d['foo']
'bar'
>>> d['bar']
'foo'
>>> len(d)
1
>>> del d['foo']
>>> d['bar']
Traceback (most recent call last):
File "<stdin>", line 7, in <module>
KeyError: 'bar'
I'm sure I didn't cover all the cases, but that should get you started.
In your special case you can store both in one dictionary:
relation = {}
relation['Alice'] = 'Bob'
relation['Bob'] = 'Alice'
Since what you are describing is a symmetric relationship. A -> B => B -> A
I know it's an older question, but I wanted to mention another great solution to this problem, namely the python package bidict. It's extremely straight forward to use:
from bidict import bidict
map = bidict(Bob = "Alice")
print(map["Bob"])
print(map.inv["Alice"])
I would just populate a second hash, with
reverse_map = dict((reversed(item) for item in forward_map.items()))
Two hash maps is actually probably the fastest-performing solution assuming you can spare the memory. I would wrap those in a single class - the burden on the programmer is in ensuring that two the hash maps sync up correctly.
A less verbose way, still using reversed:
dict(map(reversed, my_dict.items()))
You have two separate issues.
You have a "Conversation" object. It refers to two Persons. Since a Person can have multiple conversations, you have a many-to-many relationship.
You have a Map from Person to a list of Conversations. A Conversion will have a pair of Persons.
Do something like this
from collections import defaultdict
switchboard= defaultdict( list )
x = Conversation( "Alice", "Bob" )
y = Conversation( "Alice", "Charlie" )
for c in ( x, y ):
switchboard[c.p1].append( c )
switchboard[c.p2].append( c )
No, there is really no way to do this without creating two dictionaries. How would it be possible to implement this with just one dictionary while continuing to offer comparable performance?
You are better off creating a custom type that encapsulates two dictionaries and exposes the functionality you want.
You may be able to use a DoubleDict as shown in recipe 578224 on the Python Cookbook.
Another possible solution is to implement a subclass of dict, that holds the original dictionary and keeps track of a reversed version of it. Keeping two seperate dicts can be useful if keys and values are overlapping.
class TwoWayDict(dict):
def __init__(self, my_dict):
dict.__init__(self, my_dict)
self.rev_dict = {v : k for k,v in my_dict.iteritems()}
def __setitem__(self, key, value):
dict.__setitem__(self, key, value)
self.rev_dict.__setitem__(value, key)
def pop(self, key):
self.rev_dict.pop(self[key])
dict.pop(self, key)
# The above is just an idea other methods
# should also be overridden.
Example:
>>> d = {'a' : 1, 'b' : 2} # suppose we need to use d and its reversed version
>>> twd = TwoWayDict(d) # create a two-way dict
>>> twd
{'a': 1, 'b': 2}
>>> twd.rev_dict
{1: 'a', 2: 'b'}
>>> twd['a']
1
>>> twd.rev_dict[2]
'b'
>>> twd['c'] = 3 # we add to twd and reversed version also changes
>>> twd
{'a': 1, 'c': 3, 'b': 2}
>>> twd.rev_dict
{1: 'a', 2: 'b', 3: 'c'}
>>> twd.pop('a') # we pop elements from twd and reversed version changes
>>> twd
{'c': 3, 'b': 2}
>>> twd.rev_dict
{2: 'b', 3: 'c'}
There's the collections-extended library on pypi: https://pypi.python.org/pypi/collections-extended/0.6.0
Using the bijection class is as easy as:
RESPONSE_TYPES = bijection({
0x03 : 'module_info',
0x09 : 'network_status_response',
0x10 : 'trust_center_device_update'
})
>>> RESPONSE_TYPES[0x03]
'module_info'
>>> RESPONSE_TYPES.inverse['network_status_response']
0x09
I like the suggestion of bidict in one of the comments.
pip install bidict
Useage:
# This normalization method should save hugely as aDaD ~ yXyX have the same form of smallest grammar.
# To get back to your grammar's alphabet use trans
def normalize_string(s, nv=None):
if nv is None:
nv = ord('a')
trans = bidict()
r = ''
for c in s:
if c not in trans.inverse:
a = chr(nv)
nv += 1
trans[a] = c
else:
a = trans.inverse[c]
r += a
return r, trans
def translate_string(s, trans):
res = ''
for c in s:
res += trans[c]
return res
if __name__ == "__main__":
s = "bnhnbiodfjos"
n, tr = normalize_string(s)
print(n)
print(tr)
print(translate_string(n, tr))
Since there aren't much docs about it. But I've got all the features I need from it working correctly.
Prints:
abcbadefghei
bidict({'a': 'b', 'b': 'n', 'c': 'h', 'd': 'i', 'e': 'o', 'f': 'd', 'g': 'f', 'h': 'j', 'i': 's'})
bnhnbiodfjos
The kjbuckets C extension module provides a "graph" data structure which I believe gives you what you want.
Here's one more two-way dictionary implementation by extending pythons dict class in case you didn't like any of those other ones:
class DoubleD(dict):
""" Access and delete dictionary elements by key or value. """
def __getitem__(self, key):
if key not in self:
inv_dict = {v:k for k,v in self.items()}
return inv_dict[key]
return dict.__getitem__(self, key)
def __delitem__(self, key):
if key not in self:
inv_dict = {v:k for k,v in self.items()}
dict.__delitem__(self, inv_dict[key])
else:
dict.__delitem__(self, key)
Use it as a normal python dictionary except in construction:
dd = DoubleD()
dd['foo'] = 'bar'
A way I like to do this kind of thing is something like:
{my_dict[key]: key for key in my_dict.keys()}

Categories

Resources