How can I implement vice versa mapping in Python? - python

I have to convert a bunch of strings into numbers, process the numbers and convert back.
I thought of a map where I will add 2 keys when I've provided string:
Key1: (string, number);
Key2: (number, string).
But this is not optimal in terms of memory.
What I need to archieve in example:
my_cool_class.get('string') # outputs 1
my_cool_class.get(1) # outputs 'string'
Is there better way to do this in python?
Thanks in advance!

You can implement your own twoway dict like
class TwoWayDict(dict):
def __len__(self):
return dict.__len__(self) / 2
def __setitem__(self, key, value):
dict.__setitem__(self, key, value)
dict.__setitem__(self, value, key)
my_cool_class = TwoWayDict()
my_cool_class[1] = 'string'
print my_cool_class[1] # 'string'
print my_cool_class['string'] # 1

Instead of allocate another memory for the second dict, you can get the key from the value, consider that it will cost you with run-time.
mydict = {'george':16,'amber':19}
print (mydict.keys()[mydict.values().index(16)])
>>> 'george'
EDIT:
Notice that In Python 3, dict.values() (along with dict.keys() and dict.items()) returns a view, rather than a list. You therefore need to wrap your call to dict.values() in a call to list like so:
mydict = {'george':16,'amber':19}
print (list(mydict.keys())[list(mydict.values()).index(16)])

If optimal memory usage is an issue, you may not want to use Python in the first place. To solve your immediate problem, just add both the string and the number as keys to the dictionary. Remember that only a reference to the original objects will be stored. Additional copies will not be made:
d = {}
s = '123'
n = int(s)
d[s] = n
d[n] = s
Now you can access the value by the opposite key just like you wanted. This method has the advantage of O(1) lookup time.

You can create a dictionary of tuples this way you just need to check against the type of the variable to decide which one you should return.
Example:
class your_cool_class(object):
def __init__(self):
# example of dictionary
self.your_dictionary = {'3': ('3', 3), '4': ('4', 4)}
def get(self, numer):
is_string = isinstanceof(number, str)
number = str(number)
n = self.your_dictionary.get(number)
if n is not None:
return n[0] if is_string else n[1]
>>>> my_cool_class = your_cool_class()
>>>> my_cool_class.get(3)
>>>> '3'
>>>> my_cool_class.get('3')
>>>> 3

Related

How to check if a dictionary is invertible

I am working on a question that requires to point out the problem in a function that determines whether a dictionary is invertible (for every value appearing in the dictionary, there is only one key that maps to that value) or not. The question is below:
def is_invertible(adict):
inv_dict = make_inv_dict(adict)
return adict == inv_dict
def make_inv_dict(adict):
if len(adict) > 0:
key, val = adict.popitem()
adict = make_inv_dict(adict)
if val not in adict.values():
adict[key] = val
return adict
else:
return {}
Currently, this returns False for {'a': 'b', 'b': 'e', 'c': 'f'} when it is supposed to be True. I am sure that there is an issue in make_inv_dict function; is it simply because adict is not an appropriate variable name in adict = make_inv_dict(adict)? Or is there another reason why the function returns a wrong outcome?
At least three problems with the function you've given:
The condition adict == inv_dict checks whether the dictionary is its own inverse, not merely that it's invertible.
It uses pop_item to remove a key/value pair from the input dictionary, and then inserts it backwards, so the function operates in-place. By the time it's finished, adict's original contents will be completely destroyed, so the comparison will be meaningless anyway.
The line adict[key] = val inserts the key/value pair in the original order; the inverse order should be adict[val] = key. So this function doesn't do what its name promises, which is to make an inverse dictionary.
It should be noted that if not for the destruction of the dictionary (2.), the mistakes (1.) and (3.) would cancel out, because the outcome of the function is to rebuild the original dictionary but without duplicate values.
I'm guessing some people will find this question if they're looking for a correct way to invert a dictionary, so here is one: this function returns the inverse dictionary if it's possible, or None otherwise.
def invert_dict(d):
out = dict()
for k,v in dict.items():
if v in out:
return None
out[v] = k
return out
Helper function returning a boolean for whether a dictionary is invertible:
def is_invertible(d):
return invert_dict(d) is not None
My Answer:
def is_invertible(dict_var):
return len(dict_var.values()) == len(set(dict_var.values()))

How to increment a value (in defaultdict of defaultdicts)?

How to increment d['a']['b']['c'][1][2][3] if d is defaultdict of defaultdict without code dublication?
from collections import defaultdict
nested_dict_type = lambda: defaultdict(nested_dict_type)
nested_dict = nested_dict_type()
# incrementation
if type(nested_dict['a']['b']['c']['d'][1][2][3][4][5][6]) != int:
nested_dict['a']['b']['c']['d'][1][2][3][4][5][6] = 0
nested_dict['a']['b']['c']['d'][1][2][3][4][5][6] += 1 # ok, now it contains 1
Here we can see that we duplicated (in the code) a chain of keys 3 times.
Question: Is it possible to write a function inc that will take nested_dict['a']['b']...[6] and do the same job as above? So:
def inc(x):
if type(x) != int:
x = 0
x += 1
inc(nested_dict['a']['b']['c']['d'][1][2][3][4][5][6]) # ok, now it contains 1
Update (20 Aug 2018):
There is still no answer to the question. It's clear that there are options "how to do what I want", but the question is straightforward: there is "value", we pass it to a function, function modifies it. It looks that it's not possible.
Just a value, without any "additional keys", etc.
If it is so, can we make an answer more generic?
Notes:
What is defaultdict of defaultdicts - SO.
This question is not about "storing of integers in a defaultdict", so I'm not looking for a hierarchy of defaultdicts with an int type at the leaves.
Assume that type (int in the examples) is known in advance / can be even parametrized (including the ability to perform += operator) - the question is how to dereference the object, pass it for modification and store back in the context of defaultdict of defaultdicts.
Is the answer to this question related to the mutability? See example below:
Example:
def inc(x):
x += 1
d = {'a': int(0)}
inc(d['a'])
# d['a'] == 0, immutable
d = {'a': Int(0)}
inc(d['a'])
# d['a'] == 1, mutated
Where Int is:
class Int:
def __init__(self, value):
self.value = value
def __add__(self, v):
self.value += v
return self
def __repr__(self):
return str(self.value)
It's not exactly abut mutability, more about how assignment performs name binding.
When you do x = 0 in your inc function you bind a new object to the name x, and any connection between that name and the previous object bound to that name is lost. That doesn't depend on whether or not x is mutable.
But since x is an item in a mutable object we can achieve what you want by passing the parent mutable object to inc along with the key needed to access the desired item.
from collections import defaultdict
nested_dict_type = lambda: defaultdict(nested_dict_type)
nested_dict = nested_dict_type()
# incrementation
def inc(ref, key):
if not isinstance(ref[key], int):
ref[key] = 0
ref[key] += 1
d = nested_dict['a']['b']['c']['d'][1][2][3][4][5]
inc(d, 6)
print(d)
output
defaultdict(<function <lambda> at 0xb730553c>, {6: 1})
Now we aren't binding a new object, we're merely mutating an existing one, so the original d object gets updated correctly.
BTW, that deeply nested dict is a bit painful to work with. Maybe there's a better way to organize your data... But anyway, one thing that can be handy when working with deep nesting is to use lists or tuples of keys. Eg,
q = nested_dict
keys = 'a', 'b', 'c', 'd', 1, 2, 3, 4, 5
for k in keys:
q = q[k]
q now refers to nested_dict['a']['b']['c']['d'][1][2][3][4][5]
You can't have multiple default types with defaultdict. You have the following options:
Nested defaultdict of defaultdict objects indefinitely;
defaultdict of int objects, which likely won't suit your needs;
defaultdict of defaultdict down to a specific level with int defined for the last level, e.g. d = defaultdict(lambda: defaultdict(int)) for a single nesting;
Similar to (3), but for counting you can use collections.Counter instead, i.e. d = defaultdict(Counter).
I recommend the 3rd or 4th options if you are always going to go down to a set level. In other words, a scalar value will only be supplied at the nth level, where n is constant.
Otherwise, one manual option is to have a function perform the type-testing. In this case, try / except may be a good alternative. Here we also define a recursive algorithm to allow you to feed a list of keys rather than defining manual __getitem__ calls.
from collections import defaultdict
from functools import reduce
from operator import getitem
nested_dict_type = lambda: defaultdict(nested_dict_type)
d = nested_dict_type()
d[1][2] = 10
def inc(d_in, L):
try:
reduce(getitem, L[:-1], d_in)[L[-1]] += 1
except TypeError:
reduce(getitem, L[:-1], d_in)[L[-1]] = 1
inc(d, [1, 2])
inc(d, [1, 3])
print(d)
defaultdict({1: defaultdict({2: 11, 3: 1})})

How to use any value as a dictionary key?

I'd like to use instances of any type as a key in a single dict.
def add_to_dict(my_object, d, arbitrary_val = '123'):
d[ id(my_object) ] = arbitrary_val
d = {}
add_to_dict('my_str', arbitrary_val)
add_to_dict(my_list, arbitrary_val)
add_to_dict(my_int, arbirtray_val)
my_object = myclass()
my_object.__hash__ = None
add_to_dict(my_object, arbitrary_val)
The above won't work because my_list and my_object can't be hashed.
My first thought was to just pass in the id value of the object using the id() function.
def add_to_dict(my_object, d, arbitrary_val = '123'):
d[ id(my_object) ] = arbitrary_val
However, that won't work because id('some string') == id('some string') is not guaranteed to always be True.
My second thought was to test if the object has the __hash__ attribute. If it does, use the object, otherwise, use the id() value.
def add_to_dict(my_object, d, arbitrary_val = '123'):
d[ my_object if my_object.__hash__ else id(my_object) ] = arbitrary_val
However, since hash() and id() both return int's, I believe I will eventually get a collision.
How can I write add_to_dict(obj, d) above to ensure that no matter what obj is (list, int, str, object, dict), it will correctly set the item in the dictionary and do so without collision?
We could make some kind of dictionary that allows us to insert mutable objects as well:
class DictionaryMutable:
nullobject = object()
def __init__(self):
self._inner_dic = {}
self._inner_list = []
def __getitem__(self, name):
try:
return self._inner_dic[name]
except TypeError:
for key, val in self._inner_list:
if name == key:
return val
raise KeyError(name)
def __setitem__(self, name, value):
try:
self._inner_dic[name] = value
except TypeError:
for elm in self._inner_list:
if name == elm[0]:
elm[1] = value
break
else:
self._inner_list.append([name,value])
# ...
This works as follows: the DictionaryMutable consists out of a dictionary and a list. The dictionary contains the hashable immutable keys, the list contains sublists where each sublist contains two elements: a key and a value.
For each lookup we first attempt to perform a lookup on the dictionary, in case the key name is unhashable, a TypeError will be thrown. In that case we iterate through the list, check if one of the keys matches and return the corresponding value if it does. If no such element exists, we raise a KeyError.
Setting elements works approximately the same way: first we attempt to set the element in the dictionary. If it turns out the key is unhashable, we search linearly through the list and aim to add the element. If that fails, we add it at the end of the list.
This implementation has some major disadvantages:
if the dictionary lookup fails due to the key being unhashable, we will perform linear lookup, this can siginificantly slow down the lookup; and
if you alter an object that is in the dictionary, then the key will be updated, and thus a search for that object will fail. It thus can result in some unpredicted behavior.
This is only a basic implementation. For instance __iter__, etc. need to be implemented as well.
Instead of the id() of the object, you could use the pickled byte stream representation of the object pickle.dumps() returns for it. pickle works with most built-in types, and there are ways to extend it to work with most values it doesn't know how to do automatically.
Note: I used the repr() of the object as its "arbitrary value" in an effort to make it easier to identify them in the output displayed.
try:
import cpickle as pickle
except ModuleNotFoundError:
import pickle
from pprint import pprint
def add_to_dict(d, obj, arbitrary_val='123'):
d[pickle.dumps(obj)] = arbitrary_val
class MyClass: pass
my_string = 'spam'
my_list = [13, 'a']
my_int = 42
my_instance = MyClass()
d = {}
add_to_dict(d, my_string, repr(my_string))
add_to_dict(d, my_list, repr(my_list))
add_to_dict(d, my_int, repr(my_int))
add_to_dict(d, my_instance, repr(my_instance))
pprint(d)
Output:
{b'\x80\x03K*.': '42',
b'\x80\x03X\x04\x00\x00\x00spamq\x00.': "'spam'",
b'\x80\x03]q\x00(K\rX\x01\x00\x00\x00aq\x01e.': "[13, 'a']",
b'\x80\x03c__main__\nMyClass\nq\x00)\x81q\x01.': '<__main__.MyClass object at '
'0x021C1630>'}

How to swap 2 values in a dictionary, given the 3rd's key?

Suppose that I have dictionary with 3 keys 'x', 'y' and 'z'. What I need to do is to write a function that, given 'x' as argument, swaps the values stored in 'y' and 'z'.
def swap(d, key):
a, b = [_ for _ in d if _ != key]
d[a], d[b] = d[b], d[a]
This is what I've came up with, but I'm looking for a more simple and concise way. Is there any, as far as you know?
You can use a slightly more clever means of determining the keys to swap by doing:
a, b = d.keys() - {key} # On Py3; on Python 2.7, you'd use d.viewkeys()
but it's a pretty minor "improvement"; using set operations moves more work to the C layer, avoiding the Python layer iteration of a list comprehension, but the difference when you're talking about iterating three values is pretty trivial.
It's using a KeysView (a live, set-like view of the dict's keys) to get set operations to preserve the two keys not passed.
I'd do this this way, to avoid using loop:
def swap(dictS, key):
keys = list(dictS.keys())
keys.remove(key)
dictS[keys[0]], dict[keys[1]] = dictS[keys[1]], dict[keys[0]]
To solve this, we first need to find the two keys that are not same as the input key. We then just swap the values for those keys.
def swap(d, key):
keys_to_swap = []
for k in d:
if k != key:
keys_to_swap.append(k)
# keys_to_swap are the keys that need to be swapped
# Swap now
temp = d.get(keys_to_swap[0])
d[keys_to_swap[0]] = d.get(keys_to_swap[1])
d[keys_to_swap[1]] = temp
return d
Your original answer is correct, but you are not returning d.
So to correct your solution:
def swap2(d, key):
a, b = [_ for _ in d if _ != key]
d[a], d[b] = d[b], d[a]
return d #Added this line

dictionary or map with either string or integer as key in python?

This might be a silly question, but for some reason the solution escapes me at the moment.
I would like to have fast and efficient access to data that is in a list format. So for example a list of questions:
q = {}
q[1] = "my first string"
q[2] = "my second string"
q[3] = "my third string"
I can easily find what question 2's string is by doing q[2]. But I would also like to retrieve the question number by indexing q with the string:
q["my second string"] -> gives 2 as answer
I would like to do this without iterating over the keys (defeats the purpose of a dictionary) and like to avoid defining a second dictionary using the string as the key to avoid wasted memory. Is this possible?
Ultimately the reason for this is I would like to access say q[2] or q["my second string"] and get the data associated with question 2, whether using the number or the string as a key to that data. Is this possible without having to iterating over all the keys while avoiding data duplication?
There's no problem having a mixture of int and str as keys
>>> q = {}
>>> q[1] = "my first string"
>>> q[2] = "my second string"
>>> q[3] = "my third string"
>>> q.update({v:k for k,v in q.items()})
>>> q["my second string"]
2
You can use an OrderedDict, but for one of the directions it's not going to be as efficient as a normal dictionary lookup.
from collections import OrderedDict
q = OrderedDict()
q["my first string"] = 1
q["my second string"] = 2
q["my third string"] = 3
# Now you have normal key lookups on your string as a normal dict, and to get the order
q.values()[1] # To get the second value out
# To get the key, value pair of the second entry
q.items()[1]
# Would return `('my second string', 2)`
This was asked at Efficient bidirectional hash table in Python?
The answer remains the same - use bidict from http://pypi.python.org/pypi/bidict
class MyDict(dict):
def __init__(self, **kwargs):
super(MyDict, self).__init__(**kwargs)
for k, v in kwargs.iteritems():
self[v] = k
def __setitem__(self, key, val):
super(MyDict, self).__setitem__(key, val)
super(MyDict, self).__setitem__(val, key)
d = MyDict(a=1, b=2)
print d[1] # "a"
print d[2] # "b"
d['c'] = 3
print d[3] # "c"

Categories

Resources