What makes lists unhashable? - python

So lists are unhashable:
>>> { [1,2]:3 }
TypeError: unhashable type: 'list'
The following page gives an explanation:
A list is a mutable type, and cannot be used as a key in a dictionary
(it could change in-place making the key no longer locatable in the
internal hash table of the dictionary).
I understand why it is undesirable to use mutable objects as dictionary keys. However, Python raises the same exception even when I am simply trying to hash a list (independently of dictionary creation)
>>> hash( [1,2] )
TypeError: unhashable type: 'list'
Does Python do this as a guarantee that mutable types will never be used as dictionary keys? Or is there another reason that makes mutable objects impossible to hash, regardless of how I plan to use them?

Dictionaries and sets use hashing algorithms to uniquely determine an item. And those algorithms make use of the items used as keys to come up the unique hash value. Since lists are mutable, the contents of a list can change. After allowing a list to be in a dictionary as a key, if the contents of the list changes, the hash value will also change. If the hash value changes after it gets stored at a particular slot in the dictionary, it will lead to an inconsistent dictionary. For example, initially the list would have gotten stored at location A, which was determined based on the hash value. If the hash value changes, and if we look for the list we might not find it at location A, or as per the new hash value, we might find some other object.
Since, it is not possible to come up with a hash value, internally there is no hashing function defined for lists.
PyObject_HashNotImplemented, /* tp_hash */
As the hashing function is not implemented, when you use it as a key in the dictionary, or forcefully try to get the hash value with hash function, it fails to hash it and so it fails with unhashable type
TypeError: unhashable type: 'list'

Related

Why can't a set store mutable data types?

I read in a book which stated due to "algorithmic underpinnings", only instances of immutable types can be added to a Python set, but it did not explain what are those "algorithmic underpinnings".
After trying to actually add a list (mutable data type) to set, I got an error saying TypeError: unhashable type: list, but again why does the item to be added has to be hashable?
>>> my_set = set()
>>> my_set.add('a')
>>> my_set.add(1)
>>> my_set.add((1,2,3))
>>> my_set.add([1,2,3])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
From the Python documentation:
A set object is an unordered collection of distinct hashable objects
and
An object is hashable if it has a hash value which never changes during its lifetime (it needs a __hash__() method), and can be compared to other objects (it needs an __eq__() method). Hashable objects which compare equal must have the same hash value.
To explain what that means, sets/dictionary are based on the concept of Hash Tables, where all the keys are stored not with their actual value, but with their hash. The hash is an integer number that is calculated from the key value.
For example, if I want to insert the key k="Hello" into a set with the hashing function f(k) = sum of all ASCII values, then you would get Hash of k = f(k) = 72+101+108+108+111 = 500, and you would store k in position 500
(Of course it's much more complicated than this, it's just a stupid example).
This way when you want to check if k is in the set, you don't have to do a linear search, but you can calculate its hash and check at index 500 if there is something, so it's much faster. (That's why sets are much faster than lists when they have many elements).
Little problem: if your key is a list, you can modify it, and that would change its hash, so each time you modify the list you would need to edit all the sets/dictionaries that contains it, and it would be a huge mess, so It's better to avoid mutable types, and that's why lists are not hashable.

The address of keys are stored very far from each other

I'd like to explore the hash table,
In [1]: book = {"apple":0.67, "milk":1.49, "avocado":1.49, "python":2}
In [5]: [hex(id(key)) for key in book]
Out[5]: ['0x10ffffc70', '0x10ffffab0', '0x10ffffe68', '0x10ee1cca8']
The addresses tell that the keys are far away from each other, especially key "python",
I assumed that they are adjacent to one another.
How could this happen? Is it running in high performance?
There are two ways we can interpret your confusion: either you expected the id() to be the hash function for the keys, or you expected keys to be relocated to the hash table and, since in CPython the id() value is a memory location, that the id() values would say something about the hash table size. We can address both by talking about Python's dictionary implementation and how Python deals with objects in general.
Python dictionaries are implemented as a hash table, which is a table of limited size. To store keys, a hash function generates an integer (same integer for equal values), and the key is stored in a slot based on that number using a modulo function:
slot = hash(key) % len(table)
This can lead to collisions, so having a large range of numbers for the hash function to pick from is going to help reduce the chances there are such collisions. You still have to deal with collisions anyway, but you want to minimise that.
Python does not use the id() function as a hash function here, because that would not produce the same hash for equal values! If you didn't produce the same hash for equal values, then you couldn't use multiple "hello world" strings as a means to find the right slot again, as dictionary["hello world"] = "value" then "hello world" in dictionary would produce different id() values and thus hash to different slots and you would not that the specific string value has already been used as a key.
Instead, objects are expected to implement a __hash__ method, and you can see what that method produces for various objects with the hash() function.
Because keys stored in a dictionary must remain unchanged, Python won't let you store mutable types in a dictionary. Otherwise, if you can change their value, they would no longer be equal to another such object with the old value and shame hash, and you wouldn't find them in the slot that their new hash would map to.
Note that Python puts all objects in a dynamic heap, and uses references everywhere to relate the objects. Dictionaries hold references to keys and values; putting a key into a dictionary does not re-locate the key in memory and the id() of the key won't change. If keys were relocated, then a requirement for the id() function would be violated, the documentation states: This is an integer which is guaranteed to be unique and constant for this object during its lifetime.
As for those collisions: Python deals with collisions by looking for a new slot with a fixed formula, finding an empty slot in a predictable but psuedorandom series of slot numbers; see the dictobject.c source code comments if you want to know the details. As the table fills up, Python will dynamically grow the table to fit more elements, so there will always be empty slots.

Python Hash function and Hash Object

What is the different between hashable and hashobject in python?
Hashable
In general means an object has a hash value that never changes in its lifetime and can be compared to other objects. Thanks to those two features, a hashable object can be used as a key in a generic hash map
in python mmutable built-in objects are hashable while mutable containers (such as lists or dictionaries) are not. User-defined objects are by default hashable
Hashtable
in general, hash table (hash map) is a data structure used to implement an associative array, a structure that can map keys to values. Each key given a hash value through hash function for lookup
in python, dictionary is an implementation of hashtable
hash() in python
hash is a hash function that gives you a hash value (for the key inputed)
In [1]: hash ('seed_of_wind')
Out[1]: 8762898084756078118
As mentioned already, this distinctive 'id' is very useful for look up
in theory, a distinctive key will generate a distinctive hash value
By hash object, do you mean by hashable object? If so, it is covered above

How to accept a database and print out a list in python

So I have to create a simple program that accepts a database with a bunch of artists and their works of art with the details of the art. There is a given artist and I have to search through the database to find all the ones where they have to same artist and return them. I'm not allowed to use other built in functions nor import anything. Can someone tell me why its creating the error and what it means?
def works_by_artists(db,artists):
newlist = {}
for a in db.keys():
for b in db[artists]:
if a == b:
newlist.append(a);
return newlist
This prints out an error:
for b in db[artists]:
TypeError: unhashable type: 'list'
A dictionary can accept only some kinds of values as keys. In particular, they must be hashable, which mean they cannot change while in the dictionary ("immutable"), and there must be a function known to Python that takes the value and returns a nonnegative integer that somehow represents the integer. Integers, floats, strings, and tuples are immutable and hashable, while standard lists, dictionaries, and sets are not. The hash value of the key is used to store the key-value pair in the standard Python dictionary. This method was chosen for the sake of speed of looking up a key, but the speed comes at the cost of limiting the possible key values. Other ways could have been chosen, but this works well for the dictionary's intended purpose in Python.
You tried to execute the line for b in db[artists]: while the current value of artists was not hashable. This is an error. I cannot say what the value of artists was, since the value was a parameter to the function and you did not show the calling code.
So check which value of artists was given to function works_by_artists() that caused the displayed error. Whatever created that value is the actual error in your code.

TypeError: unhashable type python work around?

I want to update the DiseaseScenario.conn[newKey] which is a set but i keep getting the error not hashable. Is there a way around this?
DiseaseScenario.conn={}
for items in dictList:
for key,value in items.iteritems():
flag=1
for newKey,newValue in DiseaseScenario.conn.iteritems():
if key==newKey:
//***************************///
//geting the error Unhashable type
tempValue=[value,newValue]
DiseaseScenario.conn[newKey].remove(value)
DiseaseScenario.conn[newKey].add(tempValue)
//*******************************************//
flag=0
if flag==1:
DiseaseScenario.conn[key]=value
print DiseaseScenario.conn
You are trying to put a list in a set. You can't do that, because set items need to have a fixed hash, which mutable (changeable) builtin python types do not have.
The simplest solution is to just change your list to a tuple (a tuple is kind of like a list that can't be changed in-place). So change:
tempValue=[value,newValue]
to:
tempValue=(value,newValue)
That, of course, assumes value and newValue are not lists or other mutable types.

Categories

Resources