How to support the `in` operation in python with a class - python

What magic method do I have to modify to support the in operator. Here's an example of what I'm trying to do:
class DailyPriceObj:
def __init__(self, date, product_id=None):
self.date = date
self.product_id = product_id
self.sd_buy = None
l = list()
l.append(DailyPriceObj(date="2014-01-01"))
DailyPriceObj(date="2014-01-01") in l # how to get this to return True?
In other words, I want my object to "act like" the date property, so I can use that to see if that obj is in an interable (date should be a unique field here).

You need to implement __eq__ (and __hash__ for the sake of completeness):
class DailyPriceObj:
def __init__(self, date, product_id=None):
self.date = date
self.product_id = product_id
self.sd_buy = None
def __eq__(self, other):
return isinstance(other, self.__class__) and self.date == other.date
def __hash__(self):
return hash(self.date)
l = [DailyPriceObj(date="2014-01-01")]
s = {DailyPriceObj(date="2014-01-01")}
print(DailyPriceObj(date="2014-01-01") in l)
print(DailyPriceObj(date="2014-01-01") in s)
Output
True
True
From the documentation on __hash__:
Called by built-in function hash() and for operations on members of
hashed collections including set, frozenset, and dict. __hash__()
should return an integer. The only required property is that objects
which compare equal have the same hash value; it is advised to mix
together the hash values of the components of the object that also
play a part in comparison of objects by packing them into a tuple and
hashing the tuple.

You can implement __eq__ in such a way that both two ways of checking will work:
class DailyPriceObj:
def __init__(self, date, product_id=None):
self.date = date
self.product_id = product_id
self.sd_buy = None
def __eq__(self, other):
return self.date == other
l = list()
l.append(DailyPriceObj(date="2014-01-01"))
# both ways work:
print(DailyPriceObj(date="2014-01-01") in l) # True
print("2014-01-01" in l) # True

Related

How to perform Intersection with custom property in python

class Random:
def __init__(self,id):
self.id=id
self.prop=None
list_1={Random(12), Random(15), Random(22)}
list_2={Random(22),Random(9),Random(88)}
list_3={Random(88),Random(22),Random(12)}
result=list_1.intersection(list_2).intersection(list_3)
print (list(result))
#expected result = Random Object containing id=22
#returned result =[]
How to custom intersect among lists with custom field - id in the above case ?
set() documentation says:
A set object is an unordered collection of distinct hashable objects.
And hashable documentation says:
An object is hashable if it has a hash value which never changes during its lifetime (it needs a __hash__() method), and can be compared to other objects (it needs an __eq__() method). Hashable objects which compare equal must have the same hash value.
So you need to implement __hash__() and __eq__() for your class.
class Random:
def __init__(self,id):
self.id=id
self.prop=None
def __hash__(self):
return hash((self.id, self.prop))
def __eq__(self, other):
return self.id == other.id and self.prop == other.prop
class Random:
def __new__(cls, id):
return id
list_1 = {Random(12), Random(15), Random(22)}
list_2 = {Random(22), Random(9), Random(88)}
list_3 = {Random(88), Random(22), Random(12)}
result = list_1.intersection(list_2).intersection(list_3)
print(list(result))
if you need id then u can try new of class method because it returns a value.
without return u got the object of a class that is all different that's why your result is empty

Why isn't the hash function deterministic?

I'm developing a program using Python 3.6
I have a problem: if I use the deterministic hash function (from standard library of the language) on the same object, the string that results in output (after a run), is different for some runs!
For example:
class Generic:
def __init__(self, id, name, property):
self.id = id
self.name = name
self.property = property
def main():
my_object = Generic(3,'ddkdjsdk','casualstring')
print(hash(my_object))
I would like the output to always be the same (deterministic), but unfortunately different strings appear on the console:
8765256330262, -9223363264515786864, -9223363262437648366 and others...
Why this happens? I would like to guarantee the determinism with this function throughout my application! How do I solve the problem?
In this case it's probably easiest to define your own __eq__ function and __hash__ function. This will return the same hash every time for you:
class Generic:
def __init__(self, id, name, property):
self.id=id
self.name = name
self.property = property
def __eq__(self, other):
assert self.__class__ == other.__class__, "Types do not match"
return self.id == other.id and self.name == other.name and self.property == other.property
def __hash__(self):
return hash ( (self.id, self.name, self.property) )
This will also make hashes of equivalent objects equal, as well:
>>>obj = Generic(1, 'blah', 'blah')
>>>obj2 = Generic(1, 'blah', 'blah')
>>>obj == obj2
True
>>>hash(obj) == hash(obj2)
True
hope that helps!
For those looking to get hashes of built-in types, Python's built in hashlib might be easier than subclassing to redefine __hash__. Here's an example with for string.
from hashlib import md5
def string_hash(string):
return md5(string.encode()).hexdigest()
This will return the same hash for different string objects so long as the content is the same. Not all objects will work, but it could you save you time depending on your use case.

How to check if an object exists inside a list of objects?

I am using Python to implement an Earley Parser that has Context Free rules defined as follows:
class Rule:
def __init__(self,string,i,j,dot):
self.i = 0
self.j = 0
self.dot = 0
string = string.split('->')
self.lhs = string[0].strip()
self.rhs1 = string[1].strip()
self.rhs = []
self.rhs1 = self.rhs1.split(' ')
for word in self.rhs1:
if word.strip()!= '':
self.rhs.append(word)
def __eq__(self, other):
if self.i == other.i:
if self.j == other.j:
if self.dot == other.dot:
if self.lhs == other.lhs:
if self.rhs == other.rhs:
return True
return False
To check whether an object of class Rule exists within a chart array or not, I have used the following:
def enqueue(self, entry, state):
if state in self.chart[entry]:
return None
else:
self.chart[entry].append(state)
where chart is an array that is supposed to contain lists of objects of class Rule:
def __init__(self, words):
self.chart = [[] for i in range(len(words))]
Further I check whether a rule exists as that in the chart[entry] as follows (and if it does not exist, then simply append):
def enqueue(self, entry, state):
if state in self.chart[entry]:
return None
else:
self.chart[entry].append(state)
However this gives me an error as
TypeError: 'in <string>' requires string as left operand, not classobj
To circumvent this, I even declared an __eq__ function in the class itself but it doesn't seem to work. Can anyone help me with the same?
Assuming that your object has only a title attribute which is relevant for equality, you have to implement the __eq__ method as follows:
class YourObject:
[...]
def __eq__(self, other):
return self.title == other.title
Of course if you have more attributes that are relevant for equality, you must include those as well. You might also consider implementing __ne__ and __cmp__ for consistent behaviour.

typecast classes in python: how?

Here, I am attempting to mock up a social media profile as a class "Profile", in which you have name, a group of friends, and the ability to add and remove friends. There is a method that I would like to make, that when invoked, will print the list of friends in alphabetical order.
The issue: I get a warning that I cannot sort an unsortable type. Python is seeing my instance variable as a "Profile object", rather than a list that I can sort and print.
Here is my code:
class Profile(object):
"""
Represent a person's social profile
Argument:
name (string): a person's name - assumed to uniquely identify a person
Attributes:
name (string): a person's name - assumed to uniquely identify a person
statuses (list): a list containing a person's statuses - initialized to []
friends (set): set of friends for the given person.
it is the set of profile objects representing these friends.
"""
def __init__(self, name):
self.name = name
self.friends = set()
self.statuses = []
def __str__(self):
return self.name + " is " + self.get_last_status()
def update_status(self, status):
self.statuses.append(status)
return self
def get_last_status(self):
if len(self.statuses) == 0:
return "None"
else:
return self.statuses[-1]
def add_friend(self, friend_profile):
self.friends.add(friend_profile)
friend_profile.friends.add(self)
return self
def get_friends(self):
if len(self.friends) == 0:
return "None"
else:
friends_lst = list(self.friends)
return sorted(friends_lst)
After I fill out a list of friends (from a test module) and invoke the get_friends method, python tells me:
File "/home/tjm/Documents/CS021/social.py", line 84, in get_friends
return sorted(friends_lst)
TypeError: unorderable types: Profile() < Profile()
Why can't I simply typecast the object to get it in list form? What should I be doing instead so that get_friends will return an alphabetically sorted list of friends?
Sorting algorithms look for the existence of __eq__, __ne__, __lt__, __le__, __gt__,__ge__ methods in the class definition to compare instances created from them. You need to override those methods in order to tweak their behaviors.
For performance reasons, I'd recommend you to define some integer property for your class like id and use it for comparing instead of name which has string comparison overhead.
class Profile(object):
def __eq__(self, profile):
return self.id == profile.id # I made it up the id property.
def __lt__(self, profile):
return self.id < profile.id
def __hash__(self):
return hash(self.id)
...
Alternatively, you can pass a key function to sort algorithm if you don't want to bother yourself overriding those methods:
>>> friend_list = [<Profile: id=120>, <Profile: id=121>, <Profile: id=115>]
>>> friend_list.sort(key=lambda p: p.id, reverse=True)
Using operator.attrgetter;
>>> import operator
>>> new_friend_list = sorted(friend_list, key=operator.attrgetter('id'))
I think i'll take a crack at this. first, here's teh codes:
from collections import namedtuple
class Profile(namedtuple("Profile", "name")):
def __init__(self, name):
# don't set self.name, it's already set!
self.friends = set({})
self.statuses = list([])
# ... and all the rest the same. Only the base class changes.
what we've done here is to create a class with the shape of a tuple. As such, it's orderable, hashable, and all of the things. You could even drop your __str__() method, namedtuple provides a nice one.

Unexpected behavior for python set.__contains__

Borrowing the documentation from the __contains__ documentation
print set.__contains__.__doc__
x.__contains__(y) <==> y in x.
This seems to work fine for primitive objects such as int, basestring, etc. But for user-defined objects that define the __ne__ and __eq__ methods, I get unexpected behavior. Here is a sample code:
class CA(object):
def __init__(self,name):
self.name = name
def __eq__(self,other):
if self.name == other.name:
return True
return False
def __ne__(self,other):
return not self.__eq__(other)
obj1 = CA('hello')
obj2 = CA('hello')
theList = [obj1,]
theSet = set(theList)
# Test 1: list
print (obj2 in theList) # return True
# Test 2: set weird
print (obj2 in theSet) # return False unexpected
# Test 3: iterating over the set
found = False
for x in theSet:
if x == obj2:
found = True
print found # return True
# Test 4: Typcasting the set to a list
print (obj2 in list(theSet)) # return True
So is this a bug or a feature?
For sets and dicts, you need to define __hash__. Any two objects that are equal should hash the same in order to get consistent / expected behavior in sets and dicts.
I would reccomend using a _key method, and then just referencing that anywhere you need the part of the item to compare, just as you call __eq__ from __ne__ instead of reimplementing it:
class CA(object):
def __init__(self,name):
self.name = name
def _key(self):
return type(self), self.name
def __hash__(self):
return hash(self._key())
def __eq__(self,other):
if self._key() == other._key():
return True
return False
def __ne__(self,other):
return not self.__eq__(other)
This is because CA doesn't implement __hash__
A sensible implementation would be:
def __hash__(self):
return hash(self.name)
A set hashes it's elements to allow a fast lookup. You have to overwrite the __hash__ method so that a element can be found:
class CA(object):
def __hash__(self):
return hash(self.name)
Lists don't use hashing, but compare each element like your for loop does.

Categories

Resources