I'd like to compare two objects of the same type with the dunder method _eq_ for equality. Every object stores values for "word", "pronunciation", "weight", and "source" and equality is reached, when everything is the same.
My solution looks like the following and works but it feels clunky and I am sure that there is a better way.
def __eq__(self, other):
if self.check_other(other): # checks of both objects are snstances of LexicalEntity
return_bool = True
if self.word != other.get_word():
return_bool = False
if self.weight != other.get_weight():
return_bool = False
if self.source != other.get_source():
return_bool = False
if self.pron != other.get_pron():
return_bool = False
return return_bool
Thanks for your help.
For starters, dispense with getters and setters in Python. That will make your code much less clunky and more idiomatic, i.e., you don't need other.get_word(), you just need other.word, and remove your definition of get_word, it is useless. Python != Java.
So, then for something like this, a typical implementation would be:
def __eq__(self, other):
if isinstance(other, LexicalEntity):
these_values = self.word, self.weight, self.source, self.pron
other_values = other.word, other.weight, other.source, other.pron
return these_values == other_values
return NotImplemented # important, you don't want to return None
Alternatively, you might also just use one long boolean expression:
def __eq__(self, other):
if isinstance(other, LexicalEntity):
return (
self.word == other.word and self.weight == other.weight
and self.source == other.source and self.pron == other.pron
)
return NotImplemented
I think this maybe is little more readable:
def __eq__(self, other):
if self.check_other(other):
attrs = ["word", "weight", "source", "pron"]
return all([getattr(self, attr) == getattr(other, attr) for attr for attrs])
But I guess it's a preference if we want more readable or more smart solution
Getters and setters don't make much sense in Python, you should start using the #property annotation instead, if you do have important validations - if you're just doing this for data encapsulation, Python principles are much more loose in that aspect, so just ditch getters/setters.
As for asserting equality, if you want to avoid manually referring to each attribute, the below reflection is appliable to virtually any case:
def __eq__(self, other):
if isinstance(other, self.__class__):
attrs = [
a for a in dir(self) if not a.startswith('_') and not callable(getattr(self, a))
]
return all([getattr(self, attr) == getattr(other, attr) for attr in attrs])
return NotImplemented
As #juanpa.arrivillaga already mentioned, returning the NotImplemented (not the same as raising NotImplementedError, as noted in the comments below) is important because if other is from a different class this stops you from returning None in the equality check. A better explanation of why return NotImplemented is the fallback in these cases is found in this answer.
Related
I want to replace string keys in dictionaries in my code with a dataclass so that I can provide meta data to the keys for debugging. However, I still want to be able to use a string to lookup dictionaries. I tried implementing a data-class with a replaced __hash__ function, however my code is not working as expected:
from dataclasses import dataclass
#dataclass(eq=True, frozen=True)
class Key:
name: str
def __hash__(self):
return hash(self.name)
k = "foo"
foo = Key(name=k)
d = {}
d[foo] = 1
print(d[k]) # Key Error
The two hash functions are the same:
print(hash(k) == hash(foo)) # True
So I don't understand why this doesn't work.
Two objects having different hashes guarantees that they're different, but two objects having the same hash doesn't in itself guarantee that they're the same (because hash collisions exist). If you want the Key to be considered equal to a corresponding str, implement that in __eq__:
def __eq__(self, other):
if isinstance(other, Key):
return self.name == other.name
if isinstance(other, str):
return self.name == other
return False
This fixes the KeyError you're encountering.
Adding my notes here from the comments on the answer above, as no one looks at those in any case, so those are likely to get swept under the rug at some point.
PyCharm also produces a helpful warning:
'eq' is ignored if the class already defines '__eq__' method.
I think this means to remove the eq=True usage as well, from the #dataclass(...) decorator.
technically, you could also remove the last if isinstance(..., str): as well as the last return statement. I'm not entirely sure what would be the implications of that, however.
Here then, is a slightly more optimized approach (timings with timeit module below):
class Key:
name: str
def __hash__(self):
return hash(self.name)
def __eq__(self, other):
return self.name == getattr(other, 'name', other)
Timings with timeit
from dataclasses import dataclass
from timeit import timeit
#dataclass(frozen=True)
class Key:
name: str
def __hash__(self):
return hash(self.name)
def __eq__(self, other):
if isinstance(other, Key):
return self.name == other.name
if isinstance(other, str):
return self.name == other
return False
class KeyTwo(Key):
def __eq__(self, other):
return self.name == getattr(other, 'name', other)
k = "foo"
foo = Key(name=k)
foo_two = KeyTwo(name=k)
print('__eq__() Timings --')
print('isinstance(): ', timeit("foo == k", globals=globals()))
print('getattr(): ', timeit("foo_two == k", globals=globals()))
assert foo == foo_two == k
Results on my M1 Mac:
__eq__() Timings --
isinstance(): 0.10553250007797033
getattr(): 0.08371329202782363
I have an int-derived class with overloaded comparison operator.
In the body of the overloaded methods I need to use the original operator.
The toy example:
>>> class Derived(int):
... def __eq__(self, other):
... return super(Derived, self).__eq__(other)
works fine with Python 3.3+, but fails with Python 2.7 with exception AttributeError: 'super' object has no attribute '__eq__'.
I can think about several walkarrounds, which I found not very clean:
return int(self) == other
requires creation of a new int object just to compare it, while
try:
return super(Derived, self).__eq__(other)
except AttributeError:
return super(Derived, self).__cmp__(other) == 0
splits the control flow based on the Python version, which I find terribly messy (so is inspecting the Python version explicitly).
How can I access the original integer comparison in an elegant way working with Python 2.7 and 3.3+?
Python 2 and 3 are significantly different from each other so I think you should bite the bullet and check versions. That is only to be expected if you're trying to write code that works on both (sooner or later in my experience you find something you have to patch). To avoid any performance impact you could do something like:
from six import PY2
class Derived(int):
if PY2:
def __eq__(self, other):
return super(Derived, self).__cmp__(other) == 0
else:
def __eq__(self, other):
return super(Derived, self).__eq__(other)
That's what I'd do. If I really wanted to subclass int...
If you really don't want to, perhaps you could try:
class Derived(int):
def __eq__(self, other):
return (self ^ other) == 0
Obviously if you care about performance you'll have to do some profiling with the rest of your code and find out if either of them is significantly worse...
Both versions implement an __xor__ method, you could try this:
class Derived(int):
def __eq__(self, other):
return not super(Derived, self).__xor__(other)
I believe that you should define the __eq__ in the int before defining the class. For example:
int = 5
def int.__eq__(self, other):
return self.real == other
IntDerived = Derived(int)
This should give the super class an __eq__ attribute.
EDITED
The main idea worked, but it has been brought to my attention that the code isn't working. So: improved code:
class Derived(int):
def __eq__(self, other):
return self.real == other
Int = 5
D = Derived(Int)
D.__eq__(4) #Output: False
D.__eq__(5) #Output: True
using hasattr avoids creating a new int object, catching an exception or explicitly checking for the Python version.
The below code works on both Python 2.7 and 3.3+:
class Derived(int):
def __eq__(self, other):
return super(Derived, self).__cmp__(other) == 0 if hasattr(Derived, "__cmp__") else super(Derived, self).__eq__(other)
I am writing a Queue data structure for python purely for learning purposes. here is my class. when I compare two Queue object for equality, I get error. I think the error pops up, because I dont compare for None in my __eq__ .but how can I check for None and return accordinly. in fact, I am using list under the hood and calling its __eq__, thinking it should take care as shown here, but it does not
>>> l=[1,2,3]
>>> l2=None
>>> l==l2
False
Here is my class:
#functools.total_ordering
class Queue(Abstractstruc,Iterator):
def __init__(self,value=[],**kwargs):
objecttype = kwargs.get("objecttype",object)
self.container=[]
self.__klass=objecttype().__class__.__name__
self.concat(value)
def add(self, data):
if (data.__class__.__name__==self.__klass or self.__klass=="object"):
self.container.append(data)
else:
raise Exception("wrong type being added")
def __add__(self,other):
return Queue(self.container + other.container)
def __iadd__(self,other):
for i in other.container:
self.add(i)
return self
def remove(self):
return self.container.pop(0)
def peek(self):
return self.container[0]
def __getitem__(self,index):
return self.container[index]
def __iter__(self):
return Iterator(self.container)
def concat(self,value):
for i in value:
self.add(i)
def __bool__(self):
return len(self.container)>0
def __len__(self):
return len(self.container)
def __deepcopy__(self,memo):
return Queue(copy.deepcopy(self.container,memo))
def __lt__(self,other):
return self.container.__lt__(other.container)
def __eq__(self, other):
return self.container.__eq__(other.container)
But when I try compare using the above class I get:
>>> from queue import Queue
>>> q = Queue([1,2,3])
>>> q
>>> print q
<Queue: [1, 2, 3]>
>>> q1 = None
>>> q==q1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "queue.py", line 65, in __eq__
return self.container.__eq__(other.container)
AttributeError: 'NoneType' object has no attribute 'container'
>>>
Your problem is how you are implementing __eq__.
Look at this code:
q = Queue([1,2,3])
q1 = None
q==q1
And lets rewrite it as the equivilent:
q = Queue([1,2,3])
q == None
Now, in Queue.__eq__ we have:
def __eq__(self, other):
return self.container.__eq__(other.container)
But other is None, which means the return statement is calling:
self.container.__eq__(None.container)
As your error rightly states:
'NoneType' object has no attribute 'container'
Because it doesn't! None doesn't have a container attribute.
So, the way to do it, depends on how you want to treat it. Now, obviously, a Queue object can't be None if its been defined, so:
return other is not None and self.container.__eq__(other.container)
Will lazily evaluate if other is None, and return False before evalauting the part of the expression after the and. Otherwise, it will perform the evaulation. However, you will get other issues if other is not of type Queue (or more correctly the other object doesn't have a container attribute), such as:
q = Queue([1,2,3])
q == 1
>>> AttributeError: 'int' object has no attribute 'container'
So... depending on your logic, and if a Queue can't be "equal" to other types (which is something only you can say), you can check for the correct type like so:
return other is not None and type(self) == type(other) and self.container.__eq__(other.container)
But... None is a NoneType, so it can never be of the same type as a Queue. So we can shorten it again to just:
return type(self) == type(other) and self.container.__eq__(other.container)
edit: As per mglisons comments:
This could be made more pythonic by using the regular equality statement:
return type(self) == type(other) and self.container == other.container
They have also raised a good point regarding the use of type in checking eaulity. If you are certain that Queue would never be subclassed (which is difficult to state). You could use exception handling to capture the AttributeError effectively, like so:
def __eq__(self, other):
try:
return self.container == other.container
except AttributeError:
return False # There is no 'container' attribute, so can't be equal
except:
raise # Another error occured, better pay it forward
The above may be considered a little overengineered, but is probably one of the better ways to approach this from a safety and resuability perspective.
Or a better, shorter approach (which I should have thought of initially) using hasattr is:
return hasattr(other, 'container') and self.container == other.container
Tell Python that you don't know how to compare against other types:
def __eq__(self, other):
if not isinstance(other, Queue):
return NotImplemented
return self.container.__eq__(other.container)
you might consider checking hasattr(other, 'container') instead of the isinstance, or catch the AttributeError.
But the important thing is that, unlike other answers recommend, you do not want to return False when other isn't a Queue. If you return NotImplemented, Python will give other a chance to check the equality; if you return False, it won't. Differentiate between the three possible answers to the question "are these objects equal": yes, no, and I don't know.
You'll want to do something similar in your __lt__, where the difference is even more apparent: if you return False from both __lt__ and __eq__, then the __gt__ inserted by total_ordering will return True - even though you can't do the comparison. If you return NotImplemented from both of them, it will also be NotImplemented.
you can do something like
def __eq__(self,other):
if other is None: return False
return self.container.__eq__(other.container)
You may also want to do something like
if not isinstance(other,Queue): return False
I am working on a graph library in Python and I am defining my vetex this way:
class Vertex:
def __init__(self,key,value):
self._key = key
self._value = value
#property
def key(self):
return self._key
#key.setter
def key(self,newKey):
self._key = newKey
#property
def value(self):
return self._value
#value.setter
def value(self,newValue):
self.value = newValue
def _testConsistency(self,other):
if type(self) != type(other):
raise Exception("Need two vertexes here!")
def __lt__(self,other):
_testConsistency(other)
if self.index <= other.index:
return True
return False
......
Do I really have to define __lt__,__eq__,__ne__....all by my self? It is so verbose. Is there simpler way I can get around this?
Cheers.
Please dont use __cmp__ since it will be away in python 3.
functools.total_ordering can help you out here. It's meant to be a class decorator. You define one of __lt__(), __le__(), __gt__(), or __ge__() AND __eq__ and it fills in the rest.
As a side note:
Instead of writing this
if self.index <= other.index:
return True
return False
write this:
return self.index <= other.index
It's cleaner that way. :-)
Using functools.total_ordering, you only need to define one of the equality operators and one of the ordering operators. In Python < 3.2, you're out of luck, something has to define these operators as individual methods. Though you may be able to save some code by writing a simpler version of total_ordering yourself, if you need it in several places.
When writing custom classes it is often important to allow equivalence by means of the == and != operators. In Python, this is made possible by implementing the __eq__ and __ne__ special methods, respectively. The easiest way I've found to do this is the following method:
class Foo:
def __init__(self, item):
self.item = item
def __eq__(self, other):
if isinstance(other, self.__class__):
return self.__dict__ == other.__dict__
else:
return False
def __ne__(self, other):
return not self.__eq__(other)
Do you know of more elegant means of doing this? Do you know of any particular disadvantages to using the above method of comparing __dict__s?
Note: A bit of clarification--when __eq__ and __ne__ are undefined, you'll find this behavior:
>>> a = Foo(1)
>>> b = Foo(1)
>>> a is b
False
>>> a == b
False
That is, a == b evaluates to False because it really runs a is b, a test of identity (i.e., "Is a the same object as b?").
When __eq__ and __ne__ are defined, you'll find this behavior (which is the one we're after):
>>> a = Foo(1)
>>> b = Foo(1)
>>> a is b
False
>>> a == b
True
Consider this simple problem:
class Number:
def __init__(self, number):
self.number = number
n1 = Number(1)
n2 = Number(1)
n1 == n2 # False -- oops
So, Python by default uses the object identifiers for comparison operations:
id(n1) # 140400634555856
id(n2) # 140400634555920
Overriding the __eq__ function seems to solve the problem:
def __eq__(self, other):
"""Overrides the default implementation"""
if isinstance(other, Number):
return self.number == other.number
return False
n1 == n2 # True
n1 != n2 # True in Python 2 -- oops, False in Python 3
In Python 2, always remember to override the __ne__ function as well, as the documentation states:
There are no implied relationships among the comparison operators. The
truth of x==y does not imply that x!=y is false. Accordingly, when
defining __eq__(), one should also define __ne__() so that the
operators will behave as expected.
def __ne__(self, other):
"""Overrides the default implementation (unnecessary in Python 3)"""
return not self.__eq__(other)
n1 == n2 # True
n1 != n2 # False
In Python 3, this is no longer necessary, as the documentation states:
By default, __ne__() delegates to __eq__() and inverts the result
unless it is NotImplemented. There are no other implied
relationships among the comparison operators, for example, the truth
of (x<y or x==y) does not imply x<=y.
But that does not solve all our problems. Let’s add a subclass:
class SubNumber(Number):
pass
n3 = SubNumber(1)
n1 == n3 # False for classic-style classes -- oops, True for new-style classes
n3 == n1 # True
n1 != n3 # True for classic-style classes -- oops, False for new-style classes
n3 != n1 # False
Note: Python 2 has two kinds of classes:
classic-style (or old-style) classes, that do not inherit from object and that are declared as class A:, class A(): or class A(B): where B is a classic-style class;
new-style classes, that do inherit from object and that are declared as class A(object) or class A(B): where B is a new-style class. Python 3 has only new-style classes that are declared as class A:, class A(object): or class A(B):.
For classic-style classes, a comparison operation always calls the method of the first operand, while for new-style classes, it always calls the method of the subclass operand, regardless of the order of the operands.
So here, if Number is a classic-style class:
n1 == n3 calls n1.__eq__;
n3 == n1 calls n3.__eq__;
n1 != n3 calls n1.__ne__;
n3 != n1 calls n3.__ne__.
And if Number is a new-style class:
both n1 == n3 and n3 == n1 call n3.__eq__;
both n1 != n3 and n3 != n1 call n3.__ne__.
To fix the non-commutativity issue of the == and != operators for Python 2 classic-style classes, the __eq__ and __ne__ methods should return the NotImplemented value when an operand type is not supported. The documentation defines the NotImplemented value as:
Numeric methods and rich comparison methods may return this value if
they do not implement the operation for the operands provided. (The
interpreter will then try the reflected operation, or some other
fallback, depending on the operator.) Its truth value is true.
In this case the operator delegates the comparison operation to the reflected method of the other operand. The documentation defines reflected methods as:
There are no swapped-argument versions of these methods (to be used
when the left argument does not support the operation but the right
argument does); rather, __lt__() and __gt__() are each other’s
reflection, __le__() and __ge__() are each other’s reflection, and
__eq__() and __ne__() are their own reflection.
The result looks like this:
def __eq__(self, other):
"""Overrides the default implementation"""
if isinstance(other, Number):
return self.number == other.number
return NotImplemented
def __ne__(self, other):
"""Overrides the default implementation (unnecessary in Python 3)"""
x = self.__eq__(other)
if x is NotImplemented:
return NotImplemented
return not x
Returning the NotImplemented value instead of False is the right thing to do even for new-style classes if commutativity of the == and != operators is desired when the operands are of unrelated types (no inheritance).
Are we there yet? Not quite. How many unique numbers do we have?
len(set([n1, n2, n3])) # 3 -- oops
Sets use the hashes of objects, and by default Python returns the hash of the identifier of the object. Let’s try to override it:
def __hash__(self):
"""Overrides the default implementation"""
return hash(tuple(sorted(self.__dict__.items())))
len(set([n1, n2, n3])) # 1
The end result looks like this (I added some assertions at the end for validation):
class Number:
def __init__(self, number):
self.number = number
def __eq__(self, other):
"""Overrides the default implementation"""
if isinstance(other, Number):
return self.number == other.number
return NotImplemented
def __ne__(self, other):
"""Overrides the default implementation (unnecessary in Python 3)"""
x = self.__eq__(other)
if x is not NotImplemented:
return not x
return NotImplemented
def __hash__(self):
"""Overrides the default implementation"""
return hash(tuple(sorted(self.__dict__.items())))
class SubNumber(Number):
pass
n1 = Number(1)
n2 = Number(1)
n3 = SubNumber(1)
n4 = SubNumber(4)
assert n1 == n2
assert n2 == n1
assert not n1 != n2
assert not n2 != n1
assert n1 == n3
assert n3 == n1
assert not n1 != n3
assert not n3 != n1
assert not n1 == n4
assert not n4 == n1
assert n1 != n4
assert n4 != n1
assert len(set([n1, n2, n3, ])) == 1
assert len(set([n1, n2, n3, n4])) == 2
You need to be careful with inheritance:
>>> class Foo:
def __eq__(self, other):
if isinstance(other, self.__class__):
return self.__dict__ == other.__dict__
else:
return False
>>> class Bar(Foo):pass
>>> b = Bar()
>>> f = Foo()
>>> f == b
True
>>> b == f
False
Check types more strictly, like this:
def __eq__(self, other):
if type(other) is type(self):
return self.__dict__ == other.__dict__
return False
Besides that, your approach will work fine, that's what special methods are there for.
The way you describe is the way I've always done it. Since it's totally generic, you can always break that functionality out into a mixin class and inherit it in classes where you want that functionality.
class CommonEqualityMixin(object):
def __eq__(self, other):
return (isinstance(other, self.__class__)
and self.__dict__ == other.__dict__)
def __ne__(self, other):
return not self.__eq__(other)
class Foo(CommonEqualityMixin):
def __init__(self, item):
self.item = item
Not a direct answer but seemed relevant enough to be tacked on as it saves a bit of verbose tedium on occasion. Cut straight from the docs...
functools.total_ordering(cls)
Given a class defining one or more rich comparison ordering methods, this class decorator supplies the rest. This simplifies the effort involved in specifying all of the possible rich comparison operations:
The class must define one of __lt__(), __le__(), __gt__(), or __ge__(). In addition, the class should supply an __eq__() method.
New in version 2.7
#total_ordering
class Student:
def __eq__(self, other):
return ((self.lastname.lower(), self.firstname.lower()) ==
(other.lastname.lower(), other.firstname.lower()))
def __lt__(self, other):
return ((self.lastname.lower(), self.firstname.lower()) <
(other.lastname.lower(), other.firstname.lower()))
You don't have to override both __eq__ and __ne__ you can override only __cmp__ but this will make an implication on the result of ==, !==, < , > and so on.
is tests for object identity. This means a is b will be True in the case when a and b both hold the reference to the same object. In python you always hold a reference to an object in a variable not the actual object, so essentially for a is b to be true the objects in them should be located in the same memory location. How and most importantly why would you go about overriding this behaviour?
Edit: I didn't know __cmp__ was removed from python 3 so avoid it.
From this answer: https://stackoverflow.com/a/30676267/541136 I have demonstrated that, while it's correct to define __ne__ in terms __eq__ - instead of
def __ne__(self, other):
return not self.__eq__(other)
you should use:
def __ne__(self, other):
return not self == other
I think that the two terms you're looking for are equality (==) and identity (is). For example:
>>> a = [1,2,3]
>>> b = [1,2,3]
>>> a == b
True <-- a and b have values which are equal
>>> a is b
False <-- a and b are not the same list object
The 'is' test will test for identity using the builtin 'id()' function which essentially returns the memory address of the object and therefore isn't overloadable.
However in the case of testing the equality of a class you probably want to be a little bit more strict about your tests and only compare the data attributes in your class:
import types
class ComparesNicely(object):
def __eq__(self, other):
for key, value in self.__dict__.iteritems():
if (isinstance(value, types.FunctionType) or
key.startswith("__")):
continue
if key not in other.__dict__:
return False
if other.__dict__[key] != value:
return False
return True
This code will only compare non function data members of your class as well as skipping anything private which is generally what you want. In the case of Plain Old Python Objects I have a base class which implements __init__, __str__, __repr__ and __eq__ so my POPO objects don't carry the burden of all that extra (and in most cases identical) logic.
Instead of using subclassing/mixins, I like to use a generic class decorator
def comparable(cls):
""" Class decorator providing generic comparison functionality """
def __eq__(self, other):
return isinstance(other, self.__class__) and self.__dict__ == other.__dict__
def __ne__(self, other):
return not self.__eq__(other)
cls.__eq__ = __eq__
cls.__ne__ = __ne__
return cls
Usage:
#comparable
class Number(object):
def __init__(self, x):
self.x = x
a = Number(1)
b = Number(1)
assert a == b
This incorporates the comments on Algorias' answer, and compares objects by a single attribute because I don't care about the whole dict. hasattr(other, "id") must be true, but I know it is because I set it in the constructor.
def __eq__(self, other):
if other is self:
return True
if type(other) is not type(self):
# delegate to superclass
return NotImplemented
return other.id == self.id
I wrote a custom base with a default implementation of __ne__ that simply negates __eq__:
class HasEq(object):
"""
Mixin that provides a default implementation of ``object.__neq__`` using the subclass's implementation of ``object.__eq__``.
This overcomes Python's deficiency of ``==`` and ``!=`` not being symmetric when overloading comparison operators
(i.e. ``not x == y`` *does not* imply that ``x != y``), so whenever you implement
`object.__eq__ <https://docs.python.org/2/reference/datamodel.html#object.__eq__>`_, it is expected that you
also implement `object.__ne__ <https://docs.python.org/2/reference/datamodel.html#object.__ne__>`_
NOTE: in Python 3+ this is no longer necessary (see https://docs.python.org/3/reference/datamodel.html#object.__ne__)
"""
def __ne__(self, other):
"""
Default implementation of ``object.__ne__(self, other)``, delegating to ``self.__eq__(self, other)``.
When overriding ``object.__eq__`` in Python, one should also override ``object.__ne__`` to ensure that
``not x == y`` is the same as ``x != y``
(see `object.__eq__ <https://docs.python.org/2/reference/datamodel.html#object.__eq__>`_ spec)
:return: ``NotImplemented`` if ``self.__eq__(other)`` returns ``NotImplemented``, otherwise ``not self.__eq__(other)``
"""
equal = self.__eq__(other)
# the above result could be either True, False, or NotImplemented
if equal is NotImplemented:
return NotImplemented
return not equal
If you inherit from this base class, you only have to implement __eq__ and the base.
In retrospect, a better approach might have been to implement it as a decorator instead. Something like #functools.total_ordering