specify comparator function in python dictionary

specify comparator function in python dictionary - python

Is there a way to pass in a custom equality comparison function when creating a python dictionary so that it doesn't use the default __eq__ or hash comparators? I'm hoping there is a way to do so but wasn't able to find it so far.
edit: I am looking for a way to provide different definitions of object equality for classes that I defined. Like:
class A:
def __init__(self, a, b):
self.a = a
self.b = b
def c1(a1, a2): # assume both are A objects
return a1.a == a2.a
def c2(a1, a2):
return a1.b == a2.b
# is this possible?
d1 = dict(cmp=c1)
d2 = dict(cmp=c2)
I know I can override __eq__ etc in my class definition but I can only override it once. In Java I can use TreeMap and I am looking for the equivalent in Python.

Related

Python: Accessing dict with hashable object fails

I am using a hashable object as a key to a dictionary. The objects are hashable and I can store key-value-pairs in the dict, but when I create a copy of the same object (that gives me the same hash), I get a KeyError.
Here is some small example code:
class Object:
def __init__(self, x): self.x = x
def __hash__(self): return hash(self.x)
o1 = Object(1.)
o2 = Object(1.)
hash(o1) == hash(o2) # This is True
data = {}
data[o1] = 2.
data[o2] # Desired: This should output 2.
In my scenario above, how can I achieve that data[o2] also returns 2.?

You need to implement both __hash__ and __eq__:
class Object:
def __init__(self, x): self.x = x
def __hash__(self): return hash(self.x)
def __eq__(self, other): return self.x == other.x if isinstance(other, self.__class__) else NotImplemented
Per Python documentation:
if a class does not define an __eq__() method it should not define a __hash__() operation either
After finding the hash, Python's dictionary compares the keys using __eq__ and realize they're different, that's why you're not getting the correct output.

You can use the __eq__ magic method to implement a equality check on your object.
def __eq__(self, other):
if (isinstance(other, C)):
return self.x == self.x
You can learn more about magic methods from this link.

So as stated before your object need to implement __ eq__ trait (equality ==), If you want to understand why:
Sometimes hash of different object are the same, this is called collision.
Dictionary manages that by testing if the objects are equals. If they are not dictionary has to manage the collision. How they do that Is implementation details and can vary a lot. A dummy implementation would be list of tuple key value.
Under the hood, a dummy implementation may look like that :
dico[key] = [(object1, value), (object2, value)]

How to test for approximate equality of generic classes

I'm trying to find out if two classes are equivalent, ignoring types parameters. Say I have
from typing import Generic, TypeVar
T = TypeVar('T')
class A(Generic[T]):
pass
class B(Generic[T], A[T]):
pass
class X:
pass
I'd like each following row to be equivalent
Generic, Generic[T]
A, A[T], A[str], A[int]
B, B[T], B[str], B[int]
X
None of is, ==, isinstance, type, or __class__ work. Comparing __name__ is fragile to someone defining another class with the same name.
For bonus points*, I'd also be interested in an additional way to test equivalence of
A, A[T], A[str], A[int], B, B[T], B[str], B[int]
*not a bounty :p
(The context is that I'd like to find all the subclasses of a class other than Generic)

To recover A from A[T], you can use the __origin__ attribute, which for A will be None.
def compare(a, b):
if hasattr(a, "__origin__") and hasattr(b, "__origin__"):
a_origin = a.__origin__ or a
b_origin = b.__origin__ or b
return a_origin == b_origin
else:
return a == b
compare(A, A[int]) # True
compare(A, B[int]) # False
compare(A, A) # True
compare(X, X) # True
According to the linked comment, __origin__ should be available for Union, Optional, Generic, Callable, and Tuple.
It's worth noting that this is an implementation detail. Using this is opening yourself up to the risk of the implementation changing without warning.

Initiate subclasses from parent class

Suppose I have a list of inputs that will generate O objects, of the following form:
inps = [['A', 5], ['B', 2]]
and O has subclasses A and B. A and B each are initiated with a single integer --
5 or 2 in the example above -- and have a method update(self, t), so I believe it makes sense to group them under an O superclass. I could complete the program with a loop:
Os = []
for inp in inps:
if inp[0] == 'A':
Os.append(A(inp[1]))
elif inp[0] == 'B':
Os.append(B(inp[1]))
and then at runtime,
for O in Os: O.update(t)
I'm wondering, however, if there is a more object oriented way to accomplish this. One way, I suppose, might be to make a fake "O constructor" outside of the O class:
def initO(inp):
if inp[0] == 'A':
return A(inp[1])
elif inp[0] == 'B':
return B(inp[1])
Os = [initO(inp) for inp in inps]
This is more elegant, in my opinion, and for all intensive purposes gives me the result I want; but it feels like a complete abuse of the class system in python. Is there a better way to do this, perhaps by initiating A and B from the O constructor?
EDIT: The ideal would be to be able to use
Os = [O(inp) for inp in inps]
while maintaining O as a superclass of A and B.

You could use a dict to map the names to the actual classes:
dct = {'A': A, 'B': B}
[dct[name](argument) for name, argument in inps]
Or if you don't want the list-comprehension:
dct = {'A': A, 'B': B}
Os = []
for inp in inps:
cls = dct[inp[0]]
Os.append(cls(inp[1]))

Although it is technically possible to perform call by name in Python, I strongly advice not to do that. The cleanest way is probably using a dictionary:
trans = { 'A' : A, 'B' : B }
def initO(inp):
cons = trans.get(inp[0])
if cons is not None:
return cons(*inp[1:])
So here trans is a dictionary that maps names on classes (and thus corresponding constructors).
In the initO we perform a lookup, if the lookup succeeds, we call the constructor cons with the remaining arguments of inp.

In case you really want to create a (direct) subclass from within a parent class you could use the special __subclasses__ method:
class O(object):
def __init__(self, integer):
self.value = integer
#classmethod
def get_subclass(cls, subclassname, value):
# probably not a really good name for that method - I'm out of creativity...
subcls = next(sub for sub in cls.__subclasses__() if sub.__name__ == subclassname)
return subcls(value)
def __repr__(self):
return '{self.__class__.__name__}({self.value})'.format(self=self)
class A(O):
pass
class B(O):
pass
This acts like a factory:
>>> O.get_subclass('A', 1)
A(1)
Or as list-comprehension:
>>> [O.get_subclass(*inp) for inp in inps]
In case you want to optimize it and you know that you won't add subclasses during the programs progress you could put the subclasses in a dictionary that maps from __name__ to the subclass:
class O(object):
__subs = {}
def __init__(self, integer):
self.value = integer
#classmethod
def get_subclass(cls, subclassname, value):
if not cls.__subs:
cls.__subs = {sub.__name__: sub for sub in cls.__subclasses__()}
return cls.__subs[subclassname](value)
You could probably also use __new__ to implement that behavior or a metaclass but I think a classmethod may be more appropriate here because it's easy to understand and allows for more flexibility.
In case you not only want direct subclasses you might want to check this recipe to find even subclasses of your subclasses (I also implemented it in a 3rd party extension package of mine: iteration_utilities.itersubclasses).

Without knowing more about your A and B, it's hard to say. But this looks like a classic case for a switch in a language like C. Python doesn't have a switch statement, so the use of a dict or dict-like construct is used instead.
If you're sure your inputs are clean, you can directly get your classes using the globals() function:
Os = [globals()[f](x) for (f, x) in inps]
If you want to sanitize, you can do something like this:
allowed = {'A', 'B'}
Os = [globals()[f](x) for (f, x) in inps if f in allowed]
This solution can also be changed if you prefer to have a fixed dictionary and sanitized inputs:
allowed = {'A', 'B'}
classname_to_class = {k: v for (k, v) in globals().iteritems() if k in allowed}
# Now, you can have a dict mapping class names to classes without writing 'A': A, 'B': B ...
Alternately, if you can prefix all your class definitions, you could even do something like this:
classname_to_class = {k[13:]: v for (k, v) in globals().iteritems() if k.startswith('SpecialPrefix'} # 13: is the length of 'SpecialPrefix'
This solution allows you to just name your classes with a prefix and have the dictionary automatically populate (after stripping out the special prefix if you so choose). These dictionaries are equivalent to trans and dct in the other solutions posted here, except without having to manually generate the dictionary.
Unlike the other solutions posted so far, these reduce the likelihood of a transcription error (and the amount of boilerplate code required) in cases where you have a lot more classes than A and B.

At the risk of drawing more negative fire... we can use metaclasses. This may or may not be suitable for your particular application. Every time you define a subclass of class O, you always have an up-to-date list (well, dict) of O's subclasses. Oh, and this is written for Python 2 (but can be ported to Python 3).
class OMetaclass(type):
'''This metaclass adds a 'subclasses' attribute to its classes that
maps subclass name to the class object.'''
def __init__(cls, name, bases, dct):
if not hasattr(cls, 'subclasses'):
cls.subclasses = {}
else:
cls.subclasses[name] = cls
super(OMetaclass, cls).__init__(name, bases, dct)
class O(object):
__metaclass__ = OMetaclass
### Now, define the rest of your subclasses of O as usual.
class A(O):
def __init__(self, x): pass
class B(O):
def __init__(self, x): pass
Now, you have a dictionary, O.subclasses, that contains all the subclasses of O. You can now just do this:
Os = [O.subclasses[cls](arg) for (cls, arg) in inps]
Now, you don't have to worry about weird prefixes for your classes and you won't need to change your code if you're subclassing O already, but you've introduced magic (metaclasses) that may make your program harder to grok.

Mutable objects in python and constants

I have a class which contains data as attributes and which has a method to return a tuple containing these attributes:
class myclass(object):
def __init__(self,a,b,c):
self.a = a
self.b = b
self.c = c
def tuple(self):
return (self.a, self.b, self.c)
I use this class essentially as a tuple where the items (attributes) can be modified/read through their attribute name. Now I would like to create objects of this class, which would be constants and have pre-defined attribute values, which I could then assign to a variable/mutable object, thereby initializing this variable object's attributes to match the constant object, while at the same time retaining the ability to modify the attributes' values. For example I would like to do this:
constant_object = myclass(1,2,3)
variable_object = constant_object
variable_object.a = 999
Now of course this doesn't work in python, so I am wondering what is the best way to get this kind of functionality?

Now I would like to create objects of this class, which would be constants and have pre-defined attribute values, which I could then assign to a variable/mutable object, thereby initializing this variable object's attributes to match the constant object,
Well, you can't have that. Assignment in Python doesn't initialize anything. It doesn't copy or create anything. All it does is give a new name to the existing value.
If you want to initialize an object, the way to do that in Python is to call the constructor.
So, with your existing code:
new_object = myclass(old_object.a, old_object.b, old_object.c)
If you look at most built-in and stdlib classes, it's a lot more convenient. For example:
a = set([1, 2, 3])
b = set(a)
How do they do that? Simple. Just define an __init__ method that can be called with an existing instance. (In the case of set, this comes for free, because a set can be initialized with any iterable, and sets are iterable.)
If you don't want to give up your existing design, you're going to need a pretty clumsy __init__, but it's at least doable. Maybe this:
_sentinel = object()
def __init__(myclass_or_a, b=_sentinel, c=_sentinel):
if isinstance(a, myclass):
self.a, self.b, self.c = myclass_or_a.a, myclass_or_a.b, myclass_or_a.c
else:
self.a, self.b, self.c = myclass_or_a, b, c
… plus some error handling to check that b is _sentinel in the first case and that it isn't in the other case.
So, however you do it:
constant_object = myclass(1,2,3)
variable_object = myclass(constant_object)
variable_object.a = 999

import copy
class myclass(object):
def __init__(self,a,b,c):
self.a = a
self.b = b
self.c = c
def tuple(self):
return (self.a, self.b, self.c)
constant_object = myclass(1,2,3)
variable_object = copy.deepcopy(constant_object)
variable_object.a = 999
print constant_object.a
print variable_object.a
Output:
1
999

Deepcopying is not entirely necessary in this case, because of the way you've setup your tuple method
class myclass(object):
def __init__(self,a,b,c):
self.a = a
self.b = b
self.c = c
def tuple(self):
return (self.a, self.b, self.c)
constant_object = myclass(1,2,3)
variable_object = myclass(*constant_object.tuple())
variable_object.a = 999
>>> constant_object.a
1
>>> variable_object.a
999
Usually (as others have suggested), you'd want to deepcopy. This creates a brand new object, with no ties to the object being copied. However, given that you are using only ints, deepcopy is overkill. You're better off doing a shallow copy. As a matter of fact, it might even be faster to call the class constructor on the parameters of the object you already have, seeing as these parameters are ints. This is why I suggested the above code.

Comparing two containers for identity of their contents

I have a method that returns a set of objects, and I'm writing a unit test for this method. Is there a generic, tidy and idiomatic way of comparing these for identity (rather than equality)? Or do I need to write a suitable implementation myself?
An example (somewhat contrived to keep it simple):
class Foo(object):
def has_some_property(self):
...
class Container(object):
def __init__(self):
self.foo_set = set()
def add_foo(self, foo):
self.foo_set.add(foo)
def foo_objects_that_have_property(self):
return set([foo for foo in self.foo_set if foo.has_some_property()])
import unittest
class TestCase(unittest.TestCase):
def testFoo(self):
c = Container()
x, y, z = Foo(), Foo(), Foo()
...
self.assertContentIdentity(c.foo_objects_that_have_property(), set([x, y]))
Importantly, testing here for equality won't do, since mutating the objects returned by foo_objects_that_have_property() may lead to inconsistent results depending on how those objects are used differently in Container even if they are "equal" at the time of the test.

The best I can come up with is:
#staticmethod
def set_id(c):
return set([id(e) for e in c])
def assertContentIdentity(self, a, b):
self.assertEqual(set_id(a), set_id(b))
However, this is specialised for sets and can't deal with nested containers.

A simple, albeit not the most efficient, way to do it:
def assertContentIdentity(set1, set2):
set1 = set([id(a) for a in set1])
set2 = set([id(a) for a in set2])
assert set1 == set2

x is y won't work here since that
would tell me that the sets are
different, which I know already. I
want to know if the objects that they
contain are the same objects or
different objects.
Then you need to write your own function, like
set([id(x) for x in X]) == set([id(y) for y in Y])

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

specify comparator function in python dictionary - python

Related

Python: Accessing dict with hashable object fails

How to test for approximate equality of generic classes

Initiate subclasses from parent class

Mutable objects in python and constants

Comparing two containers for identity of their contents

Categories

Resources