Object membership in sets

Object membership in sets - python

This is really two questions:
Why isn't the membership operator (__contains__) ever being called?
Why is D in nodeList, but not in nodeSet?
My goal is for D to be "in" both nodeList and nodeSet, because it has the same loc as A.
class Node(object):
def __init__(self, loc):
self.loc = loc
def __eq__(self, other):
print "eq: self.getLoc(): {}, other.getLoc(): {}".format(self.getLoc(), other.getLoc())
if self.getLoc() == other.getLoc():
return True
return False
def __contains__(self, other):
print "contains: self.getLoc(): {}, other.getLoc(): {}".format(self.getLoc(), other.getLoc())
if self.getLoc() == other.getLoc():
return True
return False
def setLoc(self, loc):
self.loc = loc
def getLoc(self):
return self.loc
if __name__ == "__main__":
A = Node((1,1))
B = Node((2,2))
C = Node((3,3))
D = Node((1,1))
nodeList = [A, B, C]
nodeSet = set()
nodeSet.add(A)
nodeSet.add(B)
nodeSet.add(C)
print "A in nodeList: {}".format(A in nodeList)
print "A in nodeSet: {}".format(A in nodeSet)
print "D in nodeList: {}".format(D in nodeList)
print "D in nodeSet: {}".format(D in nodeSet)
This returns True, True, True, False. Apparently, the __contains__ operator is never called. I would like it to return True, True, True, True.
Any other critiques of my code are of course welcome, as I am a python beginner.

Why would Node.__contains__ ever be called? You never have a Node as the right-hand-side of an in expression.

See the documentation re __hash__() - in short:
[I]f [a class] defines __cmp__() or __eq__() but not __hash__(),
its instances will not be usable in hashed collections.
A set is a hashed collection. You'll want to make sure you implement Node.__hash__()

Related

Override eq with more logic

I have the following python classes
class Message:
def __init__(self, start_date, attributes):
self.start_date = start_date
self.attributes = attributes
def __eq__(self, other):
if not isinstance(other, Message):
return False
if not self.attributes == other.attrubutes:
return False
are_consecutive_dates = False
self_start_date= datetime.strptime(self.start_date, '%Y-%m-%d')
other_start_date= datetime.strptime(other.start_date, '%Y-%m-%d')
if not abs(self_start_date.toordinal() - other_start_date.toordinal()) == 1:
return False
return True
class Attribute:
def __init__(self, attribute_a, attribute_b):
self.attribute_a = attribute_a
self.attribute_b = attribute_b
def __eq__(self, other):
if not isinstance(other, Attribute):
return False
if not self.attribute_a == other.attribute_a:
return False
if not self.attribute_b == other.attribute_b:
return False
return True
From a business perspective, two messages are equals if have the same attributes and have consecutive dates
I have two questions:
Is valid to have some business logic inside the __eq__(like the dates are consecutive)?
If the above is valid, I would like to create a set and pass the
Messages instances and discard the ones that are equals by the definition I just wrote, so
how I need to override the __hash__?

Messages are equal if they contain consecutive dates? That doesn't seem right. Equality normally has three properties:
Reflexive: a == a. This relation isn't reflexive as messages aren't equal to themselves.
Symmetric: if a == b then b == a. Your relation is symmetric since you use abs.
Transitive: if a == b and b == c then a == c. It's not transitive. Jan 1 and Jan 3 are not consecutive even though both are consecutive with Jan 2.
By violating these properties you can't use your objects in sets or as dictionary keys. You can't usefully implement __hash__ to match this definition of __eq__: a message isn't equal to itself, but its hash will be equal to itself. This will confuse any data structure that uses either method.
Don't use __eq__ for this. It's not the right relation. Make a new method are_consecutive.

two sets were not equal, but are equal after casting to list and cast back to set

Does anyone got this weird outcome before?
>>> a == b
False
>>> list(sorted(a)) == list(sorted(b))
True
>>> set(list(a)) == set(list(b))
True
Where a and b above are set containing custom class instances.
This custom class inherited from MutableMapping with both __eq__() and __hash__() implemented as follows:
def to_json(self) -> dict:
# Code returning the data of this class
def __eq__(self, other):
if isinstance(other, Model):
return self.to_json() == other.to_json() and type(self) == type(other)
elif isinstance(other, MutableMapping):
return self.to_json() == other
else:
return False
def __hash__(self):
d = self.to_json()
hash_list = []
for k, v in d.items():
if isinstance(v, list):
v = tuple(v)
elif isinstance(v, dict):
v = tuple(v.items())
elif isinstance(v, Model):
v = hash(v)
hash_list.append((k, v))
return hash(tuple(hash_list))
I also test the hash code of these elements, it turns out it is the same. Below is my script to test:
>>> [hash(m) for m in a]
[-1696378346402890742, 3465342798672228497, 5576155172607749152]
>>> [hash(m) for m in b]
[-1696378346402890742, 3465342798672228497, 5576155172607749152]
I've found that there's some work to do with in, but I don't know what should I implement. I also don't know why this is its behavior.
>>> [(m in b) for m in a]
[False, False, False]
>>> [(m in b) for m in set(list(a))]
[False, False, False]
>>> [(m in set(list(b))) for m in a]
[True, True, True]
>>> [(m in set(list(b))) for m in set(list(b))]
[True, True, True]
Any fix that could potentially avoid/correct this weird behavior and the reason why this is how it works would be appreciated. Thanks!

I think in the case of a list, the outcome is based on the result of __eq__ only, whereas in the case of sets, it depends on __hash__ as well.
In a minimal example like the following, you can see that lists of different objects with the same __eq__-value are evaluated to be the same, whereas sets are not. If you define a __hash__-function that depends on the value only (not on the instance), also the corresponding sets evaluate to equal.
class EA2:
def __init__(self, id):
self.id = id
def __eq__(self, other):
if type(self) == type(other):
return self.id == other.id
else:
return False
def __hash__(self):
#a = hash((id(self), self.id))
a = self.id
#a = randint(1,1000)
print("hash= ", a)
return a
print({EA2(1), EA2(2)} == {EA2(1), EA2(2)})
print(list({EA2(1), EA2(2)}) == list({EA2(1), EA2(2)}))
print(EA2(1) == EA2(1))
If your comparisons evaluate to False despite equal hashes, maybe there is a different implementation of the __hash__-function around (due to subclassing)?

Context aware function

I have a piece of code below:
// The only difference is grad
class TestOne(...):
def init(self):
self.input_one = tr.allocate( ..., grad = False)
self.input_two = tr.allocate( ..., grad = False)
class TestTwo(...):
def init(self):
self.input_one = tr.allocate( ..., grad = True)
self.input_two = tr.allocate( ..., grad = False)
class TestThree(...):
def init(self):
self.input_one = tr.allocate( ..., grad = False)
self.input_two = tr.allocate( ..., grad = True)
Test1 = TestOne()
Test2 = TestTwo()
Test3 = TestThree()
# definition of allocate. It is a wrapper of the PyTorch randn function
# https://pytorch.org/docs/stable/torch.html#torch.randn
def allocate(..., grad):
...
return torch.randn(..., require_grad=grad)
I want to reduce the duplicate code by implementing just one class but able to generate same objects as the code above.
class Test(...):
// how to make it return different values?
def auto_set(self):
return False
def init(self):
self.input_one = tr.allocate( ..., grad = self.auto_set())
self.input_two = tr.allocate( ..., grad = self.auto_set())
Test1 = Test()
# grad of input_one and input_two will be `False, False`
Test2 = Test()
# grad of input_one and input_two will be `True, False`
Test3 = Test()
# grad of input_one and input_two will be `False, True`
This is part of a big project, so I can't change the interface of the init function. There could be N number of inputs which would require N + 1 different classes. That is not a scalable implementation so want to find a solution to solve that.
PS: My previous question was causing too many confusions to others so I changed it hoping to clarify on what I really want to have.
Just posting my solution here:
class Test(object):
init_counter = 0
num_variable = 0
def increase_init_counter(self):
Test.init_counter += 1
Test.auto_set_counter = 0
def auto_set(self):
if Test.init_counter == 0:
Test.num_variable += 1
return False
else:
print ("init_counter: {}, auto_set_counter: {}".format(Test.init_counter, Test.auto_set_counter))
Test.auto_set_counter += 1
if Test.init_counter == Test.auto_set_counter:
return True
else:
return False
def init(self):
self.A = self.auto_set();
self.B = False;
self.C = self.auto_set();
print ("A: {}, B: {}, C: {}".format(self.A, self.B, self.C))
=== Test
TestA = Test()
TestA.init()
for _ in range(TestA.num_variable):
TestB = copy.deepcopy(TestA)
TestB.increase_init_counter()
TestB.init()

If you find yourself using numbered variable names (e.g. v1, v2, v3) you need to stop immediately and think "Should I use a list instead?" - and the answer is "yes" in almost all cases.
Other notes:
To pick random values, make a list of possible values (in this case, [True, False]) and use random.choice()
range() can make a list of N values, which we can use to make another list of random choices (see "list comprehension" when you don't understand the [x for x in iterable] syntax).
Classes have __init__ as the constructor, you don't need a manual init function.
Classes should use a capital letter at the start of their name.
Code:
from random import choice
class Test(object):
def __init__(self, num_values):
self.values = [choice([True, False]) for _ in range(num_values)]
def see(self):
print(self.values)
for _ in range(3):
test1 = Test(3)
test1.see()
prints something like:
[False, False, False]
[True, False, True]
[True, True, False]

Let's see IIUYC...:
What you can do is to add a global, or let's say better common variable to the class definition, which is incremented when instanciating new objects of that class (and perhaps also better decremented when they are deleted).
This would give you the opportunity to implement different behaviuors of __init__() depending on the number of objects already created before.
Imagine a test class like
class Test():
i = 0
def __init__(self):
Test.i += 1
def __del__(self):
Test.i -= 1
After creating a first object, the common counter is 1:
t1 = Test()
t1.i
1
After creating a second object, the common counter is 2:
t2 = Test()
t2.i
Out: 2
... in all existing objects, because it's a common counter:
t1.i
Out: 2
Some sample implementation of what I think you want to achieve:
class Test():
i = 0
def __init__(self):
self.A = bin(Test.i)[-1] == '1'
self.B = bin(Test.i)[-2] == '1'
Test.i += 1
def __del__(self):
Test.i -= 1
t1 = Test()
print(t1.i, t1.A, t1.B)
# 1 False False
t2 = Test()
print(t2.i, t2.A, t2.B)
# 2 True False
t3 = Test()
print(t3.i, t3.A, t3.B)
# 3 False True

First of all, I suspect that what you need is a list of instance attributes (the variables in each object of the type).
class test(object):
def __init__(self):
self.v = []
# how to make it return different values?
def auto_set(self):
return False
def init(self):
self.v.append(self.auto_set())
def see(self):
print (self.v)
for _ in range(3):
test1 = test()
test1.init()
test1.see()
This will allow you to add to the attribute list. Is that enough to get you moving? We can't suggest a more thorough solution until you explain your system better.

How to modify argument to function cleanly in Python?

I have the following code which I use to modify the node that is passed into the function recursively (the node is wrapped in an array so that the modification is persistent after the function returns):
Is there a better or cleaner way to modify the argument?
`
class node(object):
def __init__(self, value, next=None):
self._value = value
self._next = next
def __repr__(self):
return "node({0})".format(self._value)
#property
def next(self):
return self._next
def foo(a, b):
if b == 0:
print "setting to next node",
a[0] = a[0].next
print a
return
print a
foo(a, b-1)
print a
n = node(5, node(8))
foo([n], 2)
`
Question was answered in: How do I pass a variable by reference?

To modify something, that thing has to be mutable. Your node instances are mutable:
n = node(3)
assert n.value == 3
n.value = 5
assert n.value == 5 # it was modified!
Also, your function fails to return any values. In your case it may be a wrong approach. Also, I frankly don't see why you would use number (0, n - 1) where the .next value is referenced. These must be node instances, not numbers.
Apparently you're making a linked list implementation, and your foo function tries to remove n-th node by traversing a list. (Please take care to name your functions descriptively; it helps both you and people answering your question.)
That's how I'd do it:
class Node(object): # class names are usually TitleCase
def __init__(self, value, next=None):
self.value = value
self.next = next # no properties for simplicity
def __repr__(self):
return "node({0})".format(self.value)
def asList(node): # nice for printing
if not node:
return []
return [node.value] + asList(node.next)
def removeNodeAt(head_node, index):
"""Removes a node from a list. Returns the (new) head node of the list."""
if index == 0: # changing head
return head_node.next
i = 1 # we have handled the index == 0 above
scan_node = head_node
while i < index and scan_node.next:
scan_node = scan_node.next
i += 1
# here scan_node.next is the node we want removed, or None
if scan_node.next:
scan_node.next = scan_node.next.next # jump over the removed node
return head_node
It works:
>>> n3 = Node(0, Node(1, Node(2)))
>>> asList(removeNodeAt(n3, 2))
[0, 1]
>>> n3 = Node(0, Node(1, Node(2)))
>>> asList(removeNodeAt(n3, 1))
[0, 2]

Because the parameter your are operating is object. You could use __dict__ to change the whole property of the object. It is equivalent to change every property of the project. You could try the following code:
class node(object):
def __init__(self, value, next=None):
self._value = value
self._next = next
def __repr__(self):
return "node({0})".format(self._value)
#property
def next(self):
return self._next
def foo(a):
print "setting to next node\n",
a.__dict__ = getattr(a.next, '__dict__', None)
return
n = node(5, node(8, node(7)))
print n._value, '->' ,n._next._value
foo(n)
print n._value, '->' ,n._next._value
Hope this could help you.

Method to compare Python dictionaries fails with certain value types?

I can't figure this out. I have two dictionaries which are identical. I use a standard method to determine the differences, of which there should be none. But certain value types are always returned as differences, even when they are not. For example, if a value is a pymongo.bson.ObjectId, the method fails to evaluate it as the same.
d1 = {'Name':'foo','ref1':ObjectId('502e232ca7919d27990001e4')}
d2 = {'Name':'foo','ref1':ObjectId('502e232ca7919d27990001e4')}
d1 == d2
returns:
True
But:
set((k,d1[k]) for k in set(d1) & set(d2) if d1[k] != d2[k])
returns:
set([('ref1',Objectid('502e232ca7919d27990001e4'))])
So I've figured out that this is weird, no?
d1['ref1'] == d2['ref1'] # True
d1['ref1'] != d2['ref1'] # False
What the?????!?!??!!?

ObjectId('502e232ca7919d27990001e4') creates a new object and by default != compares references. Try for example:
class Obj:
def __init__(self, value):
self.value = value
print Obj(1234) == Obj(1234) # False
This will evaluate to false, because they are difference instances, even if they hold the same value. To make this work, the class must implement the eq method:
class Obj:
def __init__(self, value):
self.value = value
def __eq__(self, other):
return self.value == other.value
print Obj(1234) == Obj(1234) # True
To fix this, you can "monkey-patch" the class:
class Obj:
def __init__(self, value):
self.value = value
print Obj(1234) == Obj(1234) # False
Obj.__eq__ = lambda a, b: a.value == b.value
print Obj(1234) == Obj(1234) # True
Or compare them by their values directly.
print Obj(1234).value == Obj(1234).value
Compare the values when possible because monkey-patching may break seemingly unrelated code.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Object membership in sets - python

Why would Node.contains ever be called? You never have a Node as the right-hand-side of an in expression.

See the documentation re hash() - in short: [I]f [a class] defines cmp() or eq() but not hash(), its instances will not be usable in hashed collections. A set is a hashed collection. You'll want to make sure you implement Node.hash()

Related

Override eq with more logic

two sets were not equal, but are equal after casting to list and cast back to set

Context aware function

How to modify argument to function cleanly in Python?

Method to compare Python dictionaries fails with certain value types?

Categories

Resources

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Object membership in sets - python

Why would Node.__contains__ ever be called? You never have a Node as the right-hand-side of an in expression.

See the documentation re __hash__() - in short: [I]f [a class] defines __cmp__() or __eq__() but not __hash__(), its instances will not be usable in hashed collections. A set is a hashed collection. You'll want to make sure you implement Node.__hash__()

Related

Override __eq__ with more logic

two sets were not equal, but are equal after casting to list and cast back to set

Context aware function

How to modify argument to function cleanly in Python?

Method to compare Python dictionaries fails with certain value types?

Categories

Resources

Why would Node.contains ever be called? You never have a Node as the right-hand-side of an in expression.

See the documentation re hash() - in short: [I]f [a class] defines cmp() or eq() but not hash(), its instances will not be usable in hashed collections. A set is a hashed collection. You'll want to make sure you implement Node.hash()

Override eq with more logic