Defining when an object is contained in a class - python

From what I have researched most operators and methods can be overridden when creating a class in python. By using __add__(self, other) and others for example.
My "problem" (more like I don't quite understand how it is done) is for verifying is something is in my class I have to obviously use __contains__(self, theThing).
Alas I thought this should return a boolean value in the code itself, but from example code I have seen, like this:
def __contains__(self, posORname):
return [node.getId() for node in self.tree if node.getId() == posORname or node.getName() == posORname]
What I am returning is therefore a tuple containing the Id of where said item is.
Could someone explain why this is done instead of returning True or False? And if so, shouldn't it be implicitly possible to get the index of an item in a structure by just using in?
Thanks :D

For python2 __contains__ or python3 __contains__ both should return true or false. And now the question is what is True and what is False. Truth value testing.
The following values are considered false:
None
False
zero of any numeric type, for example, 0, 0.0, 0j.
any empty sequence, for example, '', (), [].
any empty mapping, for example, {}.
instances of user-defined classes, if the class defines a bool() or len() method, when that method returns the integer zero or bool value False.

__contains__() is expected to return a boolean value. In your case, __contains__() is returning a list with any tree nodes that match posORname. Thus, it's basically just a lazy way of implementing __contains__(), because in a boolean context:
An empty list is equivalent to False.
A non-empty list is equivalent to True.
While you could potentially have __contains__() return a data structure with the index or ID of the matching tree node, the in operator doesn't care; it only exists to check whether or not an item is contained by an object. Also, __contains__() is not intended to be called directly, so relying on it to return such information would be an abuse of the __contains__() method.
Instead, you would be better off implementing an separate method for getting the index/id of a node.

The result of __contains__ is converted to a boolean according to the usual rules, so that, e.g., empty lists count as False while lists with something in them count as True. So for your example, if the list has anything in it --- that is, any items match the criteria in that list comprehension --- then the in test will be True.
This auto-conversion-to-bool behavior does not appear to be explicitly documented, and is different from other operators (like < and >), which return what they return without converting to bool. There is some discussion of the behavior here.

Related

type(x) is list vs type(x) == list

In Python, suppose one wants to test whether the variable x is a reference to a list object. Is there a difference between if type(x) is list: and if type(x) == list:? This is how I understand it. (Please correct me if I am wrong)
type(x) is list tests whether the expressions type(x) and list evaluate to the same object and type(x) == list tests whether the two objects have equivalent (in some sense) values.
type(x) == list should evaluate to True as long as x is a list. But can type(x) evaluate to a different object from what list refers to?
What exactly does the expression list evaluate to? (I am new to Python, coming from C++, and still can't quite wrap my head around the notion that types are also objects.) Does list point to somewhere in memory? What data live there?
The "one obvious way" to do it, that will preserve the spirit of "duck typing" is isinstance(x, list). Rather, in most cases, one's code won't be specific to a list, but could work with any sequence (or maybe it needs a mutable sequence). So the recomendation is actually:
from collections.abc import MutableSequence
...
if isinstance(x, MutableSequence):
...
Now, going into your specific questions:
What exactly does the expression list evaluate to? Does list point to somewhere in memory? What data live there?
list in Python points to a class. A class that can be inherited, extended, etc...and thanks to a design choice of Python, the syntax for creating an instance of a class is indistinguishable from calling a function.
So, when teaching Python to novices, one could just casually mention that list is a "function" (I prefer not, since it is straightout false - the generic term for both functions and classes in regards to that they can be "called" and will return a result is callable)
Being a class, list does live in a specific place in memory - the "where" does not make any difference when coding in Python - but yes, there is one single place in memory where a class, which in Python is also an object, an instance of type, exists as a data structure with pointers to the various methods that one can use in a Python list.
As for:
type(x) is list tests whether the expressions type(x) and list evaluate to the same object and type(x) == list tests whether the two objects have equivalent (in some sense) values.
That is correct: is is a special operator that unlike others cannot be overriden for any class and checks for object itentity - in the cPython implementation, it checks if both operands are at the same memory address (but keep in mind that that address, though visible through the built-in function id, behaves as if it is opaque from Python code). As for the "sense" in which objects are "equal" in Python: one can always override the behavior of the == operator for a given object, by creating the special named method __eq__ in its class. (The same is true for each other operator - the language data model lists all available "magic" methods).
For lists, the implemented default comparison automatically compares each element recursively (calling the .__eq__ method for each item pair in both lists, if they have the same size to start with, of course)
type(x) == list should evaluate to True as long as x is a list. But can type(x) evaluate to a different object from what list refers to?
Not if "x" is a list proper: type(x) will always evaluate to list. But == would fail if x were an instance of a subclass of list, or another Sequence implementation: that is why it is always better to compare classes using the builtins isinstance and issubclass.
is checks exact identity, and only works if there is exactly one and only one of the list type. Fortunately, that's true for the list type (unless you've done something silly), so it usually works.
== test standard equality, which is equivalent to is for most types including list
Taking those together, there is no effective difference between type(x) is list and type(x) == list, but the former is construction better describes what's happening under the hood.
Consider avoiding use of the type(x) is sometype construction in favor of the isinstance function instead, because it will work for inherited classes too:
x = [1, 2, 3]
isinstance(x, list) # True
class Y(list):
'''A class that inherits from list'''
...
y = Y([1, 2, 3])
isinstance(y, list) # True
type(y) is list # False
Better yet, if you really just want to see if something is list-like, then use isinstance with either typing or collections.abc like so:
import collections.abc
x = [1, 2, 3]
isinstance(x, collections.abc.Iterable) # True
isinstance(x, collections.abc.Sequence) # True
x = set([1, 2, 3])
isinstance(x, collections.abc.Iterable) # True
isinstance(x, collections.abc.Sequence) # False
Note that lists are both Iterable (usable in a for loop) and a Sequence (ordered). A set is Iterable but not a Sequence, because you can use it in a for loop, but it isn't in any particular order. Use Iterable when you want to use in a for loop, or Sequence if it's important that the list be in a certain order.
Final note: The dict type is Mapping, but also counts as Iterable since you can loop over it's keys in a for loop.

Where does the current documentation specify that [5] is true?

Background: I was going to answer this question, starting with something like "The documentation specifies that non-empty lists are true and [...]". But then I realized that it doesn't specify that anymore. At least not obviously, which it used to.
Up to Python 3.5, the documentation still said (emphasis mine):
4.1. Truth Value Testing
Any object can be tested for truth value, for use in an if or while
condition or as operand of the Boolean operations below. The following
values are considered false:
None
False
zero of any numeric type, for example, 0, 0.0, 0j.
any empty sequence, for example, '', (), [].
any empty mapping, for example, {}.
instances of user-defined classes, if the class defines a __bool__() or __len__() method, when that method returns the integer zero or bool
value False. [1]
All other values are considered true — so objects of many types are always true.
Operations and built-in functions that have a Boolean result always return 0 or False for false and 1 or True for true, unless otherwise stated. (Important exception: the Boolean operations or and and always return one of their operands.)
A non-empty list like [5] doesn't fall under anything in the above list, so the "All other" specifies that it's true.
But since Python 3.6, that is gone. That section now says:
Truth Value Testing
Any object can be tested for truth value, for use in an if or while
condition or as operand of the Boolean operations below.
By default, an object is considered true unless its class defines
either a __bool__() method that returns False or a __len__() method
that returns zero, when called with the object. [1] Here are most of the
built-in objects considered false:
constants defined to be false: None and False.
zero of any numeric type: 0, 0.0, 0j, Decimal(0), Fraction(0, 1)
empty sequences and collections: '', (), [], {}, set(), range(0)
Operations and built-in functions that have a Boolean result always return 0 or False for false and 1 or True for true, unless otherwise stated. (Important exception: the Boolean operations or and and always return one of their operands.)
Now [5] could have a __bool__() method that returns False, and thus it would be false. Is there a new place in the current documentation that somehow specifies that non-empty lists are true?
The documentation of all the built-in classes list all the special methods that they implement. If a method isn't listed, you can assume it isn't implemented.
Since the documentation of list doesn't say anything about overriding the __bool__ method, it inherits the default behavior.
To find all the list operations, start here. However, it also points out that lists implement all the common and mutable sequence operations, so you'll need to read that documentation for the complete list.
Just found a place, in the reference (emphasis mine):
6.11. Boolean operations
[...]
In the context of Boolean operations, and also when expressions are
used by control flow statements, the following values are interpreted
as false: False, None, numeric zero of all types, and empty strings
and containers (including strings, tuples, lists, dictionaries, sets
and frozensets). All other values are interpreted as true.
User-defined objects can customize their truth value by providing a
__bool__() method.
It bothers me a little that that's right away contradicted by the very next sentence, about user-defined objects, but I'll take it.
[5] is a list object. Unless you have specifically overridden the built-in __bool__ method, you get the default method. As the documentation already implied, this is Truthy.
The update doesn't change things so much as widen the explanation to cover derived types and other augmentations of the built-in types.

In python, is there some kind of mapping to return the "False value" of a type?

I am looking for some kind of a mapping function f() that does something similar to this:
f(str) = ''
f(complex) = 0j
f(list) = []
Meaning that it returns an object of type that evaluates to False when cast to bool.
Does such a function exist?
No, there is no such mapping. Not every type of object has a falsy value, and others have more than one. Since the truth value of a class can be customized with the __bool__ method, a class could theoretically have an infinite number of (different) falsy instances.
That said, most builtin types return their falsy value when their constructor is called without arguments:
>>> str()
''
>>> complex()
0j
>>> list()
[]
Nope, and in general, there may be no such value. The Python data model is pretty loose about how the truth-value of a type may be implemented:
object.__bool__(self)
Called to implement truth value testing and the built-in operation
bool(); should return False or True. When this method is not defined,
__len__() is called, if it is defined, and the object is considered true if its result is nonzero. If a class defines neither __len__()
nor __bool__(), all its instances are considered true.
So consider:
import random
class Wacky:
def __bool__(self):
return bool(random.randint(0,1))
What should f(Wacky) return?
This is actually called an identity element, and in programming is most often seen as part of the definition of a monoid. In python, you can get it for a type using the mzero function in the PyMonad package. Haskell calls it mempty.
Not all types have such a value to begin with. Others may have many such values. The most correct way of doing this would be to create a type-to-value dict, because then you could check if a given type was in the dict at all, and you could chose which value is the correct one if there are multiple options. The drawback is of course that you would have to somehow register every type you were interested in.
Alternatively, you could write a function using some heuristics. If you were very careful about what you passed into the function, it would probably be of some limited use. For example, all the cases you show except complex are containers that generalize with cls().
complex actually works like that too, but I mention it separately because int and float do not. So if your attempt with the empty constructor fails by returning a truthy object or raising a TypeError, you can try cls(0). And so on and so forth...
Update
#juanpa.arrivillaga's answer actually suggests a clever workaround that will work for most classes. You can extend the class and forcibly create an instance that will be falsy but otherwise identical to the original class. You have to do this by extending because dunder methods like __bool__ are only ever looked up on the class, never on an instance. There are also many types where such methods can not be replaced on the instance to begin with. As #Aran-Fey's now-deleted comment points out, you can selectively call object.__new__ or t.__new__, depending on whether you are dealing with a very special case (like int) or not:
def f(t):
class tx(t):
def __bool__(self):
return False
try:
return object.__new__(tx)
except TypeError:
return tx.__new__(tx)
This will only work for 99.9% of classes you ever encounter. It is possible to create a contrived case that raises a TypeError when passed to object.__new__ as int does, and does not allow for a no-arg version of t.__new__, but I doubt you will ever find such a thing in nature. See the gist #Aran-Fey made to demonstrate this.
No such function exists because it's not possible in general. A class may have no falsy value or it may require reversing an arbitrarily complex implementation of __bool__.
What you could do by breaking everything else is to construct a new object of that class and forcibly assign its __bool__ function to one that returns False. Though I suspect that you are looking for an object that would otherwise be a valid member of the class.
In any case, this is a Very Bad Idea in classic style.

How does Python establish equality between objects?

I tracked down an error in my program to a line where I was testing for the existence of an object in a list of objects. The line always returned False, which meant that the object wasn't in the list. In fact, it kept happening, even when I did the following:
class myObject(object):
__slots__=('mySlot')
def __init__(self,myArgument):
self.mySlot=myArgument
print(myObject(0)==myObject(0)) # prints False
a=0
print(myObject(a)==myObject(a)) # prints False
a=myObject(a)
print(a==a) # prints True
I've used deepcopy before, but I'm not experienced enough with Python to know when it is and isn't necessary, or mechanically what the difference is. I've also heard of pickling, but never used it. Can someone explain to me what's going on here?
Oh, and another thing. The line
if x in myIterable:
probably tests equality between x and each element in myIterable, right? So if I can change the perceived equality between two objects, I can modify the output of that line? Is there a built-in for that and all of the other inline operators?
It passes the second operand to the __eq__() method of the first.
Incorrect. It passes the first operand to the __contains__() method of the second, or iterates the second performing equality comparisons if no such method exists.
Perhaps you meant to use is, which compares identity instead of equality.
The line myObject(0)==myObject(0) in your code is creating two different instances of a myObject , and since you haven't defined __eq__ they are being compared for identity (i.e. memory location).
x.__eq__(y) <==> x==y and your line about if x in myIterable: using "comparing equal" for the in keyword is correct unless the iterable defines __contains__.
print(myObject(0)==myObject(0)) # prints False
#because id(left_hand_side_myObject) != id(right_hand_side_myObject)
a=0
print(myObject(a)==myObject(a)) # prints False
#again because id(left_hand_side_myObject) != id(right_hand_side_myObject)
a=myObject(a)
print(a==a) # prints True
#because id(a) == id(a)
myObject(a) == myObject(a) returns false because you are creating two separate instances of myObject (with the same attribute a). So the two objects have the same attributes, but they are different instances of your class, so they are not equal.
If you want to check whether an object is in a list, then yeah,
if x in myIterable
would probably be the easiest way to do that.
If you want to check whether an object has the exact same attributes as another object in a list, maybe try something like this:
x = myObject(a)
for y in myIterable:
if x.mySlot == y.mySlot:
print("Exists")
break
Or, you could use __eq__(self,other) in your class definition to set the conditions for eqality.

Should these expressions evaluate differently?

I was somewhat confused until I found the bug in my code. I had to change
a.matched_images.count #True when variable is 0
to
a.matched_images.count > 0 #False when variable is 0
Since I quickly wanted to know whether an object had any images, the first code will appear like the photo has images since the expression evaluates to True when the meaning really is false ("no images" / 0 images)
Did I understand this correctly and can you please answer or comment if these expressions should evaluate to different values.
What is the nature of count? If it's a basic Python number, then if count is the same as if count != 0. On the other hand, if count is a custom class then it needs to implement either __nonzero__ or __len__ for Python 2.x, or __bool__ or __len__ for Python 3.x. If those methods are not defined, then every instance of that class is considered True.
Without knowing what count is, it's hard to answer, but this excerpt may be of use to you:.
The following values are considered false:
None
False
zero of any numeric type, for example, 0, 0L, 0.0, 0j.
any empty sequence, for example, '', (), [].
any empty mapping, for example, {}.
instances of user-defined classes, if the class defines a
__nonzero__() or __len__() method, when that method returns the
integer zero or bool value False. [1]
All other values are considered true — so objects of many types are
always true.
>>> bool(0)
False
So.. no, if it were an int that wouldn't matter. Please do some tracing, print out what count actually is.

Categories

Resources