Query on python control flow statements and lambda exp - python

In python, We say everything is an object,
for instance: expression x<y internally calls x.__lt__(y)
where __lt__ is the method in object of class('int' say if values are 2 & 3) and x and y are reference variables to objects
and
User defined Function square(3) internally calls square.__call__(3)
where 'square' is the name of an object of class 'function' and reference variable 'square' points to that object name 'square'.
So,
How are if-elif-else, for, break, continue, pass, lambda statements interpreted internally? as objects?
If yes, Can you give some examples to visualise in similar above manner?

Statements are not represented as objects in Python (including the examples you give). Rather, the way things work is that language syntax often maps to "hooks" that you can define to influence what the syntax does in a particular situation. For instance, the < syntax maps to the __lt__ hook. It doesn't mean that the < symbol "is an object"; it just means that you can define methods on your objects to customize how they work with that syntax.
Some of the syntax you ask about can be influenced in a similar way via magic methods.
if checks whether its condition is boolean true. That is, if x: internally calls bool(x) to determine whether x "counts" as True or False. You can influence this decision by defining a __nonzero__ method (or __bool__ in Python 3). elif works the same way. (You cannot change what else does.) Also, of course, a lot of if conditions involve comparisons, and you can customize those comparisons in the way you already mentioned. For instance, if you want to customize how if myObject < 2 will work, you can write a __lt__ magic method.
for engages the iterator protocol. You can make your own objects iterable by defining the __iter__ and/or next methods. You can't influence break and continue exactly, but you can write iterators that "fake" continues by skipping elements, or "fake" breaks by raising StopIteration.
lambda is just another syntax for defining a function. You can't change what lambda does, and it's not clear what the point would be anyway.
pass is a do-nothing statement. You can't change what it does, and there really would be no point: it exists specifically to do nothing.

Related

Where is the default behavior for object equality (`==`) defined?

According to the object.__eq__() documentation, the default (that is, in the object class) implementation for == is as follows:
True if x is y else NotImplemented
Still following the documentation for NotImplemented, I inferred that NotImplemented implies that the Python runtime will try the comparison the other way around. That is try y.__eq__(x) if x.__eq__(y) returns NotImplemented (in the case of the == operator).
Now, the following code prints False and True in python 3.9:
class A:
pass
print(A() == A())
print(bool(NotImplemented))
So my question is the following: where does the documentation mention the special behavior of NotImplemented in the context of __eq__ ?
PS : I found an answer in CPython source code but I guess that this must/should be somewhere in the documentation.
According to the object.__eq__() documentation, the default (that is, in the object class) implementation for == is as follows
No; that is the default implementation of __eq__. ==, being an operator, cannot be implemented in classes.
Python's implementation of operators is cooperative. There is hard-coded logic that uses the dunder methods to figure out what should happen, and possibly falls back on a default. This logic is outside of any class.
You can see another example with the built-in len: a class can return whatever it likes from its __len__ method, and you can in principle call it directly and get a value of any type. However, this does not properly implement the protocol, and len will complain when it doesn't get a positive integer back. There is not any class which contains that type-checking and value-checking logic. It is external.
Still following the documentation for NotImplemented, I inferred that NotImplemented implies that the Python runtime will try the comparison the other way around. That is try y.__eq__(x) if x.__eq__(y) returns NotImplemented (in the case of the == operator).
NotImplemented is just an object. It is not syntax. It does not have any special behavior, and in Python, simply returning a value does not trigger special behavior besides that the value is returned.
The external code for binary operators will try to look for the matching __op__, and try to look for the matching __rop__ if __op__ didn't work. At this point, NotImplemented is not an acceptable answer (it is a sentinel that exists specifically for this purpose, because None is an acceptable answer). In general, if the answer so far is still NotImplemented, then the external code will raise NotImplementedError.
As a special case, objects that don't provide their own comparison (i.e., the default from object is used for __eq__ or __ne__) will compare as "not equal" unless they are identical. The C implementation repeats the identity check (in case a class explicitly defines __eq__ or __ne__ to return NotImplemented directly, I guess). This is because it is considered sensible to give this result, and obnoxious to make == fail all the time when there is a sensible default.
However, the two objects are still not orderable without explicit logic, since there isn't a reasonable default. (You could compare the pointer values, but they're arbitrary and don't have anything to do with the Python logic that got you to that point; so ordering things that way isn't realistically useful for writing Python code.) So, for example, x < y will raise a TypeError if the comparison logic isn't provided. (It does this even if x is y; you could reasonably say that <= and >= should be true in this case, and < and > should be false, but it makes things too complicated and is not very useful.)
[Observation: print(bool(NotImplemented)) prints True]
Well, yes; NotImplemented is an object, so it's truthy by default; and it doesn't represent a numeric value, and isn't a container, so there's no reason for it to be falsy.
However, that also doesn't tell us anything useful. We don't care about the truthiness of NotImplemented here, and it isn't used that way in the Python implementation. It is just a sentinel value.
where does the documentation mention the special behavior of NotImplemented in the context of __eq__ ?
Nowhere, because it isn't a behavior of NotImplemented, as explained above.
Okay, but that leaves underlying question: where does the documentation explain what the == operator does by default?
Answer: because we are talking about an operator, and not about a method, it's not in the section about dunder methods. It's in section 6, which talks about expressions. Specifically, 6.10.1. Value comparisons:
The default behavior for equality comparison (== and !=) is based on the identity of the objects. Hence, equality comparison of instances with the same identity results in equality, and equality comparison of instances with different identities results in inequality. A motivation for this default behavior is the desire that all objects should be reflexive (i.e. x is y implies x == y).

Why and when do literal comparison operators like `==` in Python use the magic method of a custom type over a builtin?

The docs.python.org page on the Python "Data Model" states that when both sides in a literal comparison operation implement magic methods for the operation, the method of the left operand gets used with the right operand as its argument:
x<y calls x.__lt__(y), x<=y calls x.__le__(y), x==y calls x.__eq__(y), x!=y calls x.__ne__(y), x>y calls x.__gt__(y), and x>=y calls x.__ge__(y).
The following class wraps the builtin tuple and implements a magic method for one of these comparison operators to demonstrate this:
class eqtest(tuple):
def __eq__(self, other):
print('Equivalence!')
When using instances of this class on the left side of a comparison operator, it behaves as expected:
>>> eqtest((1,2,3)) == (1,2,3)
Equivalence!
However, the comparison operator of the custom class seems to get called even when only using its instance on the right:
>>> (1,2,3) == eqtest((1,2,3))
Equivalence!
The result is also demonstrably different when the magic method of the left operand is explicitly called:
>>> (1,2,3).__eq__(eqtest2((1,2,3)))
True
It's easy to understand why this might be a deliberate design choice, especially with subclasses, in order to return the result most likely to be useful from the type that was defined later. However, since it deviates quite explicitly from the basic documented behaviour, it's quite hard to know how and why it works this way confidently enough to account for and use it in production.
In what cases do the Python language and the CPython reference implementation reverse the order of comparison operators even if both sides provide valid results, and where is this documented?
The rules on comparisons state that tuples don't know how to compare to other types. tuplerichcompare does Py_RETURN_NOTIMPLEMENTED. However, the PyObject richcompare checks for subtypes, such as your inherited class, and swaps the comparison order (applying the symmetry rule).
This is documented in the page you linked as well:
If the operands are of different types, and right operand’s type is a direct or indirect subclass of the left operand’s type, the reflected method of the right operand has priority, otherwise the left operand’s method has priority. Virtual subclassing is not considered.
This enables subclasses to implement more specific behaviours that work with the comparison written either way.

In python, is there some kind of mapping to return the "False value" of a type?

I am looking for some kind of a mapping function f() that does something similar to this:
f(str) = ''
f(complex) = 0j
f(list) = []
Meaning that it returns an object of type that evaluates to False when cast to bool.
Does such a function exist?
No, there is no such mapping. Not every type of object has a falsy value, and others have more than one. Since the truth value of a class can be customized with the __bool__ method, a class could theoretically have an infinite number of (different) falsy instances.
That said, most builtin types return their falsy value when their constructor is called without arguments:
>>> str()
''
>>> complex()
0j
>>> list()
[]
Nope, and in general, there may be no such value. The Python data model is pretty loose about how the truth-value of a type may be implemented:
object.__bool__(self)
Called to implement truth value testing and the built-in operation
bool(); should return False or True. When this method is not defined,
__len__() is called, if it is defined, and the object is considered true if its result is nonzero. If a class defines neither __len__()
nor __bool__(), all its instances are considered true.
So consider:
import random
class Wacky:
def __bool__(self):
return bool(random.randint(0,1))
What should f(Wacky) return?
This is actually called an identity element, and in programming is most often seen as part of the definition of a monoid. In python, you can get it for a type using the mzero function in the PyMonad package. Haskell calls it mempty.
Not all types have such a value to begin with. Others may have many such values. The most correct way of doing this would be to create a type-to-value dict, because then you could check if a given type was in the dict at all, and you could chose which value is the correct one if there are multiple options. The drawback is of course that you would have to somehow register every type you were interested in.
Alternatively, you could write a function using some heuristics. If you were very careful about what you passed into the function, it would probably be of some limited use. For example, all the cases you show except complex are containers that generalize with cls().
complex actually works like that too, but I mention it separately because int and float do not. So if your attempt with the empty constructor fails by returning a truthy object or raising a TypeError, you can try cls(0). And so on and so forth...
Update
#juanpa.arrivillaga's answer actually suggests a clever workaround that will work for most classes. You can extend the class and forcibly create an instance that will be falsy but otherwise identical to the original class. You have to do this by extending because dunder methods like __bool__ are only ever looked up on the class, never on an instance. There are also many types where such methods can not be replaced on the instance to begin with. As #Aran-Fey's now-deleted comment points out, you can selectively call object.__new__ or t.__new__, depending on whether you are dealing with a very special case (like int) or not:
def f(t):
class tx(t):
def __bool__(self):
return False
try:
return object.__new__(tx)
except TypeError:
return tx.__new__(tx)
This will only work for 99.9% of classes you ever encounter. It is possible to create a contrived case that raises a TypeError when passed to object.__new__ as int does, and does not allow for a no-arg version of t.__new__, but I doubt you will ever find such a thing in nature. See the gist #Aran-Fey made to demonstrate this.
No such function exists because it's not possible in general. A class may have no falsy value or it may require reversing an arbitrarily complex implementation of __bool__.
What you could do by breaking everything else is to construct a new object of that class and forcibly assign its __bool__ function to one that returns False. Though I suspect that you are looking for an object that would otherwise be a valid member of the class.
In any case, this is a Very Bad Idea in classic style.

Why does Python use 'magic methods'?

I'm a bit surprised by Python's extensive use of 'magic methods'.
For example, in order for a class to declare that instances have a "length", it implements a __len__ method, which it is called when you write len(obj). Why not just define a len method which is called directly as a member of the object, e.g. obj.len()?
See also: Why does Python code use len() function instead of a length method?
AFAIK, len is special in this respect and has historical roots.
Here's a quote from the FAQ:
Why does Python use methods for some
functionality (e.g. list.index()) but
functions for other (e.g. len(list))?
The major reason is history. Functions
were used for those operations that
were generic for a group of types and
which were intended to work even for
objects that didn’t have methods at
all (e.g. tuples). It is also
convenient to have a function that can
readily be applied to an amorphous
collection of objects when you use the
functional features of Python (map(),
apply() et al).
In fact, implementing len(), max(),
min() as a built-in function is
actually less code than implementing
them as methods for each type. One can
quibble about individual cases but
it’s a part of Python, and it’s too
late to make such fundamental changes
now. The functions have to remain to
avoid massive code breakage.
The other "magical methods" (actually called special method in the Python folklore) make lots of sense, and similar functionality exists in other languages. They're mostly used for code that gets called implicitly when special syntax is used.
For example:
overloaded operators (exist in C++ and others)
constructor/destructor
hooks for accessing attributes
tools for metaprogramming
and so on...
From the Zen of Python:
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
This is one of the reasons - with custom methods, developers would be free to choose a different method name, like getLength(), length(), getlength() or whatsoever. Python enforces strict naming so that the common function len() can be used.
All operations that are common for many types of objects are put into magic methods, like __nonzero__, __len__ or __repr__. They are mostly optional, though.
Operator overloading is also done with magic methods (e.g. __le__), so it makes sense to use them for other common operations, too.
Python uses the word "magic methods", because those methods really performs magic for you program. One of the biggest advantages of using Python's magic methods is that they provide a simple way to make objects behave like built-in types. That means you can avoid ugly, counter-intuitive, and nonstandard ways of performing basic operators.
Consider a following example:
dict1 = {1 : "ABC"}
dict2 = {2 : "EFG"}
dict1 + dict2
Traceback (most recent call last):
File "python", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'dict' and 'dict'
This gives an error, because the dictionary type doesn't support addition. Now, let's extend dictionary class and add "__add__" magic method:
class AddableDict(dict):
def __add__(self, otherObj):
self.update(otherObj)
return AddableDict(self)
dict1 = AddableDict({1 : "ABC"})
dict2 = AddableDict({2 : "EFG"})
print (dict1 + dict2)
Now, it gives following output.
{1: 'ABC', 2: 'EFG'}
Thus, by adding this method, suddenly magic has happened and the error you were getting earlier, has gone away.
I hope, it makes things clear to you. For more information, refer to:
A Guide to Python's Magic Methods (Rafe Kettler, 2012)
Some of these functions do more than a single method would be able to implement (without abstract methods on a superclass). For instance bool() acts kind of like this:
def bool(obj):
if hasattr(obj, '__nonzero__'):
return bool(obj.__nonzero__())
elif hasattr(obj, '__len__'):
if obj.__len__():
return True
else:
return False
return True
You can also be 100% sure that bool() will always return True or False; if you relied on a method you couldn't be entirely sure what you'd get back.
Some other functions that have relatively complicated implementations (more complicated than the underlying magic methods are likely to be) are iter() and cmp(), and all the attribute methods (getattr, setattr and delattr). Things like int also access magic methods when doing coercion (you can implement __int__), but do double duty as types. len(obj) is actually the one case where I don't believe it's ever different from obj.__len__().
They are not really "magic names". It's just the interface an object has to implement to provide a given service. In this sense, they are not more magic than any predefined interface definition you have to reimplement.
While the reason is mostly historic, there are some peculiarities in Python's len that make the use of a function instead of a method appropriate.
Some operations in Python are implemented as methods, for example list.index and dict.append, while others are implemented as callables and magic methods, for example str and iter and reversed. The two groups differ enough so the different approach is justified:
They are common.
str, int and friends are types. It makes more sense to call the constructor.
The implementation differs from the function call. For example, iter might call __getitem__ if __iter__ isn't available, and supports additional arguments that don't fit in a method call. For the same reason it.next() has been changed to next(it) in recent versions of Python - it makes more sense.
Some of these are close relatives of operators. There's syntax for calling __iter__ and __next__ - it's called the for loop. For consistency, a function is better. And it makes it better for certain optimisations.
Some of the functions are simply way too similar to the rest in some way - repr acts like str does. Having str(x) versus x.repr() would be confusing.
Some of them rarely use the actual implementation method, for example isinstance.
Some of them are actual operators, getattr(x, 'a') is another way of doing x.a and getattr shares many of the aforementioned qualities.
I personally call the first group method-like and the second group operator-like. It's not a very good distinction, but I hope it helps somehow.
Having said this, len doesn't exactly fit in the second group. It's more close to the operations in the first one, with the only difference that it's way more common than almost any of them. But the only thing that it does is calling __len__, and it's very close to L.index. However, there are some differences. For example, __len__ might be called for the implementation of other features, such as bool, if the method was called len you might break bool(x) with custom len method that does completely different thing.
In short, you have a set of very common features that classes might implement that might be accessed through an operator, through a special function (that usually does more than the implementation, as an operator would), during object construction, and all of them share some common traits. All the rest is a method. And len is somewhat of an exception to that rule.
There is not a lot to add to the above two posts, but all the "magic" functions are not really magic at all. They are part of the __ builtins__ module which is implicitly/automatically imported when the interpreter starts. I.e.:
from __builtins__ import *
happens every time before your program starts.
I always thought it would be more correct if Python only did this for the interactive shell, and required scripts to import the various parts from builtins they needed. Also probably different __ main__ handling would be nice in shells vs interactive. Anyway, check out all the functions, and see what it is like without them:
dir (__builtins__)
...
del __builtins__
Perhaps, you have noticed it is possible to use certain built-in methods (ex. len(my_list_or_my_string)), and syntaxes (ex. my_list_or_my_string[:3], my_fancy_dict['some_key']) on some native types such as list, dict. Maybe you have been curious as to why it is not possible (yet) to use these same syntaxes on some of the classes you have written.
Variables of native types (list, dict, int, str) have unique behaviours and respond to certain syntaxes because they have some special methods defined in their respective classes — these methods are called Magic Methods.
A few magic methods include: __len__, __gt__, __eq__, etc.
Read more here: https://tomisin.dev/blog/supercharging-python-classes-with-magic-methods

The advantages of having static function like len(), max(), and min() over inherited method calls

i am a python newbie, and i am not sure why python implemented len(obj), max(obj), and min(obj) as a static like functions (i am from the java language) over obj.len(), obj.max(), and obj.min()
what are the advantages and disadvantages (other than obvious inconsistency) of having len()... over the method calls?
why guido chose this over the method calls? (this could have been solved in python3 if needed, but it wasn't changed in python3, so there gotta be good reasons...i hope)
thanks!!
The big advantage is that built-in functions (and operators) can apply extra logic when appropriate, beyond simply calling the special methods. For example, min can look at several arguments and apply the appropriate inequality checks, or it can accept a single iterable argument and proceed similarly; abs when called on an object without a special method __abs__ could try comparing said object with 0 and using the object change sign method if needed (though it currently doesn't); and so forth.
So, for consistency, all operations with wide applicability must always go through built-ins and/or operators, and it's those built-ins responsibility to look up and apply the appropriate special methods (on one or more of the arguments), use alternate logic where applicable, and so forth.
An example where this principle wasn't correctly applied (but the inconsistency was fixed in Python 3) is "step an iterator forward": in 2.5 and earlier, you needed to define and call the non-specially-named next method on the iterator. In 2.6 and later you can do it the right way: the iterator object defines __next__, the new next built-in can call it and apply extra logic, for example to supply a default value (in 2.6 you can still do it the bad old way, for backwards compatibility, though in 3.* you can't any more).
Another example: consider the expression x + y. In a traditional object-oriented language (able to dispatch only on the type of the leftmost argument -- like Python, Ruby, Java, C++, C#, &c) if x is of some built-in type and y is of your own fancy new type, you're sadly out of luck if the language insists on delegating all the logic to the method of type(x) that implements addition (assuming the language allows operator overloading;-).
In Python, the + operator (and similarly of course the builtin operator.add, if that's what you prefer) tries x's type's __add__, and if that one doesn't know what to do with y, then tries y's type's __radd__. So you can define your types that know how to add themselves to integers, floats, complex, etc etc, as well as ones that know how to add such built-in numeric types to themselves (i.e., you can code it so that x + y and y + x both work fine, when y is an instance of your fancy new type and x is an instance of some builtin numeric type).
"Generic functions" (as in PEAK) are a more elegant approach (allowing any overriding based on a combination of types, never with the crazy monomaniac focus on the leftmost arguments that OOP encourages!-), but (a) they were unfortunately not accepted for Python 3, and (b) they do of course require the generic function to be expressed as free-standing (it would be absolutely crazy to have to consider the function as "belonging" to any single type, where the whole POINT is that can be differently overridden/overloaded based on arbitrary combination of its several arguments' types!-). Anybody who's ever programmed in Common Lisp, Dylan, or PEAK, knows what I'm talking about;-).
So, free-standing functions and operators are just THE right, consistent way to go (even though the lack of generic functions, in bare-bones Python, does remove some fraction of the inherent elegance, it's still a reasonable mix of elegance and practicality!-).
It emphasizes the capabilities of an object, not its methods or type. Capabilites are declared by "helper" functions such as __iter__ and __len__ but they don't make up the interface. The interface is in the builtin functions, and beside this also in the buit-in operators like + and [] for indexing and slicing.
Sometimes, it is not a one-to-one correspondance: For example, iter(obj) returns an iterator for an object, and will work even if __iter__ is not defined. If not defined, it goes on to look if the object defines __getitem__ and will return an iterator accessing the object index-wise (like an array).
This goes together with Python's Duck Typing, we care only about what we can do with an object, not that it is of a particular type.
Actually, those aren't "static" methods in the way you are thinking about them. They are built-in functions that really just alias to certain methods on python objects that implement them.
>>> class Foo(object):
... def __len__(self):
... return 42
...
>>> f = Foo()
>>> len(f)
42
These are always available to be called whether or not the object implements them or not. The point is to have some consistency. Instead of some class having a method called length() and another called size(), the convention is to implement len and let the callers always access it by the more readable len(obj) instead of obj.methodThatDoesSomethingCommon
I thought the reason was so these basic operations could be done on iterators with the same interface as containers. However, it actually doesn't work with len:
def foo():
for i in range(10):
yield i
print len(foo())
... fails with TypeError. len() won't consume and count an iterator; it only works with objects that have a __len__ call.
So, as far as I'm concerned, len() shouldn't exist. It's much more natural to say obj.len than len(obj), and much more consistent with the rest of the language and the standard library. We don't say append(lst, 1); we say lst.append(1). Having a separate global method for length is an odd, inconsistent special case, and eats a very obvious name in the global namespace, which is a very bad habit of Python.
This is unrelated to duck typing; you can say getattr(obj, "len") to decide whether you can use len on an object just as easily--and much more consistently--than you can use getattr(obj, "__len__").
All that said, as language warts go--for those who consider this a wart--this is a very easy one to live with.
On the other hand, min and max do work on iterators, which gives them a use apart from any particular object. This is straightforward, so I'll just give an example:
import random
def foo():
for i in range(10):
yield random.randint(0, 100)
print max(foo())
However, there are no __min__ or __max__ methods to override its behavior, so there's no consistent way to provide efficient searching for sorted containers. If a container is sorted on the same key that you're searching, min/max are O(1) operations instead of O(n), and the only way to expose that is by a different, inconsistent method. (This could be fixed in the language relatively easily, of course.)
To follow up with another issue with this: it prevents use of Python's method binding. As a simple, contrived example, you can do this to supply a function to add values to a list:
def add(f):
f(1)
f(2)
f(3)
lst = []
add(lst.append)
print lst
and this works on all member functions. You can't do that with min, max or len, though, since they're not methods of the object they operate on. Instead, you have to resort to functools.partial, a clumsy second-class workaround common in other languages.
Of course, this is an uncommon case; but it's the uncommon cases that tell us about a language's consistency.

Categories

Resources