The advantages of having static function like len(), max(), and min() over inherited method calls - python

i am a python newbie, and i am not sure why python implemented len(obj), max(obj), and min(obj) as a static like functions (i am from the java language) over obj.len(), obj.max(), and obj.min()
what are the advantages and disadvantages (other than obvious inconsistency) of having len()... over the method calls?
why guido chose this over the method calls? (this could have been solved in python3 if needed, but it wasn't changed in python3, so there gotta be good reasons...i hope)
thanks!!

The big advantage is that built-in functions (and operators) can apply extra logic when appropriate, beyond simply calling the special methods. For example, min can look at several arguments and apply the appropriate inequality checks, or it can accept a single iterable argument and proceed similarly; abs when called on an object without a special method __abs__ could try comparing said object with 0 and using the object change sign method if needed (though it currently doesn't); and so forth.
So, for consistency, all operations with wide applicability must always go through built-ins and/or operators, and it's those built-ins responsibility to look up and apply the appropriate special methods (on one or more of the arguments), use alternate logic where applicable, and so forth.
An example where this principle wasn't correctly applied (but the inconsistency was fixed in Python 3) is "step an iterator forward": in 2.5 and earlier, you needed to define and call the non-specially-named next method on the iterator. In 2.6 and later you can do it the right way: the iterator object defines __next__, the new next built-in can call it and apply extra logic, for example to supply a default value (in 2.6 you can still do it the bad old way, for backwards compatibility, though in 3.* you can't any more).
Another example: consider the expression x + y. In a traditional object-oriented language (able to dispatch only on the type of the leftmost argument -- like Python, Ruby, Java, C++, C#, &c) if x is of some built-in type and y is of your own fancy new type, you're sadly out of luck if the language insists on delegating all the logic to the method of type(x) that implements addition (assuming the language allows operator overloading;-).
In Python, the + operator (and similarly of course the builtin operator.add, if that's what you prefer) tries x's type's __add__, and if that one doesn't know what to do with y, then tries y's type's __radd__. So you can define your types that know how to add themselves to integers, floats, complex, etc etc, as well as ones that know how to add such built-in numeric types to themselves (i.e., you can code it so that x + y and y + x both work fine, when y is an instance of your fancy new type and x is an instance of some builtin numeric type).
"Generic functions" (as in PEAK) are a more elegant approach (allowing any overriding based on a combination of types, never with the crazy monomaniac focus on the leftmost arguments that OOP encourages!-), but (a) they were unfortunately not accepted for Python 3, and (b) they do of course require the generic function to be expressed as free-standing (it would be absolutely crazy to have to consider the function as "belonging" to any single type, where the whole POINT is that can be differently overridden/overloaded based on arbitrary combination of its several arguments' types!-). Anybody who's ever programmed in Common Lisp, Dylan, or PEAK, knows what I'm talking about;-).
So, free-standing functions and operators are just THE right, consistent way to go (even though the lack of generic functions, in bare-bones Python, does remove some fraction of the inherent elegance, it's still a reasonable mix of elegance and practicality!-).

It emphasizes the capabilities of an object, not its methods or type. Capabilites are declared by "helper" functions such as __iter__ and __len__ but they don't make up the interface. The interface is in the builtin functions, and beside this also in the buit-in operators like + and [] for indexing and slicing.
Sometimes, it is not a one-to-one correspondance: For example, iter(obj) returns an iterator for an object, and will work even if __iter__ is not defined. If not defined, it goes on to look if the object defines __getitem__ and will return an iterator accessing the object index-wise (like an array).
This goes together with Python's Duck Typing, we care only about what we can do with an object, not that it is of a particular type.

Actually, those aren't "static" methods in the way you are thinking about them. They are built-in functions that really just alias to certain methods on python objects that implement them.
>>> class Foo(object):
... def __len__(self):
... return 42
...
>>> f = Foo()
>>> len(f)
42
These are always available to be called whether or not the object implements them or not. The point is to have some consistency. Instead of some class having a method called length() and another called size(), the convention is to implement len and let the callers always access it by the more readable len(obj) instead of obj.methodThatDoesSomethingCommon

I thought the reason was so these basic operations could be done on iterators with the same interface as containers. However, it actually doesn't work with len:
def foo():
for i in range(10):
yield i
print len(foo())
... fails with TypeError. len() won't consume and count an iterator; it only works with objects that have a __len__ call.
So, as far as I'm concerned, len() shouldn't exist. It's much more natural to say obj.len than len(obj), and much more consistent with the rest of the language and the standard library. We don't say append(lst, 1); we say lst.append(1). Having a separate global method for length is an odd, inconsistent special case, and eats a very obvious name in the global namespace, which is a very bad habit of Python.
This is unrelated to duck typing; you can say getattr(obj, "len") to decide whether you can use len on an object just as easily--and much more consistently--than you can use getattr(obj, "__len__").
All that said, as language warts go--for those who consider this a wart--this is a very easy one to live with.
On the other hand, min and max do work on iterators, which gives them a use apart from any particular object. This is straightforward, so I'll just give an example:
import random
def foo():
for i in range(10):
yield random.randint(0, 100)
print max(foo())
However, there are no __min__ or __max__ methods to override its behavior, so there's no consistent way to provide efficient searching for sorted containers. If a container is sorted on the same key that you're searching, min/max are O(1) operations instead of O(n), and the only way to expose that is by a different, inconsistent method. (This could be fixed in the language relatively easily, of course.)
To follow up with another issue with this: it prevents use of Python's method binding. As a simple, contrived example, you can do this to supply a function to add values to a list:
def add(f):
f(1)
f(2)
f(3)
lst = []
add(lst.append)
print lst
and this works on all member functions. You can't do that with min, max or len, though, since they're not methods of the object they operate on. Instead, you have to resort to functools.partial, a clumsy second-class workaround common in other languages.
Of course, this is an uncommon case; but it's the uncommon cases that tell us about a language's consistency.

Related

what does it mean by 'passed by assignment'?

As follow is my understanding of types & parameters passing in java and python:
In java, there are primitive types and non-primitive types. Former are not object, latter are objects.
In python, they are all objects.
In java, arguments are passed by value because:
primitive types are copied and then passed, so they are passed by value for sure. non-primitive types are passed by reference but reference(pointer) is also value, so they are also passed by value.
In python, the only difference is that 'primitive types'(for example, numbers) are not copied, but simply taken as objects.
Based on official doc, arguments are passed by assignment. What does it mean by 'passed by assignment'? Is objects in java work the same way as python? What result in the difference (passed by value in java and passed by argument in python)?
And is there any wrong understanding above?
tl;dr: You're right that Python's semantics are essentially Java's semantics, without any primitive types.
"Passed by assignment" is actually making a different distinction than the one you're asking about.1 The idea is that argument passing to functions (and other callables) works exactly the same way assignment works.
Consider:
def f(x):
pass
a = 3
b = a
f(a)
b = a means that the target b, in this case a name in the global namespace, becomes a reference to whatever value a references.
f(a) means that the target x, in this case a name in the local namespace of the frame built to execute f, becomes a reference to whatever value a references.
The semantics are identical. Whenever a value gets assigned to a target (which isn't always a simple name—e.g., think lst[0] = a or spam.eggs = a), it follows the same set of assignment rules—whether it's an assignment statement, a function call, an as clause, or a loop iteration variable, there's just one set of rules.
But overall, your intuitive idea that Python is like Java but with only reference types is accurate: You always "pass a reference by value".
Arguing over whether that counts as "pass by reference" or "pass by value" is pointless. Trying to come up with a new unambiguous name for it that nobody will argue about is even more pointless. Liskov invented the term "call by object" three decades ago, and if that never caught on, anything someone comes up with today isn't likely to do any better.
You understand the actual semantics, and that's what matters.
And yes, this means there is no copying. In Java, only primitive values are copied, and Python doesn't have primitive values, so nothing is copied.
the only difference is that 'primitive types'(for example, numbers) are not copied, but simply taken as objects
It's much better to see this as "the only difference is that there are no 'primitive types' (not even simple numbers)", just as you said at the start.
It's also worth asking why Python has no primitive types—or why Java does.2
Making everything "boxed" can be very slow. Adding 2 + 3 in Python means dereferencing the 2 and 3 objects, getting the native values out of them, adding them together, and wrapping the result up in a new 5 object (or looking it up in a table because you already have an existing 5 object). That's a lot more work than just adding two ints.3
While a good JIT like Hotspot—or like PyPy for Python—can often automatically do those optimizations, sometimes "often" isn't good enough. That's why Java has native types: to let you manually optimize things in those cases.
Python, instead, relies on third-party libraries like Numpy, which let you pay the boxing costs just once for a whole array, instead of once per element. Which keeps the language simpler, but at the cost of needing Numpy.4
1. As far as I know, "passed by assignment" appears a couple times in the FAQs, but is not actually used in the reference docs or glossary. The reference docs already lean toward intuitive over rigorous, but the FAQ, like the tutorial, goes much further in that direction. So, asking what a term in the FAQ means, beyond the intuitive idea it's trying to get across, may not be a meaningful question in the first place.
2. I'm going to ignore the issue of Java's lack of operator overloading here. There's no reason they couldn't include special language rules for a handful of core classes, even if they didn't let you do the same thing with your own classes—e.g., Go does exactly that for things like range, and people rarely complain.
3. … or even than looping over two arrays of 30-bit digits, which is what Python actually does. The cost of working on unlimited-size "bigints" is tiny compared to the cost of boxing, so Python just always pays that extra, barely-noticeable cost. Python 2 did, like Java, have separate fixed and bigint types, but a couple decades of experience showed that it wasn't getting any performance benefits out of the extra complexity.
4. The implementation of Numpy is of course far from simple. But using it is pretty simple, and a lot more people need to use Numpy than need to write Numpy, so that turns out to be a pretty decent tradeoff.
Similar to passing reference types by value in C#.
Docs: https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/passing-reference-type-parameters#passing-reference-types-by-value
Code demo:
# mutable object
l = [9, 8, 7]
def createNewList(l1: list):
# l1+[0] will create a new list object, the reference address of the local variable l1 is changed without affecting the variable l
l1 = l1+[0]
def changeList(l1: list):
# Add an element to the end of the list, because l1 and l refer to the same object, so l will also change
l1.append(0)
print(l)
createNewList(l)
print(l)
changeList(l)
print(l)
# immutable object
num = 9
def changeValue(val: int):
# int is an immutable type, and changing the val makes the val point to the new object 8,
# it's not change the num value
value = 8
print(num)
changeValue(num)
print(num)

Is there a built-in way to use CPython built-ins to make an arbitrary callable behave as an unbound class method?

In Python 2, it was possible to convert arbitrary callables to methods of a class. Importantly, if the callable was a CPython built-in implemented in C, you could use this to make methods of user-defined classes that were C layer themselves, invoking no byte code when called.
This is occasionally useful if you're relying on the GIL to provide "lock-free" synchronization; since the GIL can only be swapped out between op codes, if all the steps in a particular part of your code can be pushed to C, you can make it behave atomically.
In Python 2, you could do something like this:
import types
from operator import attrgetter
class Foo(object):
... This class maintains a member named length storing the length...
def __len__(self):
return self.length # We don't want this, because we're trying to push all work to C
# Instead, we explicitly make an unbound method that uses attrgetter to achieve
# the same result as above __len__, but without no byte code invoked to satisfy it
Foo.__len__ = types.MethodType(attrgetter('length'), None, Foo)
In Python 3, there is no longer an unbound method type, and types.MethodType only takes two arguments and creates only bound methods (which is not useful for Python special methods like __len__, __hash__, etc., since special methods are often looked up directly on the type, not the instance).
Is there some way of accomplishing this in Py3 that I'm missing?
Things I've looked at:
functools.partialmethod (appears to not have a C implementation, so it fails the requirements, and between the Python implementation and being much more general purpose than I need, it's slow, taking about 5 us in my tests, vs. ~200-300 ns for direct Python definitions or attrgetter in Py2, a roughly 20x increase in overhead)
Trying to make attrgetter or the like follow the non-data descriptor protocol (not possible AFAICT, can't monkey-patch in a __get__ or the like)
Trying to find a way to subclass attrgetter to give it a __get__, but of course, the __get__ needs to be delegated to C layer somehow, and now we're back where we started
(Specific to attrgetter use case) Using __slots__ to make the member a descriptor in the first place, then trying to somehow convert from the resulting descriptor for the data into something that skips the final step of binding and acquiring the real value to something that makes it callable so the real value retrieval is deferred
I can't swear I didn't miss something for any of those options though. Anyone have any solutions? Total hackery is allowed; I recognize I'm doing pathological things here. Ideally it would be flexible (to let you make something that behaves like an unbound method out of a class, a Python built-in function like hex, len, etc., or any other callable object not defined at the Python layer). Importantly, it needs to attach to the class, not each instance (both to reduce per-instance overhead, and to work correctly for dunder special methods, which bypass instance lookup in most cases).
Found a (probably CPython only) solution to this recently. It's a little ugly, being a ctypes hack to directly invoke CPython APIs, but it works, and gets the desired performance:
import ctypes
from operator import attrgetter
make_instance_method = ctypes.pythonapi.PyInstanceMethod_New
make_instance_method.argtypes = (ctypes.py_object,)
make_instance_method.restype = ctypes.py_object
class Foo:
# ... This class maintains a member named length storing the length...
# Defines a __len__ method that, at the C level, fetches self.length
__len__ = make_instance_method(attrgetter('length'))
It's an improvement over the Python 2 version in one way, since, as it doesn't need the class to be defined to make an unbound method for it, you can define it in the class body by simple assignment (where the Python 2 version must explicitly reference Foo twice in Foo.__len__ = types.MethodType(attrgetter('length'), None, Foo), and only after class Foo has finished being defined).
On the other hand, it doesn't actually provide a performance benefit on CPython 3.7 AFAICT, at least not for the simple case here where it's replacing def __len__(self): return self.length; in fact, for __len__ accessed via len(instance) on an instance of Foo, ipython %%timeit microbenchmarks show len(instance) is ~10% slower when __len__ is defined via __len__ = make_instance_method(attrgetter('length')), . This is likely an artifact of attrgetter itself having slightly higher overhead due to CPython not having moved it to the "FastCall" protocol (called "Vectorcall" in 3.8 when it was made semi-public for provisional third-party use), while user-defined functions already benefit from it in 3.7, as well as having to dynamically choose whether to perform dotted or undotted attribute lookup and single or multiple attribute lookup each time (which Vectorcall might be able to avoid by choosing a __call__ implementation appropriate to the gets being performed at construction time) adds more overhead that the plain method avoids. It should win for more complicated cases (say, if the attribute to be retrieved is a nested attribute like self.contained.length), since attrgetter's overhead is largely fixed, while nested attribute lookup in Python means more byte code, but right now, it's not useful very often.
If they ever get around to optimizing operator.attrgetter for Vectorcall, I'll rebenchmark and update this answer.

Is it idiomatic to take advantage of python's pass-by-sharing of mutable data structures like lists?

I know that in Python, because it's pass-by-sharing, if I pass a mutable object (like a list) to a function, and then use that function to mutate it, I don't need to explicitly pass it back, because the caller can see the changes:
def add_to_list(list_of_nums):
list_of_nums.append(26)
my_list = [12]
add_to_list(my_list)
print my_list # >>>[12, 26]
So this works. But is it a good idea/good python practice? My gut says it's not (the same way global variables are almost always a bad idea), but maybe that's just because I first learned C++ in all its pass-by-value glory.
And yes, I know that I can code my way around this (say by creating a class), but the question is, should I, or is this generally seen as acceptable practice?
I think whether or not this is "acceptable" will be determined by the context and how well the function is named. keep Pep 20 in mind: "Explicit is better than implicit."
Python fully supports functional programming, and in that case, modifying objects that are passed in is often expected. If the function is named appropriately and documented well, I think it's fine. Your example illustrates this pretty well. The function is called add_to_list, which pretty explicitly says what the function does.
If your program/script takes more of an object-oriented approach, modifying passed-in objects should be replaced by creating the appropriate classes instead, like the native list class in your example - it has an append() method that modifies the list instead of passing the list into a separate function.
The key is to be consistent with your paradigm and well documented. If you cover both of those bases, I think it's acceptable.

Learning Python from Ruby; Differences and Similarities

I know Ruby very well. I believe that I may need to learn Python presently. For those who know both, what concepts are similar between the two, and what are different?
I'm looking for a list similar to a primer I wrote for Learning Lua for JavaScripters: simple things like whitespace significance and looping constructs; the name of nil in Python, and what values are considered "truthy"; is it idiomatic to use the equivalent of map and each, or are mumble somethingaboutlistcomprehensions mumble the norm?
If I get a good variety of answers I'm happy to aggregate them into a community wiki. Or else you all can fight and crib from each other to try to create the one true comprehensive list.
Edit: To be clear, my goal is "proper" and idiomatic Python. If there is a Python equivalent of inject, but nobody uses it because there is a better/different way to achieve the common functionality of iterating a list and accumulating a result along the way, I want to know how you do things. Perhaps I'll update this question with a list of common goals, how you achieve them in Ruby, and ask what the equivalent is in Python.
Here are some key differences to me:
Ruby has blocks; Python does not.
Python has functions; Ruby does not. In Python, you can take any function or method and pass it to another function. In Ruby, everything is a method, and methods can't be directly passed. Instead, you have to wrap them in Proc's to pass them.
Ruby and Python both support closures, but in different ways. In Python, you can define a function inside another function. The inner function has read access to variables from the outer function, but not write access. In Ruby, you define closures using blocks. The closures have full read and write access to variables from the outer scope.
Python has list comprehensions, which are pretty expressive. For example, if you have a list of numbers, you can write
[x*x for x in values if x > 15]
to get a new list of the squares of all values greater than 15. In Ruby, you'd have to write the following:
values.select {|v| v > 15}.map {|v| v * v}
The Ruby code doesn't feel as compact. It's also not as efficient since it first converts the values array into a shorter intermediate array containing the values greater than 15. Then, it takes the intermediate array and generates a final array containing the squares of the intermediates. The intermediate array is then thrown out. So, Ruby ends up with 3 arrays in memory during the computation; Python only needs the input list and the resulting list.
Python also supplies similar map comprehensions.
Python supports tuples; Ruby doesn't. In Ruby, you have to use arrays to simulate tuples.
Ruby supports switch/case statements; Python does not.
Ruby supports the standard expr ? val1 : val2 ternary operator; Python does not.
Ruby supports only single inheritance. If you need to mimic multiple inheritance, you can define modules and use mix-ins to pull the module methods into classes. Python supports multiple inheritance rather than module mix-ins.
Python supports only single-line lambda functions. Ruby blocks, which are kind of/sort of lambda functions, can be arbitrarily big. Because of this, Ruby code is typically written in a more functional style than Python code. For example, to loop over a list in Ruby, you typically do
collection.each do |value|
...
end
The block works very much like a function being passed to collection.each. If you were to do the same thing in Python, you'd have to define a named inner function and then pass that to the collection each method (if list supported this method):
def some_operation(value):
...
collection.each(some_operation)
That doesn't flow very nicely. So, typically the following non-functional approach would be used in Python:
for value in collection:
...
Using resources in a safe way is quite different between the two languages. Here, the problem is that you want to allocate some resource (open a file, obtain a database cursor, etc), perform some arbitrary operation on it, and then close it in a safe manner even if an exception occurs.
In Ruby, because blocks are so easy to use (see #9), you would typically code this pattern as a method that takes a block for the arbitrary operation to perform on the resource.
In Python, passing in a function for the arbitrary action is a little clunkier since you have to write a named, inner function (see #9). Instead, Python uses a with statement for safe resource handling. See How do I correctly clean up a Python object? for more details.
I, like you, looked for inject and other functional methods when learning Python. I was disappointed to find that they weren't all there, or that Python favored an imperative approach. That said, most of the constructs are there if you look. In some cases, a library will make things nicer.
A couple of highlights for me:
The functional programming patterns you know from Ruby are available in Python. They just look a little different. For example, there's a map function:
def f(x):
return x + 1
map(f, [1, 2, 3]) # => [2, 3, 4]
Similarly, there is a reduce function to fold over lists, etc.
That said, Python lacks blocks and doesn't have a streamlined syntax for chaining or composing functions. (For a nice way of doing this without blocks, check out Haskell's rich syntax.)
For one reason or another, the Python community seems to prefer imperative iteration for things that would, in Ruby, be done without mutation. For example, folds (i.e., inject), are often done with an imperative for loop instead of reduce:
running_total = 0
for n in [1, 2, 3]:
running_total = running_total + n
This isn't just a convention, it's also reinforced by the Python maintainers. For example, the Python 3 release notes explicitly favor for loops over reduce:
Use functools.reduce() if you really need it; however, 99 percent of the time an explicit for loop is more readable.
List comprehensions are a terse way to express complex functional operations (similar to Haskell's list monad). These aren't available in Ruby and may help in some scenarios. For example, a brute-force one-liner to find all the palindromes in a string (assuming you have a function p() that returns true for palindromes) looks like this:
s = 'string-with-palindromes-like-abbalabba'
l = len(s)
[s[x:y] for x in range(l) for y in range(x,l+1) if p(s[x:y])]
Methods in Python can be treated as context-free functions in many cases, which is something you'll have to get used to from Ruby but can be quite powerful.
In case this helps, I wrote up more thoughts here in 2011: The 'ugliness' of Python. They may need updating in light of today's focus on ML.
My suggestion: Don't try to learn the differences. Learn how to approach the problem in Python. Just like there's a Ruby approach to each problem (that works very well givin the limitations and strengths of the language), there's a Python approach to the problem. they are both different. To get the best out of each language, you really should learn the language itself, and not just the "translation" from one to the other.
Now, with that said, the difference will help you adapt faster and make 1 off modifications to a Python program. And that's fine for a start to get writing. But try to learn from other projects the why behind the architecture and design decisions rather than the how behind the semantics of the language...
I know little Ruby, but here are a few bullet points about the things you mentioned:
nil, the value indicating lack of a value, would be None (note that you check for it like x is None or x is not None, not with == - or by coercion to boolean, see next point).
None, zero-esque numbers (0, 0.0, 0j (complex number)) and empty collections ([], {}, set(), the empty string "", etc.) are considered falsy, everything else is considered truthy.
For side effects, (for-)loop explicitly. For generating a new bunch of stuff without side-effects, use list comprehensions (or their relatives - generator expressions for lazy one-time iterators, dict/set comprehensions for the said collections).
Concerning looping: You have for, which operates on an iterable(! no counting), and while, which does what you would expect. The fromer is far more powerful, thanks to the extensive support for iterators. Not only nearly everything that can be an iterator instead of a list is an iterator (at least in Python 3 - in Python 2, you have both and the default is a list, sadly). The are numerous tools for working with iterators - zip iterates any number of iterables in parallel, enumerate gives you (index, item) (on any iterable, not just on lists), even slicing abritary (possibly large or infinite) iterables! I found that these make many many looping tasks much simpler. Needless to say, they integrate just fine with list comprehensions, generator expressions, etc.
In Ruby, instance variables and methods are completely unrelated, except when you explicitly relate them with attr_accessor or something like that.
In Python, methods are just a special class of attribute: one that is executable.
So for example:
>>> class foo:
... x = 5
... def y(): pass
...
>>> f = foo()
>>> type(f.x)
<type 'int'>
>>> type(f.y)
<type 'instancemethod'>
That difference has a lot of implications, like for example that referring to f.x refers to the method object, rather than calling it. Also, as you can see, f.x is public by default, whereas in Ruby, instance variables are private by default.

Why does Python use 'magic methods'?

I'm a bit surprised by Python's extensive use of 'magic methods'.
For example, in order for a class to declare that instances have a "length", it implements a __len__ method, which it is called when you write len(obj). Why not just define a len method which is called directly as a member of the object, e.g. obj.len()?
See also: Why does Python code use len() function instead of a length method?
AFAIK, len is special in this respect and has historical roots.
Here's a quote from the FAQ:
Why does Python use methods for some
functionality (e.g. list.index()) but
functions for other (e.g. len(list))?
The major reason is history. Functions
were used for those operations that
were generic for a group of types and
which were intended to work even for
objects that didn’t have methods at
all (e.g. tuples). It is also
convenient to have a function that can
readily be applied to an amorphous
collection of objects when you use the
functional features of Python (map(),
apply() et al).
In fact, implementing len(), max(),
min() as a built-in function is
actually less code than implementing
them as methods for each type. One can
quibble about individual cases but
it’s a part of Python, and it’s too
late to make such fundamental changes
now. The functions have to remain to
avoid massive code breakage.
The other "magical methods" (actually called special method in the Python folklore) make lots of sense, and similar functionality exists in other languages. They're mostly used for code that gets called implicitly when special syntax is used.
For example:
overloaded operators (exist in C++ and others)
constructor/destructor
hooks for accessing attributes
tools for metaprogramming
and so on...
From the Zen of Python:
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
This is one of the reasons - with custom methods, developers would be free to choose a different method name, like getLength(), length(), getlength() or whatsoever. Python enforces strict naming so that the common function len() can be used.
All operations that are common for many types of objects are put into magic methods, like __nonzero__, __len__ or __repr__. They are mostly optional, though.
Operator overloading is also done with magic methods (e.g. __le__), so it makes sense to use them for other common operations, too.
Python uses the word "magic methods", because those methods really performs magic for you program. One of the biggest advantages of using Python's magic methods is that they provide a simple way to make objects behave like built-in types. That means you can avoid ugly, counter-intuitive, and nonstandard ways of performing basic operators.
Consider a following example:
dict1 = {1 : "ABC"}
dict2 = {2 : "EFG"}
dict1 + dict2
Traceback (most recent call last):
File "python", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'dict' and 'dict'
This gives an error, because the dictionary type doesn't support addition. Now, let's extend dictionary class and add "__add__" magic method:
class AddableDict(dict):
def __add__(self, otherObj):
self.update(otherObj)
return AddableDict(self)
dict1 = AddableDict({1 : "ABC"})
dict2 = AddableDict({2 : "EFG"})
print (dict1 + dict2)
Now, it gives following output.
{1: 'ABC', 2: 'EFG'}
Thus, by adding this method, suddenly magic has happened and the error you were getting earlier, has gone away.
I hope, it makes things clear to you. For more information, refer to:
A Guide to Python's Magic Methods (Rafe Kettler, 2012)
Some of these functions do more than a single method would be able to implement (without abstract methods on a superclass). For instance bool() acts kind of like this:
def bool(obj):
if hasattr(obj, '__nonzero__'):
return bool(obj.__nonzero__())
elif hasattr(obj, '__len__'):
if obj.__len__():
return True
else:
return False
return True
You can also be 100% sure that bool() will always return True or False; if you relied on a method you couldn't be entirely sure what you'd get back.
Some other functions that have relatively complicated implementations (more complicated than the underlying magic methods are likely to be) are iter() and cmp(), and all the attribute methods (getattr, setattr and delattr). Things like int also access magic methods when doing coercion (you can implement __int__), but do double duty as types. len(obj) is actually the one case where I don't believe it's ever different from obj.__len__().
They are not really "magic names". It's just the interface an object has to implement to provide a given service. In this sense, they are not more magic than any predefined interface definition you have to reimplement.
While the reason is mostly historic, there are some peculiarities in Python's len that make the use of a function instead of a method appropriate.
Some operations in Python are implemented as methods, for example list.index and dict.append, while others are implemented as callables and magic methods, for example str and iter and reversed. The two groups differ enough so the different approach is justified:
They are common.
str, int and friends are types. It makes more sense to call the constructor.
The implementation differs from the function call. For example, iter might call __getitem__ if __iter__ isn't available, and supports additional arguments that don't fit in a method call. For the same reason it.next() has been changed to next(it) in recent versions of Python - it makes more sense.
Some of these are close relatives of operators. There's syntax for calling __iter__ and __next__ - it's called the for loop. For consistency, a function is better. And it makes it better for certain optimisations.
Some of the functions are simply way too similar to the rest in some way - repr acts like str does. Having str(x) versus x.repr() would be confusing.
Some of them rarely use the actual implementation method, for example isinstance.
Some of them are actual operators, getattr(x, 'a') is another way of doing x.a and getattr shares many of the aforementioned qualities.
I personally call the first group method-like and the second group operator-like. It's not a very good distinction, but I hope it helps somehow.
Having said this, len doesn't exactly fit in the second group. It's more close to the operations in the first one, with the only difference that it's way more common than almost any of them. But the only thing that it does is calling __len__, and it's very close to L.index. However, there are some differences. For example, __len__ might be called for the implementation of other features, such as bool, if the method was called len you might break bool(x) with custom len method that does completely different thing.
In short, you have a set of very common features that classes might implement that might be accessed through an operator, through a special function (that usually does more than the implementation, as an operator would), during object construction, and all of them share some common traits. All the rest is a method. And len is somewhat of an exception to that rule.
There is not a lot to add to the above two posts, but all the "magic" functions are not really magic at all. They are part of the __ builtins__ module which is implicitly/automatically imported when the interpreter starts. I.e.:
from __builtins__ import *
happens every time before your program starts.
I always thought it would be more correct if Python only did this for the interactive shell, and required scripts to import the various parts from builtins they needed. Also probably different __ main__ handling would be nice in shells vs interactive. Anyway, check out all the functions, and see what it is like without them:
dir (__builtins__)
...
del __builtins__
Perhaps, you have noticed it is possible to use certain built-in methods (ex. len(my_list_or_my_string)), and syntaxes (ex. my_list_or_my_string[:3], my_fancy_dict['some_key']) on some native types such as list, dict. Maybe you have been curious as to why it is not possible (yet) to use these same syntaxes on some of the classes you have written.
Variables of native types (list, dict, int, str) have unique behaviours and respond to certain syntaxes because they have some special methods defined in their respective classes — these methods are called Magic Methods.
A few magic methods include: __len__, __gt__, __eq__, etc.
Read more here: https://tomisin.dev/blog/supercharging-python-classes-with-magic-methods

Categories

Resources