How to inherit mathematical operations? - python

For a group of objects that are number like (called an ordered field) you only need the following things:
Addition
Multiplication
Negation
Reciprocal
LessThanEqual
And the rest (like subtraction and equality) follow. Obvioulsy, I will also need to add things like __init__ and __str__, but what type of object can I inherit from that will supply the other operators? Some other operators that I wish would be inferred from the above include:
Subtraction
Division
Absolute Value
All other comparison operators
Etc...

Take a look at the numbers module. It has abstract base classes for numeric types.
Also take a look at the list of magic methods related to numerical types:
http://www.rafekettler.com/magicmethods.html#numeric

While not a complete answer, for comparisons, there is functools.total_ordering.

You need to override operators for that.
The complete method is well documented here: http://docs.python.org/2/reference/datamodel.html

Related

Why are set methods like .intersection() not supported on set-like objects?

In Python 3.7, I'd like to calculate the intersection of two dictionaries' keys. To do this, I'd like to call the .intersection() method on their keys(), however it does not work.
.keys() produces a set-like object, however most set methods don't work on it. What works however is the extremely unknown bitwise operator overloads for set-like objects, like &.
m = {'a':1, 'b':2}
n = {'b':3, 'c':4}
m.keys().intersection(n.keys()) # Pythonic, but doesn't work
m.keys() & n.keys() # works but not readable
set(m.keys()).intersection(set(n.keys())) # works, readable, but too verbose
I find that the & overload on a set-like object is extremely rarely used and is not known by most programmers. Method names like .intersection() or .union() is self-documenting and definitely more Pythonic by this definition.
Why isn't it supported then? Even the documentation lists the & and .intersection() methods like aliases, not mentioning that only & is supported on set-like objects.
note: For some reason, in IPython the autocomplete lists .isdisjoin() as a method which is available on dict.keys(). Out of the 17 set methods, 1 is present.
dict views only guarantee the API of collections.abc.Set, not to be equivalent to set itself:
For set-like views, all of the operations defined for the abstract base class collections.abc.Set are available (for example, ==, <, or ^).
So they're not claiming to match set, or even frozenset, just collections.abc.Set collections.abc.Set doesn't require any of the named methods aside from isdisjoint (which has no operator equivalent).
As for why collections.abc.Set doesn't require the named methods, it may be because they are somewhat more complex (most take varargs, and the varargs can be any iterable, not just other set-like things, so you can, for example, intersection with many iterables at once), and they may have wanted to limit the complexity required to implement new subclasses (especially virtual subclasses, that wouldn't inherit any of the methods collections.abc.Set might choose to provide).
All that said, I generally agree with your point that it seems needlessly inconsistent to omit the method forms. I'd recommend you open a bug on the Python bug tracker requesting the change; just because it only guarantees Set compatibility doesn't mean it can't do more.
The format should be
set.intersection(*others)
where other is any iterable.
m.keys() is dict_keys not a set so that won't work.
set(m.keys()).intersection(n.keys())
will work :)

Is "Changing the behavior of an operator like + so it works with a programmer-defined type," a good definition of operator overloading?

In "Think Python: How to Think Like a Computer Scientist", the author defines operator overloading as:
Changing the behavior of an operator like + so it works with a
programmer-defined type.
Is this an accurate definition of it (in programming in general, and in Python in specific)? Isn't it: "The ability to use the same operator for different operations?" For example, in Python, we an use + to add to numbers, or to concatenate two sequences. Isn't this operator overloading? Isn't the + operator overloaded here?
Does the author means by "the behavior of an operator", raising a TypeError because it's not implemented for the given class? Because the operator still has its behavior with other types (e.g. strings).
Is the definition the author wrote, a specific type of operator overloading?
The definition given in "How to think..." is correct.
It isn't specific for Python, C++ has the same concept.
The programmer can give an operator a new meaning, e.g. adding two vectors with a + instead of two scalar numbers.
The mere fact that an operator can be used on multiple datatypes natively doesn't have a specific naming. In almost any language + can be used to add integers or floats. There's no special word for this and many programmers aren't even aware of the difference.

Comparison of objects of distinct types

What's the suggested Python semantics for ordering of objects of distinct types? In other words, what behavior should one implement when a custom-written comparison method (using rich comparisons like e.g. __lt__, but perhaps also when using the ‘poor’ comparison __cmp__ in Python 2) encounters an object of a different type than self?
Should one invent an order, e.g. “all unexpected objects compare as less than my own type”?
Should one throw a TypeError?
Is there some easy way to let the other object have a try, i.e. if one does foo < bar and foo.__lt__ doesn't know about bar's type, can it fall back to bar.__gt__?
Are there any guidelines at all about how to achieve sane ordering of objects of distinct types, preferrably a total order but perhaps less?
Is there any part in the documentation which explains why 3 < "3"?
PEP 207 apparently leaves a lot of freedom of how things can be implemented, but nevertheless I expect there might be some guidelines how things should be implemented to help interoperability.
While writing the question, the “similar questions” list made me aware of this post. It mentions NotImplemented as a return value (not exception) of rich comparisons. Searching for this keyword in the docs turned up relevant parts as well:
Python 2 data model:
NotImplemented – This type has a single value. There is a single object with this value. This object is accessed through the built-in name NotImplemented. Numeric methods and rich comparison methods may return this value if they do not implement the operation for the operands provided. (The interpreter will then try the reflected operation, or some other fallback, depending on the operator.) Its truth value is true.
And later on:
A rich comparison method may return the singleton NotImplemented if it does not implement the operation for a given pair of arguments. By convention, False and True are returned for a successful comparison. However, these methods can return any value, so if the comparison operator is used in a Boolean context (e.g., in the condition of an if statement), Python will call bool() on the value to determine if the result is true or false.
PEP 207:
If the function cannot compare the particular combination of objects, it should return a new reference to Py_NotImplemented.
This answers points 2. and 3. of my original question, at least for the rich comparison scenario. The docs for __cmp__ don't mention NotImplemented at all, which might be the reason why I missed that at first. So this is one more reason to switch to rich comparisons.
I'm still not sure whether returning that value is to be preferred to inventing an order. I guess a lot depends on what ideas the other object might have. And I fear that for a mix of types to achieve any kind of sane orderings, a lot of cooperation might be needed. But perhaps someone else can shed more light on the other parts of my question.

Difference between operators and methods

Is there any substantial difference between operators and methods?
The only difference I see is the way the are called, do they have other differences?
For example in Python concatenation, slicing, indexing are defined as operators, while (referring to strings) upper(), replace(), strip() and so on are methods.
If I understand question currectly...
In nutshell, everything is a method of object. You can find "expression operators" methods in python magic class methods, in the operators.
So, why python has "sexy" things like [x:y], [x], +, -? Because it is common things to most developers, even to unfamiliar with development people, so math functions like +, - will catch human eye and he will know what happens. Similar with indexing - it is common syntax in many languages.
But there is no special ways to express upper, replace, strip methods, so there is no "expression operators" for it.
So, what is different between "expression operators" and methods, I'd say just the way it looks.
Your question is rather broad. For your examples, concatenation, slicing, and indexing are defined on strings and lists using special syntax (e.g., []). But other types may do things differently.
In fact, the behavior of most (I think all) of the operators is constrolled by magic methods, so really when you write something like x + y a method is called under the hood.
From a practical perspective, one of the main differences is that the set of available syntactic operators is fixed and new ones cannot be added by your Python code. You can't write your own code to define a new operator called $ and then have x $ y work. On the other hand, you can define as many methods as you want. This means that you should choose carefully what behavior (if any) you assign to operators; since there are only a limited number of operators, you want to be sure that you don't "waste" them on uncommon operations.
Is there any substantial difference between operators and
methods?
Practically speaking, there is no difference because each operator is mapped to a specific Python special method. Moreover, whenever Python encounters the use of an operator, it calls its associated special method implicitly. For example:
1 + 2
implicitly calls int.__add__, which makes the above expression equivalent1 to:
(1).__add__(2)
Below is a demonstration:
>>> class Foo:
... def __add__(self, other):
... print("Foo.__add__ was called")
... return other + 10
...
>>> f = Foo()
>>> f + 1
Foo.__add__ was called
11
>>> f.__add__(1)
Foo.__add__ was called
11
>>>
Of course, actually using (1).__add__(2) in place of 1 + 2 would be inefficient (and ugly!) because it involves an unnecessary name lookup with the . operator.
That said, I do not see a problem with generally regarding the operator symbols (+, -, *, etc.) as simply shorthands for their associated method names (__add__, __sub__, __mul__, etc.). After all, they each end up doing the same thing by calling the same method.
1Well, roughly equivalent. As documented here, there is a set of special methods prefixed with the letter r that handle reflected operands. For example, the following expression:
A + B
may actually be equivalent to:
B.__radd__(A)
if A does not implement __add__ but B implements __radd__.

Why does Python use 'magic methods'?

I'm a bit surprised by Python's extensive use of 'magic methods'.
For example, in order for a class to declare that instances have a "length", it implements a __len__ method, which it is called when you write len(obj). Why not just define a len method which is called directly as a member of the object, e.g. obj.len()?
See also: Why does Python code use len() function instead of a length method?
AFAIK, len is special in this respect and has historical roots.
Here's a quote from the FAQ:
Why does Python use methods for some
functionality (e.g. list.index()) but
functions for other (e.g. len(list))?
The major reason is history. Functions
were used for those operations that
were generic for a group of types and
which were intended to work even for
objects that didn’t have methods at
all (e.g. tuples). It is also
convenient to have a function that can
readily be applied to an amorphous
collection of objects when you use the
functional features of Python (map(),
apply() et al).
In fact, implementing len(), max(),
min() as a built-in function is
actually less code than implementing
them as methods for each type. One can
quibble about individual cases but
it’s a part of Python, and it’s too
late to make such fundamental changes
now. The functions have to remain to
avoid massive code breakage.
The other "magical methods" (actually called special method in the Python folklore) make lots of sense, and similar functionality exists in other languages. They're mostly used for code that gets called implicitly when special syntax is used.
For example:
overloaded operators (exist in C++ and others)
constructor/destructor
hooks for accessing attributes
tools for metaprogramming
and so on...
From the Zen of Python:
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
This is one of the reasons - with custom methods, developers would be free to choose a different method name, like getLength(), length(), getlength() or whatsoever. Python enforces strict naming so that the common function len() can be used.
All operations that are common for many types of objects are put into magic methods, like __nonzero__, __len__ or __repr__. They are mostly optional, though.
Operator overloading is also done with magic methods (e.g. __le__), so it makes sense to use them for other common operations, too.
Python uses the word "magic methods", because those methods really performs magic for you program. One of the biggest advantages of using Python's magic methods is that they provide a simple way to make objects behave like built-in types. That means you can avoid ugly, counter-intuitive, and nonstandard ways of performing basic operators.
Consider a following example:
dict1 = {1 : "ABC"}
dict2 = {2 : "EFG"}
dict1 + dict2
Traceback (most recent call last):
File "python", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'dict' and 'dict'
This gives an error, because the dictionary type doesn't support addition. Now, let's extend dictionary class and add "__add__" magic method:
class AddableDict(dict):
def __add__(self, otherObj):
self.update(otherObj)
return AddableDict(self)
dict1 = AddableDict({1 : "ABC"})
dict2 = AddableDict({2 : "EFG"})
print (dict1 + dict2)
Now, it gives following output.
{1: 'ABC', 2: 'EFG'}
Thus, by adding this method, suddenly magic has happened and the error you were getting earlier, has gone away.
I hope, it makes things clear to you. For more information, refer to:
A Guide to Python's Magic Methods (Rafe Kettler, 2012)
Some of these functions do more than a single method would be able to implement (without abstract methods on a superclass). For instance bool() acts kind of like this:
def bool(obj):
if hasattr(obj, '__nonzero__'):
return bool(obj.__nonzero__())
elif hasattr(obj, '__len__'):
if obj.__len__():
return True
else:
return False
return True
You can also be 100% sure that bool() will always return True or False; if you relied on a method you couldn't be entirely sure what you'd get back.
Some other functions that have relatively complicated implementations (more complicated than the underlying magic methods are likely to be) are iter() and cmp(), and all the attribute methods (getattr, setattr and delattr). Things like int also access magic methods when doing coercion (you can implement __int__), but do double duty as types. len(obj) is actually the one case where I don't believe it's ever different from obj.__len__().
They are not really "magic names". It's just the interface an object has to implement to provide a given service. In this sense, they are not more magic than any predefined interface definition you have to reimplement.
While the reason is mostly historic, there are some peculiarities in Python's len that make the use of a function instead of a method appropriate.
Some operations in Python are implemented as methods, for example list.index and dict.append, while others are implemented as callables and magic methods, for example str and iter and reversed. The two groups differ enough so the different approach is justified:
They are common.
str, int and friends are types. It makes more sense to call the constructor.
The implementation differs from the function call. For example, iter might call __getitem__ if __iter__ isn't available, and supports additional arguments that don't fit in a method call. For the same reason it.next() has been changed to next(it) in recent versions of Python - it makes more sense.
Some of these are close relatives of operators. There's syntax for calling __iter__ and __next__ - it's called the for loop. For consistency, a function is better. And it makes it better for certain optimisations.
Some of the functions are simply way too similar to the rest in some way - repr acts like str does. Having str(x) versus x.repr() would be confusing.
Some of them rarely use the actual implementation method, for example isinstance.
Some of them are actual operators, getattr(x, 'a') is another way of doing x.a and getattr shares many of the aforementioned qualities.
I personally call the first group method-like and the second group operator-like. It's not a very good distinction, but I hope it helps somehow.
Having said this, len doesn't exactly fit in the second group. It's more close to the operations in the first one, with the only difference that it's way more common than almost any of them. But the only thing that it does is calling __len__, and it's very close to L.index. However, there are some differences. For example, __len__ might be called for the implementation of other features, such as bool, if the method was called len you might break bool(x) with custom len method that does completely different thing.
In short, you have a set of very common features that classes might implement that might be accessed through an operator, through a special function (that usually does more than the implementation, as an operator would), during object construction, and all of them share some common traits. All the rest is a method. And len is somewhat of an exception to that rule.
There is not a lot to add to the above two posts, but all the "magic" functions are not really magic at all. They are part of the __ builtins__ module which is implicitly/automatically imported when the interpreter starts. I.e.:
from __builtins__ import *
happens every time before your program starts.
I always thought it would be more correct if Python only did this for the interactive shell, and required scripts to import the various parts from builtins they needed. Also probably different __ main__ handling would be nice in shells vs interactive. Anyway, check out all the functions, and see what it is like without them:
dir (__builtins__)
...
del __builtins__
Perhaps, you have noticed it is possible to use certain built-in methods (ex. len(my_list_or_my_string)), and syntaxes (ex. my_list_or_my_string[:3], my_fancy_dict['some_key']) on some native types such as list, dict. Maybe you have been curious as to why it is not possible (yet) to use these same syntaxes on some of the classes you have written.
Variables of native types (list, dict, int, str) have unique behaviours and respond to certain syntaxes because they have some special methods defined in their respective classes — these methods are called Magic Methods.
A few magic methods include: __len__, __gt__, __eq__, etc.
Read more here: https://tomisin.dev/blog/supercharging-python-classes-with-magic-methods

Categories

Resources