I am now trying to create a descriptor class for model fields which saves it's modification history.
I can determine the fact when some method is called on field value by just overriding getattr:
def __getattr__(self, attr):
print(attr)
return super().__getattr__(attr)
And I can see arguments of overrided methods:
def __add__(self, other):
print(self, other)
return super().__add__(other)
The problem is that += operator is just a syntactic sugar for:
foo = foo + other
So I can not handle += as single method call, it triggers __add__ and then __set__. Am I able to determine that value was not totally replaced with new one, but was added/multiplied/divided etc.?
Use __iadd__
For instance, if x is an instance of a class with an __iadd__() method, x += y is equivalent to x = x.__iadd__(y) . Otherwise, x.__add__(y) and y.__radd__(x) are considered, as with the evaluation of x + y.
Related
For example, I have a class with a field __x, which is a list:
class C():
def __init__(self, xx):
self.__x = xx
#property
def x(self):
return self.__x
#x.setter
def x(self, xx):
raise Exception("Attempt to change an immutable field")
I can prevent changes such as these:
c = C([1,2,3])
c.x = [3,2,1]
But how can I prevent a change such as this?
c.x.append(4)
In the final analysis, you cannot protect your objects from inspection and manipulation.
Also, always ask yourself "from whom, exactly?" when you want to "protect" data.
Sometimes it's just not worth the effort to code around users not reading the documentation.
That being said, you could consider return tuple(self.__x) in the getter.
On the other hand, if __x contains other mutable objects, that would not prevent a user from manipulating those inner objects. (return list(self.__x) would also return a shallow copy of the data, but with less implicit "hey, I'm supposed to be immutable!" signaling.)
Something you should definitely consider is to change self.__x = xx to self.__x = list(xx) in the __init__ method, such that users doing
var = []
c = C(var)
can't "easily" (or by mistake, and again, there could be mutable inner objects) change the state of c by mutating var.
The simplest approach would be to accept an iterable on __init__ and turn it to a tuple internally:
class C(object):
def __init__(self, iterable):
self._tuple = tuple(iterable)
#property
def x(self):
return self._tuple
#x.setter
def x(self, value):
raise RuntimeError('can\'t reset the x attribute.')
c = C([1, 2, 3])
# c.x = 'a' Will raise 'RuntimeError: can't reset the x attribute.'
print(c.x)
A design like this one makes any object instantiated from the class immutable, so that mutating operations should return new objects instead of changing the state of the current one.
Let's say for instance that you want to implement a function that increment by one each item in self.x. With this approach you need to write something similar to:
def increment_by_one(c):
return C(t+1 for t in c.x)
As there's a cost associated with creating and destroying objects the trade-offs between this approach (which prevents mutation of the x attribute) and the one suggested by #timgeb should be evaluated on your use-case.
I am trying to understand how operator overriding works for two operands of a custom class.
For instance, suppose I have the following:
class Adder:
def __init__(self, value=1):
self.data = value
def __add__(self,other):
print('using __add__()')
return self.data + other
def __radd__(self,other):
print('using __radd__()')
return other + self.data
I initialize the following variables:
x = Adder(5)
y = Adder(4)
And then proceed to do the following operations:
1 + x
using __radd__()
Out[108]: 6
x + 2
using __add__()
Out[109]: 7
The two operations above seem straigtforward. If a member of my custom class is to the right of the "+" in the addition expression, then __radd__ is used. If it is on the left, then __add__ is used. This works for expressions when one operand is of the Adder type and another one is something else.
When I do this, however, I get the following result:
x + y
using __add__()
using __radd__()
Out[110]: 9
As you can see, if both operands are of the custom class, then both __add__ and __radd__ are called.
My question is how does Python unravel this situation and how is it able to call both the right-hand-addition function, as well as the left-hand-addition function.
It's because inside your methods you add the data to other. This is itself an instance of Adder. So the logic goes:
call __add__ on x;
add x.data (an int) to y (an Adder instance)
ah, right-hand operand is an instance with a __radd__ method, so
call __radd__ on y;
add int to y.data (another int).
Usually you would check to see if other was an instance of your class, and if so add other.data rather than just other.
That's the because the implementation of your __add__ and __radd__ method do not give any special treatment to the instances of the Adder class. Therefore, each __add__ call leads to an integer plus Adder instance operation which further requires __radd__ due to the Adder instance on the right side.
You can resolve this by doing:
def __add__(self, other):
print('using __add__()')
if isinstance(other, Adder):
other = other.data
return self.data + other
def __radd__(self, other):
print('using __radd__()')
return self.__add__(other)
what gets returned when you return 'self' inside a python class? where do we exactly use return 'self'? In the below example what does self exactly returns
class Fib:
'''iterator that yields numbers in the Fibonacci sequence'''
def __init__(self, max):
self.max = max
def __iter__(self):
self.a = 0
self.b = 1
return self
def __next__(self):
fib = self.a
if fib > self.max:
raise StopIteration
self.a, self.b = self.b, self.a + self.b
print(self.a,self.b,self.c)
return fib
Python treats method calls like object.method() approximately like method(object). The docs say that "call x.f() is exactly equivalent to MyClass.f(x)". This means that a method will receive the object as the first argument. By convention in the definition of methods, this first argument is called self.
So self is the conventional name of the object owning the method.
Now, why would we want to return self? In your particular example, it is because the object implements the iterator protocol, which basically means it has __iter__ and __next__ methods. The __iter__ method must (according to the docs) "Return the iterator object itself", which is exactly what is happening here.
As an aside, another common reason for returning self is to support method chaining, where you would want to do object.method1().method2().method3() where all those methods are defined in the same class. This pattern is quite common in libraries like pandas.
The keyword self is used to refer to the instance that you are calling the method from.
This is particularly useful for chaining. In your example, let's say we want to call __next__() on an initialized Fib instance. Since __iter__() returns self, the following are equivalent :
obj = Fib(5)
obj.__iter__() # Initialize obj
obj.__next__()
And
obj = Fib(5).__iter__() # Create AND initialize obj
obj.__next__()
In your particular example, the self keyword returns the instance of the Fib class from which you are calling __iter__() (called obj in my small snippet).
Hope it'll be helpful.
Partial Answer:
When you return self, you return the class instance. For example:
class Foo:
def __init__(self, a):
self.a = a
def ret_self(self):
return self
If I create an instance and run ret_self, you will see that they both refer to the same instance:
>>> x = Foo("a")
>>> x
<__main__.Foo instance at 0x0000000002823D48>
>>> x.ret_self()
<__main__.Foo instance at 0x0000000002823D48>
In other words, both x and x.ret_self() return the same reference to that class instance.
self is actually another way of saying "this instance of Foo". Hence, instance variables are self.a in the class.
When will you need this? I don't have the experience to tell you and I do not want to give possibly misleading information that I am unsure of. I will leave it to someone else to expound on this answer.
Please do not accept this answer.
What is the difference between using a special method and just defining a normal class method? I was reading this site which lists a lot of them.
For example it gives a class like this.
class Word(str):
'''Class for words, defining comparison based on word length.'''
def __new__(cls, word):
# Note that we have to use __new__. This is because str is an immutable
# type, so we have to initialize it early (at creation)
if ' ' in word:
print "Value contains spaces. Truncating to first space."
word = word[:word.index(' ')] # Word is now all chars before first space
return str.__new__(cls, word)
def __gt__(self, other):
return len(self) > len(other)
def __lt__(self, other):
return len(self) < len(other)
def __ge__(self, other):
return len(self) >= len(other)
def __le__(self, other):
return len(self) <= len(other)
For each of those special methods why can't I just make a normal method instead, what are they doing different? I think I just need a fundamental explanation that I can't find, thanks.
It is a pythonic way to do this:
word1 = Word('first')
word2 = Word('second')
if word1 > word2:
pass
instead of direct usage of comparator method
NotMagicWord(str):
def is_greater(self, other)
return len(self) > len(other)
word1 = NotMagicWord('first')
word2 = NotMagicWord('second')
if word1.is_greater(word2):
pass
And the same with all other magic method. You define __len__ method to tell python its length using built-in len function, for example. All magic method will be called implicitly while standard operations like binary operators, object calling, comparision and a lot of other. A Guide to Python's Magic Methods is really good, read it and see what behavior you can give to your objects. It similar to operator overloading in C++, if you are familiar with it.
A method like __gt__ is called when you use comparison operators in your code. Writing something like
value1 > value2
Is the equivalent of writing
value1.__gt__(value2)
"Magic methods" are used by Python to implement a lot of its underlying structure.
For example, let's say I have a simple class to represent an (x, y) coordinate pair:
class Point(object):
def __init__(self, x, y):
self.x = x
self.y = y
So, __init__ would be an example of one of these "magic methods" -- it allows me to automatically initialize the class by simply doing Point(3, 2). I could write this without using magic methods by creating my own "init" function, but then I would need to make an explicit method call to initialize my class:
class Point(object):
def init(self, x, y):
self.x = x
self.y = y
return self
p = Point().init(x, y)
Let's take another example -- if I wanted to compare two point variables, I could do:
class Point(object):
def __init__(self, x, y):
self.x = x
self.y = y
def __eq__(self, other):
return self.x == other.x and self.y == other.y
This lets me compare two points by doing p1 == p2. In contrast, if I made this a normal eq method, I would have to be more explicit by doing p1.eq(p2).
Basically, magic methods are Python's way of implementing a lot of its syntactic sugar in a way that allows it to be easily customizable by programmers.
For example, I could construct a class that pretends to be a function by implementing __call__:
class Foobar(object):
def __init__(self, a):
self.a = a
def __call__(self, b):
return a + b
f = Foobar(3)
print f(4) # returns 7
Without the magic method, I would have to manually do f.call(4), which means I can no longer pretend the object is a function.
Special methods are handled specially by the rest of the Python language. For example, if you try to compare two Word instances with <, the __lt__ method of Word will be called to determine the result.
The magic methods are called when you use <, ==, > to compare the objects. functools has a helper called total_ordering that will fill in the missing comparison methods if you just define __eq__ and __gt__.
Because str already has all the comparison operations defined, it's necessary to add them as a mixin if you want to take advantage of total_ordering
from functools import total_ordering
#total_ordering
class OrderByLen(object):
def __eq__(self, other):
return len(self) == len(other)
def __gt__(self, other):
return len(self) > len(other)
class Word(OrderByLen, str):
'''Class for words, defining comparison based on word length.'''
def __new__(cls, word):
# Note that we have to use __new__. This is because str is an immutable
# type, so we have to initialize it early (at creation)
if ' ' in word:
print "Value contains spaces. Truncating to first space."
word = word[:word.index(' ')] # Word is now all chars before first space
return str.__new__(cls, word)
print Word('cat') < Word('dog') # False
print Word('cat') > Word('dog') # False
print Word('cat') == Word('dog') # True
print Word('cat') <= Word('elephant') # True
print Word('cat') >= Word('elephant') # False
I have a class that need to make some magic with every operator, like __add__, __sub__ and so on.
Instead of creating each function in the class, I have a metaclass which defines every operator in the operator module.
import operator
class MetaFuncBuilder(type):
def __init__(self, *args, **kw):
super().__init__(*args, **kw)
attr = '__{0}{1}__'
for op in (x for x in dir(operator) if not x.startswith('__')):
oper = getattr(operator, op)
# ... I have my magic replacement functions here
# `func` for `__operators__` and `__ioperators__`
# and `rfunc` for `__roperators__`
setattr(self, attr.format('', op), func)
setattr(self, attr.format('r', op), rfunc)
The approach works fine, but I think It would be better if I generate the replacement operator only when needed.
Lookup of operators should be on the metaclass because x + 1 is done as type(x).__add__(x,1) instead of x.__add__(x,1), but it doesn't get caught by __getattr__ nor __getattribute__ methods.
That doesn't work:
class Meta(type):
def __getattr__(self, name):
if name in ['__add__', '__sub__', '__mul__', ...]:
func = lambda:... #generate magic function
return func
Also, the resulting "function" must be a method bound to the instance used.
Any ideas on how can I intercept this lookup? I don't know if it's clear what I want to do.
For those questioning why do I need to this kind of thing, check the full code here.
That's a tool to generate functions (just for fun) that could work as replacement for lambdas.
Example:
>>> f = FuncBuilder()
>>> g = f ** 2
>>> g(10)
100
>>> g
<var [('pow', 2)]>
Just for the record, I don't want to know another way to do the same thing (I won't declare every single operator on the class... that will be boring and the approach I have works pretty fine :). I want to know how to intercept attribute lookup from an operator.
Some black magic let's you achieve your goal:
operators = ["add", "mul"]
class OperatorHackiness(object):
"""
Use this base class if you want your object
to intercept __add__, __iadd__, __radd__, __mul__ etc.
using __getattr__.
__getattr__ will called at most _once_ during the
lifetime of the object, as the result is cached!
"""
def __init__(self):
# create a instance-local base class which we can
# manipulate to our needs
self.__class__ = self.meta = type('tmp', (self.__class__,), {})
# add operator methods dynamically, because we are damn lazy.
# This loop is however only called once in the whole program
# (when the module is loaded)
def create_operator(name):
def dynamic_operator(self, *args):
# call getattr to allow interception
# by user
func = self.__getattr__(name)
# save the result in the temporary
# base class to avoid calling getattr twice
setattr(self.meta, name, func)
# use provided function to calculate result
return func(self, *args)
return dynamic_operator
for op in operators:
for name in ["__%s__" % op, "__r%s__" % op, "__i%s__" % op]:
setattr(OperatorHackiness, name, create_operator(name))
# Example user class
class Test(OperatorHackiness):
def __init__(self, x):
super(Test, self).__init__()
self.x = x
def __getattr__(self, attr):
print "__getattr__(%s)" % attr
if attr == "__add__":
return lambda a, b: a.x + b.x
elif attr == "__iadd__":
def iadd(self, other):
self.x += other.x
return self
return iadd
elif attr == "__mul__":
return lambda a, b: a.x * b.x
else:
raise AttributeError
## Some test code:
a = Test(3)
b = Test(4)
# let's test addition
print(a + b) # this first call to __add__ will trigger
# a __getattr__ call
print(a + b) # this second call will not!
# same for multiplication
print(a * b)
print(a * b)
# inplace addition (getattr is also only called once)
a += b
a += b
print(a.x) # yay!
Output
__getattr__(__add__)
7
7
__getattr__(__mul__)
12
12
__getattr__(__iadd__)
11
Now you can use your second code sample literally by inheriting from my OperatorHackiness base class. You even get an additional benefit: __getattr__ will only be called once per instance and operator and there is no additional layer of recursion involved for the caching. We hereby circumvent the problem of method calls being slow compared to method lookup (as Paul Hankin noticed correctly).
NOTE: The loop to add the operator methods is only executed once in your whole program, so the preparation takes constant overhead in the range of milliseconds.
The issue at hand is that Python looks up __xxx__ methods on the object's class, not on the object itself -- and if it is not found, it does not fall back to __getattr__ nor __getattribute__.
The only way to intercept such calls is to have a method already there. It can be a stub function, as in Niklas Baumstark's answer, or it can be the full-fledged replacement function; either way, however, there must be something already there or you will not be able to intercept such calls.
If you are reading closely, you will have noticed that your requirement for having the final method be bound to the instance is not a possible solution -- you can do it, but Python will never call it as Python is looking at the class of the instance, not the instance, for __xxx__ methods. Niklas Baumstark's solution of making a unique temp class for each instance is as close as you can get to that requirement.
It looks like you are making things too complicated. You can define a mixin class and inherit from it. This is both simpler than using metaclasses and will run faster than using __getattr__.
class OperatorMixin(object):
def __add__(self, other):
return func(self, other)
def __radd__(self, other):
return rfunc(self, other)
... other operators defined too
Then every class you want to have these operators, inherit from OperatorMixin.
class Expression(OperatorMixin):
... the regular methods for your class
Generating the operator methods when they're needed isn't a good idea: __getattr__ is slow compared to regular method lookup, and since the methods are stored once (on the mixin class), it saves almost nothing.
If you want to achieve your goal without metaclasses, you can append the following to your code:
def get_magic_wrapper(name):
def wrapper(self, *a, **kw):
print('Wrapping')
res = getattr(self._data, name)(*a, **kw)
return res
return wrapper
_magic_methods = ['__str__', '__len__', '__repr__']
for _mm in _magic_methods:
setattr(ShowMeList, _mm, get_magic_wrapper(_mm))
It reroutes the methods in _magic_methods to the self._data object, by adding these attributes to the class iteratively. To check if it works:
>>> l = ShowMeList(range(8))
>>> len(l)
Wrapping
8
>>> l
Wrapping
[0, 1, 2, 3, 4, 5, 6, 7]
>>> print(l)
Wrapping
[0, 1, 2, 3, 4, 5, 6, 7]