Trying to understand super and __new__
Here goes my code:
class Base(object):
def __new__(cls,foo):
if cls is Base:
if foo == 1:
# return Base.__new__(Child) complains not enough arguments
return Base.__new__(Child,foo)
if foo == 2:
# how does this work without giving foo?
return super(Base,cls).__new__(Child)
else:
return super(Base,cls).__new__(cls,foo)
def __init__(self,foo):
pass
class Child(Base):
def __init__(self,foo):
Base.__init__(self,foo)
a = Base(1) # returns instance of class Child
b = Base(2) # returns instance of class Child
c = Base(3) # returns instance of class Base
d = Child(1) # returns instance of class Child
Why doesn't super.__new__ need an argument while __new__ needs it?
Python: 2.7.11
super().__new__ is not the same function as Base.__new__. super().__new__ is object.__new__. object.__new__ doesn't require a foo argument, but Base.__new__ does.
>>> Base.__new__
<function Base.__new__ at 0x000002243340A730>
>>> super(Base, Base).__new__
<built-in method __new__ of type object at 0x00007FF87AD89EC0>
>>> object.__new__
<built-in method __new__ of type object at 0x00007FF87AD89EC0>
What may be confusing you is this line:
return super(Base,cls).__new__(cls, foo)
This calls object.__new__(cls, foo). That's right, it passes a foo argument to object.__new__ even though object.__new__ doesn't need it. This is allowed in python 2, but would crash in python 3. It would be best to remove the foo argument from there.
##### Case 1, use property #####
class Case1:
# ignore getter and setter for property
var = property(getter, setter, None, None)
##### Case 2, use equivalent methods ####
class Descriptor:
def __get__(self, obj, type=None):
return None
def __set__(self, obj, val):
pass
class Case2:
var = Descriptor()
My question is:
When I use 'property' to control the access of one variable,
instance.var will correctly return the real value,
while Class.var will return the property object itself (e.g. property object at 0x7fc5215ac788)
But when I use equivalent methods (e.g. descriptor) and override __get__ and __set__ methods,
both instance.var and Class.var can return the real value instead of the object itself.
So why they behave so differently?
I guess it is because some of default functions implemented in the my descriptor make the magic, so what are they?
update:
The reason for the above question is that __get__ function implemented in the property will determine if it is called by instance or Class, and when it is called by Class, it will return the object itself (i.e. self).
But as __set__ function does not have type or cls parameter, and based on my test, Class.var = 5 cannot be caught by __set__ function.
Therefore, I wonder what hooks we can use to customize the class variable level assignment Class.var = value?
When you do MyClass.some_descriptor, there's (obviously) no instance to be passed to the descriptor, so it is invoked with obj=None:
>>> class Desc(object):
... def __get__(self, obj, cls=None):
... print "obj : {} - cls : {}".format(obj, cls)
... return 42
...
>>> class Foo(object):
... bar = Desc()
...
>>> Foo.bar
obj : None - cls : <class '__main__.Foo'>
42
>>> Foo().bar
obj : <__main__.Foo object at 0x7fd285cf4a90> - cls : <class '__main__.Foo'>
42
>>>
In most cases (and specially with the generic property descriptor) the goal is to compute the return value based on instance attributes so there's not much you can return without the instance. In this case, most authors choose to return the descriptor instance itself so it can be correctly identified for what it is when inspecting the class.
If you want this behaviour (which makes sense for most descriptors), you just have to test obj against None and return self:
>>> class Desc2(object):
... def __get__(self, obj, cls=None):
... if obj is None:
... return self
... return 42
...
>>> Foo.baaz = Desc2()
>>> Foo.baaz
<__main__.Desc2 object at 0x7fd285cf4b10>
>>> Foo().baaz
42
>>>
And that's all the "magic" involved .
Now if you wonder why this is not the default: there are use cases for returning something else for a descriptor looked up on a class - methods for example (yes, Python functions are descriptors - their __get__ method returns a method object, which is actually a callable wrapper around the instance (if any), class and function):
>>> Foo.meth = lambda self: 42
>>> Foo.meth
<unbound method Foo.<lambda>>
>>> Foo().meth
<bound method Foo.<lambda> of <__main__.Foo object at 0x7fd285cf4bd0>>
>>> Foo.meth(Foo())
42
I am in the process of learning Python 3 and just ran into the getattr function. From what I can tell, it is invoked when the attribute call is not found in the class definition as a function or a variable.
In order to understand the behaviour, I wrote the following test class (based on what I've read):
class Test(object):
def __init__(self, foo, bar):
self.foo = foo
self.bar = bar
def __getattr__(self, itm):
if itm is 'test':
return lambda x: "%s%s" % (x.foo, x.bar)
raise AttributeError(itm)
And I then initate my object and call the non-existent function test which, expectedly, returns the reference to the function:
t = Test("Foo", "Bar")
print(t.test)
<function Test.__getattr__.<locals>.<lambda> at 0x01A138E8>
However, if I call the function, the result is not the expected "FooBar", but an error:
print(t.test())
TypeError: <lambda>() missing 1 required positional argument: 'x'
In order to get my expected results, I need to call the function with the same object as the first parameter, like this:
print(t.test(t))
FooBar
I find this behaviour rather strange, as when calling p.some_function(), is said to add p as the first argument.
I would be grateful if someone could shine some light over this headache of mine. I am using PyDev in Eclipse.
__getattr__ return values are "raw", they don't behave like class attributes, invoking the descriptor protocol that plain methods involve that causes the creation of bound methods (where self is passed implicitly). To bind the function as a method, you need to perform the binding manually:
import types
...
def __getattr__(self, itm):
if itm is 'test': # Note: This should really be == 'test', not is 'test'
# Explicitly bind function to self
return types.MethodType(lambda x: "%s%s" % (x.foo, x.bar), self)
raise AttributeError(itm)
types.MethodType is poorly documented (the interactive help is more helpful), but basically, you pass it a user-defined function and an instance of a class and it returns a bound method that, when called, implicitly passes that instance as the first positional argument (the self argument).
Note that in your specific case, you could just rely on closure scope to make a zero-argument function continue to work:
def __getattr__(self, itm):
if itm is 'test': # Note: This should really be == 'test', not is 'test'
# No binding, but referring to self captures it in closure scope
return lambda: "%s%s" % (self.foo, self.bar)
raise AttributeError(itm)
Now it's not a bound method at all, just a function that happens to have captured self from the scope in which it was defined (the __getattr__ call). Which solution is best depends on your needs; creating a bound method is slightly slower, but gets a true bound method, while relying on closure scope is (trivially, ~10ns out of >400ns) faster, but returns a plain function (which may be a problem if, for example, it's passed as a callback to code that assumes it's a bound method and can have __self__ and __func__ extracted separately for instance).
To get what you want, you need a lambda that doesn't take arguments:
return lambda: "%s%s" % (self.foo, self.bar)
But you should really use a property for this, instead.
class Test(object):
def __init__(self, foo, bar):
self.foo = foo
self.bar = bar
#property
def test(self):
return "{}{}".format(self.foo, self.bar)
t = Test("Foo", "Bar")
print(t.test)
# FooBar
Note the lack of parentheses.
If you're absolutely determined that it must be a function, do this:
class Test(object):
def __init__(self, foo, bar):
self.foo = foo
self.bar = bar
#property
def test(self):
return lambda: "{}{}".format(self.foo, self.bar)
t = Test("Foo", "Bar")
print(t.test())
# FooBar
You need to create something that behaves like a bound method, you could simply use functools.partial to bind the instance to the function:
from functools import partial
class Test(object):
def __init__(self, foo, bar):
self.foo = foo
self.bar = bar
def __getattr__(self, itm):
if itm == 'test': # you shouldn't use "is" for comparisons!
return partial(lambda x: "%s%s" % (x.foo, x.bar), self)
raise AttributeError(itm)
The test:
t = Test("Foo", "Bar")
print(t.test)
# functools.partial(<function Test.__getattr__.<locals>.<lambda> at 0x0000020C70CA6510>, <__main__.Test object at 0x0000020C7217F8D0>)
print(t.test())
# FooBar
"I find this behaviour rather strange, as when calling
p.some_function(), is said to add p as the first argument."
some_function is actually a method, which is why it gets passed an instance implicitly when the method is "bound to an object." But plain functions don't work that way, only functions defined in the class body have this magic applied to them automagically. And actually, unbound methods (accessed via the class directly) function the same as normal functions! The terminology "bound and unbound" methods no longer applies, because in Python 3 we only have methods and functions (getting rid of the distinction between unbound methods and plain functions). When an instance is instantiated, accessing the attribute returns a method which implicitly calls the instance on invocation.
>>> class A:
... def method(self, x):
... return x
...
>>> a.method
<bound method A.method of <__main__.A object at 0x101a5b3c8>>
>>> type(a.method)
<class 'method'>
However, if you access the attribute of the class you'll see it's just a function:
>>> A.method
<function A.method at 0x101a64950>
>>> type(A.method)
<class 'function'>
>>> a = A()
Now, observe:
>>> bound = a.method
>>> bound(42)
42
>>> unbound = A.method
>>> unbound(42)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: method() missing 1 required positional argument: 'x'
But this is the magic of classes. Note, you can even add functions to classes dynamically, and they get magically turned into methods when you invoke them on an instance:
>>> A.method2 = lambda self, x: x*2
>>> a2 = A()
>>> a2.method2(4)
8
And, as one would hope, the behavior still applies to objects already created!
>>> a.method2(2)
4
Note, this doesn't work if you dynamically add to an instance:
>>> a.method3 = lambda self, x: x*3
>>> a.method3(3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: <lambda>() missing 1 required positional argument: 'x'
You have to do the magic yourself:
>>> from types import MethodType
>>> a.method4 = MethodType((lambda self, x: x*4), a)
>>> a.method4(4)
16
>>>
Notice that if you do print(t.__getattr__) you get something like <bound method Test.__getattr__ of <__main__.Test object at 0x00000123FBAE4DA0>>. The key point is that methods defined on an object are said to be 'bound' and so always take the object as the first parameter. Your lambda function is just an anonymous function not 'bound' to anything, so for it to access the object it needs to be explicitly passed in.
I presume you are only doing this to experiment with using `__getattr__', as what you are doing could be much more easily achieved by making your lambda a method on the object.
Could you explain why the following code snippet doesn't work?
class A:
#staticmethod
def f():
print('A.f')
dict = {'f': f}
def callMe(g):
g()
callMe(A.dict['f'])
It yields
TypeError: 'staticmethod' object is not callable
Interesingly, changing it to
class A:
#staticmethod
def f():
print('A.f')
dict = {'f': f}
def callMe(g):
g()
callMe(A.f)
or to
class A:
#staticmethod
def f():
print('A.f')
dict = {'f': lambda: A.f()}
def callMe(g):
g()
callMe(A.dict['f'])
gives the expected result
A.f
As far as I see the behaviour is the same in Python 2 and 3.
The f object inside A is a descriptor, not the static method itself -- it returns the staticmethod when called with an instance of A; read the link, and look up the "descriptor protocol" for more info on how this works. The method itself is stored as the __func__ attribute of the descriptor.
You can see this for yourself:
>>> A.f
<function A.f at 0x7fa8acc7ca60>
>>> A.__dict__['f']
<staticmethod object at 0x7fa8acc990b8>
>>> A.__dict__['f'].__func__ # The stored method
<function A.f at 0x7fa8acc7ca60>
>>> A.__dict__['f'].__get__(A) # This is (kinda) what happens when you run A.f
<function A.f at 0x7fa8acc7ca60>
Also note that you can use A.__dict__ to access the f descriptor object, you don't need to make your own dictionary to store it.
The staticmethod object is a descriptor, and you need to access it as an attribute (of the class) for the descriptor mechanism to take effect. The staticmethod object itself is not callable, but the result of its __get__ is callable. See also this Python bug discussion.
I just realized that all user defined functions have a __get__ that allows the function to operate as descriptors when the functions are used in a class. This __get__ returns a <bound method object> when invoked in the context of an instance (myinstance.method) and returns the original function when invoked in the context of class (MyClass.method).
I was trying to get methods to behave like attributes (in the same way as with #property but without the side effect of data descriptor, i.e non-overridable from instances). I succeeded by creating a non-data descriptor that just invokes the original method on __get__ but when I discover that functions are already descriptors I tried to change the function's __get__ to do the invocation directly instead of returning a <bound method ...> but without luck.
Here is my try:
class A(object):
def m(self):
self.m = 20 # to show that .m is overridable at instance level
return 10
def alternative__get(self, instance, owner):
if instance is None:
return self
else:
return self.__func__(instance)
print(A.__dict__['m'].__get__) # => <method-wrapper '__get__' of function object ...>
A.__dict__['m'].__get__ = alternative__get
print(A.__dict__['m']) # <function m ..>
print(A.__dict__['m'].__get__) # <function alternative__get ...>
print(A.m) # <unbound method A.m>
a = A()
print(a.m) # <bound method A.m of ...>
print(a.m) # <bound method A.m of ...>
This doesn have the desired effect a.m still resolves to <bound method...> instead of returning directly the result of invoking A.m(a).
My current approach is to define a descriptor:
class attribute(object):
def __init__(self, fget):
self.fget = fget
def __get__(self, instance, owner):
if instance is None:
return self
else:
return self.fget(instance)
class A(object):
def __init__(self):
pass
#attribute
def m(self):
self.m = 20;
return 10
a = A()
print(a.m) # => 10
# a.m = 30
print(a.m) # => 20 because it's overriden at instance level
This approach works but I would like to know if it's possible to change A.m.__get__ to achieve the same effect, or why it can't work.
You can't do this by setting A.m.__get__, because the Python language internals skip the instance dict when looking up special methods like __get__. (This is so, for example, a class Foo that defines a __repr__ method uses type.__repr__ instead of Foo.__repr__ when you do repr(Foo).)