I am new to python. I think non-class objects do not have bases attribute whereas class objects do have it. But I am not sure. How does python\cpython checks if an object is non-class or class and passes the correct arguments to the object's descriptor attribute accordingly during the attribute access?
============================================
updated:
I was learning how __getattribute__ and descriptor cooperate together to make bounded methods. I was wondering how class object & non-class object invokes the descriptor's __get__ differently. I thought those 2 types of objects shared the same __getattribute__ CPython function and that same function would have to know if the invoking object was a class or non-class. But I was wrong. This article explains it well:
http://docs.python.org/dev/howto/descriptor.html#functions-and-methods
So class object use type.__getattribute__ whereas non-class object use object.__getattribute__. They are different CPython functions. And super has a third __getattribute__ CPython implementation as well.
However, about the super one, the above article states that:
quote and quote
The object returned by super() also has a custom _getattribute_() method for invoking descriptors. The call super(B, obj).m() searches obj._class_._mro_ for the base class A immediately following B and then returns A._dict_['m']._get_(obj, A). If not a descriptor, m is returned unchanged. If not in the dictionary, m reverts to a search using object._getattribute_().
The statement above didn't seem to match my experiment with Python3.1. What I saw is, which is reasonable to me:
super(B, obj).m ---> A.__dict__['m'].__get__(obj, type(obj))
objclass = type(obj)
super(B, objclass).m ---> A.__dict__['m'].__get__(None, objclass)
A was never passed to __get__
It is reasonable to me because I believe objclass (rather than A) 's mro chain is the one needed within m especially for the second case.
Was I doing something wrong? Or I didn't understand it correctly?
As the commenters asked: Why do you care? Usually that's a sign of not using Python the way it was meant to be used.
A very powerful concept of Python is duck typing. You don't care about the type or class of an object as long as it exposes the attributes you need.
how about inspect.isclass(objectname)?
more info here: http://docs.python.org/library/inspect.html
Related
I stumbled upon this nice trick to dynamically assign a bound method to a class instance in Python:
class X: pass
def f(self): pass
x = X()
x.f = f.__get__(x)
What I want to know is where this behavior is specified in the reference. Here's the closest I've found:
PEP 590
Descriptor HowTo Guide
I'd like to know if this behavior is in fact specified in the language reference somewhere.
It seems like an important enough use case to be guaranteed by the documentation (i.e. it's not clear if what appears in a HowTo demonstrates a guaranteed language feature or makes use of an implementation detail, and I'd like to think that, in principle, all guaranteed functionality can be deduced from the spec without referring to PEPs).
You're probably looking for this bit`:
object.__get__(self, instance, owner=None)
Called to get the attribute of the owner class (class attribute access) or of an instance of that class (instance attribute access). The optional owner argument is the owner class.
You're essentially calling function.__get__, whose rather simple implementation (in CPython anyway) is here; it basically calls PyMethod_New, which basically just binds a function with a self.
Lately, I've been studying Python's class instantiation process to really understand what happen under the hood when creating a class instance. But, while playing around with test code, I came across something I don't understand.
Consider this dummy class
class Foo():
def test(self):
print("I'm using test()")
Normally, if I wanted to use Foo.test instance method, I would go and create an instance of Foo and call it explicitly like so,
foo_inst = Foo()
foo_inst.test()
>>>> I'm using test()
But, I found that calling it that way ends up with the same result,
Foo.test(Foo)
>>>> I'm using test()
Here I don't actually create an instance, but I'm still accessing Foo's instance method. Why and how is this working in the context of Python ? I mean self normally refers to the current instance of the class, but I'm not technically creating a class instance in this case.
print(Foo()) #This is a Foo object
>>>><__main__.Foo object at ...>
print(Foo) #This is not
>>>> <class '__main__.Foo'>
Props to everyone that led me there in the comments section.
The answer to this question rely on two fundamentals of Python:
Duck-typing
Everything is an object
Indeed, even if self is Python's idiom to reference the current class instance, you technically can pass whatever object you want because of how Python handle typing.
Now, the other confusion that brought me here is that I wasn't creating an object in my second example. But, the thing is, Foo is already an object internally.
This can be tested empirically like so,
print(type(Foo))
<class 'type'>
So, we now know that Foo is an instance of class type and therefore can be passed as self even though it is not an instance of itself.
Basically, if I were to manipulate self as if it was a Foo object in my test method, I would have problem when calling it like my second example.
A few notes on your question (and answer). First, everything is, really an object. Even a class is an object, so, there is the class of the class (called metaclass) which is type in this case.
Second, more relevant to your case. Methods are, more or less, class, not instance attributes. In python, when you have an object obj, instance of Class, and you access obj.x, python first looks into obj, and then into Class. That's what happens when you access a method from an instance, they are just special class attributes, so they can be access from both instance and class. And, since you are not using any instance attributes of the self that should be passed to test(self) function, the object that is passed is irrelevant.
To understand that in depth, you should read about, descriptor protocol, if you are not familiar with it. It explains a lot about how things work in python. It allows python classes and objects to be essentially dictionaries, with some special attributes (very similar to javascript objects and methods)
Regarding the class instantiation, see about __new__ and metaclasses.
This question is in relation to posts at What does 'super' do in Python? , How do I initialize the base (super) class? , and Python: How do I make a subclass from a superclass? which describes two ways to initialize a SuperClass from within a SubClass as
class SuperClass:
def __init__(self):
return
def superMethod(self):
return
## One version of Initiation
class SubClass(SuperClass):
def __init__(self):
SuperClass.__init__(self)
def subMethod(self):
return
or
class SuperClass:
def __init__(self):
return
def superMethod(self):
return
## Another version of Initiation
class SubClass(SuperClass):
def __init__(self):
super(SubClass, self).__init__()
def subMethod(self):
return
So I'm a little confused about needing to explicitly pass self as a parameter in
SuperClass.__init__(self)
and
super(SubClass, self).__init__().
(In fact if I call SuperClass.__init__() I get the error
TypeError: __init__() missing 1 required positional argument: 'self'
). But when calling constructors or any other class method (ie :
## Calling class constructor / initiation
c = SuperClass()
k = SubClass()
## Calling class methods
c.superMethod()
k.superMethod()
k.subMethod()
), The self parameter is passed implicitly .
My understanding of the self keyword is it is not unlike the this pointer in C++, whereas it provides a reference to the class instance. Is this correct?
If there would always be a current instance (in this case SubClass), then why does self need to be explicitly included in the call to SuperClass.__init__(self)?
Thanks
This is simply method binding, and has very little to do with super. When you can x.method(*args), Python checks the type of x for a method named method. If it finds one, it "binds" the function to x, so that when you call it, x will be passed as the first parameter, before the rest of the arguments.
When you call a (normal) method via its class, no such binding occurs. If the method expects its first argument to be an instance (e.g. self), you need to pass it in yourself.
The actual implementation of this binding behavior is pretty neat. Python objects are "descriptors" if they have a __get__ method (and/or __set__ or __delete__ methods, but those don't matter for methods). When you look up an attribute like a.b, Python checks the class of a to see if it has a attribute b that is a descriptor. If it does, it translates a.b into type(a).b.__get__(a, type(a)). If b is a function, it will have a __get__ method that implements the binding behavior I described above. Other kinds of descriptors can have different behaviors. For instance, the classmethod decorator replaces a method with a special descriptor that binds the function the class, rather than the instance.
Python's super creates special objects that handle attribute lookups differently than normal objects, but the details don't matter too much for this issue. The binding behavior of methods called through super is just like what I described in the first paragraph, so self gets passed automatically to the bound method when it is called. The only thing special about super is that it may bind a different function than you'd get lookup up the same method name on self (that's the whole point of using it).
The following example might elucidate things:
class Example:
def method(self):
pass
>>> print(Example.method)
<unbound method Example.method>
>>> print(Example().method)
<bound method Example.method of <__main__.Example instance at 0x01EDCDF0>>
When a method is bound, the instance is passed implicitly. When a method is unbound, the instance needs to be passed explicitly.
The other answers will definitely offer some more detail on the binding process, but I think it's worth showing the above snippet.
The answer is non-trivial and would probably warrant a good article. A very good explanation of how super() works is brilliantly given by Raymond Hettinger in a Pycon 2015 talk, available here and a related article.
I will attempt a short answer and if it is not sufficient I (and hopefully the community) will expand on it.
The answer has two key pieces:
Python's super() needs to have an object on which the method being overridden is called, so it is explicitly passed with self. This is not the only possible implementation and in fact, in Python 3, it is no longer required that you pass the self instance.
Python super() is not like Java, or other compiled languages, super. Python's implementation is designed to support the multiple collaborative inheritance paradigm, as explained in Hettinger's talk.
This has an interesting consequence in Python: the method resolution in super() depends not only on the parent class, but on the children classes as well (consequence of multiple inheritance). Note that Hettinger is using Python 3.
The official Python 2.7 documentation on super is also a good source of information (better understood after watching the talk, in my opinion).
Because in SuperClass.__init__(self), you're calling the method on the class, not the instance, so it cannot be passed implicitly. Similarly you cannot just call SubClass.subMethod(), but you can call SubClass.subMethod(k) and it'll be equivalent to k.subMethod(). Similarly if self refers to a SubClass then self.__init__() means SubClass.__init__(self), so if you want to call SuperClass.__init you have to call it directly.
Recently I faced a problem in a C-based python extension while trying to instantiate objects without calling its constructor -- which is a requirement of the extension.
The class to be used to create instances is obtained dynamically: at some point, I have an instance x whose class I wish to use to create other instances, so I store x.__class__ for later use -- let this value be klass.
At a later point, I invoke PyInstance_NewRaw(klass, PyDict_New()) and then, the problem arises. It seems that if klass is an old-style class, the result of that call is the desired new instance. However, if it is a new-style class, the result is NULL and the exception raised is:
SystemError: ../Objects/classobject.c:521: bad argument to internal function
For the record, I'm using Python version 2.7.5. Googling around, I observed no more than one other person looking for a solution (and it seemed to me he was doing a workaround, but didn't detailed it).
For the record #2: the instances the extension is creating are proxies for these same x instances -- the x.__class__ and x.__dict__'s are known, so the extension is spawning new instances based on __class__ (using the aforementioned C function) and setting the respective __dict__ to the new instance (those __dict__'s have inter-process shared-memory data). Not only is conceptually problematic to call an instance's __init__ a second time (first: it's state is already know, second: the expected behavior for ctors is that they should be called exactly once for each instance), it is also impractical, since the extension cannot figure out the arguments and their order to call the __init__() for each instance in the system. Also, changing the __init__ of each class in the system whose instances may be proxies and making them aware there is a proxy mechanism they will be subjected to is conceptually problematic (they shouldn't know about it) and impractical.
So, my question is: how to perform the same behavior of PyInstance_NewRaw regardless of the instance's class style?
The type of new-style classes isn't instance, it's the class itself. So, the PyInstance_* methods aren't even meaningful for new-style classes.
In fact, the documentation explicitly explains this:
Note that the class objects described here represent old-style classes, which will go away in Python 3. When creating new types for extension modules, you will want to work with type objects (section Type Objects).
So, you will have to write code that checks whether klass is an old-style or new-style class and does the appropriate thing for each case. An old-style class's type is PyClass_Type, while a new-style class's type is either PyType_Type, or a custom metaclass.
Meanwhile, there is no direct equivalent of PyInstance_NewRaw for new-style classes. Or, rather, the direct equivalent—calling its tp_alloc slot and then adding a dict—will give you a non-functional class. You could try to duplicate all the other appropriate work, but that's going to be tricky. Alternatively, you could use tp_new, but that will do the wrong thing if there's a custom __new__ function in the class (or any of its bases). See the rejected patches from #5180 for some ideas.
But really, what you're trying to do is probably not a good idea in the first place. Maybe if you explained why this is a requirement, and what you're trying to do, there would be a better way to do it.
If the goal is to build objects by creating a new uninitialized instance of the class, then copying over its _dict__ from an initialized prototype, there's a much easier solution that I think will work for you:
__class__ is a writeable attribute. So (showing it in Python; the C API is basically the same, just a lot more verbose, and I'd probably screw up the refcounting somewhere):
class NewStyleDummy(object):
pass
def make_instance(cls, instance_dict):
if isinstance(cls, types.ClassType):
obj = do_old_style_thing(cls)
else:
obj = NewStyleDummy()
obj.__class__ = cls
obj.__dict__ = instance_dict
return obj
The new object will be an instance of cls—in particular, it will have the same class dictionary, including the MRO, metaclass, etc.
This won't work if cls has a metaclass that's required for its construction, or a custom __new__ method, or __slots__… but then your design of copying over the __dict__ doesn't make any sense in those cases anyway. I believe that in any case where anything could possibly work, this simple solution will work.
Calling cls.__new__ seems like a good solution at first, but it actually isn't. Let me explain the background.
When you do this:
foo = Foo(1, 2)
(where Foo is a new-style class), it gets converted into something like this pseudocode:
foo = Foo.__new__(1, 2)
if isinstance(foo, Foo):
foo.__init__(1, 2)
The problem is that, if Foo or one of its bases has defined a __new__ method, it will expect to get the arguments from the constructor call, just like an __init__ method will.
As you explained in your question, you don't know the constructor call arguments—in fact, that's the main reason you can't call the normal __init__ method in the first place. So, you can't call __new__ either.
The base implementation of __new__ accepts and ignores any arguments it's given. So, if none of your classes has a __new__ override or a __metaclass__, you will happen to get away with this, because of a quirk in object.__new__ (a quirk which works differently in Python 3.x, by the way). But those are the exact same cases the previous solution can handle, and that solution works for much more obvious reason.
Put another way: The previous solution depends on nobody defining __new__ because it never calls __new__. This solution depends on nobody defining __new__ because it calls __new__ with the wrong arguments.
class Foo(object):
pass
foo = Foo()
def bar(self):
print 'bar'
Foo.bar = bar
foo.bar() #bar
Coming from JavaScript, if a "class" prototype was augmented with a certain attribute. It is known that all instances of that "class" would have that attribute in its prototype chain, hence no modifications has to be done on any of its instances or "sub-classes".
In that sense, how can a Class-based language like Python achieve Monkey patching?
The real question is, how can it not? In Python, classes are first-class objects in their own right. Attribute access on instances of a class is resolved by looking up attributes on the instance, and then the class, and then the parent classes (in the method resolution order.) These lookups are all done at runtime (as is everything in Python.) If you add an attribute to a class after you create an instance, the instance will still "see" the new attribute, simply because nothing prevents it.
In other words, it works because Python doesn't cache attributes (unless your code does), because it doesn't use negative caching or shadowclasses or any of the optimization techniques that would inhibit it (or, when Python implementations do, they take into account the class might change) and because everything is runtime.
I just read through a bunch of documentation, and as far as I can tell, the whole story of how foo.bar is resolved, is as follows:
Can we find foo.__getattribute__ by the following process? If so, use the result of foo.__getattribute__('bar').
(Looking up __getattribute__ will not cause infinite recursion, but the implementation of it might.)
(In reality, we will always find __getattribute__ in new-style objects, as a default implementation is provided in object - but that implementation is of the following process. ;) )
(If we define a __getattribute__ method in Foo, and access foo.__getattribute__, foo.__getattribute__('__getattribute__') will be called! But this does not imply infinite recursion - if you are careful ;) )
Is bar a "special" name for an attribute provided by the Python runtime (e.g. __dict__, __class__, __bases__, __mro__)? If so, use that. (As far as I can tell, __getattribute__ falls into this category, which avoids infinite recursion.)
Is bar in the foo.__dict__ dict? If so, use foo.__dict__['bar'].
Does foo.__mro__ exist (i.e., is foo actually a class)? If so,
For each base-class base in foo.__mro__[1:]:
(Note that the first one will be foo itself, which we already searched.)
Is bar in base.__dict__? If so:
Let x be base.__dict__['bar'].
Can we find (again, recursively, but it won't cause a problem) x.__get__?
If so, use x.__get__(foo, foo.__class__).
(Note that the function bar is, itself, an object, and the Python compiler automatically gives functions a __get__ attribute which is designed to be used this way.)
Otherwise, use x.
For each base-class base of foo.__class__.__mro__:
(Note that this recursion is not a problem: those attributes should always exist, and fall into the "provided by the Python runtime" case. foo.__class__.__mro__[0] will always be foo.__class__, i.e. Foo in our example.)
(Note that we do this even if foo.__mro__ exists. This is because classes have a class, too: its name is type, and it provides, among other things, the method used to calculate __mro__ attributes in the first place.)
Is bar in base.__dict__? If so:
Let x be base.__dict__['bar'].
Can we find (again, recursively, but it won't cause a problem) x.__get__?
If so, use x.__get__(foo, foo.__class__).
(Note that the function bar is, itself, an object, and the Python compiler automatically gives functions a __get__ attribute which is designed to be used this way.)
Otherwise, use x.
If we still haven't found something to use: can we find foo.__getattr__ by the preceding process? If so, use the result of foo.__getattr__('bar').
If everything failed, raise AttributeError.
bar.__get__ is not really a function - it's a "method-wrapper" - but you can imagine it being implemented vaguely like this:
# Somewhere in the Python internals
class __method_wrapper(object):
def __init__(self, func):
self.func = func
def __call__(self, obj, cls):
return lambda *args, **kwargs: func(obj, *args, **kwargs)
# Except it actually returns a "bound method" object
# that uses cls for its __repr__
# and there is a __repr__ for the method_wrapper that I *think*
# uses the hashcode of the underlying function, rather than of itself,
# but I'm not sure.
# Automatically done after compiling bar
bar.__get__ = __method_wrapper(bar)
The "binding" that happens within the __get__ automatically attached to bar (called a descriptor), by the way, is more or less the reason why you have to specify self parameters explicitly for Python methods. In Javascript, this itself is magical; in Python, it is merely the process of binding things to self that is magical. ;)
And yes, you can explicitly set a __get__ method on your own objects and have it do special things when you set a class attribute to an instance of the object and then access it from an instance of that other class. Python is extremely reflective. :) But if you want to learn how to do that, and get a really full understanding of the situation, you have a lot of reading to do. ;)