How is the __class__ cell value set in class methods? - python

Looking at the documentation of the super type in Python 3.5, it notes that super(…) is the same as super(__class__, «first argument to function»). To my surprise, I wrote a method that returned __class__ – and it actually worked:
>>> class c:
... def meth(self): return __class__
...
>>> c().meth()
<class '__main__.c'>
Apparently, __class__ is a free variable assigned by the closure of the function:
>>> c.meth.__code__.co_freevars
('__class__',)
>>> c.meth.__closure__
(<cell at 0x7f6346a91048: type object at 0x55823b17f3a8>,)
I'd like to know under what circumstances that free variable is associated in the closure. I know that if I assign a function to a variable as part of creating a class it doesn't happen.
>>> def meth2(self): return __class__
...
>>> meth2.__code__.co_freevars
()
Even if I create a new class and as part of that creation assign some attribute to meth2, meth2 doesn't somehow magically gain a free variable that gets filled in.
That's unsurprising, because part of this appears to depend on the lexical state of the compiler at the time that the code is compiled.
I'd like to confirm that the conditions necessary for __class__ to be treated as a free variable are simply:
A reference to __class__ in the code block; and
The def containing the __class__ reference is lexically within a class declaration block.
I'd further like to understand what the conditions necessary for that variable getting filled in correctly are. It appears – at least from the Python 3.6 documentation – that something like type.__new__(…) is involved somehow. I haven't been able to understand for sure how type comes into play and how this all interacts with metaclasses that do not ultimately call type.__new__(…).
I'm particularly confused because I didn't think that at the time the namespace's __setattr__ method was used to assign the attribute containing the method to the method function (as it exists on the ultimately-constructed class object). I know that this namespace object exists because it was either constructed implicitly by the use of the class statement, or explicitly by the metaclass's __prepare__ method – but as best I can tell, the metaclass constructs the class object that populates __class__ after the function object is set as a value within the class namespace.

In the docs for Python’s data model, § 3.3.3.6 – “Creating the class object” – you will find the following:
[The] class object is the one that will be referenced by the
zero-argument form of super(). __class__ is an implicit closure
reference created by the compiler if any methods in a class body refer
to either __class__ or super. This allows the zero argument form
of super() to correctly identify the class being defined based on
lexical scoping, while the class or instance that was used to make
the current call is identified based on the first argument passed to
the method.
…emphasis is mine. This confirms your two putative criteria for a __class__ closure happening: a “__class__” reference in the method def, which itself is defined inside a class statement.
But then, the next ¶ in “Creating the class object” goes on to say:
CPython implementation detail: In CPython 3.6 and later, the __class__ cell is passed to the metaclass as a __classcell__ entry
in the class namespace. If present, this must be propagated up to the
type.__new__ call in order for the class to be initialized
correctly. Failing to do so will result in a RuntimeError in Python
3.8.
… emphasis is theirs. This means that if you are employing a metaclass with a __new__ method – in order to dictate the terms by which classes so designated are created – for example like e.g.:
class Meta(type):
def __new__(metacls, name, bases, attributes, **kwargs):
# Or whatever:
if '__slots__' not in attributes:
attributes['__slots__'] = tuple()
# Call up, creating and returning the new class:
return super().__new__(metacls, name,
bases,
attributes,
**kwargs)
… that last super(…).__new__(…) call is effectively calling type.__new__(…). In real life, there might be some other ancestral “__new__(…)” methods that get called between here and there, if your metaclass inherits from other metaclasses (like, e.g. abc.ABCMeta). Effectively, though, inside your Meta.__new__(…) method, between the method entry point, the super(…).__new__(…) call, and return-ing the new class object, you can inspect or set the value of the eventual __class__ cell variable through attributes['__classcell__']†.
Now as for whether this is at all useful: I don’t know. I have been programming in python for ten years; I totally use metaclasses‡, like, absolutely all the time (for better or for worse); and in the course of doing so I have never done any of the following things:
reassigned a __class__ attribute;
inspected the __class__ cell variable of anything; nor
messed around with this supposed __classcell__ namespace entry, in like any capacity
… Naturally, your programming experience will be different from mine, who knows what one does. It is not that any one of those aforementioned stratagems are de facto problematic, necessarily. But I am no stranger to bending Python’s type systems and metaprogramming facilities to my whim, and these particular things have never presented themselves as partiuclarly useful, especially once you are working within the general context of metaclasses, and what they do.
By which I suppose I mean, tl;dr: you are on the cusp of figuring out the basics of metaclasses and what they can do – do press on and experiment, but do investigate the topic with depth as well as breath. Indeed!
† – In reading through code examples of this sort, you’ll often find what my snippet here calls the attributes dictionary referred to as namespace or ns, or similar. It’s all the same stuff.
‡ – …and ABCs and mixins and class decorators and __init_subclass__(…) and the abuse of __mro_entries__(…) for personal gain; et cetera, ad nauseum

Related

Why to put a property variable in the constructor

From other languages I am used to code a class property and afterwards I can access this without having it in the constructor like
Class MyClass:
def __init__(self):
self._value = 0
#property
my_property(self):
print('I got the value:' & self._value)
In almost every example I worked through the property variable was in the constructor self._value like this
Class MyClass:
def __init__(self, value = 0):
self._value = value
To me this makes no sence since you want to set it in the property. Could anyone explain to me what is the use of placing the value variable in the constructor?
Python objects are not struct-based (like C++ or Java), they are dict-based (like Javascript). This means that the instances attributes are dynamic (you can add new attributes or delete existing ones at runtime), and are not defined at the class level but at the instance level, and they are defined quite simply by assigning to them. While they can technically be defined anywhere in the code (even outside the class), the convention (and good practice) is to define them (eventually to default values) in the initializer (the __init__ method - the real constructor is named __new__ but there are very few reasons to override it) to make clear which attributes an instance of a given class is supposed to have.
Note the use of the term "attribute" here - in python, we don't talk about "member variables" or "member functions" but about "attributes" and "methods". Actually, since Python classes are objects too (instance of the type class or a subclass of), they have attributes too, so we have instance attributes (which are per-instance) and class attributes (which belong to the class object itself, and are shared amongst instances). A class attribute can be looked up on an instance, as long as it's not shadowed by an instance attribute of the same name.
Also, since Python functions are objects too (hint: in Python, everything - everything you can put on the RHS of an assignment that is - is an object), there are no distinct namespaces for "data" attributes and "function" attributes, and Python's "methods" are actually functions defined on the class itself - IOW they are class attributes that happen to be instances of the function type. Since methods need to access the instance to be able to work on it, there's a special mechanism that allow to "customize" attribute access so a given object - if it implements the proper interface - can return something else than itself when it's looked up on an instance but resolved on the class. This mechanism is used by functions so they turn themselves into methods (callable objects that wraps the function and instance together so you don't have to pass the instance to the function), but also more generally as the support for computed attributes.
The property class is a generic implementation of computed attributes that wraps a getter function (and eventually a setter and a deleter) - so in Python "property" has a very specific meaning (the property class itself or an instance of it). Also, the #decorator syntax is nothing magical (and isn't specific to properties), it's just syntactic sugar so given a "decorator" function:
def decorator(func):
return something
this:
#decorator
def foo():
# code here
is just a shortcut for:
def foo():
# code here
foo = decorator(foo)
Here I defined decorator as a function, but just any callable object (a "callable" object is an instance of a class that defines the __call__ magic method) can be used instead - and Python classes are callables (that's even actually by calling a class that you instanciate it).
So back to your code:
# in py2, you want to inherit from `object` for
# descriptors and other fancy things to work.
# this is useless in py3 but doesn't break anything either...
class MyClass(object):
# the `__init__` function will become an attribute
# of the `MyClass` class object
def __init__(self, value=0):
# defines the instance attribute named `_value`
# the leading underscore denotes an "implementation attribute"
# - something that is not part of the class public interface
# and should not be accessed externally (IOW a protected attribute)
self._value = value
# this first defines the `my_property` function, then
# pass it to `property()`, and rebinds the `my_property` name
# to the newly created `property` instance. The `my_property` function
# will then become the property's getter (it's `fget` instance attribute)
# and will be called when the `my_property` name is resolved on a `MyClass` instance
#property
my_property(self):
print('I got the value: {}'.format(self._value))
# let's at least return something
return self._value
You may then want to inspect both the class and an instance of it:
>>> print(MyClass.__dict__)
{'__module__': 'oop', '__init__': <function MyClass.__init__ at 0x7f477fc4a158>, 'my_property': <property object at 0x7f477fc639a8>, '__dict__': <attribute '__dict__' of 'MyClass' objects>, '__weakref__': <attribute '__weakref__' of 'MyClass' objects>, '__doc__': None}
>>> print(MyClass.my_property)
<property object at 0x7f477fc639a8>
>>> print(MyClass.my_property.fget)
<function MyClass.my_property at 0x7f477fc4a1e0>
>>> m = MyClass(42)
>>> print(m.__dict__)
{'_value': 42}
>>> print(m.my_property)
I got the value: 42
42
>>>
As a conclusion: if you hope to do anything usefull with a language, you have to learn this language - you cannot just expect it to work as other languages you know. While some features are based on common concepts (ie functions, classes etc), they can actually be implemented in a totally different way (Python's object model has almost nothing in common with Java's one), so just trying to write Java (or C or C++ etc) in Python will not work (just like trying to write Python in Java FWIW).
NB: just for the sake of completeness: Python objects can actually be made "struct-based" by using __slots__ - but the aim here is not to prevent dynamically adding attributes (that's only a side effect) but to make instances of those classes "lighter" in size (which is useful when you know you're going to have thousands or more instances of them at a given time).
Because #property is not a decorator for a variable, it is a decorator for function that allows it to behave like a property. You still need to create a class variable to use a function decorated by #property:
The #property decorator turns the voltage() method into a “getter” for a read-only attribute with the same name, and it sets the docstring for voltage to “Get the current voltage.”
A property object has getter, setter, and deleter methods usable as decorators that create a copy of the property with the corresponding accessor function set to the decorated function. This is best explained with an example:
I'm guessing you're coming from a language like C++ or Java where it is common to make attributes private and then write explicit getters and setters for them? In Python there is no such thing as private other than by convention and there is no need to write getters and setters for a variable that you only need to write and read as is. #property and the corresponding setter decorators can be used if you want to add additional behaviour (e.g. logging acccess) or you want to have pseudo-properties that you can access just like real ones, e.g. you might have a Circle class that is defined by it's radius but you could define a #property for the diameter so you can still write circle.diameter.
More specifically to your question: You want to have the property as an argument of the initializer if you want to set the property at the time when you create the object. You wouldn't want to create an empty object and then immediately fill it with properties as that would create a lot of noise and make the code less readable.
Just an aside: __init__ isn't actually a constructor. The constructor for Python objects is __new__ and you almost never overwrite it.

Why isn't __bases__ accessible in the class body?

Class objects have a __bases__ (and a __base__) attribute:
>>> class Foo(object):
... pass
...
>>> Foo.__bases__
(<class 'object'>,)
Sadly, these attributes aren't accessible in the class body, which would be very convenient for accessing parent class attributes without having to hard-code the name:
class Foo:
cls_attr = 3
class Bar(Foo):
cls_attr = __base__.cls_attr + 2
# throws NameError: name '__base__' is not defined
Is there a reason why __bases__ and __base__ can't be accessed in the class body?
(To be clear, I'm asking if this is a conscious design decision. I'm not asking about the implementation; I know that __bases__ is a descriptor in type and that this descriptor can't be accessed until a class object has been created. I want to know why python doesn't create __bases__ as a local variable in the class body.)
I want to know why python doesn't create __bases__ as a local variable in the class body
As you know, class is mostly a shortcut for type.__new__() - when the runtime hits a class statements, it executes all statements at the top-level of the class body, collects all resulting bindings in a dedicated namespace dict, calls type() with the concrete metaclass, the class name, the base classes and the namespace dict, and binds the resulting class object to the class name in the enclosing scope (usually but not necessarily the module's top-level namespace).
The important point here is that it's the metaclass responsabilty to build the class object, and to allow for class object creation customisations, the metaclass must be free to do whatever it wants with its arguments. Most often a custom metaclass will mainly work on the attrs dict, but it must also be able to mess with the bases argument. Now since the metaclass is only invoked AFTER the class body statements have been executed, there's no way the runtime can reliably expose the bases in the class body scope since those bases could be modified afterward by the metaclass.
There are also some more philosophical considerations here, notably wrt/ explicit vs implicit, and as shx2 mentions, Python designers try to avoid magic variables popping out of the blue. There are indeed a couple implementation variables (__module__ and, in py3, __qualname__) that are "automagically" defined in the class body namespace, but those are just names, mostly intended as additional debugging / inspection informations for developers) and have absolutely no impact on the class object creation nor on its properties, behaviour and whatnots.
As always with Python, you have to consider the whole context (the execution model, the object model, how the different parts work together etc) to really understand the design choices. Whether you agree with the whole design and philosophy is another debate (and one that doesn't belong here), but you can be sure that yes, those choices are "conscious design decisions".
I am not answering as to why it was decided to be implemented the way it was, I'm answering why it wasn't implemented as a "local variable in the class body":
Simply because nothing in python is a local variable magically defined in the class body. Python doesn't like names magically appearing out of nowhere.
It's because it's simply is not yet created.
Consider the following:
>>> class baselessmeta(type):
... def __new__(metaclass, class_name, bases, classdict):
... return type.__new__(
... metaclass,
... class_name,
... (), # I can just ignore all the base
... {}
... )
...
>>> class Baseless(int, metaclass=baselessmeta):
... # imaginary print(__bases__, __base__)
... ...
...
>>> Baseless.__bases__
(<class 'object'>,)
>>> Baseless.__base__
<class 'object'>
>>>
What should the imaginary print result in?
Every python class is created via the type metaclass one way or another.
You have the int argument for the type() in bases argument, yet you do not know what is the return value is going to be. You may use that directly as a base in your metaclass, or you may return another base with your LOC.
Just realized your to be clear part and now my answer is useless haha. Oh welp.

Calling instance method using class definition in Python

Lately, I've been studying Python's class instantiation process to really understand what happen under the hood when creating a class instance. But, while playing around with test code, I came across something I don't understand.
Consider this dummy class
class Foo():
def test(self):
print("I'm using test()")
Normally, if I wanted to use Foo.test instance method, I would go and create an instance of Foo and call it explicitly like so,
foo_inst = Foo()
foo_inst.test()
>>>> I'm using test()
But, I found that calling it that way ends up with the same result,
Foo.test(Foo)
>>>> I'm using test()
Here I don't actually create an instance, but I'm still accessing Foo's instance method. Why and how is this working in the context of Python ? I mean self normally refers to the current instance of the class, but I'm not technically creating a class instance in this case.
print(Foo()) #This is a Foo object
>>>><__main__.Foo object at ...>
print(Foo) #This is not
>>>> <class '__main__.Foo'>
Props to everyone that led me there in the comments section.
The answer to this question rely on two fundamentals of Python:
Duck-typing
Everything is an object
Indeed, even if self is Python's idiom to reference the current class instance, you technically can pass whatever object you want because of how Python handle typing.
Now, the other confusion that brought me here is that I wasn't creating an object in my second example. But, the thing is, Foo is already an object internally.
This can be tested empirically like so,
print(type(Foo))
<class 'type'>
So, we now know that Foo is an instance of class type and therefore can be passed as self even though it is not an instance of itself.
Basically, if I were to manipulate self as if it was a Foo object in my test method, I would have problem when calling it like my second example.
A few notes on your question (and answer). First, everything is, really an object. Even a class is an object, so, there is the class of the class (called metaclass) which is type in this case.
Second, more relevant to your case. Methods are, more or less, class, not instance attributes. In python, when you have an object obj, instance of Class, and you access obj.x, python first looks into obj, and then into Class. That's what happens when you access a method from an instance, they are just special class attributes, so they can be access from both instance and class. And, since you are not using any instance attributes of the self that should be passed to test(self) function, the object that is passed is irrelevant.
To understand that in depth, you should read about, descriptor protocol, if you are not familiar with it. It explains a lot about how things work in python. It allows python classes and objects to be essentially dictionaries, with some special attributes (very similar to javascript objects and methods)
Regarding the class instantiation, see about __new__ and metaclasses.

How does extending classes (Monkey Patching) work in Python?

class Foo(object):
pass
foo = Foo()
def bar(self):
print 'bar'
Foo.bar = bar
foo.bar() #bar
Coming from JavaScript, if a "class" prototype was augmented with a certain attribute. It is known that all instances of that "class" would have that attribute in its prototype chain, hence no modifications has to be done on any of its instances or "sub-classes".
In that sense, how can a Class-based language like Python achieve Monkey patching?
The real question is, how can it not? In Python, classes are first-class objects in their own right. Attribute access on instances of a class is resolved by looking up attributes on the instance, and then the class, and then the parent classes (in the method resolution order.) These lookups are all done at runtime (as is everything in Python.) If you add an attribute to a class after you create an instance, the instance will still "see" the new attribute, simply because nothing prevents it.
In other words, it works because Python doesn't cache attributes (unless your code does), because it doesn't use negative caching or shadowclasses or any of the optimization techniques that would inhibit it (or, when Python implementations do, they take into account the class might change) and because everything is runtime.
I just read through a bunch of documentation, and as far as I can tell, the whole story of how foo.bar is resolved, is as follows:
Can we find foo.__getattribute__ by the following process? If so, use the result of foo.__getattribute__('bar').
(Looking up __getattribute__ will not cause infinite recursion, but the implementation of it might.)
(In reality, we will always find __getattribute__ in new-style objects, as a default implementation is provided in object - but that implementation is of the following process. ;) )
(If we define a __getattribute__ method in Foo, and access foo.__getattribute__, foo.__getattribute__('__getattribute__') will be called! But this does not imply infinite recursion - if you are careful ;) )
Is bar a "special" name for an attribute provided by the Python runtime (e.g. __dict__, __class__, __bases__, __mro__)? If so, use that. (As far as I can tell, __getattribute__ falls into this category, which avoids infinite recursion.)
Is bar in the foo.__dict__ dict? If so, use foo.__dict__['bar'].
Does foo.__mro__ exist (i.e., is foo actually a class)? If so,
For each base-class base in foo.__mro__[1:]:
(Note that the first one will be foo itself, which we already searched.)
Is bar in base.__dict__? If so:
Let x be base.__dict__['bar'].
Can we find (again, recursively, but it won't cause a problem) x.__get__?
If so, use x.__get__(foo, foo.__class__).
(Note that the function bar is, itself, an object, and the Python compiler automatically gives functions a __get__ attribute which is designed to be used this way.)
Otherwise, use x.
For each base-class base of foo.__class__.__mro__:
(Note that this recursion is not a problem: those attributes should always exist, and fall into the "provided by the Python runtime" case. foo.__class__.__mro__[0] will always be foo.__class__, i.e. Foo in our example.)
(Note that we do this even if foo.__mro__ exists. This is because classes have a class, too: its name is type, and it provides, among other things, the method used to calculate __mro__ attributes in the first place.)
Is bar in base.__dict__? If so:
Let x be base.__dict__['bar'].
Can we find (again, recursively, but it won't cause a problem) x.__get__?
If so, use x.__get__(foo, foo.__class__).
(Note that the function bar is, itself, an object, and the Python compiler automatically gives functions a __get__ attribute which is designed to be used this way.)
Otherwise, use x.
If we still haven't found something to use: can we find foo.__getattr__ by the preceding process? If so, use the result of foo.__getattr__('bar').
If everything failed, raise AttributeError.
bar.__get__ is not really a function - it's a "method-wrapper" - but you can imagine it being implemented vaguely like this:
# Somewhere in the Python internals
class __method_wrapper(object):
def __init__(self, func):
self.func = func
def __call__(self, obj, cls):
return lambda *args, **kwargs: func(obj, *args, **kwargs)
# Except it actually returns a "bound method" object
# that uses cls for its __repr__
# and there is a __repr__ for the method_wrapper that I *think*
# uses the hashcode of the underlying function, rather than of itself,
# but I'm not sure.
# Automatically done after compiling bar
bar.__get__ = __method_wrapper(bar)
The "binding" that happens within the __get__ automatically attached to bar (called a descriptor), by the way, is more or less the reason why you have to specify self parameters explicitly for Python methods. In Javascript, this itself is magical; in Python, it is merely the process of binding things to self that is magical. ;)
And yes, you can explicitly set a __get__ method on your own objects and have it do special things when you set a class attribute to an instance of the object and then access it from an instance of that other class. Python is extremely reflective. :) But if you want to learn how to do that, and get a really full understanding of the situation, you have a lot of reading to do. ;)

Python metaclasses

I've been hacking classes in Python like this:
def hack(f,aClass) :
class MyClass(aClass) :
def f(self) :
f()
return MyClass
A = hack(afunc,A)
Which looks pretty clean to me. It takes a class, A, creates a new class derived from it that has an extra method, calling f, and then reassigns the new class to A.
How does this differ from metaclass hacking in Python? What are the advantages of using a metaclass over this?
The definition of a class in Python is an instance of type (or an instance of a subclass of type). In other words, the class definition itself is an object. With metaclasses, you have the ability to control the type instance that becomes the class definition.
When a metaclass is invoked, you have the ability to completely re-write the class definition. You have access to all the proposed attributes of the class, its ancestors, etc. More than just injecting a method or removing a method, you can radically alter the inheritance tree, the type, and pretty much any other aspect. You can also chain metaclasses together for a very dynamic and totally convoluted experience.
I suppose the real benefit, though is that the class's type remains the class's type. In your example, typing:
a_inst = A()
type(a_inst)
will show that it is an instance of MyClass. Yes, isinstance(a_inst, aClass) would return True, but you've introduced a subclass, rather than a dynamically re-defined class. The distinction there is probably the key.
As rjh points out, the anonymous inner class also has performance and extensibility implications. A metaclass is processed only once, and the moment that the class is defined, and never again. Users of your API can also extend your metaclass because it is not enclosed within a function, so you gain a certain degree of extensibility.
This slightly old article actually has a good explanation that compares exactly the "function decoration" approach you used in the example with metaclasses, and shows the history of the Python metaclass evolution in that context: http://www.ibm.com/developerworks/linux/library/l-pymeta.html
You can use the type callable as well.
def hack(f, aClass):
newfunc = lambda self: f()
return type('MyClass', (aClass,), {'f': newfunc})
I find using type the easiest way to get into the metaclass world.
A metaclass is the class of a class. IMO, the bloke here covered it quite serviceably, including some use-cases. See Stack Overflow question "MetaClass", "new", "cls" and "super" - what is the mechanism exactly?.

Categories

Resources