Python Descriptor's - Documentation unclear - python

I was looking at python's descriptor's documentation here, and the statement which got me thinking is:
For objects, the machinery is in object.__getattribute__() which transforms b.x into type(b).__dict__['x'].__get__(b, type(b))
under a section named Invoking Descriptors.
Last part of the statement b.x into type(b).__dict__['x'].__get__(b, type(b)) is causing the conflict here. As per my understanding, if we lookup for attribute on an instance, then instance.__dict__is being looked up, and if we didn't find anything type(instance).__dict__ is referred.
In our example, b.x should then be evaluated as:
b.__dict__["x"].__get__(b, type(b)) instead of
type(b).__dict__['x'].__get__(b, type(b))
Is this understanding correct? Or am I going wrong somewhere in interpretation?
Any explanation would be helpful.
Thanks.
I am adding the second part as well:
Why instance attributes does not respect the descriptor protocol? For ex: referring to code below:
>>> class Desc(object):
... def __get__(self, obj, type):
... return 1000
... def __set__(self, obj, value):
... raise AttributeError
...
>>>
>>> class Test(object):
... def __init__(self,num):
... self.num = num
... self.desc = Desc()
...
>>>
>>> t = Test(10)
>>> print "Desc details are ", t.desc
Desc details are <__main__.Desc object at 0x7f746d647890>
Thanks for helping me out.

Your understanding is incorrect. x most likely does not appear in the instance's dict at all; the descriptor object appears in the class's dict or the dict of one of the superclasses.
Let's use an example:
class Foo(object):
#property
def x(self):
return 0
def y(self):
return 1
x = Foo()
x.__dict__['x'] = 2
x.__dict__['y'] = 3
Foo.x and Foo.y are both descriptors. (Properties and functions both implement the descriptor protocol.)
When we access x.x:
>>> x.x
0
We do not get the value from x's dict. Instead, since Python finds a data descriptor by the name of x in Foo.__dict__, it calls
Foo.__dict__['x'].__get__(x, Foo)
and returns the result. The data descriptor wins over the instance dict.
On the other hand, if we try x.y:
>>> x.y
3
we get 3, rather than a bound method object. Functions don't have __set__ or __delete__, so the instance dict overrides them.
As for the new Part 2 to your question, descriptors don't function in the instance dict. Consider what would happen if they did:
class Foo(object):
#property
def bar(self):
return 4
Foo.bar = 3
If descriptors functioned in the instance dict, then the assignment to Foo.bar would find a descriptor in Foo's dict and call Foo.__dict__['bar'].__set__. The __set__ method of the descriptor would have to handle setting the attribute on both the class and the instance, and it would have to tell the difference somehow, even in the face of metaclasses. There just isn't a compelling reason to complicate the protocol this way.

Related

Is it possible to make the output of `type` return a different class?

So disclaimer: this question has piqued my curiosity a bit, and I'm asking this for purely educational purposes. More of a challenge for the Python gurus here I suppose!
Is it possible to make the output of type(foo) return a different value than the actual instance class? i.e. can it pose as an imposter and pass a check such as type(Foo()) is Bar?
#juanpa.arrivillaga made a suggestion of manually re-assigning __class__ on the instance, but that has the effect of changing how all other methods would be called. e.g.
class Foo:
def test(self):
return 1
class Bar:
def test(self):
return 2
foo = Foo()
foo.__class__ = Bar
print(type(foo) is Bar)
print(foo.test())
>>> True
>>> 2
The desired outputs would be True, 1. i.e The class returned in type is different than the instance, and the instance methods defined in the real class still get invoked.
No - the __class__ attribute is a fundamental information on the layout of all Python objects as "seen" on the C API level itself. And that is what is checked by the call to type.
That means: every Python object have a slot in its in-memory layout with space for a single pointer, to the Python object that is that object's class.
Even if you use ctypes or other means to override protection to that slot and change it from Python code (since modifying obj.__class__ with = is guarded at the C level), changing it effectively changes the object type: the value in the __class__ slot IS the object's class, and the test method would be picked from the class in there (Bar) in your example.
However there is more information here: in all documentation, type(obj) is regarded as equivalent as obj.__class__ - however, if the objects'class defines a descriptor with the name __class__, it is used when one uses the form obj.__class__. type(obj) however will check the instance's __class__ slot directly and return the true class.
So, this can "lie" to code using obj.__class__, but not type(obj):
class Bar:
def test(self):
return 2
class Foo:
def test(self):
return 1
#property
def __class__(self):
return Bar
Property on the metaclass
Trying to mess with creating a __class__ descriptor on the metaclass of Foo itself will be messy -- both type(Foo()) and repr(Foo()) will report an instance of Bar, but the "real" object class will be Foo. In a sense, yes, it makes type(Foo()) lie, but not in the way you were thinking about - type(Foo()) will output the repr of Bar(), but it is Foo's repr that is messed up, due to implementation details inside type.__call__:
In [73]: class M(type):
...: #property
...: def __class__(cls):
...: return Bar
...:
In [74]: class Foo(metaclass=M):
...: def test(self):
...: return 1
...:
In [75]: type(Foo())
Out[75]: <__main__.Bar at 0x55665b000578>
In [76]: type(Foo()) is Bar
Out[76]: False
In [77]: type(Foo()) is Foo
Out[77]: True
In [78]: Foo
Out[78]: <__main__.Bar at 0x55665b000578>
In [79]: Foo().test()
Out[79]: 1
In [80]: Bar().test()
Out[80]: 2
In [81]: type(Foo())().test()
Out[81]: 1
Modifying type itself
Since no one "imports" type from anywhere, and just use
the built-in type itself, it is possible to monkeypatch the builtin
type callable to report a false class - and it will work for all
Python code in the same process relying on the call to type:
original_type = __builtins__["type"] if isinstance("__builtins__", dict) else __builtins__.type
def type(obj_or_name, bases=None, attrs=None, **kwargs):
if bases is not None:
return original_type(obj_or_name, bases, attrs, **kwargs)
if hasattr(obj_or_name, "__fakeclass__"):
return getattr(obj_or_name, "__fakeclass__")
return original_type(obj_or_name)
if isinstance(__builtins__, dict):
__builtins__["type"] = type
else:
__builtins__.type = type
del type
There is one trick here I had not find in the docs: when acessing __builtins__ in a program, it works as a dictionary. However, in an interactive environment such as Python's Repl or Ipython, it is a
module - retrieving the original type and writting the modified
version to __builtins__ have to take that into account - the code above
works both ways.
And testing this (I imported the snippet above from a .py file on disk):
>>> class Bar:
... def test(self):
... return 2
...
>>> class Foo:
... def test(self):
... return 1
... __fakeclass__ = Bar
...
>>> type(Foo())
<class '__main__.Bar'>
>>>
>>> Foo().__class__
<class '__main__.Foo'>
>>> Foo().test()
1
Although this works for demonstration purposes, replacing the built-in type caused "dissonances" that proved fatal in a more complex environment such as IPython: Ipython will crash and terminate immediately if the snippet above is run.

how to make a copy of a class in python?

I have a class A
class A(object):
a = 1
def __init__(self):
self.b = 10
def foo(self):
print type(self).a
print self.b
Then I want to create a class B, which equivalent as A but with different name and value of class member a:
This is what I have tried:
class A(object):
a = 1
def __init__(self):
self.b = 10
def foo(self):
print type(self).a
print self.b
A_dummy = type('A_dummy',(object,),{})
A_attrs = {attr:getattr(A,attr) for attr in dir(A) if (not attr in dir(A_dummy))}
B = type('B',(object,),A_attrs)
B.a = 2
a = A()
a.foo()
b = B()
b.foo()
However I got an Error:
File "test.py", line 31, in main
b.foo()
TypeError: unbound method foo() must be called with A instance as first argument (got nothing instead)
So How I can cope with this sort of jobs (create a copy of an exists class)? Maybe a meta class is needed? But What I prefer is just a function FooCopyClass, such that:
B = FooCopyClass('B',A)
A.a = 10
B.a = 100
print A.a # get 10 as output
print B.a # get 100 as output
In this case, modifying the class member of B won't influence the A, vice versa.
The problem you're encountering is that looking up a method attribute on a Python 2 class creates an unbound method, it doesn't return the underlying raw function (on Python 3, unbound methods are abolished, and what you're attempting would work just fine). You need to bypass the descriptor protocol machinery that converts from function to unbound method. The easiest way is to use vars to grab the class's attribute dictionary directly:
# Make copy of A's attributes
Bvars = vars(A).copy()
# Modify the desired attribute
Bvars['a'] = 2
# Construct the new class from it
B = type('B', (object,), Bvars)
Equivalently, you could copy and initialize B in one step, then reassign B.a after:
# Still need to copy; can't initialize from the proxy type vars(SOMECLASS)
# returns to protect the class internals
B = type('B', (object,), vars(A).copy())
B.a = 2
Or for slightly non-idiomatic one-liner fun:
B = type('B', (object,), dict(vars(A), a=2))
Either way, when you're done:
B().foo()
will output:
2
10
as expected.
You may be trying to (1) create copies of classes for some reason for some real app:
in that case, try using copy.deepcopy - it includes the mechanisms to copy classes. Just change the copy __name__ attribute afterwards if needed. Works both in Python 2 or Python 3.
(2) Trying to learn and understand about Python internal class organization: in that case, there is no reason to fight with Python 2, as some wrinkles there were fixed for Python 3.
In any case, if you try using dir for fetching a class attributes, you will end up with more than you want - as dir also retrieves the methods and attributes of all superclasses. So, even if your method is made to work (in Python 2 that means getting the .im_func attribute of retrieved unbound methods, to use as raw functions on creating a new class), your class would have more methods than the original one.
Actually, both in Python 2 and Python 3, copying a class __dict__ will suffice. If you want mutable objects that are class attributes not to be shared, you should resort again to deepcopy. In Python 3:
class A(object):
b = []
def foo(self):
print(self.b)
from copy import deepcopy
def copy_class(cls, new_name):
new_cls = type(new_name, cls.__bases__, deepcopy(A.__dict__))
new_cls.__name__ = new_name
return new_cls
In Python 2, it would work almost the same, but there is no convenient way to get the explicit bases of an existing class (i.e. __bases__ is not set). You can use __mro__ for the same effect. The only thing is that all ancestor classes are passed in a hardcoded order as bases of the new class, and in a complex hierarchy you could have differences between the behaviors of B descendants and A descendants if multiple-inheritance is used.

Grandchild inheriting from Parent class - Python

I am learning all about Python classes and I have a lot of ground to cover.
I came across an example that got me a bit confused.
These are the parent classes
Class X
Class Y
Class Z
Child classes are:
Class A (X,Y)
Class B (Y,Z)
Grandchild class is:
Class M (A,B,Z)
Doesn't Class M inherit Class Z through inheriting from Class B or what would the reason be for this type of structure? Class M would just ignore the second time Class Z is inherited wouldn't it be, or am I missing something?
Class M would just inherit the Class Z attributes twice (redundant) wouldn't it be, or am I missing something?
No, there are no "duplicated" attributes, Python performs a linearization they can the Method Resolution Order (MRO) as is, for instance, explained here. You are however correct that here adding Z to the list does not change anything.
They first construct MRO's for the parents, so:
MRO(X) = (X,object)
MRO(Y) = (Y,object)
MRO(Z) = (Z,object)
MRO(A) = (A,X,Y,object)
MRO(B) = (B,Y,Z,object)
and then they construct an MRO for M by merging:
MRO(M) = (M,)+merge((A,X,Y,object),(B,Y,Z,object),(Z,object))
= (M,A,X,B,Y,Z,object)
Now each time you call a method, Python will first check if the attribute is in the internal dictionary self.__dict__ of that object). If not, Python will walk throught the MRO and attempt to find an attribute with that name. From the moment it finds one, it will stop searching.
Finally super() is a proxy-object that does the same resolution, but starts in the MRO at the stage of the class. So in this case if you have:
class B:
def foo():
super().bar()
and you construct an object m = M() and call m.foo() then - given the foo() of B is called, super().bar will first attempt to find a bar in Y, if that fails, it will look for a bar in Z and finally in object.
Attributes are not inherited twice. If you add an attribute like:
self.qux = 1425
then it is simply added to the internal self.__dict__ dictionary of that object.
Stating Z explicitly however can be beneficial: if the designer of B is not sure whether Z is a real requirement. In that case you know for sure that Z will still be in the MRO if B is altered.
Apart from what #Willem has mentioned, I would like to add that, you are talking about multiple inheritance problem. For python, object instantiation is a bit different compared other languages like Java. Here, object instatiation is divided into two parts :- Object creation(using __new__ method) and object initialization(using __init__ method). Moreover, it's not necessary that child class will always have parent class's attributes. Child class get parent class's attribute, only if parent class constructor is invoked from child class(explicitly).
>>> class A(object):
def __init__(self):
self.a = 23
>>> class B(A):
def __init__(self):
self.b = 33
>>> class C(A):
def __init__(self):
self.c = 44
super(C, self).__init__()
>>> a = A()
>>> b = B()
>>> c = C()
>>> print (a.a) 23
>>> print (b.a) Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'B' object has no attribute 'a'
>>> print (c.a) 23
In the above code snipped, B is not invoking A's __init__ method and so, it doesn't have a as member variable, despite the fact that it's inheriting from A class. Same thing is not the case for language like Java, where there's a fixed template of attributes, that a class will have. This's how python is different from other languages.
Attributes that an object have, are stored in __dict__ member of object and it's __getattribute__ magic method in object class, which implements attribute lookup according to mro specified by willem. You can use vars() and dir() method for introspection of instance.

How to make one member of class to be both field and method?

I have one class A which extends B, and B has one method count(). Now I want to allow user call both A.count and A.count(). A.count means count is one field of A while A.count() means it is method derived from B.
This is impossible in Python, and here's why:
You can always assign a method (or really any function) to a variable and call it later.
hello = some_function
hello()
is semantically identical to
some_function()
So what would happen if you had an object of your class A called x:
x = A()
foo = x.count
foo()
The only way you could do this is by storing a special object in x.count that is callable and also turns into e.g. an integer when used in that way, but that is horrible and doesn't actually work according to specification.
As i said, it's not exactly impossible, as told by other answers. Lets see a didactic example:
class A(object):
class COUNT(object):
__val = 12345
def __call__(self, *args, **kwargs):
return self.__val
def __getattr__(self, item):
return self.__val
def __str__(self):
return str(self.__val)
count = COUNT()
if __name__ == '__main__':
your_inst = A()
print(your_inst.count)
# outputs: 12345
print(your_inst.count())
# outputs: 12345
As you may notice, you need to implement a series of things to accomplish that kind of behaviour. First, your class will need to implement the attribute count not as the value type that you intent, but as an instance of another class, which will have to implement, among other things (to make that class behave, by duck typing, as the type you intent) the __call__ method, that should return the same as you A class __getattr__, that way, the public attribute count will answer as a callable (your_inst.count()) or, as you call, a field (your_inst.count), the same way.
By the way, i don't know if the following is clear to you or not, but it may help you understand why it isn't as trivial as one may think it is to make count and count() behave the same way:
class A(object):
def count(self):
return 123
if __name__ == '__main__':
a = A()
print(type(a.count))
# outputs: <class 'method'>
print(type(a.count()))
# outputs: <class 'int'>
. invokes the a class __getattr__ to get the item count. a.count will return the referente to that function (python's function are first class objects), the second one, will do the same, but the parentheses will invoke the __call__ method from a.count.

Best of two ways to declare a class variable in Python

The way I usually declare a class variable to be used in instances in Python is the following:
class MyClass(object):
def __init__(self):
self.a_member = 0
my_object = MyClass()
my_object.a_member # evaluates to 0
But the following also works. Is it bad practice? If so, why?
class MyClass(object):
a_member = 0
my_object = MyClass()
my_object.a_member # also evaluates to 0
The second method is used all over Zope, but I haven't seen it anywhere else. Why is that?
Edit: as a response to sr2222's answer. I understand that the two are essentially different. However, if the class is only ever used to instantiate objects, the two will work he same way. So is it bad to use a class variable as an instance variable? It feels like it would be but I can't explain why.
The question is whether this is an attribute of the class itself or of a particular object. If the whole class of things has a certain attribute (possibly with minor exceptions), then by all means, assign an attribute onto the class. If some strange objects, or subclasses differ in this attribute, they can override it as necessary. Also, this is more memory-efficient than assigning an essentially constant attribute onto every object; only the class's __dict__ has a single entry for that attribute, and the __dict__ of each object may remain empty (at least for that particular attribute).
In short, both of your examples are quite idiomatic code, but they mean somewhat different things, both at the machine level, and at the human semantic level.
Let me explain this:
>>> class MyClass(object):
... a_member = 'a'
...
>>> o = MyClass()
>>> p = MyClass()
>>> o.a_member
'a'
>>> p.a_member
'a'
>>> o.a_member = 'b'
>>> p.a_member
'a'
On line two, you're setting a "class attribute". This is litterally an attribute of the object named "MyClass". It is stored as MyClass.__dict__['a_member'] = 'a'. On later lines, you're setting the object attribute o.a_member to be. This is completely equivalent to o.__dict__['a_member'] = 'b'. You can see that this has nothing to do with the separate dictionary of p.__dict__. When accessing a_member of p, it is not found in the object dictionary, and deferred up to its class dictionary: MyClass.a_member. This is why modifying the attributes of o do not affect the attributes of p, because it doesn't affect the attributes of MyClass.
The first is an instance attribute, the second a class attribute. They are not the same at all. An instance attribute is attached to an actual created object of the type whereas the class variable is attached to the class (the type) itself.
>>> class A(object):
... cls_attr = 'a'
... def __init__(self, x):
... self.ins_attr = x
...
>>> a1 = A(1)
>>> a2 = A(2)
>>> a1.cls_attr
'a'
>>> a2.cls_attr
'a'
>>> a1.ins_attr
1
>>> a2.ins_attr
2
>>> a1.__class__.cls_attr = 'b'
>>> a2.cls_attr
'b'
>>> a1.ins_attr = 3
>>> a2.ins_attr
2
Even if you are never modifying the objects' contents, the two are not interchangeable. The way I understand it, accessing class attributes is slightly slower than accessing instance attributes, because the interpreter essentially has to take an extra step to look up the class attribute.
Instance attribute
"What's a.thing?"
Class attribute
"What's a.thing? Oh, a has no instance attribute thing, I'll check its class..."
I have my answer! I owe to #mjgpy3's reference in the comment to the original post. The difference comes if the value assigned to the class variable is MUTABLE! THEN, the two will be changed together. The members split when a new value replaces the old one
>>> class MyClass(object):
... my_str = 'a'
... my_list = []
...
>>> a1, a2 = MyClass(), MyClass()
>>> a1.my_str # This is the CLASS variable.
'a'
>>> a2.my_str # This is the exact same class variable.
'a'
>>> a1.my_str = 'b' # This is a completely new instance variable. Strings are not mutable.
>>> a2.my_str # This is still the old, unchanged class variable.
'a'
>>> a1.my_list.append('w') # We're changing the mutable class variable, but not reassigning it.
>>> a2.my_list # This is the same old class variable, but with a new value.
['w']
Edit: this is pretty much what bukzor wrote. They get the best answer mark.

Categories

Resources