How does Python access an objects attribute when __dict__ is overridden?

How does Python access an objects attribute when __dict__ is overridden? - python

The Python docs, specify that attribute access is done via the objects __dict__ attribute.
The default behavior for attribute access is to get, set, or delete the attribute from an object’s dictionary. For instance, a.x has a lookup chain starting with a.__dict__['x'], then type(a).__dict__['x'], and continuing through the base classes of type(a) excluding metaclasses. If the looked-up value is an object defining one of the descriptor methods, then Python may override the default behavior and invoke the descriptor method instead. Where this occurs in the precedence chain depends on which descriptor methods were defined.
In this example, I override the object's __dict__, but python can still manage to get its attribute, how does that work?:
In [1]: class Foo:
...: def __init__(self):
...: self.a = 5
...: self.b = 3
...: def __dict__(self):
...: return 'asdasd'
...:
In [2]: obj_foo = Foo()
In [3]: obj_foo.__dict__
Out[3]: <bound method Foo.__dict__ of <__main__.Foo object at 0x111056850>>
In [4]: obj_foo.__dict__()
Out[4]: 'asdasd'
In [5]: obj_foo.b
Out[5]: 3

You are implementing the __dict__ method.
If you override the __dict__ attribute you will succeed in breaking Python:
class Foo:
def __init__(self):
self.a = 1
self.__dict__ = {}
Foo().a
AttributeError: 'Foo' object has no attribute 'a'
Interestingly enough, if both are overridden everything is back to normal:
class Foo:
def __init__(self):
self.a = 1
self.__dict__ = {}
def __dict__(self):
pass
print(Foo().a)
Outputs
1
Making __dict__ a property also does not seem to break anything:
class Foo:
def __init__(self):
self.a = 1
#property
def __dict__(self):
return {}
print(Foo().a)
Also outputs 1.
I suspect the answer to the question *why* lies (probably) deep in the CPython implementation of `object`, more digging needs to be done.

Related

Some problems about Python inherited classmethod

I have this code:
from typing import Callable, Any
class Test(classmethod):
def __init__(self, f: Callable[..., Any]):
super().__init__(f)
def __get__(self,*args,**kwargs):
print(args) # why out put is (None, <class '__main__.A'>) where form none why no parameter 123
# where was it called
return super().__get__(*args,**kwargs)
class A:
#Test
def b(cls,v_b):
print(cls,v_b)
A.b(123)
Why the output is (None, <class '__main__.A'>)? Where did None come form and why is it not the parameter 123, which is the value I called it with?

The __get__ method is called when the method b is retrieved from the A class. It has nothing to do with the actual calling of b.
To illustrate this, separate the access to b from the actual call of b:
print("Getting a reference to method A.b")
method = A.b
print("I have a reference to the method now. Let's call it.")
method()
This results in this output:
Getting a reference to method A.b
(None, <class '__main__.A'>)
I have a reference to the method now. Let's call it.
<class '__main__.A'> 123
So you see, it is normal that the output in __get__ does not show anything about the argument you call b with, because you haven't made the call yet.
The output None, <class '__main__.A'> is in line with the Python documentation on __get__:
object.__get__(self, instance, owner=None)
Called to get the attribute of the owner class (class attribute access) or of an instance of that class (instance attribute access). The optional owner argument is the owner class, while instance is the instance that the attribute was accessed through, or None when the attribute is accessed through the owner.
In your case you are using it for accessing an attribute (b) of a class (A) -- not of an instance of A -- so that explains the instance argument is None and the owner argument is your class A.
The second output, made with print(cls,v_b), will print <class '__main__.A'> for cls, because that is what happens when you call class methods (as opposed to instance methods). Again, from the documentation:
When a class attribute reference (for class C, say) would yield a class method object, it is transformed into an instance method object whose __self__ attribute is C.
Your case is described here, where A is the class, and so the first parameter (which you called cls) will get as value A.

You can apply multiple decorators on the same function, for example,
first (and outer) decorator could be a classmethod
and the second (doing your stuff) could define a wrapper, where you could accept your arguments as usual
In [4]: def test_deco(func):
...: def wrapper(cls, *args, **kwds):
...: print("cls is", cls)
...: print("That's where 123 should appear>>>", args, kwds)
...: return func(cls, *args, **kwds)
...:
...: return wrapper
...:
...:
...: class A:
...: #classmethod
...: #test_deco
...: def b(cls, v_b):
...: print("That's where 123 will appear as well>>>", v_b)
...:
...:
...: A.b(123)
cls is <class '__main__.A'>
That's where 123 should appear>>> (123,) {}
That's where 123 will appear as well>>> 123
In [5]:
It's too much trouble to use two at a time i want only use one like it
It is possible to define a decorator applying a couple of other decorators:
def my_super_decorator_doing_everything_at_once(func):
return classmethod(my_small_decorator_doing_almost_everything(func))
That works because decorators notation
#g
#f
def x(): ...
is a readable way to say
def x(): ...
x = g(f(x))

Why are non-overriding descriptors shadowed but overriding descriptors are not? [duplicate]

I am learning about descriptors in python. I want to write a non-data descriptor but the class having the descriptor as its classmethod doesn't call the __get__ special method when I call the classmethod. This is my example (without the __set__):
class D(object):
"The Descriptor"
def __init__(self, x = 1395):
self.x = x
def __get__(self, instance, owner):
print "getting", self.x
return self.x
class C(object):
d = D()
def __init__(self, d):
self.d = d
And here is how I call it:
>>> c = C(4)
>>> c.d
4
The __get__ of the descriptor class gets no call. But when I also set a __set__ the descriptor seems to get activated:
class D(object):
"The Descriptor"
def __init__(self, x = 1395):
self.x = x
def __get__(self, instance, owner):
print "getting", self.x
return self.x
def __set__(self, instance, value):
print "setting", self.x
self.x = value
class C(object):
d = D()
def __init__(self, d):
self.d = d
Now I create a C instance:
>>> c=C(4)
setting 1395
>>> c.d
getting 4
4
and both of __get__, __set__ are present. It seems that I am missing some basic concepts about descriptors and how they can be used. Can anyone explain this behaviour of __get__, __set__?

You successfully created a proper non-data descriptor, but you then mask the d attribute by setting an instance attribute.
Because it is a non-data descriptor, the instance attribute wins in this case. When you add a __set__ method, you turn your descriptor into a data descriptor, and data descriptors are always applied even if there is an instance attribute. (*)
From the Descriptor Howto:
The default behavior for attribute access is to get, set, or delete the attribute from an object’s dictionary. For instance, a.x has a lookup chain starting with a.__dict__['x'], then type(a).__dict__['x'], and continuing through the base classes of type(a) excluding metaclasses. If the looked-up value is an object defining one of the descriptor methods, then Python may override the default behavior and invoke the descriptor method instead. Where this occurs in the precedence chain depends on which descriptor methods were defined.
and
If an object defines both __get__() and __set__(), it is considered a data descriptor. Descriptors that only define __get__() are called non-data descriptors (they are typically used for methods but other uses are possible).
Data and non-data descriptors differ in how overrides are calculated with respect to entries in an instance’s dictionary. If an instance’s dictionary has an entry with the same name as a data descriptor, the data descriptor takes precedence. If an instance’s dictionary has an entry with the same name as a non-data descriptor, the dictionary entry takes precedence.
If you remove the d instance attribute (never set it or delete it from the instance), the descriptor object gets invoked:
>>> class D(object):
... def __init__(self, x = 1395):
... self.x = x
... def __get__(self, instance, owner):
... print "getting", self.x
... return self.x
...
>>> class C(object):
... d = D()
...
>>> c = C()
>>> c.d
getting 1395
1395
Add an instance attribute again and the descriptor is ignored because the instance attribute wins:
>>> c.d = 42 # setting an instance attribute
>>> c.d
42
>>> del c.d # deleting it again
>>> c.d
getting 1395
1395
Also see the Invoking Descriptors documentation in the Python Datamodel reference.
(*) Provided the data descriptor implements the __get__ hook. Accessing such a descriptor via instance.attribute_name will return the descriptor object unless 'attribute_name' exists in instance.__dict__.

Can Python classes have members that are accessible, but not from an instance of the class?

So I don't come from a computer science background and I am having trouble googling/SO searching on the right terms to answer this question. If I have a Python class with a class variable objects like so:
class MyClass(object):
objects = None
pass
MyClass.objects = 'test'
print MyClass.objects # outputs 'test'
a = MyClass()
print a.objects # also outputs 'test'
both the class and instances of the class will have access to the objects variable. I understand that I can change the instance value like so:
a.objects = 'bar'
print a.objects # outputs 'bar'
print MyClass.objects # outputs 'test'
but is it possible to have a class variable in Python that is accessible to users of the class (i.e. not just from within the class) but not accessible to the instances of that class? I think this is called a private member or static member in other languages?

Python is designed to allow instances of a class to access that class's attributes through the instance.
This only goes one level deep, so you can use a metaclass:
class T(type):
x = 5
class A(object):
__metaclass__ = T
Note that the metaclass syntax is different in Python 3. This works:
>>> A.x
5
>>> A().x
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'A' object has no attribute 'x'
It doesn't prevent you setting the attribute on instances of the class, though; to prevent that you'd have to play with __setattr__ magic method:
class A(object):
x = 1
def __getattribute__(self, name):
if name == 'x':
raise AttributeError
return super(A, self).__getattribute__(name)
def __setattr__(self, name, value):
if name == 'x':
raise AttributeError
return super(A, self).__setattr__(name, value)
def __delattr__(self, name):
if name == 'x':
raise AttributeError
return super(A, self).__delattr__(name)

The simplest way of achieving it is to use a descriptor. Descriptors are the thing meant for giving a higher level of control over attribute access. For example:
class ClassOnly(object):
def __init__(self, name, value):
self.name = name
self.value = value
def __get__(self, inst, cls):
if inst is not None:
msg = 'Cannot access class attribute {} from an instance'.format(self.name)
raise AttributeError(msg)
return self.value
class A(object):
objects = ClassOnly('objects', [])
Used as:
In [11]: a = A()
In [12]: a.objects
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-12-24afc67fd0ba> in <module>()
----> 1 a.objects
<ipython-input-9-db6510cd313b> in __get__(self, inst, cls)
5 def __get__(self, inst, cls):
6 if inst is not None:
----> 7 raise AttributeError('Cannot access class attribute {} from an instance'.format(self.name))
8 return self.value
AttributeError: Cannot access class attribute objects from an instance
In [13]: A.objects
Out[13]: []

If you want there to be a "single source of truth" for objects, you could make it a mutable type:
class MyClass(object):
objects = []
With immutable types, the fact that each instance starts out with the same reference from MyClass is irrelevant, as the first time that attribute is changed for the instance, it becomes "disconnected" from the class's value.
However, if the attribute is mutable, changing it in an instance changes it for the class and all other instances of the class:
>>> MyClass.objects.append(1)
>>> MyClass.objects
[1]
>>> a = MyClass()
>>> a.objects
[1]
>>> a.objects.append(2)
>>> a.objects
[1, 2]
>>> MyClass.objects
[1, 2]
In Python, nothing is really "private", so you can't really prevent the instances from accessing or altering objects (in that case, is it an appropriate class attribute?), but it is conventional to prepend names with an underscore if you don't ordinarily want them to be accessed directly: _objects.
One way to actually protect objects from instance access would be to override __getattribute__:
def __getattribute__(self, name):
if name == "objects":
raise AttributeError("Do not access 'objects' though MyClass instances.")
return super(MyClass, self).__getattribute__(name)
>>> MyClass.objects
[1]
>>> a.objects
...
AttributeError: Do not access 'objects' though MyClass instances.

No, you can't (EDIT: you can't in a way that is completely unaccessible, like in Java or C++).
You can do this, if you like:
class MyClass(object):
objects = None
pass
MyClass_objects = 'test'
print MyClass_objects # outputs 'test'
a = MyClass()
print a.objects # outputs 'None'
or this:
in your_module.py:
objects = 'test'
class MyClass(object):
objects = None
pass
in yourapp.py:
import your_module
print your_module.objects # outputs 'test'
a = your_module.MyClass()
print a.objects # outputs 'None'
the reason is:
When you create an instance of some class there is nothing to prevent
you from poking around inside and using various internal, private
methods that are (a) necessary for the class to function, BUT (b) not
intended for direct use/access.
Nothing is really private in python. No class or class instance can
keep you away from all what's inside (this makes introspection
possible and powerful). Python trusts you. It says "hey, if you want
to go poking around in dark places, I'm gonna trust that you've got a
good reason and you're not making trouble."
Karl Fast

Why descriptors (in Python) cannot be inserted (placed) in ALREADY initialized object?

In a process of coding I have been faced with the need to change behavior of object's property (but NOT a class property). And I found that already initialized object cannot be patched with descriptor. So why?
code examples
class A(object):
pass
class D(object):
def __init__(self, fget=None, fset=None, fdel=None, doc=None):
pass
def __get__(self, obj, objtype=None):
return 5
A.x = D()
A.x
Out[12]: 5 # work!
a = A()
a.y = D()
a.y
Out[14]: <__main__.D at 0x13a8d90> # not work!

From the documentation.
For objects, the machinery is in object.__getattribute__() which transforms b.x into type(b).__dict__['x'].__get__(b, type(b)).
That is, the attribute lookup on an instance (b) is converted into a descriptor call on the class (type(b)). Descriptors operate at the class level.
As for why this is true, it's because descriptors are basically a way to do method-like work (i.e., call a method) on attribute lookup. And methods are essentially class-level behavior: you generally define the methods you want on a class, and you don't add extra methods to individual instances. Doing descriptor lookup on an instance would be like defining a method on an instance.
Now, it is possible to assign new methods to instances, and it's also possible to get descriptors to work on instances of a particular class. You just have to do extra work. As the documentation quote above says, the machinery is in object.__getattribute__, so you can override it by defining a custom __getattribute__ on your class:
class Foo(object):
def __getattribute__(self, attr):
myDict = object.__getattribute__(self, '__dict__')
if attr in myDict and hasattr(myDict[attr], '__get__'):
return myDict[attr].__get__(self, type(self))
else:
return super(Foo, self).__getattribute__(attr)
class D(object):
def __init__(self, fget=None, fset=None, fdel=None, doc=None):
pass
def __get__(self, obj, objtype=None):
return 5
And then:
>>> f = Foo()
>>> f.x = D()
>>> f.x
5
So if you feel the need to do this, you can make it happen. It's just not the default behavior, simply because it's not what descriptors were designed to do.

How to keep track of class instances?

Toward the end of a program I'm looking to load a specific variable from all the instances of a class into a dictionary.
For example:
class Foo():
def __init__(self):
self.x = {}
foo1 = Foo()
foo2 = Foo()
...
Let's say the number of instances will vary and I want the x dict from each instance of Foo() loaded into a new dict. How would I do that?
The examples I've seen in SO assume one already has the list of instances.

One way to keep track of instances is with a class variable:
class A(object):
instances = []
def __init__(self, foo):
self.foo = foo
A.instances.append(self)
At the end of the program, you can create your dict like this:
foo_vars = {id(instance): instance.foo for instance in A.instances}
There is only one list:
>>> a = A(1)
>>> b = A(2)
>>> A.instances
[<__main__.A object at 0x1004d44d0>, <__main__.A object at 0x1004d4510>]
>>> id(A.instances)
4299683456
>>> id(a.instances)
4299683456
>>> id(b.instances)
4299683456

#JoelCornett's answer covers the basics perfectly. This is a slightly more complicated version, which might help with a few subtle issues.
If you want to be able to access all the "live" instances of a given class, subclass the following (or include equivalent code in your own base class):
from weakref import WeakSet
class base(object):
def __new__(cls, *args, **kwargs):
instance = object.__new__(cls, *args, **kwargs)
if "instances" not in cls.__dict__:
cls.instances = WeakSet()
cls.instances.add(instance)
return instance
This addresses two possible issues with the simpler implementation that #JoelCornett presented:
Each subclass of base will keep track of its own instances separately. You won't get subclass instances in a parent class's instance list, and one subclass will never stumble over instances of a sibling subclass. This might be undesirable, depending on your use case, but it's probably easier to merge the sets back together than it is to split them apart.
The instances set uses weak references to the class's instances, so if you del or reassign all the other references to an instance elsewhere in your code, the bookkeeping code will not prevent it from being garbage collected. Again, this might not be desirable for some use cases, but it is easy enough to use regular sets (or lists) instead of a weakset if you really want every instance to last forever.
Some handy-dandy test output (with the instances sets always being passed to list only because they don't print out nicely):
>>> b = base()
>>> list(base.instances)
[<__main__.base object at 0x00000000026067F0>]
>>> class foo(base):
... pass
...
>>> f = foo()
>>> list(foo.instances)
[<__main__.foo object at 0x0000000002606898>]
>>> list(base.instances)
[<__main__.base object at 0x00000000026067F0>]
>>> del f
>>> list(foo.instances)
[]

You would probably want to use weak references to your instances. Otherwise the class could likely end up keeping track of instances that were meant to have been deleted. A weakref.WeakSet will automatically remove any dead instances from its set.
One way to keep track of instances is with a class variable:
import weakref
class A(object):
instances = weakref.WeakSet()
def __init__(self, foo):
self.foo = foo
A.instances.add(self)
#classmethod
def get_instances(cls):
return list(A.instances) #Returns list of all current instances
At the end of the program, you can create your dict like this:
foo_vars = {id(instance): instance.foo for instance in A.instances}
There is only one list:
>>> a = A(1)
>>> b = A(2)
>>> A.get_instances()
[<inst.A object at 0x100587290>, <inst.A object at 0x100587250>]
>>> id(A.instances)
4299861712
>>> id(a.instances)
4299861712
>>> id(b.instances)
4299861712
>>> a = A(3) #original a will be dereferenced and replaced with new instance
>>> A.get_instances()
[<inst.A object at 0x100587290>, <inst.A object at 0x1005872d0>]

You can also solve this problem using a metaclass:
When a class is created (__init__ method of metaclass), add a new instance registry
When a new instance of this class is created (__call__ method of metaclass), add it to the instance registry.
The advantage of this approach is that each class has a registry - even if no instance exists. In contrast, when overriding __new__ (as in Blckknght's answer), the registry is added when the first instance is created.
class MetaInstanceRegistry(type):
"""Metaclass providing an instance registry"""
def __init__(cls, name, bases, attrs):
# Create class
super(MetaInstanceRegistry, cls).__init__(name, bases, attrs)
# Initialize fresh instance storage
cls._instances = weakref.WeakSet()
def __call__(cls, *args, **kwargs):
# Create instance (calls __init__ and __new__ methods)
inst = super(MetaInstanceRegistry, cls).__call__(*args, **kwargs)
# Store weak reference to instance. WeakSet will automatically remove
# references to objects that have been garbage collected
cls._instances.add(inst)
return inst
def _get_instances(cls, recursive=False):
"""Get all instances of this class in the registry. If recursive=True
search subclasses recursively"""
instances = list(cls._instances)
if recursive:
for Child in cls.__subclasses__():
instances += Child._get_instances(recursive=recursive)
# Remove duplicates from multiple inheritance.
return list(set(instances))
Usage: Create a registry and subclass it.
class Registry(object):
__metaclass__ = MetaInstanceRegistry
class Base(Registry):
def __init__(self, x):
self.x = x
class A(Base):
pass
class B(Base):
pass
class C(B):
pass
a = A(x=1)
a2 = A(2)
b = B(x=3)
c = C(4)
for cls in [Base, A, B, C]:
print cls.__name__
print cls._get_instances()
print cls._get_instances(recursive=True)
print
del c
print C._get_instances()
If using abstract base classes from the abc module, just subclass abc.ABCMeta to avoid metaclass conflicts:
from abc import ABCMeta, abstractmethod
class ABCMetaInstanceRegistry(MetaInstanceRegistry, ABCMeta):
pass
class ABCRegistry(object):
__metaclass__ = ABCMetaInstanceRegistry
class ABCBase(ABCRegistry):
__metaclass__ = ABCMeta
#abstractmethod
def f(self):
pass
class E(ABCBase):
def __init__(self, x):
self.x = x
def f(self):
return self.x
e = E(x=5)
print E._get_instances()

Another option for quick low-level hacks and debugging is to filter the list of objects returned by gc.get_objects() and generate the dictionary on the fly that way. In CPython that function will return you a (generally huge) list of everything the garbage collector knows about, so it will definitely contain all of the instances of any particular user-defined class.
Note that this is digging a bit into the internals of the interpreter, so it may or may not work (or work well) with the likes of Jython, PyPy, IronPython, etc. I haven't checked. It's also likely to be really slow regardless. Use with caution/YMMV/etc.
However, I imagine that some people running into this question might eventually want to do this sort of thing as a one-off to figure out what's going on with the runtime state of some slice of code that's behaving strangely. This method has the benefit of not affecting the instances or their construction at all, which might be useful if the code in question is coming out of a third-party library or something.

Here's a similar approach to Blckknght's, which works with subclasses as well. Thought this might be of interest, if someone ends up here. One difference, if B is a subclass of A, and b is an instance of B, b will appear in both A.instances and B.instances. As stated by Blckknght, this depends on the use case.
from weakref import WeakSet
class RegisterInstancesMixin:
instances = WeakSet()
def __new__(cls, *args, **kargs):
o = object.__new__(cls, *args, **kargs)
cls._register_instance(o)
return o
#classmethod
def print_instances(cls):
for instance in cls.instances:
print(instance)
#classmethod
def _register_instance(cls, instance):
cls.instances.add(instance)
for b in cls.__bases__:
if issubclass(b, RegisterInstancesMixin):
b._register_instance(instance)
def __init_subclass__(cls):
cls.instances = WeakSet()
class Animal(RegisterInstancesMixin):
pass
class Mammal(Animal):
pass
class Human(Mammal):
pass
class Dog(Mammal):
pass
alice = Human()
bob = Human()
cannelle = Dog()
Animal.print_instances()
Mammal.print_instances()
Human.print_instances()
Animal.print_instances() will print three objects, whereas Human.print_instances() will print two.

Using the answer from #Joel Cornett I've come up with the following, which seems to work. i.e. i'm able to total up object variables.
import os
os.system("clear")
class Foo():
instances = []
def __init__(self):
Foo.instances.append(self)
self.x = 5
class Bar():
def __init__(self):
pass
def testy(self):
self.foo1 = Foo()
self.foo2 = Foo()
self.foo3 = Foo()
foo = Foo()
print Foo.instances
bar = Bar()
bar.testy()
print Foo.instances
x_tot = 0
for inst in Foo.instances:
x_tot += inst.x
print x_tot
output:
[<__main__.Foo instance at 0x108e334d0>]
[<__main__.Foo instance at 0x108e334d0>, <__main__.Foo instance at 0x108e33560>, <__main__.Foo instance at 0x108e335a8>, <__main__.Foo instance at 0x108e335f0>]
5
10
15
20

(For Python)
I have found a way to record the class instances via the "dataclass" decorator while defining a class. Define a class attribute 'instances' (or any other name) as a list of the instances you want to record. Append that list with the 'dict' form of created objects via the dunder method __dict__. Thus, the class attribute 'instances' will record instances in the dict form, which you want.
For example,
from dataclasses import dataclass
#dataclass
class player:
instances=[]
def __init__(self,name,rank):
self.name=name
self.rank=rank
self.instances.append(self.__dict__)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How does Python access an objects attribute when dict is overridden? - python

Related

Some problems about Python inherited classmethod

Why are non-overriding descriptors shadowed but overriding descriptors are not? [duplicate]

Can Python classes have members that are accessible, but not from an instance of the class?

Why descriptors (in Python) cannot be inserted (placed) in ALREADY initialized object?

How to keep track of class instances?

Categories

Resources