Check if class attribute was defined or derived in given class - python

Example
class A:
foo = 1
class B:
foo = 2
class C:
foo = 3
class D(A, B, C):
pass
def collect_foo(cls):
import inspect
foos = []
for c in inspect.getmro(cls):
if hasattr(c, 'foo'):
foos.append(c.foo)
return foos
Now collect_foo(D) returns [1, 1, 2, 3] - 1 is doubled as D derives it from A. The question is - how to get unique foos. First thing which came to my mind was checking if property is derived or declared in given class - is it possible? How to do that?

Just check
'foo' in c.__dict__
instead of
hasattr(c, 'foo')
This will only yield True if the attribute is defined in c itself.

I believe this will work... Look to see if it is in the __dict__ attribute of the class. But, be sure you really want to do this first.
Example:
if name in cls.__dict__:
# ... your code here ...
pass

"The thing is there are some attributes which I don't want to be overriden but mixed in and available for derived class"
This is exactly what the namespace mangling in Python does. The attributes that should not be overridden like this should start with two underscores. That way they don't get overridden, but remain unique for each class.

I agree with the accepted answer, but a #classmethod did not match my case...
I found it useful to check the class dict from the context of a base class or mix-in, where I only want to execute some code path if it applies to the derived class, such as if a particular method is overridden.
i.e. (in Python 3.6):
if method_name in self.__class__.__dict__.keys():
# do something
I rationalized it as follows.
self is the instance
__class__ is that instance's actual type
therefore, dict only contains the methods actually overridden in the derived class.

Related

dataclasses.dataclass with __init_subclass__

My confusion is with the interplay between dataclasses & __init_subclass__.
I am trying to implement a base class that will exclusively be inherited from. In this example, A is the base class. It is my understanding from reading the python docs on dataclasses that simply adding a decorator should automatically create some special dunder methods for me. Quoting their docs:
For example, this code:
from dataclasses import dataclass
#dataclass
class InventoryItem:
"""Class for keeping track of an item in inventory."""
name: str
unit_price: float
quantity_on_hand: int = 0
def total_cost(self) -> float:
return self.unit_price * self.quantity_on_hand
will add, among other things, a __init__() that looks like:
def __init__(self, name: str, unit_price: float, quantity_on_hand: int = 0):
self.name = name
self.unit_price = unit_price
self.quantity_on_hand = quantity_on_hand
This is an instance variable, no? From the classes docs, it shows a toy example, which reads super clear.
class Dog:
kind = 'canine' # class variable shared by all instances
def __init__(self, name):
self.name = name # instance variable unique to each instance
A main gap in my understanding is - is it an instance variable or a class variable? From my testing below, it is a class variable, but from the docs, it shows an instance variable as it's proximal implementation. It may be that most of my problem is there. I've also read the python docs on classes, which do not go into dataclasses.
The problem continues with the seemingly limited docs on __init_subclass__, which yields another gap in my understanding. I am also making use of __init_subclass__, in order to enforce that my subclasses have indeed instantiated the variable x.
Below, we have A, which has an instance variable x set to None. B, C, and D all subclass A, in different ways (hoping) to determine implementation specifics.
B inherits from A, setting a class variable of x.
D is a dataclass, which inherits from A, setting what would appear to be a class variable of x. However, given their docs from above, it seems that the class variable x of D should be created as an instance variable. Thus, when D is created, it should first call __init_subclass__, in that function, it will check to see if x exists in D - by my understanding, it should not; however, the code passes scot-free. I believe D() will create x as an instance variable because the dataclass docs show that this will create an __init__ for the user.
"will add, among other things..." <insert __init__ code>
I must be wrong here but I'm struggling to put it together.
import dataclasses
class A:
def __init__(self):
self.x = None
def __init_subclass__(cls):
if not getattr(cls, 'x') or not cls.x:
raise TypeError(
f'Cannot instantiate {cls.__name__}, as all subclasses of {cls.__base__.__name__} must set x.'
)
class B(A):
x = 'instantiated-in-b'
#dataclasses.dataclass
class D(A):
x : str = 'instantiated-in-d'
class C(A):
def __init__(self):
self.x = 'instantiated-in-c'
print('B', B())
print('D', D())
print('C', C())
The code, per my expectation, properly fails with C(). Executing the above code will succeed with D, which does not compute for me. In my understanding (which is wrong), I am defining a field, which means that dataclass should expand my class variables as instance variables. (The previous statement is most probably where I am wrong, but I cannot find anything that documents this behavior. Are data classes not actually expanding class variables as instance variables? It certainly appears that way from the visual explanation in their docs.) From the dataclass docs:
The dataclass() decorator examines the class to find fields. A field is defined as a class variable that has a type annotation.
Thus - why - when creating an instance D() - does it slide past the __init_subclass__ of its parent A?
Apologies for the lengthy post, I must be missing something simple, so if once can point me in the right direction, that would be excellent. TIA!
I have just found the implementation for dataclasses from the CPython github.
Related Articles:
Understanding __init_subclass__
python-why-use-self-in-a-class
proper-way-to-create-class-variable-in-data-class
how-to-get-instance-variables-in-python
enforcing-class-variables-in-a-subclass
__init_subclass__ is called when initializing a subclass. Not when initializing an instance of a subclass - it's called when initializing the subclass itself. Your exception occurs while trying to create the C class, not while trying to evaluate C().
Decorators, such as #dataclass, are a post-processing mechanism, not a pre-processing mechanism. A class decorator takes an existing class that has already gone through all the standard initialization, including __init_subclass__, and modifies the class. Since this happens after __init_subclass__, __init_subclass__ doesn't see any of the modifications that #dataclass performs.
Even if the decorator were to be applied first, D still would have passed the check in A.__init_subclass__, because the dataclass decorator will set D.x to the default value of the x field anyway, so __init_subclass__ will find a value of x. In this case, that happens to be the same thing you set D.x to in the original class definition, but it can be a different object in cases where you construct field objects explicitly.
(Also, you probably wanted to write hasattr instead of getattr in not getattr(cls, 'x').)

How to store functions as class variables in python?

I am writing a framework, and I want my base class to use different functions for renaming in the child classes. I figured the best way would be to use a class attribute, like in case of A, but I got TypeErrors when running it like in rename_columns(). However it worked with implementation like B
import pandas as pd
class A:
my_func_mask = str.lower
foo = 'bar'
def rename_columns(self, data):
return data.rename(columns=self.my_func_mask)
class B(A):
def rename_columns(self, data):
return data.rename(columns=self.__class__.my_func_mask)
So I experimented with the above a bit, and I get the following:
a = A()
a.foo # Works fine, gives back 'bar'
a.__class__.my_func_mask # Works as expected `a.__class__.my_func_mask is str.lower` is true
a.my_func_mask # throws TypeError: descriptor 'lower' for 'str' objects doesn't apply to 'A' object
My questions would be why can I use regular typed (int, str, etc.) values as class attributes and access them on the instance as well, while I cannot do that for functions?
What happens during the attribute lookup in these cases? What is the difference in the attribute resolution process?
Actually both foo and my_func_mask is in __class__.__dict__ so I am a bit puzzled. Thanks for the clarifications!
You are storing an unbound built-in method on your class, meaning it is a descriptor object. When you then try to access that on self, descriptor binding applies but the __get__ method called to complete the binding tells you that it can't be bound to your custom class instances, because the method would only work on str instances. That's a strict limitation of most methods of built-in types.
You need to store it in a different manner; putting it inside another container, such as a list or dictionary, would avoid binding. Or you could wrap it in a staticmethod descriptor to have it be bound and return the original. Another option is to not store this as a class attribute, and simply create an instance attribute in __init__.
But in this case, I'd not store str.lower as an attribute value, at all. I'd store None and fall back to str.lower when you still encounter None:
return data.rename(columns=self.my_func_mask or str.lower)
Setting my_func_mask to None is a better indicator that a default is going to be used, clearly distinguishable from explicitly setting str.lower as the mask.
You need to declare staticmethod.
class A:
my_func_mask = staticmethod(str.lower)
foo = 'bar'
>>> A().my_func_mask is str.lower
>>> True
Everything that is placed in the class definition is bound to the class, but you can't bind a built-in to your own class.
Essentially, all code that you place in a class is executed when the class is created. All items in locals() are then bound to your class at the end of the class. That's why this also works to bind a method to your class:
def abc(self):
print('{} from outside the class'.format(self))
class A:
f1 = abc
f2 = lambda self: print('{} from lambda'.format(self))
def f3(self):
print('{} from method'.format(self))
To not have the function bound to your class, you have to place it in the __init__ method of your class:
class A:
def __init__(self):
self.my_func_mask = str.lower

Mutating objects referenced by class attributes through instance methods

This is a two-part query, which broadly relates to class attributes referencing mutable and immutable objects, and how these should be dealt with in code design. I have abstracted away the details to provide an example class below.
In this example, the class is designed for two instances which, through an instance method, can access a class attribute that references a mutable object (a list in this case), each can “take” (by mutating the object) elements of this object into their own instance attribute (by mutating the object it references). If one instance “takes” an element of the class attribute, that element is subsequently unavailable to the other instance, which is the effect I wish to achieve. I find this a convenient way of avoiding the use of class methods, but is it bad practice?
Also in this example, there is a class method that reassigns an immutable object (a Boolean value, in this case) to a class attribute based on the state of an instance attribute. I can achieve this by using a class method with cls as the first argument and self as the second argument, but I’m not sure if this is correct. On the other hand, perhaps this is how I should be dealing with the first part of this query?
class Foo(object):
mutable_attr = ['1', '2']
immutable_attr = False
def __init__(self):
self.instance_attr = []
def change_mutable(self):
self.instance_attr.append(self.mutable_attr[0])
self.mutable_attr.remove(self.mutable_attr[0])
#classmethod
def change_immutable(cls, self):
if len(self.instance_attr) == 1:
cls.immutable_attr = True
eggs = Foo()
spam = Foo()
If you want a class-level attribute (which, as you say, is "visible" to all instances of this class) using a class method like you show is fine. This is, mostly, a question of style and there are no clear answers here. So what you show is fine.
I just want to point out that you don't have to use a class method to accomplish your goal. To accomplish your goal this is also perfectly fine (and in my opinion, more standard):
class Foo(object):
# ... same as it ever was ...
def change_immutable(self):
"""If instance has list length of 1, change immutable_attr for all insts."""
if len(self.instance_attr) == 1:
type(self).immutable_attr = True
Or even:
def change_immutable(self):
"""If instance has list length of 1, change immutable_attr for all insts."""
if len(self.instance_attr) == 1:
Foo.immutable_attr = True
if that's what you want to do. The major point being that you are not forced into using a class method to get/set class level attributes.
The type builtin function (https://docs.python.org/2/library/functions.html#type) simply returns the class of an instance. For new style classes (most classes nowadays, ones that ultimately descend from object) type(self) is the same as self.__class__, but using type is the more idiomatic way to access an object's type.
You use type when you want to write code that gets an object's ultimate type, even if it's subclassed. This may or may not be what you want to do. For example, say you have this:
class Baz(Foo):
pass
bazzer = Baz()
bazzer.change_mutable()
bazzer.change_immutable()
Then the code:
type(self).immutable_attr = True
Changes the immutable_attr on the Baz class, not the Foo class. That may or may not be what you want -- just be aware that only objects that descend from Baz see this. If you want to make it visible to all descendants of Foo, then the more appropriate code is:
Foo.immutable_attr = True
Hope this helps -- this question is a good one but a bit open ended. Again, major point being you are not forced to use class methods to set/get class attrs -- but not that there's anything wrong with that either :)
Just finally note the way you first wrote it:
#classmethod
def change_immutable(cls, self):
if len(self.instance_attr) == 1:
cls.immutable_attr = True
Is like doing the:
type(self).immutable_attr = True
way, because the cls variable will not necessarily be Foo if it's subclassed. If you for sure want to set it for all instances of Foo, then just setting the Foo class directly:
Foo.immutable_attr = True
is the way to go.
This is one possibility:
class Foo(object):
__mutable_attr = ['1', '2']
__immutable_attr = False
def __init__(self):
self.instance_attr = []
def change_mutable(self):
self.instance_attr.append(self.__class__.__mutable_attr.pop(0))
if len(self.instance_attr) == 1:
self.__class__.__immutable_attr = True
#property
def immutable_attr(self):
return self.__class__.__immutable_attr
So a little bit of explanation:
1. I'm making it harder to access class attributes from the outside to protect them from accidental change by prefixing them with double underscore.
2. I'm doing pop() and append() in one line.
3. I'm setting the value for __immutable_attr immediately after modifying __mutable_attr if the condition is met.
4. I'm exposing immutable_attr as read only property to provide easy way to check it's value.
5. I'm using self.__class__ to access class of the instance - it's more readable than type(self) and gives us direct access to attributes with double underscore.

Can set any property of Python object [duplicate]

This question already has answers here:
Usage of __slots__?
(14 answers)
Can't set attributes on instance of "object" class
(7 answers)
Closed 7 months ago.
For example, this code is Python:
a = object()
a.b = 3
throws AttributeError: 'object' object has no attribute 'b'
But, this piece of code:
class c(object): pass
a = c()
a.b = 3
is just fine. Why can I assign property b, when class x does not have that property? How can I make my classes have only properties defined?
The object type is a built-in class written in C and doesn't let you add attributes to it. It has been expressly coded to prevent it.
The easiest way to get the same behavior in your own classes is to use the __slots__ attribute to define a list of the exact attributes you want to support. Python will reserve space for just those attributes and not allow any others.
class c(object):
__slots__ = "foo", "bar", "baz"
a = c()
a.foo = 3 # works
a.b = 3 # AttributeError
Of course, there are some caveats with this approach: you can't pickle such objects, and code that expects every object to have a __dict__ attribute will break. A "more Pythonic" way would be to use a custom __setattr__() as shown by another poster. Of course there are plenty of ways around that, and no way around setting __slots__ (aside from subclassing and adding your attributes to the subclass).
In general, this is not something you should actually want to do in Python. If the user of your class wants to store some extra attributes on instances of the class, there's no reason not to let them, and in fact a lot of reasons why you might want to.
You can override the behavior of the __setattr__ magic method like so.
class C(object):
def __setattr__(self, name, value):
allowed_attrs = ('a', 'b', 'c')
if name not in allowed_attrs:
# raise exception
# or do something else
pass
self.__dict__[name] = value
Of course, this will only prevent you from setting attributes like a.b (the dot form). You can still set the attributes using a.__dict__[b] = value. In that case, you should override the __dict__ method too.
Python generally allows you to set any attribute on any object. This is a special case where the object class acts differently. There are also some modules implemented in C that act similarly.
If you want your object to behave like this, you can define a __setattr__(self, name, value) method that explicitly does a raise AttributeError() if you try to set a member that's not on the "approved list" (see http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/389916)
Creating an object instance has no features. Therefore setting attributes on an instance of a the base object type is expressly disabled. You must subclass it to be able to create attributes.
Hint: If you want a simple object to use as something on which to store properties, you can do so by creating an anonymous function with lambda. Functions, being objects, are able to store attributes as well, so this is perfectly legit:
>>> a = lambda: None
>>> a.b = 3
>>> a.b
3
This happens because when you say a.b = 3, it creates a variable in a that represents b. For example,
class a: pass
print a.b
returns AttributeError: class a has no attribute b
However this code,
class a: pass
a.b = 3
print a.b
returns 3 as it sets the value of b in a, to 3.

Differences between static and instance variables in python. Do they even exist?

A random class definition:
class ABC:
x = 6
Setting some values, first for the abc instance, later for the static variable:
abc = ABC()
abc.x = 2
ABC.x = 5
and then print the results:
print abc.x
print ABC.x
which prints
2
5
Now, I don't really get what is going on, because if i replace in the class definition x = 6 for "pass", it will just output the same thing. My question is, what is the purpose of defining a variable in the class definition in python if it seems like i can anyone set at any time any variable without doing so?
Also, does python know the difference between instance and static variables? From what I saw, I'd say so.
Warning: the following is an oversimplification; I'm ignoring __new__() and a bunch of other special class methods, and handwaving a lot of details. But this explanation will get you pretty far in Python.
When you create an instance of a class in Python, like calling ABC() in your example:
abc = ABC()
Python creates a new empty object and sets its class to ABC. Then it calls the __init__() if there is one. Finally it returns the object.
When you ask for an attribute of an object, first it looks in the instance. If it doesn't find it, it looks in the instance's class. Then in the base class(es) and so on. If it never finds anybody with the attribute defined, it throws an exception.
When you assign to an attribute of an object, it creates that attribute if the object doesn't already have one. Then it sets the attribute to that value. If the object already had an attribute with that name, it drops the reference to the old value and takes a reference to the new one.
These rules make the behavior you observe easy to predict. After this line:
abc = ABC()
only the ABC object (the class) has an attribute named x. The abc instance doesn't have its own x yet, so if you ask for one you're going to get the value of ABC.x. But then you reassign the attribute x on both the class and the object. And when you subsequently examine those attributes you observe the values you put there are still there.
Now you should be able to predict what this code does:
class ABC:
x = 6
a = ABC()
ABC.xyz = 5
print(ABC.xyz, a.xyz)
Yes: it prints two fives. You might have expected it to throw an AttributeError exception. But Python finds the attribute in the class--even though it was added after the instance was created.
This behavior can really get you in to trouble. One classic beginner mistake in Python:
class ABC:
x = []
a = ABC()
a.x.append(1)
b = ABC()
print(b.x)
That will print [1]. All instances of ABC() are sharing the same list. What you probably wanted was this:
class ABC:
def __init__(self):
self.x = []
a = ABC()
a.x.append(1)
b = ABC()
print(b.x)
That will print an empty list as you expect.
To answer your exact questions:
My question is, what is the purpose of defining a variable in the class definition in python if it seems like i can anyone set at any time any variable without doing so?
I assume this means "why should I assign members inside the class, instead of inside the __init__ method?"
As a practical matter, this means the instances don't have their own copy of the attribute (or at least not yet). This means the instances are smaller; it also means accessing the attribute is slower. It also means the instances all share the same value for that attribute, which in the case of mutable objects may or may not be what you want. Finally, assignments here mean that the value is an attribute of the class, and that's the most straightforward way to set attributes on the class.
As a purely stylistic matter it's shorter code, as you don't have all those instances of self. all over. Beyond that it doesn't make much difference. However, assigning attributes in the __init__ method ensures they are unambiguously instance variables.
I'm not terribly consistent myself. The only thing I'm sure to do is assign all the mutable objects that I don't want shared in the __init__ method.
Also, does python know the difference between instance and static variables? From what I saw, I'd say so.
Python classes don't have class static variables like C++ does. There are only attributes: attributes of the class object, and attributes of the instance object. And if you ask for an attribute, and the instance doesn't have it, you'll get the attribute from the class.
The closest approximation of a class static variable in Python would be a hidden module attribute, like so:
_x = 3
class ABC:
def method(self):
global _x
# ...
It's not part of the class per se. But this is a common Python idiom.
class SomeClass:
x=6 # class variable
def __init__(self):
self.y = 666 # instance variable
There is virtue in declaring a class scoped variable: it serves as default for one. Think of class scoped variable as you would think of "static" variables in some other languages.
Python makes a distinction between the two. The purpose could be multiple, but one example is this:
class token(object):
id = 0
def __init__(self, value):
self.value = value
self.id = token.id
token.id += 1
Here, the class variable token.id is automatically incremented at each new instance, and this instance can take a unique ID at the same time, which will be put in self.id. Both are stored at different places - in the class object, or in the instance object, you can indeed compare that to static and instance variables in some OO languages like C++ or C#.
In that example, if you do:
print token.id
you will see the next ID to be assigned, whereas:
x = token(10)
print x.id
will give the id of that instance.
Everyone can also put other attributes in an instance or in a class, that's right, but that wouldn't be interesting since the class code is not intended to use them. The interest with an exemple as above is that the class code uses them.
A class-level variable (called "static" in other languages) is owned by the class, and shared by all instances of the class.
A instance variable is part of by each distinct instance of the class.
However.
You can add a new instance variable any time you want.
So getting abc.x requires first checking for an instance variable. If there is no instance variable, it will try the class variable.
And setting abc.x will create (or replace) an instance variable.
Every object has a __dict__. The class ABC and its instance, abc, are both objects, and so each has their own separate __dict__:
In [3]: class ABC:
...: x=6
Notice ABC.__dict__ has a 'x' key:
In [4]: ABC.__dict__
Out[4]: {'__doc__': None, '__module__': '__main__', 'x': 6}
In [5]: abc=ABC()
In [6]: abc.__dict__
Out[6]: {}
Notice that if 'x' is not in abc.__dict__, then the __dict__'s of abc's superclass(es) are searched. So abc.x is "inherited" from ABC:
In [14]: abc.x
Out[14]: 6
But if we set abc.x then we are changing abc.__dict__, not ABC.__dict__:
In [7]: abc.x = 2
In [8]: abc.__dict__
Out[8]: {'x': 2}
In [9]: ABC.__dict__
Out[9]: {'__doc__': None, '__module__': '__main__', 'x': 6}
Of course, we can change ABC.__dict__ if we wish:
In [10]: ABC.x = 5
In [11]: ABC.__dict__
Out[11]: {'__doc__': None, '__module__': '__main__', 'x': 5}
The benefit of a "static" or in Python a "class attribute" is that each instance of the class will have access to the same class attribute. This is not true for instance attributes as you may be aware.
Take for example:
class A(object):
b = 1
A.b # => 1
inst = A()
inst2 = A()
inst.b # => 1
inst2.b # => 1
A.b = 5
inst.b # => 5
inst2.b # => 5
As you can see the instance of the class has access to the class attribute which can be set by specifying the class name and then the class attribute.
The tricky part is when you have a class attribute and instance attribute named the same thing. This requires an understanding of what is going on under the hood.
inst.__dict__ # => {}
A.__dict__ # => {..., 'b': 5}
Notice how the instance does not have b as an attribute? Above, when we called inst.b Python actually checks inst.__dict__ for the attribute, if it cannot be found, then it searches A.__dict__ (the class's attributes). Of course, when Python looks up b in the class's attributes it is found and returned.
You can get some confusing output if you then set an instance attribute with the same name.
For example:
inst.b = 10
inst.__dict__ #=> {'b': 10}
A.b # => 5
inst.b # => 10
You can see that the instance of the class now has the b instance attribute and therefore Python returns that value.

Categories

Resources