Python Class vs. Module Attributes - python

I'm interested in hearing some discussion about class attributes in Python. For example, what is a good use case for class attributes? For the most part, I can not come up with a case where a class attribute is preferable to using a module level attribute. If this is true, then why have them around?
The problem I have with them, is that it is almost too easy to clobber a class attribute value by mistake, and then your "global" value has turned into a local instance attribute.
Feel free to comment on how you would handle the following situations:
Constant values used by a class and/or sub-classes. This may include "magic number" dictionary keys or list indexes that will never change, but possible need one-time initialization.
Default class attribute, that in rare occasions updated for a special instance of the class.
Global data structure used to represent an internal state of a class shared between all instances.
A class that initializes a number of default attributes, not influenced by constructor arguments.
Some Related Posts:
Difference Between Class and Instance Attributes

#4:
I never use class attributes to initialize default instance attributes (the ones you normally put in __init__). For example:
class Obj(object):
def __init__(self):
self.users = 0
and never:
class Obj(object):
users = 0
Why? Because it's inconsistent: it doesn't do what you want when you assign anything but an invariant object:
class Obj(object):
users = []
causes the users list to be shared across all objects, which in this case isn't wanted. It's confusing to split these into class attributes and assignments in __init__ depending on their type, so I always put them all in __init__, which I find clearer anyway.
As for the rest, I generally put class-specific values inside the class. This isn't so much because globals are "evil"--they're not so big a deal as in some languages, because they're still scoped to the module, unless the module itself is too big--but if external code wants to access them, it's handy to have all of the relevant values in one place. For example, in module.py:
class Obj(object):
class Exception(Exception): pass
...
and then:
from module import Obj
try:
o = Obj()
o.go()
except o.Exception:
print "error"
Aside from allowing subclasses to change the value (which isn't always wanted anyway), it means I don't have to laboriously import exception names and a bunch of other stuff needed to use Obj. "from module import Obj, ObjException, ..." gets tiresome quickly.

what is a good use case for class attributes
Case 0. Class methods are just class attributes. This is not just a technical similarity - you can access and modify class methods at runtime by assigning callables to them.
Case 1. A module can easily define several classes. It's reasonable to encapsulate everything about class A into A... and everything about class B into B.... For example,
# module xxx
class X:
MAX_THREADS = 100
...
# main program
from xxx import X
if nthreads < X.MAX_THREADS: ...
Case 2. This class has lots of default attributes which can be modified in an instance. Here the ability to leave attribute to be a 'global default' is a feature, not bug.
class NiceDiff:
"""Formats time difference given in seconds into a form '15 minutes ago'."""
magic = .249
pattern = 'in {0}', 'right now', '{0} ago'
divisions = 1
# there are more default attributes
One creates instance of NiceDiff to use the existing or slightly modified formatting, but a localizer to a different language subclasses the class to implement some functions in a fundamentally different way and redefine constants:
class Разница(NiceDiff): # NiceDiff localized to Russian
'''Из разницы во времени, типа -300, делает конкретно '5 минут назад'.'''
pattern = 'через {0}', 'прям щас', '{0} назад'
Your cases:
constants -- yes, I put them to class. It's strange to say self.CONSTANT = ..., so I don't see a big risk for clobbering them.
Default attribute -- mixed, as above may go to class, but may also go to __init__ depending on the semantics.
Global data structure --- goes to class if used only by the class, but may also go to module, in either case must be very well-documented.

Class attributes are often used to allow overriding defaults in subclasses. For example, BaseHTTPRequestHandler has class constants sys_version and server_version, the latter defaulting to "BaseHTTP/" + __version__. SimpleHTTPRequestHandler overrides server_version to "SimpleHTTP/" + __version__.

Encapsulation is a good principle: when an attribute is inside the class it pertains to instead of being in the global scope, this gives additional information to people reading the code.
In your situations 1-4, I would thus avoid globals as much as I can, and prefer using class attributes, which allow one to benefit from encapsulation.

Related

Undoing a decade of singleton pattern and class-level configuration

Overview
I need to duplicate a whole inheritance tree of classes. Simply deep-copying the class objects does not work; a proper factory pattern involves a huge amount of code changes; I'm not sure how to use metaclasses to accomplish this.
Background
The software I work on implements support for specialized external hardware, connected to the host computer via USB. Many years ago, it was assumed that there would only ever be one type of hardware in use at a time. Consequently, the hardware object is used as a singleton. Along the years, secondary classes were configured based on the currently active hardware class.
At the moment, it is impossible to use this library with two types of hardware at the same time, since the classobjects cannot be configured for both hardware at the same time.
In recent years, we have avoided this issue by creating one python process for each hardware, but this is becoming untenable.
Here is an extremely simplified example of the architecture:
# ----------
# Hardware classes
class HwBase():
def customizeComponent(self, compDict):
compDict['ComponentBase'].hardware = self
class HwA(HwBase):
def customizeComponent(self, compDict):
super().customizeComponent(compDict)
compDict['AnotherComponent'].prop.configure(1,2,3)
class HwB(HwBase):
def customizeComponent(self, compDict):
super().customizeComponent(compDict)
compDict['AnotherComponent'].prop.configure(4,5,6)
# ----------
# Property classes
class SpecialProperty(property):
def __init__(self, fvalidate):
self.fvalidate = fvalidate
# handle fset, fget, etc. here.
# super().__init__()
# ----------
# Component classes
class ComponentBase():
hardware = None
def validateProp(self, val):
return val < self.maxVal
prop = SpecialProperty(fvalidate=validateProp)
class SomeComponent():
"""Users directly instantiate and use this compoent via an interactive shell.
This component does complex operations with the hardware attribute"""
def validateThing(self, val):
return isinstance(val, ComponentBase)
thing = SpecialProperty(fvalidate=validateThing)
class AnotherComponent():
"""Users directly instantiate and use this compoent via an interactive shell
This component does complex operations with the hardware attribute"""
maxVal = 15
# ----------
# Initialization
def initialize():
""" This is only called once perppython instance."""
#activeCls = HwA
activeCls = HwB
allComponents = {
'ComponentBase': ComponentBase,
'SomeComponent': SomeComponent,
'AnotherComponent': AnotherComponent
}
hwInstance = activeCls()
hwInstance.customizeComponent(allComponents)
return allComponents
components = initialize()
# ----------
# User code goes here
someInstance1 = components['SomeComponent']()
someInstance2 = components['SomeComponent']()
someInstance1.prop = 10
someInstance2.prop = 10
The overarching goal would be to interact with both HwA and HwB at the same time. Since most interactions are done via components instead of the Hw objects themselves, I believe the solution involves having multiple versions of the components, e.g.: two separate inheritance trees, for a total of 6 final components, one tree/set configured for each hardware. This is what I need help with.
Potential solutions
Consider that I have around tens different hardware do configure for. Furthermore, there are hundreds of different leaf components classes, with many extra bases and mixin classes.
Move all configuration steps in the component's init method
Not possible due to the use of properties; these need to be set on the class.
Deepcopy the classobjects
Copy all classobjects, swap in the appropriate __bases__. Mutable class variables need to be carefully handled. However, I'm not sure how to deal with properties for this, since classbody references within the property objects (such as fvalidate) need to be updated to that of the copied class.
This requires a significant amount of manual intervention to work. Not impossible, but prone to breaking in the long term.
Factory pattern
Wrap all component definition in a factory function:
def ComponentBaseFactory(hw):
class SomeComponent(cache[hw].ComponentBase):
pass
and have some sort of component cache which would handle creating all classobjects during initialize()
This is what I consider the most architecturally-correct option available. Since the class body is re-executed
on every factory call, the attributes of the properties will reference the appropriate class object.
Downside: huge code footprint. I am familiar with doing codebase-wide changes via sed or python scripts, but this would be quite a lot.
Add metaclasses on components
I am not sure how to proceed for this. Based on the python data model (py3.7), the following happens at class creation (which happens right after the class definition indentation ends):
MRO entries are resolved;
the appropriate metaclass is determined;
the class namespace is prepared;
the class body is executed;
the class object is created.
I would need to redo these steps after the class has been defined (like a factory function!), but i'm not sure how to redo step 4. Specifically, the python documentation states in section 3.3.3.5 that the class body is executed as with a "special?" form of the exec() builtin. How can I re-exec the class body with a different set of locals/globals? Even if I access the class body's code with inspect shenanigans, i'm not sure i'll be able to reproduce the module environment properly.
Even if I mess with __prepare__ and __new__, I don't see how I can fix the cross-references introduced in the class code block regarding the property instantiation.
Components as metaclasses
A metaclass is a class factory, just like a class is an object factory. SomeComponent and AnotherComponent could be declared as metaclasses, then get instantiated with the Hw object during initialize():
SomeComponent = SomeComponentMeta(hw)
This is similar to the factory pattern, but would also require quite a few code changes: a lot of class code would have to be moved to the metaclass __init__.
I'd have to spend a lot more of time here to proper understand what you need, but if your "TL;DR" of executing the class body with different globals/nonlocal variables is the bottom line, the factory approach is a very clean and readable way, as you had considered.
At first, I don't think a metaclass could be a good approach here - although it could be used to customize your special properties (in my first read, I could not figure out what they actually do, and how they should differ between your final classes). If the function as a class factory can specialize your properties, it would work nonetheless.
If what you need is that the properties are independent for Hwa and HwB like in accessing a different list object in HwA than is accessed in HwB, yes, a metaclass could take care of that, by automatically recreating any properties when creating a subclass (so that the property objects themselves are not shared with the supper-classes and across the hierarchy).
If that i what you need, leave a comment, I can write some proof of concept code.
Anyway, it is possible to create a metaclass that, upon instantiating a subclass, will look upon the hierarchy for all SpecialProperty and create new-instances of those for the subclass - so that a base value set on a superclass remains valid for the subclasses, but when configuration runs, each class will have an independent configuration. (as it turns out, no metaclass is needed: we are covered by __init_subclass__ )
Another thing to take care of is that subclassses of property cannot be simply copies with Python's copy.copy (tested empirically), so we need a way to create reliable copies of those. I include one function bellow, but it might need to be improved to work with the actual SpecialProperty class.
from copy import copy
def copy_property(prop):
cls = prop.__class__
new_prop = cls.__new__(cls)
# Initialize the attributes that can't be set from Python code, inplace:
property.__init__(new_prop, prop.fget, prop.fset, prop.fdel)
if hasattr(prop, "__dict__"): # only exists for subclasses of property
# Possible adaptation needed: it may be that for some attributes of
# SpecialProperty, a deepcopy would be needed.
# But for the given example attribute of "fvalidate" a simple copy is better:
new_prop.__dict__ = copy(prop.__dict__)
return new_prop
# Python 3.6 introduced `__init_subclass__` which is called at subclass _creation_
# time. With it, the logic can be inserted in ComponentBase and there is no need for
# a metaclass.
class ComponentBase():
def __init_subclass__(cls, **kwargs):
super().__init_subclass__(**kwargs)
for attrname in dir(cls):
attr = getattr(cls, attrname)
if not isinstance(attr, SpecialProperty):
continue
new_prop = copy_property(attr)
setattr(cls, attrname, new_prop)
hardware = None
...
As you see- theres some workarounds that had to be done because your project opted for subclassing property. I am leaving this remark here as a remainder that unless property fits one exact needs, it is more clean to write a new class implementing the Descriptor Protocol - just by implementing __set__, __get__ and __delete__ directly.

Properties seem to set to the same value for all objects (Python) [duplicate]

What is the difference between class and instance variables in Python?
class Complex:
a = 1
and
class Complex:
def __init__(self):
self.a = 1
Using the call: x = Complex().a in both cases assigns x to 1.
A more in-depth answer about __init__() and self will be appreciated.
When you write a class block, you create class attributes (or class variables). All the names you assign in the class block, including methods you define with def become class attributes.
After a class instance is created, anything with a reference to the instance can create instance attributes on it. Inside methods, the "current" instance is almost always bound to the name self, which is why you are thinking of these as "self variables". Usually in object-oriented design, the code attached to a class is supposed to have control over the attributes of instances of that class, so almost all instance attribute assignment is done inside methods, using the reference to the instance received in the self parameter of the method.
Class attributes are often compared to static variables (or methods) as found in languages like Java, C#, or C++. However, if you want to aim for deeper understanding I would avoid thinking of class attributes as "the same" as static variables. While they are often used for the same purposes, the underlying concept is quite different. More on this in the "advanced" section below the line.
An example!
class SomeClass:
def __init__(self):
self.foo = 'I am an instance attribute called foo'
self.foo_list = []
bar = 'I am a class attribute called bar'
bar_list = []
After executing this block, there is a class SomeClass, with 3 class attributes: __init__, bar, and bar_list.
Then we'll create an instance:
instance = SomeClass()
When this happens, SomeClass's __init__ method is executed, receiving the new instance in its self parameter. This method creates two instance attributes: foo and foo_list. Then this instance is assigned into the instance variable, so it's bound to a thing with those two instance attributes: foo and foo_list.
But:
print instance.bar
gives:
I am a class attribute called bar
How did this happen? When we try to retrieve an attribute through the dot syntax, and the attribute doesn't exist, Python goes through a bunch of steps to try and fulfill your request anyway. The next thing it will try is to look at the class attributes of the class of your instance. In this case, it found an attribute bar in SomeClass, so it returned that.
That's also how method calls work by the way. When you call mylist.append(5), for example, mylist doesn't have an attribute named append. But the class of mylist does, and it's bound to a method object. That method object is returned by the mylist.append bit, and then the (5) bit calls the method with the argument 5.
The way this is useful is that all instances of SomeClass will have access to the same bar attribute. We could create a million instances, but we only need to store that one string in memory, because they can all find it.
But you have to be a bit careful. Have a look at the following operations:
sc1 = SomeClass()
sc1.foo_list.append(1)
sc1.bar_list.append(2)
sc2 = SomeClass()
sc2.foo_list.append(10)
sc2.bar_list.append(20)
print sc1.foo_list
print sc1.bar_list
print sc2.foo_list
print sc2.bar_list
What do you think this prints?
[1]
[2, 20]
[10]
[2, 20]
This is because each instance has its own copy of foo_list, so they were appended to separately. But all instances share access to the same bar_list. So when we did sc1.bar_list.append(2) it affected sc2, even though sc2 didn't exist yet! And likewise sc2.bar_list.append(20) affected the bar_list retrieved through sc1. This is often not what you want.
Advanced study follows. :)
To really grok Python, coming from traditional statically typed OO-languages like Java and C#, you have to learn to rethink classes a little bit.
In Java, a class isn't really a thing in its own right. When you write a class you're more declaring a bunch of things that all instances of that class have in common. At runtime, there's only instances (and static methods/variables, but those are really just global variables and functions in a namespace associated with a class, nothing to do with OO really). Classes are the way you write down in your source code what the instances will be like at runtime; they only "exist" in your source code, not in the running program.
In Python, a class is nothing special. It's an object just like anything else. So "class attributes" are in fact exactly the same thing as "instance attributes"; in reality there's just "attributes". The only reason for drawing a distinction is that we tend to use objects which are classes differently from objects which are not classes. The underlying machinery is all the same. This is why I say it would be a mistake to think of class attributes as static variables from other languages.
But the thing that really makes Python classes different from Java-style classes is that just like any other object each class is an instance of some class!
In Python, most classes are instances of a builtin class called type. It is this class that controls the common behaviour of classes, and makes all the OO stuff the way it does. The default OO way of having instances of classes that have their own attributes, and have common methods/attributes defined by their class, is just a protocol in Python. You can change most aspects of it if you want. If you've ever heard of using a metaclass, all that is is defining a class that is an instance of a different class than type.
The only really "special" thing about classes (aside from all the builtin machinery to make them work they way they do by default), is the class block syntax, to make it easier for you to create instances of type. This:
class Foo(BaseFoo):
def __init__(self, foo):
self.foo = foo
z = 28
is roughly equivalent to the following:
def __init__(self, foo):
self.foo = foo
classdict = {'__init__': __init__, 'z': 28 }
Foo = type('Foo', (BaseFoo,) classdict)
And it will arrange for all the contents of classdict to become attributes of the object that gets created.
So then it becomes almost trivial to see that you can access a class attribute by Class.attribute just as easily as i = Class(); i.attribute. Both i and Class are objects, and objects have attributes. This also makes it easy to understand how you can modify a class after it's been created; just assign its attributes the same way you would with any other object!
In fact, instances have no particular special relationship with the class used to create them. The way Python knows which class to search for attributes that aren't found in the instance is by the hidden __class__ attribute. Which you can read to find out what class this is an instance of, just as with any other attribute: c = some_instance.__class__. Now you have a variable c bound to a class, even though it probably doesn't have the same name as the class. You can use this to access class attributes, or even call it to create more instances of it (even though you don't know what class it is!).
And you can even assign to i.__class__ to change what class it is an instance of! If you do this, nothing in particular happens immediately. It's not earth-shattering. All that it means is that when you look up attributes that don't exist in the instance, Python will go look at the new contents of __class__. Since that includes most methods, and methods usually expect the instance they're operating on to be in certain states, this usually results in errors if you do it at random, and it's very confusing, but it can be done. If you're very careful, the thing you store in __class__ doesn't even have to be a class object; all Python's going to do with it is look up attributes under certain circumstances, so all you need is an object that has the right kind of attributes (some caveats aside where Python does get picky about things being classes or instances of a particular class).
That's probably enough for now. Hopefully (if you've even read this far) I haven't confused you too much. Python is neat when you learn how it works. :)
What you're calling an "instance" variable isn't actually an instance variable; it's a class variable. See the language reference about classes.
In your example, the a appears to be an instance variable because it is immutable. It's nature as a class variable can be seen in the case when you assign a mutable object:
>>> class Complex:
>>> a = []
>>>
>>> b = Complex()
>>> c = Complex()
>>>
>>> # What do they look like?
>>> b.a
[]
>>> c.a
[]
>>>
>>> # Change b...
>>> b.a.append('Hello')
>>> b.a
['Hello']
>>> # What does c look like?
>>> c.a
['Hello']
If you used self, then it would be a true instance variable, and thus each instance would have it's own unique a. An object's __init__ function is called when a new instance is created, and self is a reference to that instance.

Does Python require intimate knowledge of all classes in the inheritance chain?

Python classes have no concept of public/private, so we are told to not touch something that starts with an underscore unless we created it. But does this not require complete knowledge of all classes from which we inherit, directly or indirectly? Witness:
class Base(object):
def __init__(self):
super(Base, self).__init__()
self._foo = 0
def foo(self):
return self._foo + 1
class Sub(Base):
def __init__(self):
super(Sub, self).__init__()
self._foo = None
Sub().foo()
Expectedly, a TypeError is raised when None + 1 is evaluated. So I have to know that _foo exists in the base class. To get around this, __foo can be used instead, which solves the problem by mangling the name. This seems to be, if not elegant, an acceptable solution. However, what happens if Base inherits from a class (in a separate package) called Sub? Now __foo in my Sub overrides __foo in the grandparent Sub.
This implies that I have to know the entire inheritance chain, including all "private" objects each uses. The fact that Python is dynamically-typed makes this even harder, since there are no declarations to search for. The worst part, however, is probably the fact Base might inherit from object right now, but in some future release, it switches to inheriting from Sub. Clearly if I know Sub is inherited from, I can rename my class, however annoying that is. But I can't see into the future.
Is this not a case where a true private data type would prevent a problem? How, in Python, can I be sure that I'm not accidentally stepping on somebody's toes if those toes might spring into existence at some point in the future?
EDIT: I've apparently not made clear the primary question. I'm familiar with name mangling and the difference between a single and a double underscore. The question is: how do I deal with the fact that I might clash with classes whose existence I don't know of right now? If my parent class (which is in a package I did not write) happens to start inheriting from a class with the same name as my class, even name mangling won't help. Am I wrong in seeing this as a (corner) case that true private members would solve, but that Python has trouble with?
EDIT: As requested, the following is a full example:
File parent.py:
class Sub(object):
def __init__(self):
self.__foo = 12
def foo(self):
return self.__foo + 1
class Base(Sub):
pass
File sub.py:
import parent
class Sub(parent.Base):
def __init__(self):
super(Sub, self).__init__()
self.__foo = None
Sub().foo()
The grandparent's foo is called, but my __foo is used.
Obviously you wouldn't write code like this yourself, but parent could easily be provided by a third party, the details of which could change at any time.
Use private names (instead of protected ones), starting with a double underscore:
class Sub(Base):
def __init__(self):
super(Sub, self).__init__()
self.__foo = None
# ^^
will not conflict with _foo or __foo in Base. This is because Python replaces the double underscore with a single underscore and the name of the class; the following two lines are equivalent:
class Sub(Base):
def x(self):
self.__foo = None # .. is the same as ..
self._Sub__foo = None
(In response to the edit:) The chance that two classes in a class hierarchy not only have the same name, but that they are both using the same property name, and are both using the private mangled (__) form is so minuscule that it can be safely ignored in practice (I for one haven't heard of a single case so far).
In theory, however, you are correct in that in order to formally verify correctness of a program, one most know the entire inheritance chain. Luckily, formal verification usually requires a fixed set of libraries in any case.
This is in the spirit of the Zen of Python, which includes
practicality beats purity.
Name mangling includes the class so your Base.__foo and Sub.__foo will have different names. This was the entire reason for adding the name mangling feature to Python in the first place. One will be _Base__foo, the other _Sub__foo.
Many people prefer to use composition (has-a) instead of inheritance (is-a) for some of these very reasons.
This implies that I have to know the entire inheritance chain. . .
Yes, you should know the entire inheritance chain, or the docs for the object you are directly sub-classing should tell you what you need to know.
Subclassing is an advanced feature, and should be treated with care.
A good example of docs specifying what should be overridden in a subclass is the threading class:
This class represents an activity that is run in a separate thread of control. There are two ways to specify the activity: by passing a callable object to the constructor, or by overriding the run() method in a subclass. No other methods (except for the constructor) should be overridden in a subclass. In other words, only override the __init__() and run() methods of this class.
How often do you modify base classes in inheritance chains to introduce inheritance from a class with the same name as a subclass further down the chain???
Less flippantly, yes, you have to know the code you are working with. You certainly have to know the public names being used, after all. Python being python, discovering the public names in use by your ancestor classes takes pretty much the same effort as discovering the private ones.
In years of Python programming, I have never found this to be much of an issue in practice. When you're naming instance variables, you should have a pretty good idea whether (a) a name is generic enough that it's likely to be used in other contexts and (b) the class you're writing is likely to be involved in an inheritance hierarchy with other unknown classes. In such cases, you think a bit more carefully about the names you're using; self.value isn't a great idea for an attribute name, and neither is something like Adaptor a great class name.
In contrast, I have run into difficulties with the overuse of double-underscore names a number of times. Python being Python, even "private" names tend to be accessed by code defined outside the class. You might think that it would always be bad practice to let an external function access "private" attributes, but what about things like getattr and hasattr? The invocation of them can be in the class's own code, so the class is still controlling all access to the private attributes, but they still don't work without you doing the name-mangling manually. If Python had actually-enforced private variables you couldn't use functions like those on them at all. These days I tend to reserve double-underscore names for cases when I'm writing something very generic like a decorator, metaclass, or mixin that needs to add a "secret attribute" to the instances of the (unknown) classes it's applied to.
And of course there's the standard dynamic language argument: the reality is that you have to test your code thoroughly to have much justification in making the claim "my software works". Such testing will be very unlikely to miss the bugs caused by accidentally clashing names. If you are not doing that testing, then many more uncaught bugs will be introduced by other means than by accidental name clashes.
In summation, the lack of private variables is just not that big a deal in idiomatic Python code in practice, and the addition of true private variables would cause more frequent problems in other ways IMHO.
Mangling happens with double underscores. Single underscores are more of a "please don't".
You don't need to know all the details of all parent classes (note that deep inheritance is usually best avoided), because you can still dir() and help() and any other form of introspection you can come up with.
As noted, you can use name mangling. However, you can stick with a single underscore (or none!) if you document your code adequately - you should not have so many private variables that this proves to be a problem. Just say if a method relies on a private variable, and add either the variable, or the name of the method to the class docstring to alert users.
Further, if you create unit tests, you should create tests that check invariants on members, and accordingly these should be able to show up such name clashes.
If you really want to have "private" variables, and for whatever reason name-mangling doesn't meet your needs, you can factor your private state into another object:
class Foo(object):
class Stateholder(object): pass
def __init__(self):
self._state = Stateholder()
self.state.private = 1

how to override class, or undeclare class or redeclare a Class in python?

is there any possible to override class, or undeclare class or redeclare a Class in python?
Yes, just declare it again:
class Foo(object): x = 1
class Foo(object): x = 2
The above code will not raise any error, and the name Foo will refer to the second class declared. Note however, that the class declared by the first declaration will still exist if anything refers to it, e.g. an instance, or a derived class.
This means that existing instances will not change class when you declare a new class with the same name, and existing subclasses will not magically inherit from the new class.
Probably the simplest method to deal with subclasses is to also re-declare them, so they inherit from the "renewed" base class. An alternative would be to mess with their __bases__ property, although I can't tell you if that would have unexpected results (there will almost certainly be some corner cases where this would not work).
As to existing instances, it is possible to re-assign their __class__ property with a new class. This does present two issues - first you have to find them (see this question: Printing all instances of a class), and second of all, items stored in instance __dict__ or __slots__ properties will still be there in those instances. If that is not something that should happen with your new class definition, you will have to write appropriate code to handle that as part of the transformation.
IN summary, it's unlikely to be worth it except in quite simple cases. If you need complete uptime for a running system, you might be better using a replication-based approach to achieve code changes.
Update: If this is the kind of thing you know you're going to do, another solution would be to use the strategy pattern.
Undeclare a class using del className as usual.

How can I make Python/Sphinx document object attributes only declared in __init__?

I have Python classes with object attributes which are only declared as part of running the constructor, like so:
class Foo(object):
def __init__(self, base):
self.basepath = base
temp = []
for run in os.listdir(self.basepath):
if self.foo(run):
temp.append(run)
self.availableruns = tuple(sorted(temp))
If I now use either help(Foo) or attempt to document Foo in Sphinx, the self.basepath and self.availableruns attributes are not shown. That's a problem for users of our API.
I've tried searching for a standard way to ensure that these "dynamically declared" attributes can be found (and preferably docstring'd) by the parser, but no luck so far. Any suggestions? Thanks.
I've tried searching for a standard way to ensure that these "dynamically declared" attributes can be found (and preferably docstring'd) by the parser, but no luck so far. Any suggestions?
They cannot ever be "detected" by any parser.
Python has setattr. The complete set of attributes is never "detectable", in any sense of the word.
You absolutely must describe them in the docstring.
[Unless you want to do a bunch of meta-programming to generate docstrings from stuff you gathered from inspect or something. Even then, your "solution" would be incomplete as soon as you starting using setattr.]
class Foo(object):
"""
:ivar basepath:
:ivar availableruns:
"""
def __init__(self, base):
You could define a class variable with the same name as the instance variable. That class variable will then be shadowed by the instance variable when you set it. E.g:
class Foo(object):
#: Doc comment for availableruns
availableruns = ()
def __init__(self, base):
...
self.availableruns = tuple(sorted(temp))
Indeed, if the instance variable has a useful immutable default value (eg None or the empty tuple), then you can save a little memory by just not setting the variable if should have its default value. Of course, this approach won't work if you're talking about an instance variable that you might want to delete (e.g., del foo.availableruns)-- but I find that's not a very common case.
If you're using sphinx, and have "autoattribute" set, then this should get documented appropriately. Or, depending on the context of what you're doing, you could just directly use the Sphinx .. py:attribute:: directive.

Categories

Resources