I think I understand the concept of "name mangling" in python, but there's something that I probably missed. Take a look at the following code:
#!/usr/bin/env python
class Base(object):
__data = "Base"
#classmethod
def func(cls):
return "Class name is {}, data is {}".format(cls.__name__, cls.__data)
class A(Base):
__data = "A"
class B(A):
__data = "B"
print Base.func()
print A.func()
print B.func()
Here the output I get:
Class name is Base, data is Base
Class name is A, data is Base
Class name is B, data is Base
Now, I understand that for each class the actual name of the class attribute is mangled to _<Class name>__data. So for instance, for Base it would be _Base__data, for A it would be _A__data, etc.
My question is, inside func it identifies correctly the names of the inherited classes (Base, A and B), but cls.__data always leads to cls._Base__data. Why is that? I mean, if __name__ is A or B, then I know I'm inside class A or B, so I expect cls.__data to be the one of A or B respectively. What am I missing here?
You are not "missing", to the contrary, you just "found out" what name mangling does: it is made to ensure variables with double underscores inside a method will always see the attribute defined in the same class as that method, and in none of its subclasses.
If you simply want to use the attribute as it is overriden in each subclass, that is the normal behavior for all other attributes, but for the ones prefixed by two underscores.
So, what happens is that the .__data name used inside func is itself mangled, at compile time, to _base__data.
OrderedDict
Python's collections.OrderedDict have an extra trap: Python offers both a pure-python implementation, which uses the __ for its "private attributes", as explained above, but also have a native code implementation in C, and the private structures of that are not exposed to Python.
And the collections module ends the OrderedDict code block with these lines:
try:
from _collections import OrderedDict
except ImportError:
# Leave the pure Python version in place.
pass
That is: the normal "collections.OrderedDict" is written in C, with a lot of opaque structures, that can't be tapped in by subclasses.
The only way to have access to the Python defined OrderedDict is by
deleting the _collections.OrderedDict attribute (in the _collections, not collections module), and reload the collections module.
If you do that, and instantiate one ordered dict, the private data structures can be seem:
from imp import reload
import _collections, collections
backup = _collections.OrderedDict
del _collections.OrderedDict
collections = reload(collections)
PyOrderedDict = collections.OrderedDict
_collections.OrderedDict = backup
a = PyOrderedDict()
dir(a)
Out[xx]:
['_OrderedDict__hardroot',
'_OrderedDict__map',
'_OrderedDict__marker',
'_OrderedDict__root',
'_OrderedDict__update',
'__class__',
'__contains__',
'__delattr__',
...
]
As you have noticed, the name used for name mangling is the name of the class where a method is declared, not the derived type of the current object.
The documentation for this feature explicitly gives an example about protecting a variable from a derived class (rather than from external code using an instance variable).
Name mangling is helpful for letting subclasses override methods without breaking intraclass method calls. For example:
class Mapping:
def __init__(self, iterable):
self.items_list = []
self.__update(iterable)
def update(self, iterable):
for item in iterable:
self.items_list.append(item)
__update = update # private copy of original update() method
class MappingSubclass(Mapping):
def update(self, keys, values):
# provides new signature for update()
# but does not break __init__()
for item in zip(keys, values):
self.items_list.append(item)
Related
I have found no reference for a short constructor call that would initialize variables of the caller's choice. I am looking for
class AClass:
def __init__(self):
pass
instance = AClass(var1=3, var2=5)
instead of writing the heavier
class AClass:
def __init__(self, var1, var2):
self.var1 = var1
self.var2 = var2
or the much heavier
instance = AClass()
instance.var1 = 3
instance.var2 = 5
Am I missing something?
This is an excellent question and has been a puzzle also for me.
In the modern Python world, there are three (excellent) shorthand initializers (this term is clever, I am adopting it), depending on your needs. None requires any footwork with __init__ methods (which is what you wanted to avoid in the first place).
Namespace object
If you wish to assign arbitrary values to an instance (i.e. not enforced by the class), you should use a particular data structure called namespace. A namespace object is an object accessible with the dot notation, to which you can assign basically what you want.
You can import the Namespace class from argparse (it is covered here: How do I create a Python namespace (argparse.parse_args value)?). Since Python 3.3. a SimpleNamespace class is available from the standard types package.
from types import SimpleNamespace
instance = SimpleNamespace(var1=var1, var2=var2)
You can also write:
instance = SimpleNamespace()
instance.var1 = var1
instance.var2 = var2
Let's say its the "quick and dirty way", which would work in a number of cases. In general there is not even the need to declare your class.
If you want your instances to still have a few methods and properties you could still do:
class AClass(Namespace):
def mymethod(self, ...):
pass
And then:
instance = AClass(var1=var1, var2=var2, etc.)
That gives you maximum flexibility.
Named tuple
On the other hand, if you want the class to enforce those attributes, then you have another, more solid option.
A named tuple produces immutable instances, which are initialized once and for all. Think of them as ordinary tuples, but with each item also accessible with the dot notation. This class namedtuple is part of the standard distribution of Python. This how you generate your class:
from collections import namedtuple
AClass = namedtuple("AClass", "var1 var2")
Note how cool and short the definition is and not __init__ method required. You can actually complete your class after that.
And to create an object:
instance = AClass(var1, var2)
or
instance = AClass(var1=var1, var2=var2)
Named list
But what if you want that instance to be mutable, i.e. to allow you update the properties of the instance? The answer is the named list (also known as RecordClass). Conceptually it is like a normal list, where the items are also accessible with the dot notation.
There are various implementations. I personally use the aptly named namedlist.
The syntax is identical:
from namedlist import namedlist
AClass = namedlist("AClass", "var1 var2")
And to create an object:
instance = AClass(var1, var2)
or:
instance = AClass(var1=var1, var2=var2)
And you can then modify them:
instance.var1 = var3
But you can't add an attribute that is not defined.
>>> instance.var4 = var4
File "<stdin>", line 1, in <module>
AttributeError: 'instance' object has no attribute 'var4'
Usage
Here is my two-bit:
Namespace object is for maximum flexibility and there is not even the need to declare a class; with the risk of having instances that don't behave properly (but Python is a language for consenting adults). If you have only one instance and/or you know what you're doing, that would be the way to go.
namedtuple class generator is perfect to generate objects for returns from functions (see this brief explanation in a lecture from Raymond Hettinger). Rather than returning bland tuples that the user needs to look up in the documentation, the tuple returned is self-explanatory (a dir or help will do it). And it it's compatible with tuple usage anyway (e.g. k,v, z = my_func()). Plus it's immutable, which has its own advantages.
namedlist class generator is useful in a wide range of cases, including when you need to return multiple values from a function, which then need to be amended at a later stage (and you can still unpack them: k, v, z = instance). If you need a mutable object from a proper class with enforced attributes, that might be the go-to solution.
If you use them well, this might significantly cut down time spent on writing classes and handling instances!
Update (September 2020)
#PPC: your dream has come true.
Since Python 3.7, a new tool is available as a standard: dataclasses (unsurprisingly, the designer of the named list package, Eric V. Smith, is also behind it).
In essence, it provides an automatic initialization of class variables.
from dataclasses import dataclass
#dataclass
class InventoryItem:
"""Class for keeping track of an item in inventory."""
name: str
unit_price: float
quantity_on_hand: int = 0
def total_cost(self) -> float:
return self.unit_price * self.quantity_on_hand
(from the official doc)
What the #dataclass decorator will do, will be to automatically add the __init__() method:
def __init__(self, name: str, unit_price: float, quantity_on_hand: int=0):
self.name = name
self.unit_price = unit_price
self.quantity_on_hand = quantity_on_hand
IMHO, it's a pretty, eminently pythonic solution.
Eric also maintains a backport of dataclasses on github, for Python 3.6.
You can update the __dict__ attribute of your object directly, which is where the attributes are stored
class AClass:
def __init__(self, **kwargs):
self.__dict__.update(kwargs)
c = AClass(var1=1, var2='a')
You can use the dictionary representation of the object's attributes, and update its elements with the keyword arguments given to the constructor:
class AClass:
def __init__(self, **kwargs):
self.__dict__.update(**kwargs)
instance = AClass(var1=3, var2=5)
print(instance.var1, instance.var2) # prints 3 5
However, consider this question and its answers considering the style of this. Unless you know what you are doing, better explicitly set the arguments one by one. It will be better understandable for you and other people later - explicit is better than implicit. If you do it the __dict__.update way, document it properly.
Try
class AClass:
def __init__(self, **vars):
self.var1 = vars.get('var1')
Background
I wish to use a meta class in order to add helper methods based on the original class. If the method I wish to add uses self.__attributeName I get an AttributeError (because of name mangling) but for an existing identical method this isn't a problem.
Code example
Here is a simplified example
# Function to be added as a method of Test
def newfunction2(self):
"""Function identical to newfunction"""
print self.mouse
print self._dog
print self.__cat
class MetaTest(type):
"""Metaclass to process the original class and
add new methods based on the original class
"""
def __new__(meta, name, base, dct):
newclass = super(MetaTest, meta).__new__(
meta, name, base, dct
)
# Condition for adding newfunction2
if "newfunction" in dct:
print "Found newfunction!"
print "Add newfunction2!"
setattr(newclass, "newfunction2", newfunction2)
return newclass
# Class to be modified by MetaTest
class Test(object):
__metaclass__ = MetaTest
def __init__(self):
self.__cat = "cat"
self._dog = "dog"
self.mouse = "mouse"
def newfunction(self):
"""Function identical to newfunction2"""
print self.mouse
print self._dog
print self.__cat
T = Test()
T.newfunction()
T.newfunction2() # AttributeError: 'Test' object has no attribute '__cat'
Question
Is there a way of adding newfunction2 that could use self.__cat?
(Without renaming self.__cat to self._cat.)
And maybe something more fundamental, why isn't self.__cat being treated in the same way for both cases since newfunction2 is now part of Test?
Name mangling happens when the methods in a class are compiled. Attribute names like __foo are turned in to _ClassName__foo, where ClassName is the name of the class the method is defined in. Note that you can use name mangling for attributes of other objects!
In your code, the name mangling in newfunction2 doesn't work because when the function is compiled, it's not part of the class. Thus the lookups of __cat don't get turned into __Test_cat the way they did in Test.__init__. You could explicitly look up the mangled version of the attribute name if you want, but it sounds like you want newfunction2 to be generic, and able to be added to multiple classes. Unfortunately, that doesn't work with name mangling.
Indeed, preventing code not defined in your class from accessing your attributes is the whole reason to use name mangling. Usually it's only worth bothering with if you're writing a proxy or mixin type and you don't want your internal-use attributes to collide with the attributes of the class you're proxying or mixing in with (which you won't know in advance).
To answer both of your questions:
You will need to change self.__cat when you need to call it from newfunction2 to self._Test__cat thanks to the name mangling rule.
Python documentation:
This mangling is done without regard to the syntactic position of the
identifier, as long as it occurs within the definition of a class.
Let me brake it down for you, it's saying that it doesn't matter where your interpreter is reading when it encounters a name mangled name. The name will only be mangled if it occurs in the definition of a class, which in your case, it's not. Since it's not directly "under" a class definition. So when it reads self.__cat, it's keeping it at self.__cat, not going to textually replace it with self._Test__cat since it isn't defined inside theTest class.
You can use <Test instance>._Test__cat to access the __cat attribute from the Test class. (where <Test instance> is replaced by self or any other instance of the Test class)
learn more in the Python doc
class B:
def __init__(self):
self.__private = 0
def __private_method(self):
'''A private method via inheritance'''
return ('{!r}'.format(self))
def internal_method(self):
return ('{!s}'.format(self))
class C(B):
def __init__(self):
super().__init__()
self.__private = 1
def __private_method(self):
return 'Am from class C'
c = C()
print(c.__dict__)
b = B()
print(b.__dict__)
print(b._B__private)
print(c._C__private_method())
Suppose I have a large number of classes defined by an import of a large library codebase, which I don't want to hack around with for reasons of maintainability. They all inherit from BaseClass, and BaseClass contains a method which I want to augment. I think the following is a workable solution
class MyMixin(object):
def method( self, args):
... # 1. a few lines of code copied from BaseClass's def of method
... # 2. some lines of my code that can't go before or after the copied code
... # 3. and the rest of the copied code
class MyAbcClass( MyMixin, AbcClass):
pass
# many similar lines
class MyZzzClass( MyMixin, ZzzClass):
pass
The question. Is there a way to take, say, a list of ("MyXxxClass", XxxClass) tuples, and write code that defines the MyXxxClasses? And is it sufficiently comprehensible that it beats the repetition in the above?
Use three-arg type to define the classes, then set them on the module's global dictionary:
todefine = [('MyAbcClass', AbcClass), ...]
for name, base in todefine:
globals()[name] = type(name, (MyMixin, base), {})
If the names to define follow the fixed pattern you gave (`"My" + base class name), you can repeat yourself even less by dynamically constructing the name to define:
todefine = [AbcClass, ...]
for base in todefine:
name = "My" + base.__name__
globals()[name] = type(name, (MyMixin, base), {})
And if you are trying to wrap all the classes from a given module, you can avoid even explicitly listing the classes by introspecting the module to generate todefine programmatically (if you know the module has or lacks __all__ you can just use the appropriate approach instead of trying one and defaulting to the other):
import inspect
try:
# For modules that define __all__, we want all exported classes
# even if they weren't originally defined in the module
todefine = filter(inspect.isclass, (getattr(somemodule, name) for name in somemodule.__all__))
except AttributeError:
# If __all__ not defined, heuristic approach; exclude private names
# defined with leading underscore, and objects that were imported from
# other modules (so if the module does from itertools import chain,
# we don't wrap chain)
todefine = (obj for name, obj in vars(somemodule).items() if not name.startswith('_') and inspect.isclass(obj) and inspect.getmodule(obj) is somemodule)
Python's inner/nested classes confuse me. Is there something that can't be accomplished without them? If so, what is that thing?
Quoted from http://www.geekinterview.com/question_details/64739:
Advantages of inner class:
Logical grouping of classes: If a class is useful to only one other class then it is logical to embed it in that class and keep the two together. Nesting such "helper classes" makes their package more streamlined.
Increased encapsulation: Consider two top-level classes A and B where B needs access to members of A that would otherwise be declared private. By hiding class B within class A A's members can be declared private and B can access them. In addition B itself can be hidden from the outside world.
More readable, maintainable code: Nesting small classes within top-level classes places the code closer to where it is used.
The main advantage is organization. Anything that can be accomplished with inner classes can be accomplished without them.
Is there something that can't be accomplished without them?
No. They are absolutely equivalent to defining the class normally at top level, and then copying a reference to it into the outer class.
I don't think there's any special reason nested classes are ‘allowed’, other than it makes no particular sense to explicitly ‘disallow’ them either.
If you're looking for a class that exists within the lifecycle of the outer/owner object, and always has a reference to an instance of the outer class — inner classes as Java does it – then Python's nested classes are not that thing. But you can hack up something like that thing:
import weakref, new
class innerclass(object):
"""Descriptor for making inner classes.
Adds a property 'owner' to the inner class, pointing to the outer
owner instance.
"""
# Use a weakref dict to memoise previous results so that
# instance.Inner() always returns the same inner classobj.
#
def __init__(self, inner):
self.inner= inner
self.instances= weakref.WeakKeyDictionary()
# Not thread-safe - consider adding a lock.
#
def __get__(self, instance, _):
if instance is None:
return self.inner
if instance not in self.instances:
self.instances[instance]= new.classobj(
self.inner.__name__, (self.inner,), {'owner': instance}
)
return self.instances[instance]
# Using an inner class
#
class Outer(object):
#innerclass
class Inner(object):
def __repr__(self):
return '<%s.%s inner object of %r>' % (
self.owner.__class__.__name__,
self.__class__.__name__,
self.owner
)
>>> o1= Outer()
>>> o2= Outer()
>>> i1= o1.Inner()
>>> i1
<Outer.Inner inner object of <__main__.Outer object at 0x7fb2cd62de90>>
>>> isinstance(i1, Outer.Inner)
True
>>> isinstance(i1, o1.Inner)
True
>>> isinstance(i1, o2.Inner)
False
(This uses class decorators, which are new in Python 2.6 and 3.0. Otherwise you'd have to say “Inner= innerclass(Inner)” after the class definition.)
There's something you need to wrap your head around to be able to understand this. In most languages, class definitions are directives to the compiler. That is, the class is created before the program is ever run. In python, all statements are executable. That means that this statement:
class foo(object):
pass
is a statement that is executed at runtime just like this one:
x = y + z
This means that not only can you create classes within other classes, you can create classes anywhere you want to. Consider this code:
def foo():
class bar(object):
...
z = bar()
Thus, the idea of an "inner class" isn't really a language construct; it's a programmer construct. Guido has a very good summary of how this came about here. But essentially, the basic idea is this simplifies the language's grammar.
Nesting classes within classes:
Nested classes bloat the class definition making it harder to see whats going on.
Nested classes can create coupling that would make testing more difficult.
In Python you can put more than one class in a file/module, unlike Java, so the class still remains close to top level class and could even have the class name prefixed with an "_" to help signify that others shouldn't be using it.
The place where nested classes can prove useful is within functions
def some_func(a, b, c):
class SomeClass(a):
def some_method(self):
return b
SomeClass.__doc__ = c
return SomeClass
The class captures the values from the function allowing you to dynamically create a class like template metaprogramming in C++
I understand the arguments against nested classes, but there is a case for using them in some occasions. Imagine I'm creating a doubly-linked list class, and I need to create a node class for maintaing the nodes. I have two choices, create Node class inside the DoublyLinkedList class, or create the Node class outside the DoublyLinkedList class. I prefer the first choice in this case, because the Node class is only meaningful inside the DoublyLinkedList class. While there's no hiding/encapsulation benefit, there is a grouping benefit of being able to say the Node class is part of the DoublyLinkedList class.
Is there something that can't be accomplished without them? If so,
what is that thing?
There is something that cannot be easily done without: inheritance of related classes.
Here is a minimalist example with the related classes A and B:
class A(object):
class B(object):
def __init__(self, parent):
self.parent = parent
def make_B(self):
return self.B(self)
class AA(A): # Inheritance
class B(A.B): # Inheritance, same class name
pass
This code leads to a quite reasonable and predictable behaviour:
>>> type(A().make_B())
<class '__main__.A.B'>
>>> type(A().make_B().parent)
<class '__main__.A'>
>>> type(AA().make_B())
<class '__main__.AA.B'>
>>> type(AA().make_B().parent)
<class '__main__.AA'>
If B were a top-level class, you could not write self.B() in the method make_B but would simply write B(), and thus lose the dynamic binding to the adequate classes.
Note that in this construction, you should never refer to class A in the body of class B. This is the motivation for introducing the parent attribute in class B.
Of course, this dynamic binding can be recreated without inner class at the cost of a tedious and error-prone instrumentation of the classes.
1. Two functionally equivalent ways
The two ways shown before are functionally identical. However, there are some subtle differences, and there are situations when you would like to choose one over another.
Way 1: Nested class definition (="Nested class")
class MyOuter1:
class Inner:
def show(self, msg):
print(msg)
Way 2: With module level Inner class attached to Outer class(="Referenced inner class")
class _InnerClass:
def show(self, msg):
print(msg)
class MyOuter2:
Inner = _InnerClass
Underscore is used to follow PEP8 "internal interfaces (packages, modules, classes, functions, attributes or other names) should -- be prefixed with a single leading underscore."
2. Similarities
Below code snippet demonstrates the functional similarities of the "Nested class" vs "Referenced inner class"; They would behave the same way in code checking for the type of an inner class instance. Needless to say, the m.inner.anymethod() would behave similarly with m1 and m2
m1 = MyOuter1()
m2 = MyOuter2()
innercls1 = getattr(m1, 'Inner', None)
innercls2 = getattr(m2, 'Inner', None)
isinstance(innercls1(), MyOuter1.Inner)
# True
isinstance(innercls2(), MyOuter2.Inner)
# True
type(innercls1()) == mypackage.outer1.MyOuter1.Inner
# True (when part of mypackage)
type(innercls2()) == mypackage.outer2.MyOuter2.Inner
# True (when part of mypackage)
3. Differences
The differences of "Nested class" and "Referenced inner class" are listed below. They are not big, but sometimes you would like to choose one or the other based on these.
3.1 Code Encapsulation
With "Nested classes" it is possible to encapsulate code better than with "Referenced inner class". A class in the module namespace is a global variable. The purpose of nested classes is to reduce clutter in the module and put the inner class inside the outer class.
While no-one* is using from packagename import *, low amount of module level variables can be nice for example when using an IDE with code completion / intellisense.
*Right?
3.2 Readability of code
Django documentation instructs to use inner class Meta for model metadata. It is a bit more clearer* to instruct the framework users to write a class Foo(models.Model) with inner class Meta;
class Ox(models.Model):
horn_length = models.IntegerField()
class Meta:
ordering = ["horn_length"]
verbose_name_plural = "oxen"
instead of "write a class _Meta, then write a class Foo(models.Model) with Meta = _Meta";
class _Meta:
ordering = ["horn_length"]
verbose_name_plural = "oxen"
class Ox(models.Model):
Meta = _Meta
horn_length = models.IntegerField()
With the "Nested class" approach the code can be read a nested bullet point list, but with the "Referenced inner class" method one has to scroll back up to see the definition of _Meta to see its "child items" (attributes).
The "Referenced inner class" method can be more readable if your code nesting level grows or the rows are long for some other reason.
* Of course, a matter of taste
3.3 Slightly different error messages
This is not a big deal, but just for completeness: When accessing non-existent attribute for the inner class, we see slighly different exceptions. Continuing the example given in Section 2:
innercls1.foo()
# AttributeError: type object 'Inner' has no attribute 'foo'
innercls2.foo()
# AttributeError: type object '_InnerClass' has no attribute 'foo'
This is because the types of the inner classes are
type(innercls1())
#mypackage.outer1.MyOuter1.Inner
type(innercls2())
#mypackage.outer2._InnerClass
The main use case I use this for is the prevent proliferation of small modules and to prevent namespace pollution when separate modules are not needed. If I am extending an existing class, but that existing class must reference another subclass that should always be coupled to it. For example, I may have a utils.py module that has many helper classes in it, that aren't necessarily coupled together, but I want to reinforce coupling for some of those helper classes. For example, when I implement https://stackoverflow.com/a/8274307/2718295
:utils.py:
import json, decimal
class Helper1(object):
pass
class Helper2(object):
pass
# Here is the notorious JSONEncoder extension to serialize Decimals to JSON floats
class DecimalJSONEncoder(json.JSONEncoder):
class _repr_decimal(float): # Because float.__repr__ cannot be monkey patched
def __init__(self, obj):
self._obj = obj
def __repr__(self):
return '{:f}'.format(self._obj)
def default(self, obj): # override JSONEncoder.default
if isinstance(obj, decimal.Decimal):
return self._repr_decimal(obj)
# else
super(self.__class__, self).default(obj)
# could also have inherited from object and used return json.JSONEncoder.default(self, obj)
Then we can:
>>> from utils import DecimalJSONEncoder
>>> import json, decimal
>>> json.dumps({'key1': decimal.Decimal('1.12345678901234'),
... 'key2':'strKey2Value'}, cls=DecimalJSONEncoder)
{"key2": "key2_value", "key_1": 1.12345678901234}
Of course, we could have eschewed inheriting json.JSONEnocder altogether and just override default():
:
import decimal, json
class Helper1(object):
pass
def json_encoder_decimal(obj):
class _repr_decimal(float):
...
if isinstance(obj, decimal.Decimal):
return _repr_decimal(obj)
return json.JSONEncoder(obj)
>>> json.dumps({'key1': decimal.Decimal('1.12345678901234')}, default=json_decimal_encoder)
'{"key1": 1.12345678901234}'
But sometimes just for convention, you want utils to be composed of classes for extensibility.
Here's another use-case: I want a factory for mutables in my OuterClass without having to invoke copy:
class OuterClass(object):
class DTemplate(dict):
def __init__(self):
self.update({'key1': [1,2,3],
'key2': {'subkey': [4,5,6]})
def __init__(self):
self.outerclass_dict = {
'outerkey1': self.DTemplate(),
'outerkey2': self.DTemplate()}
obj = OuterClass()
obj.outerclass_dict['outerkey1']['key2']['subkey'].append(4)
assert obj.outerclass_dict['outerkey2']['key2']['subkey'] == [4,5,6]
I prefer this pattern over the #staticmethod decorator you would otherwise use for a factory function.
I have used Python's inner classes to create deliberately buggy subclasses within unittest functions (i.e. inside def test_something():) in order to get closer to 100% test coverage (e.g. testing very rarely triggered logging statements by overriding some methods).
In retrospect it's similar to Ed's answer https://stackoverflow.com/a/722036/1101109
Such inner classes should go out of scope and be ready for garbage collection once all references to them have been removed. For instance, take the following inner.py file:
class A(object):
pass
def scope():
class Buggy(A):
"""Do tests or something"""
assert isinstance(Buggy(), A)
I get the following curious results under OSX Python 2.7.6:
>>> from inner import A, scope
>>> A.__subclasses__()
[]
>>> scope()
>>> A.__subclasses__()
[<class 'inner.Buggy'>]
>>> del A, scope
>>> from inner import A
>>> A.__subclasses__()
[<class 'inner.Buggy'>]
>>> del A
>>> import gc
>>> gc.collect()
0
>>> gc.collect() # Yes I needed to call the gc twice, seems reproducible
3
>>> from inner import A
>>> A.__subclasses__()
[]
Hint - Don't go on and try doing this with Django models, which seemed to keep other (cached?) references to my buggy classes.
So in general, I wouldn't recommend using inner classes for this kind of purpose unless you really do value that 100% test coverage and can't use other methods. Though I think it's nice to be aware that if you use the __subclasses__(), that it can sometimes get polluted by inner classes. Either way if you followed this far, I think we're pretty deep into Python at this point, private dunderscores and all.
The following code
import types
class A:
class D:
pass
class C:
pass
for d in dir(A):
if type(eval('A.'+d)) is types.ClassType:
print d
outputs
C
D
How do I get it to output in the order in which these classes were defined in the code? I.e.
D
C
Is there any way other than using inspect.getsource(A) and parsing that?
Note that that parsing is already done for you in inspect - take a look at inspect.findsource, which searches the module for the class definition and returns the source and line number. Sorting on that line number (you may also need to split out classes defined in separate modules) should give the right order.
However, this function doesn't seem to be documented, and is just using a regular expression to find the line, so it may not be too reliable.
Another option is to use metaclasses, or some other way to either implicitly or explicitly ordering information to the object. For example:
import itertools, operator
next_id = itertools.count().next
class OrderedMeta(type):
def __init__(cls, name, bases, dct):
super(OrderedMeta, cls).__init__(name, bases, dct)
cls._order = next_id()
# Set the default metaclass
__metaclass__ = OrderedMeta
class A:
class D:
pass
class C:
pass
print sorted([cls for cls in [getattr(A, name) for name in dir(A)]
if isinstance(cls, OrderedMeta)], key=operator.attrgetter("_order"))
However this is a fairly intrusive change (requires setting the metaclass of any classes you're interested in to OrderedMeta)
The inspect module also has the findsource function. It returns a tuple of source lines and line number where the object is defined.
>>> import inspect
>>> import StringIO
>>> inspect.findsource(StringIO.StringIO)[1]
41
>>>
The findsource function actually searches trough the source file and looks for likely candidates if it is given a class-object.
Given a method-, function-, traceback-, frame-, or code-object, it simply looks at the co_firstlineno attribute of the (contained) code-object.
No, you can't get those attributes in the order you're looking for. Python attributes are stored in a dict (read: hashmap), which has no awareness of insertion order.
Also, I would avoid the use of eval by simply saying
if type(getattr(A, d)) is types.ClassType:
print d
in your loop. Note that you can also just iterate through key/value pairs in A.__dict__
AFAIK, no -- there isn't*. This is because all of a class's attributes are stored in a dictionary (which is, as you know, unordered).
*: it might actually be possible, but that would require either decorators or possibly metaclass hacking. Do either of those interest you?
class ExampleObject:
def example2():
pass
def example1():
pass
context = ExampleObject
def sort_key(item):
return inspect.findsource(item)[1]
properties = [
getattr(context, attribute) for attribute in dir(context)
if callable(getattr(context, attribute)) and
attribute.startswith('__') is False
]
properties.sort(key=sort_key)
print(properties)
Should print out:
[<function ExampleObject.example2 at 0x7fc2baf9e940>, <function ExampleObject.example1 at 0x7fc2bae5e790>]
I needed to use this as well for some compiler i'm building, and this proved very useful.
I'm not trying to be glib here, but would it be feasible for you to organize the classes in your source alphabetically? i find that when there are lots of classes in one file this can be useful in its own right.