Access all instances of a class while debugging - python

Is there a way in python\pydev to see and access instances of a certain class while debugging?
For instance, if I define SomeClass and various modules in a single python interpreter script instantiate this class, is there a way to see how many such instances exist in the interpreter and to access their attributes in a central fashion, without coercing the code to hold references to them from a single location (such as the module where the class is defined)?

You could find all such objects using gc.get_objects():
For example, if you define Foo class in module othermod.py:
class Foo(object):
pass
f2 = Foo()
then you can count all instances of Foo in script script.py like this:
import gc
import othermod
f = othermod.Foo()
objs = gc.get_objects()
# print(len(objs))
# 3519
print(len([obj for obj in objs if isinstance(obj,othermod.Foo)]))
# 2
Caveat: gc.get_objects does not track instances of atomic types (like int or str), but it sounds like that is not the kind of object you want to track.

Another option is to use objgraph module:
In [1]: class A(object): pass
In [2]: class B: pass
In [3]: test1 = [A() for i in range(3)]
In [4]: test2 = [A() for i in range(3)]
In [5]: test3 = [B() for i in range(5)]
In [6]: import objgraph
In [7]: objgraph.by_type('A')
Out[7]:
[<__main__.A at 0x2ccc130>,
<__main__.A at 0x2ccc150>,
<__main__.A at 0x2ccc170>,
<__main__.A at 0x2cbb790>,
<__main__.A at 0x2cbb1b0>,
<__main__.A at 0x2cbb7f0>]
But it will not work for old-style classes:
In [8]: objgraph.by_type('B')
Out[8]: []
objgraph uses info from garbage collector, like in unutbu answer.

Related

Is it possible to make the output of `type` return a different class?

So disclaimer: this question has piqued my curiosity a bit, and I'm asking this for purely educational purposes. More of a challenge for the Python gurus here I suppose!
Is it possible to make the output of type(foo) return a different value than the actual instance class? i.e. can it pose as an imposter and pass a check such as type(Foo()) is Bar?
#juanpa.arrivillaga made a suggestion of manually re-assigning __class__ on the instance, but that has the effect of changing how all other methods would be called. e.g.
class Foo:
def test(self):
return 1
class Bar:
def test(self):
return 2
foo = Foo()
foo.__class__ = Bar
print(type(foo) is Bar)
print(foo.test())
>>> True
>>> 2
The desired outputs would be True, 1. i.e The class returned in type is different than the instance, and the instance methods defined in the real class still get invoked.
No - the __class__ attribute is a fundamental information on the layout of all Python objects as "seen" on the C API level itself. And that is what is checked by the call to type.
That means: every Python object have a slot in its in-memory layout with space for a single pointer, to the Python object that is that object's class.
Even if you use ctypes or other means to override protection to that slot and change it from Python code (since modifying obj.__class__ with = is guarded at the C level), changing it effectively changes the object type: the value in the __class__ slot IS the object's class, and the test method would be picked from the class in there (Bar) in your example.
However there is more information here: in all documentation, type(obj) is regarded as equivalent as obj.__class__ - however, if the objects'class defines a descriptor with the name __class__, it is used when one uses the form obj.__class__. type(obj) however will check the instance's __class__ slot directly and return the true class.
So, this can "lie" to code using obj.__class__, but not type(obj):
class Bar:
def test(self):
return 2
class Foo:
def test(self):
return 1
#property
def __class__(self):
return Bar
Property on the metaclass
Trying to mess with creating a __class__ descriptor on the metaclass of Foo itself will be messy -- both type(Foo()) and repr(Foo()) will report an instance of Bar, but the "real" object class will be Foo. In a sense, yes, it makes type(Foo()) lie, but not in the way you were thinking about - type(Foo()) will output the repr of Bar(), but it is Foo's repr that is messed up, due to implementation details inside type.__call__:
In [73]: class M(type):
...: #property
...: def __class__(cls):
...: return Bar
...:
In [74]: class Foo(metaclass=M):
...: def test(self):
...: return 1
...:
In [75]: type(Foo())
Out[75]: <__main__.Bar at 0x55665b000578>
In [76]: type(Foo()) is Bar
Out[76]: False
In [77]: type(Foo()) is Foo
Out[77]: True
In [78]: Foo
Out[78]: <__main__.Bar at 0x55665b000578>
In [79]: Foo().test()
Out[79]: 1
In [80]: Bar().test()
Out[80]: 2
In [81]: type(Foo())().test()
Out[81]: 1
Modifying type itself
Since no one "imports" type from anywhere, and just use
the built-in type itself, it is possible to monkeypatch the builtin
type callable to report a false class - and it will work for all
Python code in the same process relying on the call to type:
original_type = __builtins__["type"] if isinstance("__builtins__", dict) else __builtins__.type
def type(obj_or_name, bases=None, attrs=None, **kwargs):
if bases is not None:
return original_type(obj_or_name, bases, attrs, **kwargs)
if hasattr(obj_or_name, "__fakeclass__"):
return getattr(obj_or_name, "__fakeclass__")
return original_type(obj_or_name)
if isinstance(__builtins__, dict):
__builtins__["type"] = type
else:
__builtins__.type = type
del type
There is one trick here I had not find in the docs: when acessing __builtins__ in a program, it works as a dictionary. However, in an interactive environment such as Python's Repl or Ipython, it is a
module - retrieving the original type and writting the modified
version to __builtins__ have to take that into account - the code above
works both ways.
And testing this (I imported the snippet above from a .py file on disk):
>>> class Bar:
... def test(self):
... return 2
...
>>> class Foo:
... def test(self):
... return 1
... __fakeclass__ = Bar
...
>>> type(Foo())
<class '__main__.Bar'>
>>>
>>> Foo().__class__
<class '__main__.Foo'>
>>> Foo().test()
1
Although this works for demonstration purposes, replacing the built-in type caused "dissonances" that proved fatal in a more complex environment such as IPython: Ipython will crash and terminate immediately if the snippet above is run.

how to make a copy of a class in python?

I have a class A
class A(object):
a = 1
def __init__(self):
self.b = 10
def foo(self):
print type(self).a
print self.b
Then I want to create a class B, which equivalent as A but with different name and value of class member a:
This is what I have tried:
class A(object):
a = 1
def __init__(self):
self.b = 10
def foo(self):
print type(self).a
print self.b
A_dummy = type('A_dummy',(object,),{})
A_attrs = {attr:getattr(A,attr) for attr in dir(A) if (not attr in dir(A_dummy))}
B = type('B',(object,),A_attrs)
B.a = 2
a = A()
a.foo()
b = B()
b.foo()
However I got an Error:
File "test.py", line 31, in main
b.foo()
TypeError: unbound method foo() must be called with A instance as first argument (got nothing instead)
So How I can cope with this sort of jobs (create a copy of an exists class)? Maybe a meta class is needed? But What I prefer is just a function FooCopyClass, such that:
B = FooCopyClass('B',A)
A.a = 10
B.a = 100
print A.a # get 10 as output
print B.a # get 100 as output
In this case, modifying the class member of B won't influence the A, vice versa.
The problem you're encountering is that looking up a method attribute on a Python 2 class creates an unbound method, it doesn't return the underlying raw function (on Python 3, unbound methods are abolished, and what you're attempting would work just fine). You need to bypass the descriptor protocol machinery that converts from function to unbound method. The easiest way is to use vars to grab the class's attribute dictionary directly:
# Make copy of A's attributes
Bvars = vars(A).copy()
# Modify the desired attribute
Bvars['a'] = 2
# Construct the new class from it
B = type('B', (object,), Bvars)
Equivalently, you could copy and initialize B in one step, then reassign B.a after:
# Still need to copy; can't initialize from the proxy type vars(SOMECLASS)
# returns to protect the class internals
B = type('B', (object,), vars(A).copy())
B.a = 2
Or for slightly non-idiomatic one-liner fun:
B = type('B', (object,), dict(vars(A), a=2))
Either way, when you're done:
B().foo()
will output:
2
10
as expected.
You may be trying to (1) create copies of classes for some reason for some real app:
in that case, try using copy.deepcopy - it includes the mechanisms to copy classes. Just change the copy __name__ attribute afterwards if needed. Works both in Python 2 or Python 3.
(2) Trying to learn and understand about Python internal class organization: in that case, there is no reason to fight with Python 2, as some wrinkles there were fixed for Python 3.
In any case, if you try using dir for fetching a class attributes, you will end up with more than you want - as dir also retrieves the methods and attributes of all superclasses. So, even if your method is made to work (in Python 2 that means getting the .im_func attribute of retrieved unbound methods, to use as raw functions on creating a new class), your class would have more methods than the original one.
Actually, both in Python 2 and Python 3, copying a class __dict__ will suffice. If you want mutable objects that are class attributes not to be shared, you should resort again to deepcopy. In Python 3:
class A(object):
b = []
def foo(self):
print(self.b)
from copy import deepcopy
def copy_class(cls, new_name):
new_cls = type(new_name, cls.__bases__, deepcopy(A.__dict__))
new_cls.__name__ = new_name
return new_cls
In Python 2, it would work almost the same, but there is no convenient way to get the explicit bases of an existing class (i.e. __bases__ is not set). You can use __mro__ for the same effect. The only thing is that all ancestor classes are passed in a hardcoded order as bases of the new class, and in a complex hierarchy you could have differences between the behaviors of B descendants and A descendants if multiple-inheritance is used.

Dynamically adding __slots__ to an imported class [duplicate]

Suppose I have a class with __slots__
class A:
__slots__ = ['x']
a = A()
a.x = 1 # works fine
a.y = 1 # AttributeError (as expected)
Now I am going to change __slots__ of A.
A.__slots__.append('y')
print(A.__slots__) # ['x', 'y']
b = A()
b.x = 1 # OK
b.y = 1 # AttributeError (why?)
b was created after __slots__ of A had changed, so Python, in principle, could allocate memory for b.y. Why it didn't?
How to properly modify __slots__ of a class, so that new instances have the modified attributes?
You cannot dynamically alter the __slots__ attribute after creating the class, no. That's because the value is used to create special descriptors for each slot. From the __slots__ documentation:
__slots__ are implemented at the class level by creating descriptors (Implementing Descriptors) for each variable name. As a result, class attributes cannot be used to set default values for instance variables defined by __slots__; otherwise, the class attribute would overwrite the descriptor assignment.
You can see the descriptors in the class __dict__:
>>> class A:
... __slots__ = ['x']
...
>>> A.__dict__
mappingproxy({'__module__': '__main__', '__doc__': None, 'x': <member 'x' of 'A' objects>, '__slots__': ['x']})
>>> A.__dict__['x']
<member 'x' of 'A' objects>
>>> a = A()
>>> A.__dict__['x'].__get__(a, A)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: x
>>> A.__dict__['x'].__set__(a, 'foobar')
>>> A.__dict__['x'].__get__(a, A)
'foobar'
>>> a.x
'foobar'
You cannot yourself create these additional descriptors. Even if you could, you cannot allocate more memory space for the extra slot references on the instances produced for this class, as that's information stored in the C struct for the class, and not in a manner accessible to Python code.
That's all because __slots__ is only an extension of the low-level handling of the elements that make up Python instances to Python code; the __dict__ and __weakref__ attributes on regular Python instances were always implemented as slots:
>>> class Regular: pass
...
>>> Regular.__dict__['__dict__']
<attribute '__dict__' of 'Regular' objects>
>>> Regular.__dict__['__weakref__']
<attribute '__weakref__' of 'Regular' objects>
>>> r = Regular()
>>> Regular.__dict__['__dict__'].__get__(r, Regular) is r.__dict__
True
All the Python developers did here was extend the system to add a few more of such slots using arbitrary names, with those names taken from the __slots__ attribute on the class being created, so that you can save memory; dictionaries take more memory than simple references to values in slots do. By specifying __slots__ you disable the __dict__ and __weakref__ slots, unless you explicitly include those in the __slots__ sequence.
The only way to extend slots then is to subclass; you can dynamically create a subclass with the type() function or by using a factory function:
def extra_slots_subclass(base, *slots):
class ExtraSlots(base):
__slots__ = slots
ExtraSlots.__name__ = base.__name__
return ExtraSlots
It appears to me a type turns __slots__ into a tuple as one of it's first orders of action. It then stores the tuple on the extended type object. Since beneath it all, the python is looking at a tuple, there is no way to mutate it. Indeed, I'm not even sure you can access it unless you pass a tuple in to the instance in the first place.
The fact that the original object that you set still remains as an attribute on the type is (perhaps) just a convenience for introspection.
You can't modify __slots__ and expect to have that show up somewhere (and really -- from a readability perspective, You probably don't really want to do that anyway, right?)...
Of course, you can always subclass to extend the slots:
>>> class C(A):
... __slots__ = ['z']
...
>>> c = C()
>>> c.x = 1
>>> c.z = 1
You cannot modify the __slots__ attribute after class creation. This is because it would leade to strange behaviour.
Imagine the following.
class A:
__slots__ = ["x"]
a = A()
A.__slots__.append("y")
a.y = None
What should happen in this scenario? No space was originally allocated for a second slot, but according to the slots attribute, a should be able have space for y.
__slots__ is not about protecting what names can and cannot be accessed. Rather __slots__ is about reducing the memory footprint of an object. By attempting to modify __slots__ you would defeat the optimisations that __slots__ is meant to achieve.
How __slots__ reduces memory footprint
Normally, an object's attributes are stored in a dict, which requires a fair bit of memory itself. If you are creating millions of objects then the space required by these dicts becomes prohibitive. __slots__ informs the python machinery that makes the class object that there will only be so many attributes refered to by instances of this class and what the names of the attributes will be. Therefore, the class can make an optimisation by storing the attributes directly on the instance rather than in a dict. It places the memory for the (pointers to the) attributes directly on the object, rather than creating a new dict for the object.
Putting answers to this and related question together, I want to make an accent on a solution to this problem:
You can kind of modify __slots__ by creating a subclass with the same name and then replacing parent class with its child. Note that you can do this for classes declared and used in any module, not just yours!
Consider the following module which declares some classes:
module.py:
class A(object):
# some class a user should import
__slots__ = ('x', 'b')
def __init__(self):
self.b = B()
class B(object):
# let's suppose we can't use it directly,
# it's returned as a part of another class
__slots__ = ('z',)
Here's how you can add attributes to these classes:
>>> import module
>>> from module import A
>>>
>>> # for classes imported into your module:
>>> A = type('A', (A,), {'__slots__': ('foo',)})
>>> # for classes which will be instantiated by the `module` itself:
>>> module.B = type('B', (module.B,), {'__slots__': ('bar',)})
>>>
>>> a = A()
>>> a.x = 1
>>> a.foo = 2
>>>
>>> b = a.b
>>> b.z = 3
>>> b.bar = 4
>>>
But what if you receive class instances from some third-party module using the module?
module_3rd_party.py:
from module import A
def get_instance():
return A()
No problem, it will also work! The only difference is that you may need to patch them before you import third-party module (in case it imports classes from the module):
>>> import module
>>>
>>> module.A = type('A', (module.A,), {'__slots__': ('foo',)})
>>> module.B = type('B', (module.B,), {'__slots__': ('bar',)})
>>>
>>> # note that we import `module_3rd_party` AFTER we patch the `module`
>>> from module_3rd_party import get_instance
>>>
>>> a = get_instance()
>>> a.x = 1
>>> a.foo = 2
>>>
>>> b = a.b
>>> b.z = 3
>>> b.bar = 4
>>>
It works because Python imports modules only once and then shares them between all other modules, so the changes you make to modules affect all code running along yours.

How to dynamically change __slots__ attribute?

Suppose I have a class with __slots__
class A:
__slots__ = ['x']
a = A()
a.x = 1 # works fine
a.y = 1 # AttributeError (as expected)
Now I am going to change __slots__ of A.
A.__slots__.append('y')
print(A.__slots__) # ['x', 'y']
b = A()
b.x = 1 # OK
b.y = 1 # AttributeError (why?)
b was created after __slots__ of A had changed, so Python, in principle, could allocate memory for b.y. Why it didn't?
How to properly modify __slots__ of a class, so that new instances have the modified attributes?
You cannot dynamically alter the __slots__ attribute after creating the class, no. That's because the value is used to create special descriptors for each slot. From the __slots__ documentation:
__slots__ are implemented at the class level by creating descriptors (Implementing Descriptors) for each variable name. As a result, class attributes cannot be used to set default values for instance variables defined by __slots__; otherwise, the class attribute would overwrite the descriptor assignment.
You can see the descriptors in the class __dict__:
>>> class A:
... __slots__ = ['x']
...
>>> A.__dict__
mappingproxy({'__module__': '__main__', '__doc__': None, 'x': <member 'x' of 'A' objects>, '__slots__': ['x']})
>>> A.__dict__['x']
<member 'x' of 'A' objects>
>>> a = A()
>>> A.__dict__['x'].__get__(a, A)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: x
>>> A.__dict__['x'].__set__(a, 'foobar')
>>> A.__dict__['x'].__get__(a, A)
'foobar'
>>> a.x
'foobar'
You cannot yourself create these additional descriptors. Even if you could, you cannot allocate more memory space for the extra slot references on the instances produced for this class, as that's information stored in the C struct for the class, and not in a manner accessible to Python code.
That's all because __slots__ is only an extension of the low-level handling of the elements that make up Python instances to Python code; the __dict__ and __weakref__ attributes on regular Python instances were always implemented as slots:
>>> class Regular: pass
...
>>> Regular.__dict__['__dict__']
<attribute '__dict__' of 'Regular' objects>
>>> Regular.__dict__['__weakref__']
<attribute '__weakref__' of 'Regular' objects>
>>> r = Regular()
>>> Regular.__dict__['__dict__'].__get__(r, Regular) is r.__dict__
True
All the Python developers did here was extend the system to add a few more of such slots using arbitrary names, with those names taken from the __slots__ attribute on the class being created, so that you can save memory; dictionaries take more memory than simple references to values in slots do. By specifying __slots__ you disable the __dict__ and __weakref__ slots, unless you explicitly include those in the __slots__ sequence.
The only way to extend slots then is to subclass; you can dynamically create a subclass with the type() function or by using a factory function:
def extra_slots_subclass(base, *slots):
class ExtraSlots(base):
__slots__ = slots
ExtraSlots.__name__ = base.__name__
return ExtraSlots
It appears to me a type turns __slots__ into a tuple as one of it's first orders of action. It then stores the tuple on the extended type object. Since beneath it all, the python is looking at a tuple, there is no way to mutate it. Indeed, I'm not even sure you can access it unless you pass a tuple in to the instance in the first place.
The fact that the original object that you set still remains as an attribute on the type is (perhaps) just a convenience for introspection.
You can't modify __slots__ and expect to have that show up somewhere (and really -- from a readability perspective, You probably don't really want to do that anyway, right?)...
Of course, you can always subclass to extend the slots:
>>> class C(A):
... __slots__ = ['z']
...
>>> c = C()
>>> c.x = 1
>>> c.z = 1
You cannot modify the __slots__ attribute after class creation. This is because it would leade to strange behaviour.
Imagine the following.
class A:
__slots__ = ["x"]
a = A()
A.__slots__.append("y")
a.y = None
What should happen in this scenario? No space was originally allocated for a second slot, but according to the slots attribute, a should be able have space for y.
__slots__ is not about protecting what names can and cannot be accessed. Rather __slots__ is about reducing the memory footprint of an object. By attempting to modify __slots__ you would defeat the optimisations that __slots__ is meant to achieve.
How __slots__ reduces memory footprint
Normally, an object's attributes are stored in a dict, which requires a fair bit of memory itself. If you are creating millions of objects then the space required by these dicts becomes prohibitive. __slots__ informs the python machinery that makes the class object that there will only be so many attributes refered to by instances of this class and what the names of the attributes will be. Therefore, the class can make an optimisation by storing the attributes directly on the instance rather than in a dict. It places the memory for the (pointers to the) attributes directly on the object, rather than creating a new dict for the object.
Putting answers to this and related question together, I want to make an accent on a solution to this problem:
You can kind of modify __slots__ by creating a subclass with the same name and then replacing parent class with its child. Note that you can do this for classes declared and used in any module, not just yours!
Consider the following module which declares some classes:
module.py:
class A(object):
# some class a user should import
__slots__ = ('x', 'b')
def __init__(self):
self.b = B()
class B(object):
# let's suppose we can't use it directly,
# it's returned as a part of another class
__slots__ = ('z',)
Here's how you can add attributes to these classes:
>>> import module
>>> from module import A
>>>
>>> # for classes imported into your module:
>>> A = type('A', (A,), {'__slots__': ('foo',)})
>>> # for classes which will be instantiated by the `module` itself:
>>> module.B = type('B', (module.B,), {'__slots__': ('bar',)})
>>>
>>> a = A()
>>> a.x = 1
>>> a.foo = 2
>>>
>>> b = a.b
>>> b.z = 3
>>> b.bar = 4
>>>
But what if you receive class instances from some third-party module using the module?
module_3rd_party.py:
from module import A
def get_instance():
return A()
No problem, it will also work! The only difference is that you may need to patch them before you import third-party module (in case it imports classes from the module):
>>> import module
>>>
>>> module.A = type('A', (module.A,), {'__slots__': ('foo',)})
>>> module.B = type('B', (module.B,), {'__slots__': ('bar',)})
>>>
>>> # note that we import `module_3rd_party` AFTER we patch the `module`
>>> from module_3rd_party import get_instance
>>>
>>> a = get_instance()
>>> a.x = 1
>>> a.foo = 2
>>>
>>> b = a.b
>>> b.z = 3
>>> b.bar = 4
>>>
It works because Python imports modules only once and then shares them between all other modules, so the changes you make to modules affect all code running along yours.

Check if class object

Is it possible in python to check if an object is a class object. IE if you have
class Foo(object):
pass
How could you check if o is Foo (or some other class) or an instance of Foo (or any other class instance)? In Java this would be a simple matter. Just check if the object is an instance of Class. Is there something similar in python or are you supposed to just not care?
Slight clarification: I'm trying to make a function that prints information about the parameter its given. So if you pass in o, where o = Foo() it prints out information about Foo. If you pass in Foo it should print out the exact same information. Not information about Type.
Use the isinstance builtin function.
>>> o = Foo()
>>> isinstance(o, Foo)
True
>>> isinstance(13, Foo)
False
This also works for subclasses:
>>> class Bar(Foo): pass
>>> b = Bar()
>>> isinstance(b, Foo)
True
>>> isinstance(b, Bar)
True
Yes, normally, you are supposed to not particularly care what type the object is. Instead, you just call the method you want on o so that people can plug in arbitrary objects that conform to your interface. This wouldn't be possible if you were to aggressively check the types of objects that you're using. This principle is called duck typing, and allows you a bit more freedom in how you choose to write your code.
Python is pragmatic though, so feel free to use isinstance if it makes sense for your particular program.
Edit:
To check if some variable is a class vs an instance, you can do this:
>>> isinstance(Foo, type) # returns true if the variable is a type.
True
>>> isinstance(o, type)
False
My end goal is to make a function that prints out information about an object if its an instance and print something different if its a class. So this time I do care.
First, understand that classes are instances — they're instances of type:
>>> class Foo(object):
... pass
...
>>> isinstance(Foo, type)
True
So, you can pick out classes that way, but keep in mind that classes are instances too. (And thus, you can pass classes to functions, return them from functions store them in lists, create the on the fly…)
the isinstance() function
isinstance(o, Foo)
and you can also use it to compare o to object
In [18]: class Foo(object): pass
In [20]: o_instance = Foo()
In [21]: o_class = Foo
In [22]: isinstance(o_instance, Foo)
Out[22]: True
In [23]: isinstance(o_class, Foo)
Out[23]: False
In [24]: isinstance(o_instance, object)
Out[24]: True
In [25]: isinstance(o_class, object)
Out[25]: True
I had to do like Thanatos said and check
isinstance(Foo, type)
But in the case of old class types you have to also do
isinstance(Foo, types.ClassType)

Categories

Resources