Garbage collect a class with a reference to its instance? - python

Consider this code snippet:
import gc
from weakref import ref
def leak_class(create_ref):
class Foo(object):
# make cycle non-garbage collectable
def __del__(self):
pass
if create_ref:
# create a strong reference cycle
Foo.bar = Foo()
return ref(Foo)
# without reference cycle
r = leak_class(False)
gc.collect()
print r() # prints None
# with reference cycle
r = leak_class(True)
gc.collect()
print r() # prints <class '__main__.Foo'>
It creates a reference cycle that cannot be collected, because the referenced instance has a __del__ method. The cycle is created here:
# create a strong reference cycle
Foo.bar = Foo()
This is just a proof of concept, the reference could be added by some external code, a descriptor or anything. If that's not clear to you, remember that each objects mantains a reference to its class:
+-------------+ +--------------------+
| | Foo.bar | |
| Foo (class) +------------>| foo (Foo instance) |
| | | |
+-------------+ +----------+---------+
^ |
| foo.__class__ |
+--------------------------------+
If I could guarantee that Foo.bar is only accessed from Foo, the cycle wouldn't be necessary, as theoretically the instance could hold only a weak reference to its class.
Can you think of a practical way to make this work without a leak?
As some are asking why would external code modify a class but can't control its lifecycle, consider this example, similar to the real-life example I was working to:
class Descriptor(object):
def __get__(self, obj, kls=None):
if obj is None:
try:
obj = kls._my_instance
except AttributeError:
obj = kls()
kls._my_instance = obj
return obj.something()
# usage example #
class Example(object):
foo = Descriptor()
def something(self):
return 100
print Example.foo
In this code only Descriptor (a non-data descriptor) is part of the API I'm implementing. Example class is an example of how the descriptor would be used.
Why does the descriptor store a reference to an instance inside the class itself? Basically for caching purposes. Descriptor required this contract with the implementor: it would be used in any class assuming that
The class has a constructor with no args, that gives an "anonymous instance" (my definition)
The class has some behavior-specific methods (something here).
An instance of the class can stay alive for an undefined amount of time.
It doesn't assume anything about:
How long it takes to construct an object
Whether the class implements del or other magic methods
How long the class is expected to live
Moreover the API was designed to avoid any extra load on the class implementor. I could have moved the responsibility for caching the object to the implementor, but I wanted a standard behavior.
There actually is a simple solution to this problem: make the default behavior to cache the instance (like it does in this code) but allow the implementor to override it if they have to implement __del__.
Of course this wouldn't be as simple if we assumed that the class state had to be preserved between calls.
As a starting point, I was coding a "weak object", an implementation of object that only kept a weak reference to its class:
from weakref import proxy
def make_proxy(strong_kls):
kls = proxy(strong_kls)
class WeakObject(object):
def __getattribute__(self, name):
try:
attr = kls.__dict__[name]
except KeyError:
raise AttributeError(name)
try:
return attr.__get__(self, kls)
except AttributeError:
return attr
def __setattr__(self, name, value):
# TODO: implement...
pass
return WeakObject
Foo.bar = make_proxy(Foo)()
It appears to work for a limited number of use cases, but I'd have to reimplement the whole set of object methods, and I don't know how to deal with classes that override __new__.

For your example, why don't you store _my_instance in a dict on the descriptor class, rather than on the class holding the descriptor? You could use a weakref or WeakValueDictionary in that dict, so that when the object disappears the dict will just lose its reference and the descriptor will create a new one on the next access.
Edit: I think you have a misunderstanding about the possibility of collecting the class while the instance lives on. Methods in Python are stored on the class, not the instance (barring peculiar tricks). If you have an object obj of class Class, and you allowed Class to be garbage collected while obj still exists, then calling a method obj.meth() on the object would fail, because the method would have disappeared along with the class. That is why your only option is to weaken your class->obj reference; even if you could make objects weakly reference their class, all it would do is break the class if the weakness ever "took effect" (i.e., if the class were collected while an instance still existed).

The problem you're facing is just a special case of the general ref-cycle-with-__del__ problem.
I don't see anything unusual in the way the cycles are created in your case, which is to say, you should resort to the standard ways of avoiding the general problem.
I think implementing and using a weak object would be hard to get right, and you would still need to remember to use it in all places where you define __del__. It doesn't sound like the best approach.
Instead, you should try the following:
consider not defining __del__ in your class (recommended)
in classes which define __del__, avoid reference cycles (in general, it might be hard/impossible to make sure no cycles are created anywhere in your code. In your case, seems like you want the cycles to exist)
explicitly break the cycles, using del (if there are appropriate points to do that in your code)
scan the gc.garbage list, and explicitly break reference cycles (using del)

Related

Why to put a property variable in the constructor

From other languages I am used to code a class property and afterwards I can access this without having it in the constructor like
Class MyClass:
def __init__(self):
self._value = 0
#property
my_property(self):
print('I got the value:' & self._value)
In almost every example I worked through the property variable was in the constructor self._value like this
Class MyClass:
def __init__(self, value = 0):
self._value = value
To me this makes no sence since you want to set it in the property. Could anyone explain to me what is the use of placing the value variable in the constructor?
Python objects are not struct-based (like C++ or Java), they are dict-based (like Javascript). This means that the instances attributes are dynamic (you can add new attributes or delete existing ones at runtime), and are not defined at the class level but at the instance level, and they are defined quite simply by assigning to them. While they can technically be defined anywhere in the code (even outside the class), the convention (and good practice) is to define them (eventually to default values) in the initializer (the __init__ method - the real constructor is named __new__ but there are very few reasons to override it) to make clear which attributes an instance of a given class is supposed to have.
Note the use of the term "attribute" here - in python, we don't talk about "member variables" or "member functions" but about "attributes" and "methods". Actually, since Python classes are objects too (instance of the type class or a subclass of), they have attributes too, so we have instance attributes (which are per-instance) and class attributes (which belong to the class object itself, and are shared amongst instances). A class attribute can be looked up on an instance, as long as it's not shadowed by an instance attribute of the same name.
Also, since Python functions are objects too (hint: in Python, everything - everything you can put on the RHS of an assignment that is - is an object), there are no distinct namespaces for "data" attributes and "function" attributes, and Python's "methods" are actually functions defined on the class itself - IOW they are class attributes that happen to be instances of the function type. Since methods need to access the instance to be able to work on it, there's a special mechanism that allow to "customize" attribute access so a given object - if it implements the proper interface - can return something else than itself when it's looked up on an instance but resolved on the class. This mechanism is used by functions so they turn themselves into methods (callable objects that wraps the function and instance together so you don't have to pass the instance to the function), but also more generally as the support for computed attributes.
The property class is a generic implementation of computed attributes that wraps a getter function (and eventually a setter and a deleter) - so in Python "property" has a very specific meaning (the property class itself or an instance of it). Also, the #decorator syntax is nothing magical (and isn't specific to properties), it's just syntactic sugar so given a "decorator" function:
def decorator(func):
return something
this:
#decorator
def foo():
# code here
is just a shortcut for:
def foo():
# code here
foo = decorator(foo)
Here I defined decorator as a function, but just any callable object (a "callable" object is an instance of a class that defines the __call__ magic method) can be used instead - and Python classes are callables (that's even actually by calling a class that you instanciate it).
So back to your code:
# in py2, you want to inherit from `object` for
# descriptors and other fancy things to work.
# this is useless in py3 but doesn't break anything either...
class MyClass(object):
# the `__init__` function will become an attribute
# of the `MyClass` class object
def __init__(self, value=0):
# defines the instance attribute named `_value`
# the leading underscore denotes an "implementation attribute"
# - something that is not part of the class public interface
# and should not be accessed externally (IOW a protected attribute)
self._value = value
# this first defines the `my_property` function, then
# pass it to `property()`, and rebinds the `my_property` name
# to the newly created `property` instance. The `my_property` function
# will then become the property's getter (it's `fget` instance attribute)
# and will be called when the `my_property` name is resolved on a `MyClass` instance
#property
my_property(self):
print('I got the value: {}'.format(self._value))
# let's at least return something
return self._value
You may then want to inspect both the class and an instance of it:
>>> print(MyClass.__dict__)
{'__module__': 'oop', '__init__': <function MyClass.__init__ at 0x7f477fc4a158>, 'my_property': <property object at 0x7f477fc639a8>, '__dict__': <attribute '__dict__' of 'MyClass' objects>, '__weakref__': <attribute '__weakref__' of 'MyClass' objects>, '__doc__': None}
>>> print(MyClass.my_property)
<property object at 0x7f477fc639a8>
>>> print(MyClass.my_property.fget)
<function MyClass.my_property at 0x7f477fc4a1e0>
>>> m = MyClass(42)
>>> print(m.__dict__)
{'_value': 42}
>>> print(m.my_property)
I got the value: 42
42
>>>
As a conclusion: if you hope to do anything usefull with a language, you have to learn this language - you cannot just expect it to work as other languages you know. While some features are based on common concepts (ie functions, classes etc), they can actually be implemented in a totally different way (Python's object model has almost nothing in common with Java's one), so just trying to write Java (or C or C++ etc) in Python will not work (just like trying to write Python in Java FWIW).
NB: just for the sake of completeness: Python objects can actually be made "struct-based" by using __slots__ - but the aim here is not to prevent dynamically adding attributes (that's only a side effect) but to make instances of those classes "lighter" in size (which is useful when you know you're going to have thousands or more instances of them at a given time).
Because #property is not a decorator for a variable, it is a decorator for function that allows it to behave like a property. You still need to create a class variable to use a function decorated by #property:
The #property decorator turns the voltage() method into a “getter” for a read-only attribute with the same name, and it sets the docstring for voltage to “Get the current voltage.”
A property object has getter, setter, and deleter methods usable as decorators that create a copy of the property with the corresponding accessor function set to the decorated function. This is best explained with an example:
I'm guessing you're coming from a language like C++ or Java where it is common to make attributes private and then write explicit getters and setters for them? In Python there is no such thing as private other than by convention and there is no need to write getters and setters for a variable that you only need to write and read as is. #property and the corresponding setter decorators can be used if you want to add additional behaviour (e.g. logging acccess) or you want to have pseudo-properties that you can access just like real ones, e.g. you might have a Circle class that is defined by it's radius but you could define a #property for the diameter so you can still write circle.diameter.
More specifically to your question: You want to have the property as an argument of the initializer if you want to set the property at the time when you create the object. You wouldn't want to create an empty object and then immediately fill it with properties as that would create a lot of noise and make the code less readable.
Just an aside: __init__ isn't actually a constructor. The constructor for Python objects is __new__ and you almost never overwrite it.

Circular reference - break on single reference

TL;DR Is there any way to create a weak reference that will call a callback upon having 1 strong reference left instead of 0?
For those who think it's an X Y problem, here's the long explanation:
I have quite a challenging issue that I'm trying to solve with my code.
Suppose we have an instance of some class Foo, and a different class Bar which references the instance as it uses it:
class Foo: # Can be anything
pass
class Bar:
"""I must hold the instance in order to do stuff"""
def __init__(self, inst):
self.inst = inst
foo_to_bar = {}
def get_bar(foo):
"""Creates Bar if one doesn't exist"""
return foo_to_bar.setdefault(foo, Bar(foo))
# We can either have
bar = get_foobar(Foo())
# Bar must hold a strong reference to foo
# Or
foo = Foo()
bar = get_foobar(foo)
bar2 = get_foobar(foo) # Same Bar
del bar
del bar2
bar3 = get_foobar(foo) # Same Bar
# In this case, as long as foo exists, we want the same bar to show up,
# therefore, foo must in some way hold a strong reference back to bar
Now here's the tricky part: You can solve this issue using a circular reference, where foo references bar and bar references foo, but hey, what's the fun part in that? It will take longer to clean up, will not work in case Foo defines __slots__ and generally will be a poor solution.
Is there any way, I can create a foo_to_bar mapping that cleans upon a single reference to both foo and bar? In essence:
import weakref
foo_to_bar = weakref.WeakKeyDictionary()
# If bar is referenced only once (as the dict value) and foo is
# referenced only once (from bar.inst) their mapping will be cleared out
This way it can work perfectly as having foo outside the function makes sure bar is still there (I might require __slots__ on Foo to support __weakref__) and having bar outside the function results in foo still being there (because of the strong reference in Bar).
WeakKeyDictionary does not work beacuse {weakref.ref(inst): bar.inst} will cause circular reference.
Alternatively, is there any way to hook into the reference counting mechanism (in order to clean when both objects get to 1 reference each) without incurring significant overhead?
You are overthinking this. You don't need to track if there is just one reference left. Your mistake is to create a circular reference in the first place.
Store _BarInner objects in your cache, that have no reference to Foo instances. Upon access to the mapping, return a lightweight Bar instance that contains both the _BarInner and Foo references:
from weakref import WeakKeyDictionary
from collections.abc import Mapping
class Foo:
pass
class Bar:
"""I must hold the instance in order to do stuff"""
def __init__(self, inst, inner):
self._inst = inst
self._inner = inner
# Access to interesting stuff is proxied on to the inner object,
# with the instance information included *as needed*.
#property
def spam(self):
self.inner.spam(self.inst)
class _BarInner:
"""The actual data you want to cache"""
def spam(self, instance):
# do something with instance, but *do not store any references to that
# object on self*.
class BarMapping(Mapping):
def __init__(self):
self._mapping = WeakKeyDictionary()
def __getitem__(self, inst):
inner = self._mapping.get(inst)
if inner is None:
inner = self._mapping[inst] = _BarInner()
return Bar(inst, inner)
Translating this to the bdict project linked in the comments, you can simplify things drastically:
Don't worry about lack of support for weak references in projects. Document that your project will only support per-instance data on types that have a __weakref__ attribute. That's enough.
Don't distinguish between slots and no-slots types. Always store per-instance data away from the instances. This lets you simplify your code.
The same goes for the 'strong' and 'autocache' flags. The flyweight should always keep a strong reference. Per-instance data should always be stored.
Use a single class for the descriptor return value. The ClassBoundDict type is all you need. Store the instance and owner data passed to __get__ in that object, and vary behaviour in __setitem__ accordingly.
Look at collections.ChainMap() to encapsulate access to the class and instance mappings for read access.

Modifying class __dict__ when shadowed by a property

I am attempting to modify a value in a class __dict__ directly using something like X.__dict__['x'] += 1. It is impossible to do the modification like that because a class __dict__ is actually a mappingproxy object that does not allow direct modification of values. The reason for attempting direct modification or equivalent is that I am trying to hide the class attribute behind a property defined on the metaclass with the same name. Here is an example:
class Meta(type):
def __new__(cls, name, bases, attrs, **kwargs):
attrs['x'] = 0
return super().__new__(cls, name, bases, attrs)
#property
def x(cls):
return cls.__dict__['x']
class Class(metaclass=Meta):
def __init__(self):
self.id = __class__.x
__class__.__dict__['x'] += 1
This is example shows a scheme for creating an auto-incremented ID for each instance of Class. The line __class__.__dict__['x'] += 1 can not be replaced by setattr(__class__, 'x', __class__.x + 1) because x is a property with no setter in Meta. It would just change a TypeError from mappingproxy into an AttributeError from property.
I have tried messing with __prepare__, but that has no effect. The implementation in type already returns a mutable dict for the namespace. The immutable mappingproxy seems to get set in type.__new__, which I don't know how to avoid.
I have also attempted to rebind the entire __dict__ reference to a mutable version, but that failed as well: https://ideone.com/w3HqNf, implying that perhaps the mappingproxy is not created in type.__new__.
How can I modify a class dict value directly, even when shadowed by a metaclass property? While it may be effectively impossible, setattr is able to do it somehow, so I would expect that there is a solution.
My main requirement is to have a class attribute that appears to be read only and does not use additional names anywhere. I am not absolutely hung up on the idea of using a metaclass property with an eponymous class dict entry, but that is usually how I hide read only values in regular instances.
EDIT
I finally figured out where the class __dict__ becomes immutable. It is described in the last paragraph of the "Creating the Class Object" section of the Data Model reference:
When a new class is created by type.__new__, the object provided as the namespace parameter is copied to a new ordered mapping and the original object is discarded. The new copy is wrapped in a read-only proxy, which becomes the __dict__ attribute of the class object.
Probably the best way: just pick another name. Call the property x and the dict key '_x', so you can access it the normal way.
Alternative way: add another layer of indirection:
class Meta(type):
def __new__(cls, name, bases, attrs, **kwargs):
attrs['x'] = [0]
return super().__new__(cls, name, bases, attrs)
#property
def x(cls):
return cls.__dict__['x'][0]
class Class(metaclass=Meta):
def __init__(self):
self.id = __class__.x
__class__.__dict__['x'][0] += 1
That way you don't have to modify the actual entry in the class dict.
Super-hacky way that might outright segfault your Python: access the underlying dict through the gc module.
import gc
class Meta(type):
def __new__(cls, name, bases, attrs, **kwargs):
attrs['x'] = 0
return super().__new__(cls, name, bases, attrs)
#property
def x(cls):
return cls.__dict__['x']
class Class(metaclass=Meta):
def __init__(self):
self.id = __class__.x
gc.get_referents(__class__.__dict__)[0]['x'] += 1
This bypasses critical work type.__setattr__ does to maintain internal invariants, particularly in things like CPython's type attribute cache. It is a terrible idea, and I'm only mentioning it so I can put this warning here, because if someone else comes up with it, they might not know that messing with the underlying dict is legitimately dangerous.
It is very easy to end up with dangling references doing this, and I have segfaulted Python quite a few times experimenting with this. Here's one simple case that crashed on Ideone:
import gc
class Foo(object):
x = []
Foo().x
gc.get_referents(Foo.__dict__)[0]['x'] = []
print(Foo().x)
Output:
*** Error in `python3': double free or corruption (fasttop): 0x000055d69f59b110 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x70bcb)[0x2b32d5977bcb]
/lib/x86_64-linux-gnu/libc.so.6(+0x76f96)[0x2b32d597df96]
/lib/x86_64-linux-gnu/libc.so.6(+0x7778e)[0x2b32d597e78e]
python3(+0x2011f5)[0x55d69f02d1f5]
python3(+0x6be7a)[0x55d69ee97e7a]
python3(PyCFunction_Call+0xd1)[0x55d69efec761]
python3(PyObject_Call+0x47)[0x55d69f035647]
... [it continues like that for a while]
And here's a case with wrong results and no noisy error message to alert you to the fact that something has gone wrong:
import gc
class Foo(object):
x = 'foo'
print(Foo().x)
gc.get_referents(Foo.__dict__)[0]['x'] = 'bar'
print(Foo().x)
Output:
foo
foo
I make absolutely no guarantees as to any safe way to use this, and even if things happen to work out on one Python version, they may not work on future versions. It can be fun to fiddle with, but it's not something to actually use. Seriously, don't do it. Do you want to explain to your boss that your website went down or your published data analysis will need to be retracted because you took this bad idea and used it?
This probably counts as an "additional name" you don't want, but I've implemented this using a dictionary in the metaclass where the keys are the classes. The __next__ method on the metaclass makes the class itself iterable, such that you can just do next() to get the next ID. The dunder method also keeps the method from being available through the instances. The dictionary storing the next id has a name starting with a double underscore, so it's not easily discoverable from any of the classes that use it. The incrementing ID functionality is thus entirely contained in the metaclass.
I tucked the assignment of the id into a __new__ method on a base class, so you don't have to worry about it in __init__. This also allows you to del Meta so all the machinery is a little harder to get to.
class Meta(type):
__ids = {}
#property
def id(cls):
return __class__.__ids.setdefault(cls, 0)
def __next__(cls):
id = __class__.__ids.setdefault(cls, 0)
__class__.__ids[cls] += 1
return id
class Base(metaclass=Meta):
def __new__(cls, *args, **kwargs):
self = object.__new__(cls)
self.id = next(cls)
return self
del Meta
class Class(Base):
pass
class Brass(Base):
pass
c0 = Class()
c1 = Class()
b0 = Brass()
b1 = Brass()
assert (b0.id, b1.id, c0.id, c1.id) == (0, 1, 0, 1)
assert (Class.id, Brass.id) == (2, 2)
assert not hasattr(Class, "__ids")
assert not hasattr(Brass, "__ids")
Note that I've used the same name for the attribute on both the class and the object. That way Class.id is the number of instances you've created, while c1.id is the ID of that specific instance.
My main requirement is to have a class attribute that appears to be read only and does not use additional names anywhere. I am not absolutely hung up on the idea of using a metaclass property with an eponymous class dict entry, but that is usually how I hide read only values in regular instances.
What you are asking for is a contradiction: If your example worked, then __class__.__dict__['x'] would be an "additional name" for the attribute. So clearly we need a more specific definition of "additional name." But to come up with that definition, we need to know what you are trying to accomplish (NB: The following goals are not mutually exclusive, so you may want to do all of these things):
You want to make the value completely untouchable, except within the Class.__init__() method (and the same method of any subclasses): This is unPythonic and quite impossible. If __init__() can modify the value, then so can anyone else. You might be able to accomplish something like this if the modifying code lives in Class.__new__(), which the metaclass dynamically creates in Meta.__new__(), but that's extremely ugly and hard to understand.
You want the code that manipulates the value to be "nicely encapsulated": Write a method in the metaclass that increments the private value (or does whatever other modification you need), and provide a read-only metaclass property that accesses it under the public name.
You are concerned about a subclass accidentally clashing names with the private name: Prefix the private name with a double underscore to invoke automatic name mangling. While this is usually seen as a bit unPythonic, it is appropriate for cases where name collisions may be less obvious to subclass authors, such as the internal names of a metaclass colliding with the internal names of a regular class instantiated from it.

Python memory allocation, when using bound, static or class functions?

I am curious about this: what actually happens to the python objects once that you create a class that contains each one of these functions?
Looking at some example, I see that either the bound, static or class function is in fact creating a class object, which is the one that contains all 3 function.
Is this always true, no matter which function I call? and the parent object class (object in this case, but can be anything I think) is always called, since the constructor in my class is invoking it implicitly?
class myclass(object):
a=1
b=True
def myfunct(self, b)
return (self.a + b)
#staticmethod
def staticfunct(b):
print b
#classmethod
classfunct(cls, b):
cls.a=b
Since it was not clear: what is the lifecycle for this object class, when I use it as following?
from mymodule import myclass
class1 = myclass()
class1.staticfunct(4)
class1.classfunct(3)
class1.myfunct
In the case of static, myclass object get allocated, and then the function is run, but class and bound method are not generated?
In the case of class funciton, it is the same as above?
in the case of the bound function, everything in the class is allocated?
The class statement creates the class. That is an object which has all three functions, but the first (myfunct) is unbound and cannot be called without an instance object of this class.
The instances of this class (in case you create them) will have bound versions of this function and references to the static and the class functions.
So, both the class and the instances have all three functions.
None of these functions create a class object, though. That is done by the class statement. (To be precise: When the interpreter completes the class creation, i. e. the class does not yet exist when the functions inside it are created; mind boggling, but seldom necessary to know.)
If you do not override the __init__() function, it will be inherited and called for each created instance, yes.
Since it was not clear: what is the lifecycle for this object class,
when I use it as following?
from mymodule import myclass
This will create the class, and code for all functions. They will be classmethod, staticmethod, and method (which you can see by using type() on them)
class1 = myclass()
This will create an instance of the class, which has a dictionary and a lot of other stuff. It doesn't do anything to your methods though.
class1.staticfunct(4)
This calls your staticfunct.
class1.classfunct(3)
This calls you classfunct
class1.myfunct
This will create a new object that is a bound myfunct method of class1. It is often useful to bind this to a variable if you are going to be calling it over and over. But this bound method has normal lifetime.
Here is an example you might find illustrative:
>>> class foo(object):
... def bar(self):
... pass
...
>>> x = foo()
>>> x.bar is x.bar
False
Every time you access x.bar, it creates a new bound method object.
And another example showing class methods:
>>> class foo(object):
... #classmethod
... def bar():
... pass
...
>>> foo.bar
<bound method type.bar of <class '__main__.foo'>>
Your class myclass actually has four methods that are important: the three you explicitly coded and the constructor, __init__ which is inherited from object. Only the constructor creates a new instance. So in your code one instance is created, which you have named class1 (a poor choice of name).
myfunctcreates a new integer by adding class1.a to 4. The lifecycle of class1 is not affected, nor are variables class1.a, class1.b, myclass.a or myclass.b.
staticfunct just prints something, and the attributes of myclass and class1 are irrelevant.
classfunct modifies the variable myclass.a. It has no effect on the lifecycle or state of class1.
The variable myclass.b is never used or accessed at all; the variables named b in the individual functions refer to the values passed in the function's arguments.
Additional info added based on the OP's comments:
Except for the basic data types (int, chars, floats, etc) everything in Python is an object. That includes the class itself (a class object), every method (a method object) and every instance you create. Once created each object remains alive until every reference to it disappears; then it is garbage-collected.
So in your example, when the interpreter reaches the end of the class statement body an object named "myclass" exists, and additional objects exist for each of its members (myclass.a, myclass.b, myclass.myfunct, myclass.staticfunct etc.) There is also some overhead for each object; most objects have a member named __dict__ and a few others. When you instantiate an instance of myclass, named "class1", another new object is created. But there are no new method objects created, and no instance variables since you don't have any of those. class1.a is a pseudonym for myclass.a and similarly for the methods.
If you want to get rid of an object, i.e., have it garbage-collected, you need to eliminate all references to it. In the case of global variables you can use the "del" statement for this purpose:
A = myclass()
del A
Will create a new instance and immediately delete it, releasing its resources for garbage collection. Of course you then cannot subsequently use the object, for example print(A) will now give you an exception.

Why doesn't the weakref work on this bound method?

I have a project where i'm trying to use weakrefs with callbacks, and I don't understand what I'm doing wrong. I have created simplified test that shows the exact behavior i'm confused with.
Why is it that in this test test_a works as expected, but the weakref for self.MyCallbackB disappears between the class initialization and calling test_b? I thought like as long as the instance (a) exists, the reference to self.MyCallbackB should exist, but it doesn't.
import weakref
class A(object):
def __init__(self):
def MyCallbackA():
print 'MyCallbackA'
self.MyCallbackA = MyCallbackA
self._testA = weakref.proxy(self.MyCallbackA)
self._testB = weakref.proxy(self.MyCallbackB)
def MyCallbackB(self):
print 'MyCallbackB'
def test_a(self):
self._testA()
def test_b(self):
self._testB()
if __name__ == '__main__':
a = A()
a.test_a()
a.test_b()
You want a WeakMethod.
An explanation why your solution doesn't work can be found in the discussion of the recipe:
Normal weakref.refs to bound methods don't quite work the way one expects, because bound methods are first-class objects; weakrefs to bound methods are dead-on-arrival unless some other strong reference to the same bound method exists.
According to the documentation for the Weakref module:
In the following, the term referent means the object which is referred to
by a weak reference.
A weak reference to an object is not
enough to keep the object alive: when
the only remaining references to a
referent are weak references, garbage
collection is free to destroy the
referent and reuse its memory for
something else.
Whats happening with MyCallbackA is that you are holding a reference to it in the instances of A, thanks to -
self.MyCallbackA = MyCallbackA
Now, there is no reference to the bound method MyCallbackB in your code. It is held only in a.__class__.__dict__ as an unbound method. Basically, a bound method is created (and returned to you) when you do self.methodName. (AFAIK, a bound method works like a property -using a descriptor (read-only): at least for new style classes. I am sure, something similar i.e. w/o descriptors happens for old style classes. I'll leave it to someone more experienced to verify the claim about old style classes.) So, self.MyCallbackB dies as soon as the weakref is created, because there is no strong reference to it!
My conclusions are based on :-
import weakref
#Trace is called when the object is deleted! - see weakref docs.
def trace(x):
print "Del MycallbackB"
class A(object):
def __init__(self):
def MyCallbackA():
print 'MyCallbackA'
self.MyCallbackA = MyCallbackA
self._testA = weakref.proxy(self.MyCallbackA)
print "Create MyCallbackB"
# To fix it, do -
# self.MyCallbackB = self.MyCallBackB
# The name on the LHS could be anything, even foo!
self._testB = weakref.proxy(self.MyCallbackB, trace)
print "Done playing with MyCallbackB"
def MyCallbackB(self):
print 'MyCallbackB'
def test_a(self):
self._testA()
def test_b(self):
self._testB()
if __name__ == '__main__':
a = A()
#print a.__class__.__dict__["MyCallbackB"]
a.test_a()
Output
Create MyCallbackB
Del MycallbackB
Done playing with MyCallbackB
MyCallbackA
Note :
I tried verifying this for old style classes. It turned out that "print a.test_a.__get__"
outputs -
<method-wrapper '__get__' of instancemethod object at 0xb7d7ffcc>
for both new and old style classes. So it may not really be a descriptor, just something descriptor-like. In any case, the point is that a bound-method object is created when you acces an instance method through self, and unless you maintain a strong reference to it, it will be deleted.
The other answers address the why in the original question, but either don't provide a workaround or refer to external sites.
After working through several other posts on StackExchange on this topic, many of which are marked as duplicates of this question, I finally came to a succinct workaround. When I know the nature of the object I'm dealing with, I use the weakref module; when I might instead be dealing with a bound method (as occurs in my code when using event callbacks), I now use the following WeakRef class as a direct replacement for weakref.ref(). I've tested this with Python 2.4 through and including Python 2.7, but not on Python 3.x.
class WeakRef:
def __init__ (self, item):
try:
self.method = weakref.ref (item.im_func)
self.instance = weakref.ref (item.im_self)
except AttributeError:
self.reference = weakref.ref (item)
else:
self.reference = None
def __call__ (self):
if self.reference != None:
return self.reference ()
instance = self.instance ()
if instance == None:
return None
method = self.method ()
return getattr (instance, method.__name__)

Categories

Resources