Python: deletion of self referencing object - python

I want to ask how to delete an object with a self-reference in Python.
Let's think a class, which is a simple example to know when it is created and when it is deleted:
#!/usr/bin/python
class TTest:
def __init__(self):
self.sub_func= None
print 'Created',self
def __del__(self):
self.sub_func= None
print 'Deleted',self
def Print(self):
print 'Print',self
This class has a variable self.sub_func to which we assume to assign a function. I want to assign a function using an instance of TTest to self.sub_func. See the following case:
def SubFunc1(t):
t.Print()
def DefineObj1():
t= TTest()
t.sub_func= lambda: SubFunc1(t)
return t
t= DefineObj1()
t.sub_func()
del t
The result is:
Created <__main__.TTest instance at 0x7ffbabceee60>
Print <__main__.TTest instance at 0x7ffbabceee60>
that is to say, though we executed "del t", t was not deleted.
I guess the reason is that t.sub_func is a self-referencing object, so reference counter of t does not become zero at "del t", thus t is not deleted by the garbage collector.
To solve this problem, I need to insert
t.sub_func= None
before "del t"; in this time, the output is:
Created <__main__.TTest instance at 0x7fab9ece2e60>
Print <__main__.TTest instance at 0x7fab9ece2e60>
Deleted <__main__.TTest instance at 0x7fab9ece2e60>
But this is strange. t.sub_func is part of t, so I do not want to care about clearing t.sub_func when deleting t.
Could you tell me if you know a good solution?

How to makes sure an object in a reference cycle gets deleted when it is no longer reachable? The simplest solution is not to define a __del__ method. Very few, if any, classes need a __del__ method. Python makes no guarantees about when or even if a __del__ method will get called.
There are several ways you can alleviate this problem.
Use a function rather than a lambda that contains and checks a weak reference. Requires explicit checking that the object is still alive each time the function is called.
Create a unique class for each object so that we can store the function on a class rather than as a monkey-patched function. This could get memory heavy.
Define a property that knows how to get the given function and turn it into a method. My personal favourite as it closely approximates how bound methods are created from a class'es unbound methods.
Using weak references
import weakref
class TTest:
def __init__(self):
self.func = None
print 'Created', self
def __del__(self):
print 'Deleted', self
def print_self(self):
print 'Print',self
def print_func(t):
t.print_self()
def create_ttest():
t = TTest()
weak_t = weakref.ref(t)
def func():
t1 = weak_t()
if t1 is None:
raise TypeError("TTest object no longer exists")
print_func(t1)
t.func = func
return t
if __name__ == "__main__":
t = create_ttest()
t.func()
del t
Creating a unique class
class TTest:
def __init__(self):
print 'Created', self
def __del__(self):
print 'Deleted', self
def print_self(self):
print 'Print',self
def print_func(t):
t.print_self()
def create_ttest():
class SubTTest(TTest):
def func(self):
print_func(self)
SubTTest.func1 = print_func
# The above also works. First argument is instantiated as the object the
# function was called on.
return SubTTest()
if __name__ == "__main__":
t = create_ttest()
t.func()
t.func1()
del t
Using properties
import types
class TTest:
def __init__(self, func):
self._func = func
print 'Created', self
def __del__(self):
print 'Deleted', self
def print_self(self):
print 'Print',self
#property
def func(self):
return types.MethodType(self._func, self)
def print_func(t):
t.print_self()
def create_ttest():
def func(self):
print_func(self)
t = TTest(func)
return t
if __name__ == "__main__":
t = create_ttest()
t.func()
del t

From the official CPython docs:
Objects that have __del__() methods and are part of a reference cycle cause the entire reference cycle to be uncollectable, including objects not necessarily in the cycle but reachable only from it. Python doesn’t collect such cycles automatically because, in general, it isn’t possible for Python to guess a safe order in which to run the __del__() methods. If you know a safe order, you can force the issue by examining the garbage list, and explicitly breaking cycles due to your objects within the list. Note that these objects are kept alive even so by virtue of being in the garbage list, so they should be removed from garbage too. For example, after breaking cycles, do del gc.garbage[:] to empty the list. It’s generally better to avoid the issue by not creating cycles containing objects with __del__() methods, and garbage can be examined in that case to verify that no such cycles are being created.
See also: http://engineering.hearsaysocial.com/2013/06/16/circular-references-in-python/

Related

Is it a good practice to keep reference in a class variable to the current instance of it?

I have a class that will always have only 1 object at the time. I'm just starting OOP in python and I was wondering what is a better approach: to assign an instance of this class to the variable and operate on that variable or rather have this instance referenced in the class variable instead. Here is an example of what I mean:
Referenced instance:
def Transaction(object):
current_transaction = None
in_progress = False
def __init__(self):
self.__class__.current_transaction = self
self.__class__.in_progress = True
self.name = 'abc'
self.value = 50
def update(self):
do_smth()
Transaction()
if Transaction.in_progress:
Transaction.current_transaction.update()
print Transaction.current_transaction.name
print Transaction.current_transaction.value
instance in a variable
def Transaction(object):
def __init__(self):
self.name = 'abc'
self.value = 50
def update(self):
do_smth()
current_transaction = Transaction()
in_progress = True
if in_progress:
current_transaction.update()
print current_transaction.name
print current_transaction.value
It's possible to see that you've encapsulated too much in the first case just by comparing the overall readability of the code: the second is much cleaner.
A better way to implement the first option is to use class methods: decorate all your method with #classmethod and then call with Transaction.method().
There's no practical difference in code quality for these two options. However, assuming that the the class is final, that is, without derived classes, I would go for a third choice: use the module as a singleton and kill the class. This would be the most compact and most readable choice. You don't need classes to create sigletons.
I think the first version doesn't make much sense, and the second version of your code would be better in almost all situations. It can sometimes be useful to write a Singleton class (where only one instance ever exists) by overriding __new__ to always return the saved instance (after it's been created the first time). But usually you don't need that unless you're wrapping some external resource that really only ever makes sense to exist once.
If your other code needs to share a single instance, there are other ways to do so (e.g. a global variable in some module or a constructor argument for each other object that needs a reference).
Note that if your instances have a very well defined life cycle, with specific events that should happen when they're created and destroyed, and unknown code running and using the object in between, the context manager protocol may be something you should look at, as it lets you use your instances in with statements:
with Transaction() as trans:
trans.whatever() # the Transaction will be notified if anything raises
other_stuff() # an exception that is not caught within the with block
trans.foo() # (so it can do a rollback if it wants to)
foo() # the Transaction will be cleaned up (e.g. committed) when the indented with block ends
Implementing the context manager protocol requires an __enter__ and __exit__ method.

How to replace/bypass a class property?

I would like to have a class with an attribute attr that, when accessed for the first time, runs a function and returns a value, and then becomes this value (its type changes, etc.).
A similar behavior can be obtained with:
class MyClass(object):
#property
def attr(self):
try:
return self._cached_result
except AttributeError:
result = ...
self._cached_result = result
return result
obj = MyClass()
print obj.attr # First calculation
print obj.attr # Cached result is used
However, .attr does not become the initial result, when doing this. It would be more efficient if it did.
A difficulty is that after obj.attr is set to a property, it cannot be set easily to something else, because infinite loops appear naturally. Thus, in the code above, the obj.attr property has no setter so it cannot be directly modified. If a setter is defined, then replacing obj.attr in this setter creates an infinite loop (the setter is accessed from within the setter). I also thought of first deleting the setter so as to be able to do a regular self.attr = …, with del self.attr, but this calls the property deleter (if any), which recreates the infinite loop problem (modifications of self.attr anywhere generally tend to go through the property rules).
So, is there a way to bypass the property mechanism and replace the bound property obj.attr by anything, from within MyClass.attr.__getter__?
This looks a bit like premature optimization : you want to skip a method call by making a descriptor change itself.
It's perfectly possible, but it would have to be justified.
To modify the descriptor from your property, you'd have to be editing your class, which is probably not what you want.
I think a better way to implement this would be to :
do not define obj.attr
override __getattr__, if argument is "attr", obj.attr = new_value, otherwise raise AttributeError
As soon as obj.attr is set, __getattr__ will not be called any more, as it is only called when the attribute does not exist. (__getattribute__ is the one that would get called all the time.)
The main difference with your initial proposal is that the first attribute access is slower, because of the method call overhead of __getattr__, but then it will be as fact as a regular __dict__ lookup.
Example :
class MyClass(object):
def __getattr__(self, name):
if name == 'attr':
self.attr = ...
return self.attr
raise AttributeError(name)
obj = MyClass()
print obj.attr # First calculation
print obj.attr # Cached result is used
EDIT : Please see the other answer, especially if you use Python 3.6 or more.
For new-style classes, which utilize the descriptor protocol, you could do this by creating your own custom descriptor class whose __get__() method will be called at most one time. When that happens, the result is then cached by creating an instance attribute with the same name the class method has.
Here's what I mean.
from __future__ import print_function
class cached_property(object):
"""Descriptor class for making class methods lazily-evaluated and caches the result."""
def __init__(self, func):
self.func = func
def __get__(self, inst, cls):
if inst is None:
return self
else:
value = self.func(inst)
setattr(inst, self.func.__name__, value)
return value
class MyClass(object):
#cached_property
def attr(self):
print('doing long calculation...', end='')
result = 42
return result
obj = MyClass()
print(obj.attr) # -> doing long calculation...42
print(obj.attr) # -> 42

Set object attributes to every variable used in a method in python

How can I set (almost) all local variables in an object's method to be attributes of that object?
class Obj(object):
def do_something(self):
localstr = 'hello world'
localnum = 1
#TODO store vars in the object for easier inspection
x = Obj()
x.do_something()
print x.localstr, x.localnum
Inspired by Python update object from dictionary, I came up with the following:
class Obj(object):
def do_something(self):
localstr = 'hello world'
localnum = 1
# store vars in the object for easier inspection
l = locals().copy()
del l['self']
for key,value in l.iteritems():
setattr(self, key, value)
x = Obj()
x.do_something()
print x.localstr, x.localnum
There is already a python debugger that let you inspect local variables, so there is no point in polluting the objects with random instance attributes.
Also your approach does not work if more than one method use the same local variable names, since it would be possible that a method overwrites some of the instance attributes, leaving the state of the object in an ambiguous state.
Also your solution goes against the DRY principle, since you must add the code before every return.
An other disadvantage is that often you want to know the state of the local variables in more than one place during method execution, and this is not possible with your answer.
If you really want to save the local variables manually, then something like this is probably much better than your solution:
import inspect
from collections import defaultdict
class LogLocals(object):
NO_BREAK_POINT = object()
def __init__(self):
self.__locals = defaultdict(defaultdict(list))
def register_locals(self, local_vars, method_name=None,
break_point=NO_BREAK_POINT):
if method_name is None:
method_name = inspect.currentframe(1).f_code.co_name
self.__locals[method_name][break_point].append(local_vars)
def reset_locals(self, method_name=None, break_point=NO_BREAK_POINT,
all_=False):
if method_name is None:
method_name = inspect.currentframe(1).f_code.co_name
if all_:
del self.__locals[method_name]
else:
del self.__locals[method_name][point]
def get_locals(self, method_name, break_point=NO_BREAK_POINT):
return self.__locals[method_name][break_point]
You simply have inherit from it and call register_locals(locals()) when you want to save the state. It also allow to distinguish between "break points" and most importantly it does not pollute the instances.
Also it distinguish between different calls returning a list of states instead of the last state.
If you want to access the locals of some call via attributes you can simply do something like:
class SimpleNamespace(object): # python3.3 already provides this
def __init__(self, attrs):
self.__dict__.update(attrs)
the_locals = x.get_locals('method_1')[-1] # take only last call locals
x = SimpleNamespace(the_locals)
x.some_local_variable
Anyway, I believe there is no much use for this. You ought to use the python debugger.

why are my weakrefs dead in the water when they point to a method? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why doesn't the weakref work on this bound method?
I'm using weakrefs in an observer-pattern and noticed an interesting phenomenon. If I create an object and add one of it's methods as an observer of an Observable, the reference is dead almost instantly. Can anyone explain what is happening?
I'm also interested in thoughts for why this might be a bad idea. I've decided not to use the weakrefs and just make sure to clean up after myself properly with Observable.removeobserver, but my curiosity is killing me here.
Here's the code:
from weakref import ref
class Observable:
__observers = None
def addobserver(self, observer):
if not self.__observers:
self.__observers = []
self.__observers.append(ref(observer))
print 'ADDING observer', ref(observer)
def removeobserver(self, observer):
self.__observers.remove(ref(observer))
def notify(self, event):
for o in self.__observers:
if o() is None:
print 'observer was deleted (removing)', o
self.__observers.remove(o)
else:
o()(event)
class C(Observable):
def set(self, val):
self.notify(val)
class bar(object):
def __init__(self):
self.c = C()
self.c.addobserver(self.foo)
print self.c._Observable__observers
def foo(self, x):
print 'callback', x #never reached
b = bar()
b.c.set(3)
and here's the output:
ADDING observer <weakref at 0xaf1570; to 'instancemethod' at 0xa106c0 (foo)>
[<weakref at 0xaf1570; dead>]
observer was deleted (removing) <weakref at 0xaf1570; dead>
the main thing to note is that the print statement after the call to addobserver shows that the weakref is already dead.
Whenever you do reference an object method, there's a bit of magic that happens, and it's that magic that's getting in your way.
Specifically, Python looks up the method on the object's class, then combines it with the object itself to create a kind of callable called a bound method. Every time e.g. the expression self.foo is evaluated, a new bound method instance is created. If you immediately take a weakref to that, then there are no other references to the bound method (even though both the object and the class's method still have live refs) and the weakref dies.
See this snippet on ActiveState for a workaround.
Each time you access a method of an instance, obj.m, a wrapper (called "bound method" is generated) that's callable an adds self (obj) as first argument when called. This is a neat solution for passing self "implicitly" and allows passing instance methods in the first place. But it also means that each time you type obj.m, a new (very lightweight) object is created, and unless you keep a (non-weak) reference to it around, it will be GC'd, because nobody will keep it alive for you.

Why doesn't the weakref work on this bound method?

I have a project where i'm trying to use weakrefs with callbacks, and I don't understand what I'm doing wrong. I have created simplified test that shows the exact behavior i'm confused with.
Why is it that in this test test_a works as expected, but the weakref for self.MyCallbackB disappears between the class initialization and calling test_b? I thought like as long as the instance (a) exists, the reference to self.MyCallbackB should exist, but it doesn't.
import weakref
class A(object):
def __init__(self):
def MyCallbackA():
print 'MyCallbackA'
self.MyCallbackA = MyCallbackA
self._testA = weakref.proxy(self.MyCallbackA)
self._testB = weakref.proxy(self.MyCallbackB)
def MyCallbackB(self):
print 'MyCallbackB'
def test_a(self):
self._testA()
def test_b(self):
self._testB()
if __name__ == '__main__':
a = A()
a.test_a()
a.test_b()
You want a WeakMethod.
An explanation why your solution doesn't work can be found in the discussion of the recipe:
Normal weakref.refs to bound methods don't quite work the way one expects, because bound methods are first-class objects; weakrefs to bound methods are dead-on-arrival unless some other strong reference to the same bound method exists.
According to the documentation for the Weakref module:
In the following, the term referent means the object which is referred to
by a weak reference.
A weak reference to an object is not
enough to keep the object alive: when
the only remaining references to a
referent are weak references, garbage
collection is free to destroy the
referent and reuse its memory for
something else.
Whats happening with MyCallbackA is that you are holding a reference to it in the instances of A, thanks to -
self.MyCallbackA = MyCallbackA
Now, there is no reference to the bound method MyCallbackB in your code. It is held only in a.__class__.__dict__ as an unbound method. Basically, a bound method is created (and returned to you) when you do self.methodName. (AFAIK, a bound method works like a property -using a descriptor (read-only): at least for new style classes. I am sure, something similar i.e. w/o descriptors happens for old style classes. I'll leave it to someone more experienced to verify the claim about old style classes.) So, self.MyCallbackB dies as soon as the weakref is created, because there is no strong reference to it!
My conclusions are based on :-
import weakref
#Trace is called when the object is deleted! - see weakref docs.
def trace(x):
print "Del MycallbackB"
class A(object):
def __init__(self):
def MyCallbackA():
print 'MyCallbackA'
self.MyCallbackA = MyCallbackA
self._testA = weakref.proxy(self.MyCallbackA)
print "Create MyCallbackB"
# To fix it, do -
# self.MyCallbackB = self.MyCallBackB
# The name on the LHS could be anything, even foo!
self._testB = weakref.proxy(self.MyCallbackB, trace)
print "Done playing with MyCallbackB"
def MyCallbackB(self):
print 'MyCallbackB'
def test_a(self):
self._testA()
def test_b(self):
self._testB()
if __name__ == '__main__':
a = A()
#print a.__class__.__dict__["MyCallbackB"]
a.test_a()
Output
Create MyCallbackB
Del MycallbackB
Done playing with MyCallbackB
MyCallbackA
Note :
I tried verifying this for old style classes. It turned out that "print a.test_a.__get__"
outputs -
<method-wrapper '__get__' of instancemethod object at 0xb7d7ffcc>
for both new and old style classes. So it may not really be a descriptor, just something descriptor-like. In any case, the point is that a bound-method object is created when you acces an instance method through self, and unless you maintain a strong reference to it, it will be deleted.
The other answers address the why in the original question, but either don't provide a workaround or refer to external sites.
After working through several other posts on StackExchange on this topic, many of which are marked as duplicates of this question, I finally came to a succinct workaround. When I know the nature of the object I'm dealing with, I use the weakref module; when I might instead be dealing with a bound method (as occurs in my code when using event callbacks), I now use the following WeakRef class as a direct replacement for weakref.ref(). I've tested this with Python 2.4 through and including Python 2.7, but not on Python 3.x.
class WeakRef:
def __init__ (self, item):
try:
self.method = weakref.ref (item.im_func)
self.instance = weakref.ref (item.im_self)
except AttributeError:
self.reference = weakref.ref (item)
else:
self.reference = None
def __call__ (self):
if self.reference != None:
return self.reference ()
instance = self.instance ()
if instance == None:
return None
method = self.method ()
return getattr (instance, method.__name__)

Categories

Resources