I have a class that opens a file for writing. In my destructor, I call the function that closes the file:
class MyClass:
def __del__(self):
self.close()
def close(self):
if self.__fileHandle__ is not None:
self.__fileHandle__.close()
but when I delete the object with code like:
myobj = MyClass()
myobj.open()
del myobj
if I try to reinstantiate the object, I get a value error:
ValueError: The file 'filename' is already opened. Please close it before reopening in write mode.
whereas if I call myobj.close() before del myobj I don't get this problem. So why isn't __del__() getting called?
Are you sure you want to use __del__? There are issues with __del__ and garbage collection.
You could make MyClass a context manager instead:
class MyClass(object):
def __enter__(self):
return self
def __exit__(self,ext_type,exc_value,traceback):
if self.__fileHandle__ is not None:
self.__fileHandle__.close()
By doing so, you could use MyClass like this:
with MyClass() as myobj:
...
and myobj.__exit__ (and thus self.__fileHandle__.close()) will be called when Python leaves the with-block.
That's not what del does. It's unfortunate that __del__ has the same name as del, because they are not related to each other. In modern terminology, the __del__ method would be called a finalizer, not a destructor and the difference is important.
The short difference is that it's easy to guarantee when a destructor is called, but you have very few guarantees about when __del__ will be called and it might never be called. There are many different circumstances that can cause this.
If you want lexical scoping, use a with statement. Otherwise, call myobj.close() directly. The del statement only deletes references, not objects.
I found another answer (link) to a different question that answers this in more detail. It is unfortunate that the accepted answer to that question contains egregious errors.
Edit: As commentors noted, you need to inherit from object. That is fine, but it is still possible that __del__ will never be called (you could be getting lucky). See the linked answer above.
Your code should inherit from object - not doing so has been considered out of date (except in special cases) for at least six years.
You can read about __del__ here: http://docs.python.org/reference/datamodel.html#object.del
The short version of why you need to inherit from object is that __del__ is only "magic" on new-style classes.
If you need to rely on calling of a finalizer, I strongly suggest that you use the context manager approach recommended in other answers, because that is a portable, robust solution.
Perhaps something else is referencing it, which is why __del__ isn't being called (yet).
Consider this code:
#!/usr/bin/env python
import time
class NiceClass():
def __init__(self, num):
print "Make %i" % num
self.num = num
def __del__(self):
print "Unmake %i" % self.num
x = NiceClass(1)
y = NiceClass(2)
z = NiceClass(3)
lst = [x, y, z]
time.sleep(1)
del x
del y
del z
time.sleep(1)
print "Deleting the list."
del lst
time.sleep(1)
It doesn't call __del__ of NiceClass instances until we delete the list that references them.
Unlike C++, __del__ isn't being called unconditionally to destruct the object on demand. GC makes things a bit harder. Here is some info: http://arctrix.com/nas/python/gc/
Related
I have an application with a ProcessPoolExecutor, to which I deliver an object instance that has a destructor implemented using the __del__ method.
The problem is, that the __del__ method deletes files from the disk, that are common to all the threads (processes). When a process in the pool finishes its job, it calls the __del__ method of the object it got and thus ruins the resources of the other threads (processes).
I tried to prepare a "safe" object, without a destructor, which I would use when submitting jobs to the pool:
my_safe_object = copy.deepcopy(my_object)
delattr(my_safe_object, '__del__')
But the delattr call fails with the following error:
AttributeError: __del__
Any idea how to get rid of the __del__ method of an existing object at runtime?
UPDATE - My solution:
Eventually I solved it using quite an elegant workaround:
class C:
def __init__(self):
self.orig_id = id(self)
# ... CODE ...
def __del__(self):
if id(self) != self.orig_id:
return
# .... CODE ....
So the field orig_id is only computed for the original object, where the constructor is really executed. The other object "clones" are created using a deep-copy, so their orig_id value will contain the id of the original object. Thus, when the clones are destroyed and call __del__, they will compare their own id with the original object id and will return, as the IDs will not match. Thus, only the original object will pass into executing __del__.
The best thing yo do there, if you have access to the object's class code, is not to rely on __del__ at all. The fact of __del__ having a permanent side-effect could be a problem by itself, but in an environment using multiprocessing it is definitively a no-go!
Here is why: first __del__ is a method that lies on the instance's class, as most "magic" methods (and that is why you can't delete it from an instance). Second: __del__ is called when references to an object reach zero. However, if you don't have any reference to an object on the "master" process, that does not mean all the child processes are over with it. This is likely the source of your problem: reference counting for objects are independent in each process. And third: you don't have that much control on when __del__ is called, even in a single process application. It is not hard to have a dangling reference to an object in a dictionary, or cache somewhere - so tying important application behavior to __del__ is normally discouraged. And all of this is only for recent Python versions (~ > 3.5), as prior to that, __del__ would be even more unreliable, and Python would not ensure it was called at all.
So, as the other answers put it, you could try snooze __del__ directly on the class, but that would have to be done on the object's class in all the sub-processes as well.
Therefore the way I recommend you to do this is to have a method to be explicitly called that will perform the file-erasing and other side-effects when disposing of an object. You simply rename your __del__ method and call it just on the main process.
If you want to ensure this "destructor" to be called,Python does offer some automatic control with the context protocol: you will then use your objects within a with statement block - and destroy it with inside an __exit__ method. This method is called automatically at the end of the with block. Of course, you will have to devise a way for the with block just to be left when work in the subprocess on the instance have finished. That is why in this case, I think an ordinary, explicit, clean-up method that would be called on your main process when consuming the "result" of whatever you executed off-process would be easier.
TL;DR
Change your source object's class clean-up code from __del__ to an ordinary method, like cleanup
On submitting your instances to off-process executing, call the clean-up in your main-process, by using the concurrent.futures.as_completed call.
In case you can't change the source code for the object's class, inherit it,
override __del__ with a no-op method, and force the object's __class__ atribute to the inherited class before submitting it to other processes:
class SafeObject(BombObject):
def __del__(self):
pass
def execute(obj):
# this function is executed in other process
...
def execute_all(obj_list):
executor = concurrent.futures.ProcessPoolExecutor(max_workers=XX)
with executor:
futures = {}
for obj in obj_list:
obj.__class__ = SafeObject
futures[executor.submit(execute, obj)] = obj
for future in concurrent.futures.as_completed(futures):
value = future.result() # add try/except aroudn this as needed.
BombClass.__del__(obj) # Or just restore the "__class__" if the isntances will be needed elsewhere
del futures # Needed to clean-up the extra references to the objects created in the futures dict.
(please note that the "with" statement above is from the recommended usage for ProcessPoolExecutor, from the docs, not for the custom __exit__ method I suggested you using earlier in the answer. Having a with block equivalent that will allow you to take full advantage of the ProcessPoolExecutor will require some ingenuity into it)
In general, methods belong to the class. While generally you can shadow a method on an instance, special "dunder" methods are optimized to check the class first regardless. So consider:
In [1]: class Foo:
...: def __int__(self):
...: return 42
...:
In [2]: foo = Foo()
In [3]: int(foo)
Out[3]: 42
In [4]: foo.__int__ = lambda self: 43
In [5]: int(foo)
Out[5]: 42
You can read more about this behavior in the docs
For custom classes, implicit invocations of special methods are only guaranteed to work correctly if defined on an object’s type, not in the object’s instance dictionary.
I think the cleanest solution if you are using multiprocessing is to simply derive from the class and override __del__. I fear that monkey-patching the class will not play nice with multiprocessing, unless you monkey patch the class in all the processes. Not sure how the pickleing will work out here.
I have a class that will always have only 1 object at the time. I'm just starting OOP in python and I was wondering what is a better approach: to assign an instance of this class to the variable and operate on that variable or rather have this instance referenced in the class variable instead. Here is an example of what I mean:
Referenced instance:
def Transaction(object):
current_transaction = None
in_progress = False
def __init__(self):
self.__class__.current_transaction = self
self.__class__.in_progress = True
self.name = 'abc'
self.value = 50
def update(self):
do_smth()
Transaction()
if Transaction.in_progress:
Transaction.current_transaction.update()
print Transaction.current_transaction.name
print Transaction.current_transaction.value
instance in a variable
def Transaction(object):
def __init__(self):
self.name = 'abc'
self.value = 50
def update(self):
do_smth()
current_transaction = Transaction()
in_progress = True
if in_progress:
current_transaction.update()
print current_transaction.name
print current_transaction.value
It's possible to see that you've encapsulated too much in the first case just by comparing the overall readability of the code: the second is much cleaner.
A better way to implement the first option is to use class methods: decorate all your method with #classmethod and then call with Transaction.method().
There's no practical difference in code quality for these two options. However, assuming that the the class is final, that is, without derived classes, I would go for a third choice: use the module as a singleton and kill the class. This would be the most compact and most readable choice. You don't need classes to create sigletons.
I think the first version doesn't make much sense, and the second version of your code would be better in almost all situations. It can sometimes be useful to write a Singleton class (where only one instance ever exists) by overriding __new__ to always return the saved instance (after it's been created the first time). But usually you don't need that unless you're wrapping some external resource that really only ever makes sense to exist once.
If your other code needs to share a single instance, there are other ways to do so (e.g. a global variable in some module or a constructor argument for each other object that needs a reference).
Note that if your instances have a very well defined life cycle, with specific events that should happen when they're created and destroyed, and unknown code running and using the object in between, the context manager protocol may be something you should look at, as it lets you use your instances in with statements:
with Transaction() as trans:
trans.whatever() # the Transaction will be notified if anything raises
other_stuff() # an exception that is not caught within the with block
trans.foo() # (so it can do a rollback if it wants to)
foo() # the Transaction will be cleaned up (e.g. committed) when the indented with block ends
Implementing the context manager protocol requires an __enter__ and __exit__ method.
Can someone explain why the following code behaves the way it does:
import types
class Dummy():
def __init__(self, name):
self.name = name
def __del__(self):
print "delete",self.name
d1 = Dummy("d1")
del d1
d1 = None
print "after d1"
d2 = Dummy("d2")
def func(self):
print "func called"
d2.func = types.MethodType(func, d2)
d2.func()
del d2
d2 = None
print "after d2"
d3 = Dummy("d3")
def func(self):
print "func called"
d3.func = types.MethodType(func, d3)
d3.func()
d3.func = None
del d3
d3 = None
print "after d3"
The output (note that the destructor for d2 is never called) is this (python 2.7)
delete d1
after d1
func called
after d2
func called
delete d3
after d3
Is there a way to "fix" the code so the destructor is called without deleting the method added? I mean, the best place to put the d2.func = None would be in the destructor!
Thanks
[edit] Based on the first few answers, I'd like to clarify that I'm not asking about the merits (or lack thereof) of using __del__. I tried to create the shortest function that would demonstrate what I consider to be non-intuitive behavior. I'm assuming a circular reference has been created, but I'm not sure why. If possible, I'd like to know how to avoid the circular reference....
You cannot assume that __del__ will ever be called - it is not a place to hope that resources are automagically deallocated. If you want to make sure that a (non-memory) resource is released, you should make a release() or similar method and then call that explicitly (or use it in a context manager as pointed out by Thanatos in comments below).
At the very least you should read the __del__ documentation very closely, and then you should probably not try to use __del__. (Also refer to the gc.garbage documentation for other bad things about __del__)
I'm providing my own answer because, while I appreciate the advice to avoid __del__, my question was how to get it to work properly for the code sample provided.
Short version: The following code uses weakref to avoid the circular reference. I thought I'd tried this before posting the question, but I guess I must have done something wrong.
import types, weakref
class Dummy():
def __init__(self, name):
self.name = name
def __del__(self):
print "delete",self.name
d2 = Dummy("d2")
def func(self):
print "func called"
d2.func = types.MethodType(func, weakref.ref(d2)) #This works
#d2.func = func.__get__(weakref.ref(d2), Dummy) #This works too
d2.func()
del d2
d2 = None
print "after d2"
Longer version:
When I posted the question, I did search for similar questions. I know you can use with instead, and that the prevailing sentiment is that __del__ is BAD.
Using with makes sense, but only in certain situations. Opening a file, reading it, and closing it is a good example where with is a perfectly good solution. You've gone a specific block of code where the object is needed, and you want to clean up the object and the end of the block.
A database connection seems to be used often as an example that doesn't work well using with, since you usually need to leave the section of code that creates the connection and have the connection closed in a more event-driven (rather than sequential) timeframe.
If with is not the right solution, I see two alternatives:
You make sure __del__ works (see this blog for a better
description of weakref usage)
You use the atexit module to run a callback when your program closes. See this topic for example.
While I tried to provide simplified code, my real problem is more event-driven, so with is not an appropriate solution (with is fine for the simplified code). I also wanted to avoid atexit, as my program can be long-running, and I want to be able to perform the cleanup as soon as possible.
So, in this specific case, I find it to be the best solution to use weakref and prevent circular references that would prevent __del__ from working.
This may be an exception to the rule, but there are use-cases where using weakref and __del__ is the right implementation, IMHO.
Instead of del, you can use the with operator.
http://effbot.org/zone/python-with-statement.htm
just like with filetype objects, you could something like
with Dummy('d1') as d:
#stuff
#d's __exit__ method is guaranteed to have been called
del doesn't call __del__
del in the way you are using removes a local variable. __del__ is called when the object is destroyed. Python as a language makes no guarantees as to when it will destroy an object.
CPython as the most common implementation of Python, uses reference counting. As a result del will often work as you expect. However it will not work in the case that you have a reference cycle.
d3 -> d3.func -> d3
Python doesn't detect this and so won't clean it up right away. And its not just reference cycles. If an exception is throw you probably want to still call your destructor. However, Python will typically hold onto to the local variables as part of its traceback.
The solution is not to depend on the __del__ method. Rather, use a context manager.
class Dummy:
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
print "Destroying", self
with Dummy() as dummy:
# Do whatever you want with dummy in here
# __exit__ will be called before you get here
This is guaranteed to work, and you can even check the parameters to see whether you are handling an exception and do something different in that case.
A full example of a context manager.
class Dummy(object):
def __init__(self, name):
self.name = name
def __enter__(self):
return self
def __exit__(self, exct_type, exce_value, traceback):
print 'cleanup:', d
def __repr__(self):
return 'Dummy(%r)' % (self.name,)
with Dummy("foo") as d:
print 'using:', d
print 'later:', d
It seems to me the real heart of the matter is here:
adding the functions is dynamic (at runtime) and not known in advance
I sense that what you are really after is a flexible way to bind different functionality to an object representing program state, also known as polymorphism. Python does that quite well, not by attaching/detaching methods, but by instantiating different classes. I suggest you look again at your class organization. Perhaps you need to separate a core, persistent data object from transient state objects. Use the has-a paradigm rather than is-a: each time state changes, you either wrap the core data in a state object, or you assign the new state object to an attribute of the core.
If you're sure you can't use that kind of pythonic OOP, you could still work around your problem another way by defining all your functions in the class to begin with and subsequently binding them to additional instance attributes (unless you're compiling these functions on the fly from user input):
class LongRunning(object):
def bark_loudly(self):
print("WOOF WOOF")
def bark_softly(self):
print("woof woof")
while True:
d = LongRunning()
d.bark = d.bark_loudly
d.bark()
d.bark = d.bark_softly
d.bark()
An alternative solution to using weakref is to dynamically bind the function to the instance only when it is called by overriding __getattr__ or __getattribute__ on the class to return func.__get__(self, type(self)) instead of just func for functions bound to the instance. This is how functions defined on the class behave. Unfortunately (for some use cases) python doesn't perform the same logic for functions attached to the instance itself, but you can modify it to do this. I've had similar problems with descriptors bound to instances. Performance here probably isn't as good as using weakref, but it is an option that will work transparently for any dynamically assigned function with the use of only python builtins.
If you find yourself doing this often, you might want a custom metaclass that does dynamic binding of instance-level functions.
Another alternative is to add the function directly to the class, which will then properly perform the binding when it's called. For a lot of use cases, this would have some headaches involved: namely, properly namespacing the functions so they don't collide. The instance id could be used for this, though, since the id in cPython isn't guaranteed unique over the life of the program, you'd need to ponder this a bit to make sure it works for your use case... in particular, you probably need to make sure you delete the class function when an object goes out of scope, and thus its id/memory address is available again. __del__ is perfect for this :). Alternatively, you could clear out all methods namespaced to the instance on object creation (in __init__ or __new__).
Another alternative (rather than messing with python magic methods) is to explicitly add a method for calling your dynamically bound functions. This has the downside that your users can't call your function using normal python syntax:
class MyClass(object):
def dynamic_func(self, func_name):
return getattr(self, func_name).__get__(self, type(self))
def call_dynamic_func(self, func_name, *args, **kwargs):
return getattr(self, func_name).__get__(self, type(self))(*args, **kwargs)
"""
Alternate without using descriptor functionality:
def call_dynamic_func(self, func_name, *args, **kwargs):
return getattr(self, func_name)(self, *args, **kwargs)
"""
Just to make this post complete, I'll show your weakref option as well:
import weakref
inst = MyClass()
def func(self):
print 'My func'
# You could also use the types modules, but the descriptor method is cleaner IMO
inst.func = func.__get__(weakref.ref(inst), type(inst))
use eval()
In [1]: int('25.0')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-1-67d52e3d0c17> in <module>
----> 1 int('25.0')
ValueError: invalid literal for int() with base 10: '25.0'
In [2]: int(float('25.0'))
Out[2]: 25
In [3]: eval('25.0')
Out[3]: 25.0
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why doesn't the weakref work on this bound method?
I'm using weakrefs in an observer-pattern and noticed an interesting phenomenon. If I create an object and add one of it's methods as an observer of an Observable, the reference is dead almost instantly. Can anyone explain what is happening?
I'm also interested in thoughts for why this might be a bad idea. I've decided not to use the weakrefs and just make sure to clean up after myself properly with Observable.removeobserver, but my curiosity is killing me here.
Here's the code:
from weakref import ref
class Observable:
__observers = None
def addobserver(self, observer):
if not self.__observers:
self.__observers = []
self.__observers.append(ref(observer))
print 'ADDING observer', ref(observer)
def removeobserver(self, observer):
self.__observers.remove(ref(observer))
def notify(self, event):
for o in self.__observers:
if o() is None:
print 'observer was deleted (removing)', o
self.__observers.remove(o)
else:
o()(event)
class C(Observable):
def set(self, val):
self.notify(val)
class bar(object):
def __init__(self):
self.c = C()
self.c.addobserver(self.foo)
print self.c._Observable__observers
def foo(self, x):
print 'callback', x #never reached
b = bar()
b.c.set(3)
and here's the output:
ADDING observer <weakref at 0xaf1570; to 'instancemethod' at 0xa106c0 (foo)>
[<weakref at 0xaf1570; dead>]
observer was deleted (removing) <weakref at 0xaf1570; dead>
the main thing to note is that the print statement after the call to addobserver shows that the weakref is already dead.
Whenever you do reference an object method, there's a bit of magic that happens, and it's that magic that's getting in your way.
Specifically, Python looks up the method on the object's class, then combines it with the object itself to create a kind of callable called a bound method. Every time e.g. the expression self.foo is evaluated, a new bound method instance is created. If you immediately take a weakref to that, then there are no other references to the bound method (even though both the object and the class's method still have live refs) and the weakref dies.
See this snippet on ActiveState for a workaround.
Each time you access a method of an instance, obj.m, a wrapper (called "bound method" is generated) that's callable an adds self (obj) as first argument when called. This is a neat solution for passing self "implicitly" and allows passing instance methods in the first place. But it also means that each time you type obj.m, a new (very lightweight) object is created, and unless you keep a (non-weak) reference to it around, it will be GC'd, because nobody will keep it alive for you.
I have a project where i'm trying to use weakrefs with callbacks, and I don't understand what I'm doing wrong. I have created simplified test that shows the exact behavior i'm confused with.
Why is it that in this test test_a works as expected, but the weakref for self.MyCallbackB disappears between the class initialization and calling test_b? I thought like as long as the instance (a) exists, the reference to self.MyCallbackB should exist, but it doesn't.
import weakref
class A(object):
def __init__(self):
def MyCallbackA():
print 'MyCallbackA'
self.MyCallbackA = MyCallbackA
self._testA = weakref.proxy(self.MyCallbackA)
self._testB = weakref.proxy(self.MyCallbackB)
def MyCallbackB(self):
print 'MyCallbackB'
def test_a(self):
self._testA()
def test_b(self):
self._testB()
if __name__ == '__main__':
a = A()
a.test_a()
a.test_b()
You want a WeakMethod.
An explanation why your solution doesn't work can be found in the discussion of the recipe:
Normal weakref.refs to bound methods don't quite work the way one expects, because bound methods are first-class objects; weakrefs to bound methods are dead-on-arrival unless some other strong reference to the same bound method exists.
According to the documentation for the Weakref module:
In the following, the term referent means the object which is referred to
by a weak reference.
A weak reference to an object is not
enough to keep the object alive: when
the only remaining references to a
referent are weak references, garbage
collection is free to destroy the
referent and reuse its memory for
something else.
Whats happening with MyCallbackA is that you are holding a reference to it in the instances of A, thanks to -
self.MyCallbackA = MyCallbackA
Now, there is no reference to the bound method MyCallbackB in your code. It is held only in a.__class__.__dict__ as an unbound method. Basically, a bound method is created (and returned to you) when you do self.methodName. (AFAIK, a bound method works like a property -using a descriptor (read-only): at least for new style classes. I am sure, something similar i.e. w/o descriptors happens for old style classes. I'll leave it to someone more experienced to verify the claim about old style classes.) So, self.MyCallbackB dies as soon as the weakref is created, because there is no strong reference to it!
My conclusions are based on :-
import weakref
#Trace is called when the object is deleted! - see weakref docs.
def trace(x):
print "Del MycallbackB"
class A(object):
def __init__(self):
def MyCallbackA():
print 'MyCallbackA'
self.MyCallbackA = MyCallbackA
self._testA = weakref.proxy(self.MyCallbackA)
print "Create MyCallbackB"
# To fix it, do -
# self.MyCallbackB = self.MyCallBackB
# The name on the LHS could be anything, even foo!
self._testB = weakref.proxy(self.MyCallbackB, trace)
print "Done playing with MyCallbackB"
def MyCallbackB(self):
print 'MyCallbackB'
def test_a(self):
self._testA()
def test_b(self):
self._testB()
if __name__ == '__main__':
a = A()
#print a.__class__.__dict__["MyCallbackB"]
a.test_a()
Output
Create MyCallbackB
Del MycallbackB
Done playing with MyCallbackB
MyCallbackA
Note :
I tried verifying this for old style classes. It turned out that "print a.test_a.__get__"
outputs -
<method-wrapper '__get__' of instancemethod object at 0xb7d7ffcc>
for both new and old style classes. So it may not really be a descriptor, just something descriptor-like. In any case, the point is that a bound-method object is created when you acces an instance method through self, and unless you maintain a strong reference to it, it will be deleted.
The other answers address the why in the original question, but either don't provide a workaround or refer to external sites.
After working through several other posts on StackExchange on this topic, many of which are marked as duplicates of this question, I finally came to a succinct workaround. When I know the nature of the object I'm dealing with, I use the weakref module; when I might instead be dealing with a bound method (as occurs in my code when using event callbacks), I now use the following WeakRef class as a direct replacement for weakref.ref(). I've tested this with Python 2.4 through and including Python 2.7, but not on Python 3.x.
class WeakRef:
def __init__ (self, item):
try:
self.method = weakref.ref (item.im_func)
self.instance = weakref.ref (item.im_self)
except AttributeError:
self.reference = weakref.ref (item)
else:
self.reference = None
def __call__ (self):
if self.reference != None:
return self.reference ()
instance = self.instance ()
if instance == None:
return None
method = self.method ()
return getattr (instance, method.__name__)