I have learned that python does not guarantee that __del__ is called whenever an object is deleted.
In other words, del x does not necessarily invoke its destructor x.__del__().
If I want to ensure proper object cleanup, I should use a context manager (in a with statement).
I know it's stupid, but for a couple of reasons (please don't ask why) I am tied to a system with Python 2.4; therefore context managers are out of question (they were introduced in Python 2.5)
So I need a an alternative solution, and hence my question: are there best practices that would help me to use __del__ reliably? I am thinking in the direction of "if python provides such functionality, there must be a way it can be efficiently used (I'm just to stupid to figure out how)",,,
Or I am just being naive, should forget about __del__ and move on to a completely different approach?
In short: No, there is no way to ensure it gets called.
The answer is to implement context managers yourself. A with statement roughly translates to:
x.__enter__()
try:
...
finally:
x.__exit__()
So just do it manually. It is a little more complex than that, so I recommend reading PEP 343 to fully understand how context managers work.
One option is to call your cleaning up function close(), and then in future versions of python, people can easily use contextlib.closing to turn it into a real context manager.
Instead of __del__, give your class a method called something like close, then call that explicitly:
foo = Foo()
try:
foo.do_interesting_stuff()
finally:
foo.close()
For extra safety and forward-compatibility, have __exit__ and __del__ call close as well.
Related
I do things mostly in C++, where the destructor method is really meant for destruction of an acquired resource. Recently I started with python (which is really a fun and fantastic), and I came to learn it has GC like java.
Thus, there is no heavy emphasis on object ownership (construction and destruction).
As far as I've learned, the __init__() method makes more sense to me in python than it does for ruby too, but the __del__() method, do we really need to implement this built-in function in our class? Will my class lack something if I miss __del__()? The one scenario I could see __del__() useful is, if I want to log something when destroying an object. Is there anything other than this?
In the Python 3 docs the developers have now made clear that destructor is in fact not the appropriate name for the method __del__.
object.__del__(self)
Called when the instance is about to be destroyed. This is also called a finalizer or (improperly) a destructor.
Note that the OLD Python 3 docs used to suggest that 'destructor' was the proper name:
object.__del__(self)
Called when the instance is about to be destroyed. This is also called a destructor. If a base class has a __del__() method, the derived class’s __del__() method, if any, must explicitly call it to ensure proper deletion of the base class part of the instance.
From other answers but also from the Wikipedia:
In a language with an automatic garbage collection mechanism, it would be difficult to deterministically ensure the invocation of a destructor, and hence these languages are generally considered unsuitable for RAII [Resource Acquisition Is Initialization]
So you should almost never be implementing __del__, but it gives you the opportunity to do so in some (rare?) use cases
As the other answers have already pointed out, you probably shouldn't implement __del__ in Python. If you find yourself in the situation thinking you'd really need a destructor (for example if your class wraps a resource that needs to be explicitly closed) then the Pythonic way to go is using context managers.
Is del really a destructor?
No, __del__ method is not a destructor, is just a normal method you can call whenever you want to perform any operation, but it is always called before the garbage collector destroys the object.
Think of it like a clean or last will method.
So uncommon it is that I have learned about it today (and I'm long ago into python).
Memory is deallocated, files closed, ... by the GC. But you could need to perform some task with effects outside of the class.
My use case is about implementing some sort of RAII regarding some temporal directories. I'd like it to be removed no matter what.
Instead of removing it after the processing (which, after some change, was no longer run) I've moved it to the __del__ method, and it works as expected.
This is a very specific case, where we don't really care about when the method is called, as long as it's called before leaving the program. So, use with care.
I have a class with few methods - each one is setting some internal state, and usually requires some other method to be called first, to prepare stage.
Typical invocation goes like this:
c = MyMysteryClass()
c.connectToServer()
c.downloadData()
c.computeResults()
In some cases only connectToServer() and downloadData() will be called (or even just connectToServer() alone).
The question is: how should those methods behave when they are called in wrong order (or, in other words, when the internal state is not yet ready for their task)?
I see two solutions:
They should throw an exception
They should call correct previous method internally
Currently I'm using second approach, as it allows me to write less code (I can just write c.computeResults() and know that two other methods will be called if necessary). Plus, when I call them multiple times, I don't have to keep track of what was already called and so I avoid multiple reconnecting or downloading.
On the other hand, first approach seems more predictable from the caller perspective, and possibly less error prone.
And of course, there is a possibility for a hybrid solution: throw and exception, and add another layer of methods with internal state detection and proper calling of previous ones. But that seems to be a bit of an overkill.
Your suggestions?
They should throw an exception. As said in the Zen of Python: Explicit is better than implicit. And, for that matter, Errors should never pass silently. Unless explicitly silenced. If the methods are called out of order that's a programmer's mistake, and you shouldn't try to fix that by guessing what they mean. You might accidentally cover up an oversight in a way that looks like it works, but is not actually an accurate reflection of the programmer's intent. (That programmer may be future you.)
If these methods are usually called immediately one after another, you could consider collating them by adding a new method that simply calls them all in a row. That way you can use that method and not have to worry about getting it wrong.
Note that classes that handle internal state in this way are sometimes called for but are often not, in fact, necessary. Depending on your use case and the needs of the rest of your application, you may be better off doing this with functions and actually passing connection objects, etc. from one method to another, rather than using a class to store internal state. See for instance Stop Writing Classes. This is just something to consider and not an imperative; plenty of reasonable people disagree with the theory behind Stop Writing Classes.
You should write exceptions. It is good programming practice to write Exceptions to make your code easier to understand for the following reasons:
What you are describe fits the literal description of "exception" -- it is an exception to normal proceedings.
If you build in some kind of work around, you will likely have "spaghetti code" = BAD.
When you, or someone else goes back and reads this code later, it will be difficult to understand if you do not provide the hint that it is an exception to have these methods executed out of order.
Here's a good source:
http://jeffknupp.com/blog/2013/02/06/write-cleaner-python-use-exceptions/
As my CS professor always said "Good programmers can write code that computers can read, but great programmers write code that humans and computers can read".
I hope this helps.
If it's possible, you should make the dependencies explicit.
For your example:
c = MyMysteryClass()
connection = c.connectToServer()
data = c.downloadData(connection)
results = c.computeResults(data)
This way, even if you don't know how the library works, there's only one order the methods could be called in.
It is possible for a generator to manage a resource, e.g. by yield'ing from inside a context manager.
The resource is freed as soon as the close() method of the generator is called (or an exception is raised).
As it's easy to forget to call close() in the end, I think it's obvious to use a context manager also for that (and also to handle potential exceptions).
I know that I can use contextlib.closing for that, but wouldn't it be much nicer to directly use the generator in the with statement?
Is there a reason why a generator should not be a context manager?
In general, the reason you don't see more generators as context managers and visa versa is that they're aimed at solving different problems. Context managers came about because it provided a clean and concise way of scoping executable code to a resource.
There is one very good reason you might want to separate a class that implements __iter__() from also being a context manager, the Single Responsibility Principle. Single Responsibility boils down to the concept
Make a class do one thing and do it well
Lists are iterable but that's because they're a collection. They manage no state other than what they hold and iteration is just another way of accessing that state. Unless you need iteration as a means of accessing the state of a contained object then I can't see a reason to mix and match the two together. Even then, I would go to great lengths to separate it out in true OO style.
Like Wheaties said, you want to have classes do only "one thing and do it well". In particular with context managers, they are managing a context. So ask yourself, what is the context here? Most of the time, it will be having a resource open. A while ago I asked about using a queue with a context manager, and the response was basically that a queue did not make sense as a context. However, "in a task" was the real context that I was in and it made sense to make a context manager for that.
Additionally, there is no iterated with statement. For example, I cannot open a file and iterate through it in one statement like this:
for line in file with open(filename) as file:
...
It has to be done in two lines:
with open(filename) as file:
for line in file:
...
This is good because the context being managed is not "we are iterating through the file", it is "we have a file open". So again, what is the context? What are you really doing? Most likely, your managed context is not actually the iteration through the resource. However, if you look at your specific problem you might discover that you do indeed have a situation in which the generator is managing a context. Hopefully understanding what the context really is should give you some ideas on how to appropriately manage it.
I have a class where I create a file object in the constructor. This class also implements a finish() method as part of its interface and in this method I close the file object. The problem is that if I get an exception before this point, the file will not be closed. The class in question has a number of other methods that use the file object. Do I need to wrap all of these in a try finally clause or is there a better approach?
Thanks,
Barry
You could make your class a context-manager, and then wrap object creation and use of that class in a with-statement. See PEP 343 for details.
To make your class a context-manager, it has to implement the methods __enter__() and __exit__(). __enter__() is called when you enter the with-statement, and __exit__() is guaranteed to be called when you leave it, no matter how.
You could then use your class like this:
with MyClass() as foo:
# use foo here
If you acquire your resources in the constructor, you can make __enter__() simply return self without doing anything. __exit__() should just call your finish()-method.
For short lived file objects, a try/finally pair or the more succinct with-statement is recommended as a clean way to make sure the files are flushed and the related resources are released.
For long lived file objects, you can register with atexit() for an explicit close or just rely on the interpreter cleaning up before it exits.
At the interactive prompt, most people don't bother for simple experiments where there isn't much of a downside to leaving files unclosed or relying on refcounting or GC to close for you.
Closing your files is considered good technique. In reality though, not explicitly closing files rarely has any noticeable effects.
You can either have a try...finally pair, or make your class a context manager suitable for use in the with statement.
I have been thinking about how I write classes in Python. More specifically how the constructor is implemented and how the object should be destroyed. I don't want to rely on CPython's reference counting to do object cleanup. This basically tells me I should use with statements to manage my object life times and that I need an explicit close/dispose method (this method could be called from __exit__ if the object is also a context manager).
class Foo(object):
def __init__(self):
pass
def close(self):
pass
Now, if all my objects behave in this way and all my code uses with statements or explicit calls to close() (or dispose()) I don't realy see the need for me to put any code in __del__. Should we really use __del__ to dispose of our objects?
Short answer : No.
Long answer: Using __del__ is tricky, mainly because it's not guaranteed to be called. That means you can't do things there that absolutely has to be done. This in turn means that __del__ basically only can be used for cleanups that would happen sooner or later anyway, like cleaning up resources that would be cleaned up when the process exits, so it doesn't matter if __del__ doesn't get called. Of course, these are also generally the same things Python will do for you. So that kinda makes __del__ useless.
Also, __del__ gets called when Python garbage collects, and you didn't want to wait for Pythons garbage collecting, which means you can't use __del__ anyway.
So, don't use __del__. Use __enter__/__exit__ instead.
FYI: Here is an example of a non-circular situation where the destructor did not get called:
class A(object):
def __init__(self):
print('Constructing A')
def __del__(self):
print('Destructing A')
class B(object):
a = A()
OK, so it's a class attribute. Evidently that's a special case. But it just goes to show that making sure __del__ gets called isn't straightforward. I'm pretty sure I've seen more non-circular situations where __del__ isn't called.
Not necessarily. You'll encounter problems when you have cyclic references. Eli Bendersky does a good job of explaining this in his blog post:
Safely using destructors in Python
If you are sure you will not go into cyclic references, then using __del__ in that way is OK: as soon as the reference count goes to zero, the CPython VM will call that method and destroy the object.
If you plan to use cyclic references - please think it very thoroughly, and check if weak references may help; in many cases, cyclic references are a first symptom of bad design.
If you have no control on the way your object is going to be used, then using __del__ may not be safe.
If you plan to use JPython or IronPython, __del__ is unreliable at all, because final object destruction will happen at garbage collection, and that's something you cannot control.
In sum, in my opinion, __del__ is usually perfectly safe and good; however, in many situation it could be better to make a step back, and try to look at the problem from a different perspective; a good use of try/except and of with contexts may be a more pythonic solution.