Garbage collection of object after exception - python

I have observed that after an exception I have an object for which constructor is not called, which causes a lock to be held. What is the best way to improve the situation? Would calling del in an except block be the solution?
b=BigHash(DB_DIR, url)
meta = bdecode(b.get())
return meta
b holds a lock which is released on destruction (it's a C++ object)
an exception is thrown by b.get().

No matter what, you want the lock to be released - whether or not an exception is thrown. In that case, it's probably best to release the lock/delete b in a finally: clause:
b=BigHash(DB_DIR, url)
try:
meta = bdecode(b.get())
finally:
del b # or whatever you need to do to release the lock
return meta
You could also use a context manager - http://docs.python.org/library/stdtypes.html#typecontextmanager. Simply add code to free the lock in the BigHash.__exit__ function, which will be called after leaving the with block in the following code:
with BigHash(DB_DIR, url) as b:
meta = bdecode(b.get())
return meta

You need to do something like this to make sure b in unlocked
b=BigHash(DB_DIR, url)
try:
meta = bdecode(b.get())
return meta
finally:
#unlock b here
A cleaner way would be if BigHash can work as a context, so you can write
with b as BigHash(DB_DIR, url):
meta = bdecode(b.get())
return meta
You might have to add some code to BigHash to make it work as a context though

Calling del on a name is something you prettymuch never should do. Calling del does not guarantee anything useful about what will happen to the underlying object. You should never depend on a __del__ method for something you need to happen.
del only gets rid of one reference to an object, which can be confusing when you may have made more without thinking. Therefore, del is useful for cleaning up a namespace, not for controlling the lifetime of objects, and it's not even great for that—the proper way to control a name's lifetime is to put it in a function and have it go out of scope or put it in a with block.
You need to equip BigHash with the ability to release the lock explicitly, with an release or unlock or close method. If you want to use this with a context manager (with), you can define __exit__, which will get called at a predictable, useful time.

Related

Is it possible to delete a record after "return" statement?

I made a function in 'hr.contract' that generate a payslip. But this payslip is used to simulate and calculate some salaries, so after create the payslip I must delete it.
Also, I made a function to print the payslip report, from the contract form. The problem is, when I click "print" button I create the payslip and return its report, but then I can't figure out a way to delete the payslip created.
def generate_report(self):
# I get this values from another methods,
# I put 1 and 20 just to avoid confution in the question.
run_id = 1
indicador_id = 20
payslip = self.generate_fake_nominee(run_id, self.employee_id.id, indicador_id, self.id)
report = payslip.print_nominee_report()
return report
I can't do stuff after the return, so any ideas?
Assuming payslip is the only reference, it should be deleted when the function returns. If it holds resources that might not be cleaned up properly, either:
It should implement the context management protocol and a with statement could be used:
with self.generate_fake_nominee(run_id, self.employee_id.id, indicador_id, self.id) as payslip:
return payslip.print_nominee_report()
or,
If you need to call some cleanup function manually, a finally block can be used:
payslip = self.generate_fake_nominee(run_id, self.employee_id.id, indicador_id, self.id)
try:
return payslip.print_nominee_report()
finally:
payslip.cleanup_function()
To be clear, putting del payslip in a finally block would be completely pointless; the name will be unbound when the function returns anyway, so del accomplishes nothing. Either the object bound to payslip would be freed anyway (if it had no other aliases) or it wouldn't be free regardless (it's aliased elsewhere). payslip is unbound when the function returns no matter what, and all del does is make the unbinding explicit, which is slower and pointlessly verbose.
Update: It seems like the report in your original code has a live dependency on the payslip object, so you can't actually clean up the payslip until the caller is done with the report. If that's the case, you have two options as well:
Figure out how to remove the live dependency, so report is a standalone set of data (depends on code you haven't shown us, but this is the best solution when possible)
Use a finalizer tied to report that preserves payslip until report is garbage collected, at which point it cleans it up, a la:
# At top of module
import weakref
def generate_report(self):
# I get this values from another methods,
# I put 1 and 20 just to avoid confution in the question.
run_id = 1
indicador_id = 20
payslip = self.generate_fake_nominee(run_id, self.employee_id.id, indicador_id, self.id)
report = payslip.print_nominee_report()
# Ensure cleanup_function is called when report is gc-ed
weakref.finalize(report, payslip.cleanup_function)
return report
This option isn't perfect; even on CPython with deterministic reference counting, if a reference cycle gets involved, even when you're done with report it can take an arbitrary amount of time before the cycle collector is invoked and report actually gets collected (and the cleanup function invoked). On non-CPython interpreters with true garbage collectors, this will happen even without reference cycles.

What is the benefit of using a context mananger with multiprocessing.Manager?

In the documentation, Manager is used with a context manager (i.e. with) like so:
from multiprocessing.managers import BaseManager
class MathsClass:
def add(self, x, y):
return x + y
def mul(self, x, y):
return x * y
class MyManager(BaseManager):
pass
MyManager.register('Maths', MathsClass)
if __name__ == '__main__':
with MyManager() as manager:
maths = manager.Maths()
print(maths.add(4, 3)) # prints 7
print(maths.mul(7, 8)) # prints 56
But what is the benefit of this, with the exception of the namespace? For opening file streams, the benefit is quite obvious in that you don't have to manually .close() the connection, but what is it for Manager? If you don't use it in a context, what steps do you have to use to ensure that everything is closed properly?
In short, what is the benefit of using the above over something like:
manager = MyManager()
maths = manager.Maths()
print(maths.add(4, 3)) # prints 7
print(maths.mul(7, 8)) # prints 56
But what is the benefit of this (...)?
First, you get the primary benefit of almost any context managers. You have a well-defined lifetime for the resource. It is allocated and acquired when the with ...: block is opened. It is released when the blocks ends (either by reaching the end or because an exception is raised). It is still deallocated whenever the garbage collector gets around to it but this is of less concern since the external resource has already been released.
In the case of multiprocessing.Manager (which is a function that returns a SyncManager, even though Manager looks lot like a class), the resource is a "server" process that holds state and a number of worker processes that share that state.
what is [the benefit of using a context manager] for Manager?
If you don't use a context manager and you don't call shutdown on the manager then the "server" process will continue running until the SyncManager's __del__ is run. In some cases, this might happen soon after the code that created the SyncManager is done (for example, if it is created inside a short function and the function returns normally and you're using CPython then the reference counting system will probably quickly notice the object is dead and call its __del__). In other cases, it might take longer (if an exception is raised and holds on to a reference to the manager then it will be kept alive until that exception is dealt with). In some bad cases, it might never happen at all (if SyncManager ends up in a reference cycle then its __del__ will prevent the cycle collector from collecting it at all; or your process might crash before __del__ is called). In all these cases, you're giving up control of when the extra Python processes created by SyncManager are cleaned up. These processes may represent non-trivial resource usage on your system. In really bad cases, if you create SyncManager in a loop, you may end up creating many of these that live at the same time and could easily consume huge quantities of resources.
If you don't use it in a context, what steps do you have to use to ensure that everything is closed properly?
You have to implement the context manager protocol yourself, as you would for any context manager you used without with. It's tricky to do in pure-Python while still being correct. Something like:
manager = None
try:
manager = MyManager()
manager.__enter__()
# use it ...
except:
if manager is not None:
manager.__exit__(*exc_info())
else:
if manager is not None:
manager.__exit__(None, None, None)
start and shutdown are also aliases of __enter__ and __exit__, respectively.

Evaluate and assign expression in or before with statement

If I am correct, with statement doesn't introduce a local scope for the with statement.
These are examples from Learning Python:
with open(r'C:\misc\data') as myfile:
for line in myfile:
print(line)
...more code here...
and
lock = threading.Lock() # After: import threading
with lock:
# critical section of code
...access shared resources...
Is the second example equivalent to the following rewritten in a way similar to the first example?
with threading.Lock() as lock:
# critical section of code
...access shared resources...
What are their differences?
Is the first example equivalent to the following rewritten in a way similar to the second example?
myfile = open(r'C:\misc\data')
with myfile:
for line in myfile:
print(line)
...more code here...
What are their differences?
When with enters a context, it calls a hook on the context manager object, called __enter__, and the return value of that hook can optionally be assigned to a name using as <name>. Many context managers return self from their __enter__ hook. If they do, then you can indeed take your pick between creating the context manager on a separate line or capturing the object with as.
Out of your two examples, only the file object returned from open() has an __enter__ hook that returns self. For threading.Lock(), __enter__ returns the same value as Lock.acquire(), so a boolean, not the lock object itself.
You'll need to look for explicit documentation that confirms this; this is not always that clear however. For Lock objects, the relevant section of the documentation states:
All of the objects provided by this module that have acquire() and release() methods can be used as context managers for a with statement. The acquire() method will be called when the block is entered, and release() will be called when the block is exited.
and for file objects, the IOBase documentation is rather on the vague side and you have to infer from the example that the file object is returned.
The main thing to take away is that returning self is not mandatory, nor is it always desired. Context managers are entirely free to return something else. For example, many database connection objects are context managers that let you manage the transaction (roll back or commit automatically, depending on whether or not there was an exception), where entering returns a new cursor object bound to the connection.
To be explicit:
for your open() example, the two examples are for all intents and purposes exactly the same. Both call open(), and if that does not raise an exception, you end up with a reference to that file object named myfile. In both cases the file object will be closed after the with statement is done. The name continues to exist after the with statement is done.
There is a difference, but it is mostly technical. For with open(...) as myfile:, the file object is created, has it's __enter__ method called and then myfile is bound. For the myfile = open(...) case, myfile is bound first, __enter__ called later.
For your with threading.Lock() as lock: example, using as lock will set lock to a True (locking always either succeeds or blocks indefinitely this way). This differs from the lock = threading.Lock() case, where lock is bound to the lock object.
Here's a good explanation. I'll paraphrase the key part:
The with statement could be thought of like this code:
set things up
try:
do something
finally:
tear things down
Here, “set things up” could be opening a file, or acquiring some sort of external resource, and “tear things down” would then be closing the file, or releasing or removing the resource. The try-finally construct guarantees that the “tear things down” part is always executed, even if the code that does the work doesn’t finish.

How to use a context manager in python

Below is a hypothetical piece of code
with dbengine.connect(**details) as db:
cur = db.exec(sql_string)
results = cur.fetchall()
return results
In this case I would expect that when tabbed out of that with block db.close() is called and db is marked for garbage collection.
In work I've started seeing this code crop up.
with something() as myobj:
logger.info('I got an obj!')
return myobj
I don't know if you should be using with like the new keyword in java. Could someone direct me to any good docs that might explain what you can/can't should/shouldn't do when using with?
P.S Log messages are actually that lame :-)
The target name the with statement binds the contextmanager __enter__ return value to (the name after as) is not scoped to just the with statement. Like for loop variable, the as target name is scoped in the current function or module namespace. The name does not disappear or is otherwise cleared when the with suite ends.
As such, return myobj outside of the with statement is perfectly legal, if somewhat nonsensical. All that the with statement guarantees is that the something().__exit__() method will have been called when the block completes (be that by reaching the end of the block, or because of a continue, break or return statement, or because an exception has been raised).
That said, you'd be better off just moving the return inside the with statement:
with something() as myobj:
logger.info('I got an obj!')
return myobj
and
with dbengine.connect(**details) as db:
cur = db.exec(sql_string)
return cur.fetchall()
The context manager will still be cleaned up properly, but now the return statement looks like it is a logical part of the with block. The execution order is not altered; something().__exit__() is called, then the function returns.
As always, the Python documentation on the with syntax is excellent. You could also review the documentation on context managers and the original proposal, PEP-343.

Workaround for python 2.4's yield not allowed in try block with finally clause

I'm stuck on python2.4, so I can't use a finally clause with generators or yield. Is there any way to work around this?
I can't find any mentions of how to work around this limitation in python 2.4, and I'm not a big fan of the workarounds I've thought of (mainly involving __del__ and trying to make sure it runs within a reasonable time) aren't very appealing.
You can duplicate code to avoid the finally block:
try:
yield 42
finally:
do_something()
Becomes:
try:
yield 42
except: # bare except, catches *anything*
do_something()
raise # re-raise same exception
do_something()
(I've not tried this on Python 2.4, you may have to look at sys.exc_info instead of the re-raise statement above, as in raise sys.exc_info[0], sys.exc_info[1], sys.exc_info[2].)
The only code that's guaranteed to be called when a generator instance is simply abandoned (garbage collected) are the __del__ methods for its local variables (if no references to those objects exist outside) and the callbacks for weak references to its local variables (ditto). I recommend the weak reference route because it's non-invasive (you don't need a special class with a __del__ -- just anything that's weakly referenceable). E.g.:
import weakref
def gen():
x = set()
def finis(*_):
print 'finis!'
y = weakref.ref(x, finis)
for i in range(99):
yield i
for i in gen():
if i>5: break
this does print finis!, as desired.

Categories

Resources