Resource Aquisition Is Initialization, in Python - python

I am new to Python. I come from C++.
In some code reviews, I've had several peers wanting me to move things from init and del to a start and stop method. Most of them time, this goes against the RAII that was beaten into my head with decades of C++.
https://en.wikipedia.org/wiki/Resource_acquisition_is_initialization
Is RAII not a thing in Python?
Shouldn't it be?
After all, we can throw exceptions and we'd want to release resources when we do, no?
If it isn't. Can someone give some insight as to why things are done differently? Is there a language feature that I don't understand?
if I have:
class Poop:
def __init__:
# Get some Windows Resource
def __del__:
#Release some Windows Resource
def foo():
poop = Poop()
raise Exception("Poop happens")
The Windows Resource is released, right?

RAII works in C++ because destruction is deterministic.
In garbage collected languages like Python, your object could theoretically never be destroyed, even if you call del on it.
Anyway, the idiomatic way to handle resources in Python is not with RAII, nor with start/stop, but with context managers.
The simplest example is with a file object:
with open('this_file.txt') as f:
# ... do stuff with f ...
# ... back to code that doesn't touch f ...
The with statement is, more or less, a try-finally block that creates a resource and ensures that the resource is cleaned up when the block ends; something like this:
try:
f = open('this_file.txt')
# ... do stuff with f ...
finally:
f.close()
# ... back to code that doesn't touch f ...
I don't know Java, but I believe that the JVM also uses garbage collection, and similarly try-finally is an idiom for resource management in Java.
Anyway, the with statement takes a context manager, which is an instance of a class defining the __enter__ and __exit__ methods (see the docs).
For completeness, there may be cases where you want a context manager, but don't want to define a whole class just for that. In that case, contextlib may help.
A worked example; say you have a resource:
class Resource:
def method(self):
pass
get_resource = Resource
release_resource = lambda x: None
A RAII-like class might look something like this:
class RAIILike:
def __init__(self):
self.resource = get_resource()
def __del__(self):
release_resource(self.resource)
def do_complex_thing(self):
# do something complex with resource
pass
raii_thingy = RAIILike()
And you would use the resource like this:
raii_thingy.resource.method()
On the other hand, a context managed resource could look like this...
class ContextManagedResource:
def __enter__(self):
self._resource = get_resource()
return self._resource
def __exit__(self, exc_type, exc_value, traceback):
if exc_type is not None:
# handle exception here
pass
else:
pass
release_resource(self._resource)
return True
...and be used like this:
with ContextManagedResource() as res:
res.method()
Once the with block ends, the resource will be automatically released, regardless of whether the object that obtained it has been garbage collected.

Your own reference to wikipedia says:
Perl, Python (in the CPython implementation), and PHP manage
object lifetime by reference counting, which makes it possible to use
RAII. Objects that are no longer referenced are immediately destroyed
or finalized and released, so a destructor or finalizer can release
the resource at that time. However, it is not always idiomatic in such
languages, and is specifically discouraged in Python (in favor of
context managers and finalizers from the weakref package).

You can do RAII in python, or get pretty close. However, unlike C++ where you do the work in the constuctor and destructor, in python you need to use the dunder functions of enter and exit. This post has a excellent write up of how to write the functions and how they will behave in the presence of exceptions: https://preshing.com/20110920/the-python-with-statement-by-example/

Related

Difference between Context Managers and Decorators in Python

What is the main difference between the two? I have been studying Python and came across them. A decorator is essentially a function that wraps another function and you can do anything before and after a particular function executes.
def my_decorator(some_function):
def wrapper(*args, **kwargs):
print("Do something before the function is called")
some_function(*args, **kwargs)
print("Do something after the function is called")
return wrapper
#my_decorator
def addition(a, b):
result = a+b
print("Addition of {} and {} is {}".format(a,b,result))
But after studying Context Manager, I couldn't help but notice that it too has a enter and exit where you could do most similar operations.
from contextlib import contextmanager
#contextmanager
def open_file(path, mode):
the_file = open(path, mode)
yield the_file
the_file.close()
files = []
for x in range(100000):
with open_file('foo.txt', 'w') as infile:
files.append(infile)
for f in files:
if not f.closed:
print('not closed')
Everything before yield is taken as part of the "enter" and everything after a part of "exit".
Although both Context Managers and Decorators are syntactically different, their behaviors can be looked upon as similar. So what is the difference? What are the different scenarios when one should use either of them?
They are completely separate concepts and should not be seen in the same light.
A decorator lets you augment or replace a function or a class when it is defined. This is far broader than just executing things before or after a function call. Sure, your specific decorator lets you do something just before and after a function call, provided no exception is raised, or you explicitly handle exceptions. But you could also use a decorator to add an attribute to the function object, or to update some kind of registry. Or to return something entirely different and ignore the original function. Or to produce a wrapper that manipulates the arguments passed in, or the return value of the original function. A context manager can't do any of those things.
A context manager on the other hand lets you abstract away try: ... finally: constructs, in that no matter how the block exits, you get to execute some more code at the end of the block. Even if the block raises an exception, or uses return to exit a function, the context manager __exit__ method is still going to be called, regardless. A context manager can even suppress any exceptions raised in the block.
The two concepts are otherwise not related at all. Use decorators when you require to do something to or with functions or classes when they are defined. Use context managers when you want to clean up or take other actions after a block ends.
they are completely different concepts.
context managers are objects to be used with the python with keyword. It runs code when entering the block and exiting the block.
decorators are modifications to a function or class definition. It runs code that replaces the function as it is being defined.
#D
def Y(...):
...
is just another way of writing
def Y(...):
....
Y = D(Y)
Good thinking, indeed the concepts have many similiarities, though there are important differences, so it is safer to think of them as totally different concepts
Any context manager created with contextlib.contextmanager is also a decorator, as described here: https://docs.python.org/3/library/contextlib.html#using-a-context-manager-as-a-function-decorator
Context managers can be used to wrap code with setup and teardown steps. Decorators are a more general construct with allow us to modify functions many ways, including by wrapping them with setup/teardown logic. So it seems pretty natural to ask: why can't we use a context manager as a decorator?
We can, and in fact contextlib has already done it for you. If we write a context manager like so:
from contextlib import contextmanager
#contextmanager
def my_context():
print("setup")
yield
print("teardown")
We can use it as a context manager in a with block or we can use it as a decorator:
def foo():
with my_context():
print("foo ran")
#my_context()
def bar():
print("bar ran")
>>> foo()
setup
foo ran
teardown
>>> bar()
setup
bar ran
teardown
Which should you use?
Use a with block when your enclosed code needs access to the object returned by the context manager, e.g. file handling:
with open("my_file.txt") as file:
file.read() # needs access to the file object
Use as a decorator when an entire function needs to be wrapped in a context and doesn't need any context variables:
#contextmanager
def suppress_all_exceptions():
try: yield
except: pass
#suppress_all_exceptions()
def div_by_zero():
print("hi")
x = 1 / 0 # exception suppressed
Note: the same functionality can also be achieved by subclassing contextlib.ContextDecorator:
class MyContext(contextlib.ContextDecorator):
def __enter__(): ...
def __exit__(*errs): ...

Use with statement in a class that wraps a resource

If I have a class that wraps a resource, e.g., an sqlite database connection or a file, is there a way I can use the with statement to close the resource when my object goes out of scope or is gcollected?
To clarify what I mean, I want to avoid this:
class x:
def __init__(self):
# open resource
def close(self): # or __del__, even worst
# close resource
but make it in such a way that the resource is always freed as in
with open('foo') as f:
# use resource
You need to provide __enter__ and __exit__ methods. See PEP 343.
This PEP adds a new statement "with" to the Python language to make it
possible to factor out standard uses of try/finally statements.
In this PEP, context managers provide __enter__() and __exit__()
methods that are invoked on entry to and exit from the body of the
with statement.
Use contextlib.closing:
with contextlib.closing(thing) as thing:
do_stuff_with(thing)
# Thing is closed now.
You can always put any cleanup code you need into a class's __del__ method:
class x:
def __init__(self):
self.thing = get_thing()
def __del__(self):
self.thing.close()
But you shouldn't.
This is a bad idea, for a few reasons. If you're using CPython, having custom __del__ methods means the GC can't break reference cycles. If you're using most other Python implementations, __del__ methods aren't called at predictable times.
This is why you usually put cleanup in explicit close methods. That's the best you can do within the class itself. It's always up to the user of your class to make sure the close method gets called, not the class itself.
So, there's no way you can use a with statement, or anything equivalent, inside your class. But you can make it easier for users of your class to use a with statement, by making your class into a context manager, as described in roippi's answer, or just by suggesting they use contextlib.closing in your documentation.

Python missing __exit__ method

Some background: I work in a large bank and I'm trying to re-use some Python modules, which I cannot change, only import. I also don't have the option of installing any new utilities/functions etc (running Python 2.6 on Linux).
I've got this at present:
In my module:
from common.databaseHelper import BacktestingDatabaseHelper
class mfReportProcess(testingResource):
def __init__(self):
self.db = BacktestingDatabaseHelper.fromConfig('db_name')
One of the methods called within the 'testingResource' class has this:
with self.db as handler:
which falls over with this:
with self.db as handler:
AttributeError: 'BacktestingDatabaseHelper' object has no attribute '__exit__'
and, indeed, there is no __exit__ method in the 'BacktestingDatabaseHelper' class, a class which I cannot change.
However, this code I'm trying to re-use works perfectly well for other apps - does anyone know why I get this error and no-one else?
Is there some way of defining __exit__ locally?
Many thanks in advance.
EDITED to add:
I've tried to add my own class to setup DB access but can't get it to work - added this to my module:
class myDB(BacktestingDatabaseHelper):
def __enter__(self):
self.db = fromConfig('db_name')
def __exit__(self):
self.db.close()
and added:
self.db = myDB
into my init attribute for my main class but I get this error:
with self.db as handler:
TypeError: unbound method __enter__() must be called with myDB instance as first argument (got nothing instead)
Any suggestions as to how to do this properly?
Using the with protocol assumes that the object used in with implements the context manager protocol.
Basically this means that the class definition should have __enter__() and __exit__() methods defined. If you use an object without these, python will throw an AttributeError complaining about the missing __exit__ attribute.
The error means that BacktestingDatabaseHelper is not designed to be used in a with statement. Sounds like the classes testingResource and BacktestingDatabaseHelper are not compatible with each other (perhaps your version of common.databaseHelper is out of date).
As you cannot change the with statement, you must add a class deriving from BacktestingDatabaseHelper which adds appropriate __enter__() and __exit__() functions and use this instead.
Here is an example which tries to be as close to the original as possible:
class myDB(BacktestingDatabaseHelper):
def __enter__(self):
return self
def __exit__(self):
self.db.close()
def fromConfig(self, name):
x = super(myDB, self).fromConfig(name)
assert isinstance(x, BacktestingDatabaseHelper)
x.__class__ = myDB # not sure if that really works
[...]
self.db=myDB.fromConfig('tpbp')
The problem is, however, that I am not sure what the __enter__ is supposed to return. If you take MySQLdb, for example, the context manager of the connection creates a cursor representing one transaction. If that's the case here as well, wou have to work out something else...
You might want to try the contextlib.contextmanager decorator to wrap your object so that it supports the context manager protocol.
The 'with' keyword is basically a shortcut for writing out:
try:
// Do something
finally:
hander.__exit__()
which is useful if your handler object is using up resources (like, for example, an open file stream). It makes sure that no matter what happens in the 'do something' part, the resource is released cleanly.
In your case, your handler object doesn't have an __exit__ method, and so with fails. I would assume that other people can use BacktestingDatabaseHelper because they're not using with.
As for what you can do now, I would suggest forgetting with and using try ... finally instead, rather than trying to add your own version of __exit__ to the object. You'll just have to make sure you release the handler properly (how you do this will depend on how BacktestingDatabaseHelper is supposed to be used), e.g.
try:
handler = self.db
// do stuff
finally:
handler.close()
Edit:
Since you can't change it, you should do something like #Daniel Roseman suggests to wrap BacktestingDatabaseHelper. Depending on how best to clean up BacktestingDatabaseHelper (as above), you can write something like:
from contextlib import contextmanager
#contextmanager
def closing(thing):
try:
yield thing
finally:
thing.close()
and use this as:
class mfReportProcess(testingResource):
def __init__(self):
self.db = closing(BacktestingDatabaseHelper.fromConfig('db_name'))
(this is directly from the documentation).

I don't understand this python __del__ behaviour

Can someone explain why the following code behaves the way it does:
import types
class Dummy():
def __init__(self, name):
self.name = name
def __del__(self):
print "delete",self.name
d1 = Dummy("d1")
del d1
d1 = None
print "after d1"
d2 = Dummy("d2")
def func(self):
print "func called"
d2.func = types.MethodType(func, d2)
d2.func()
del d2
d2 = None
print "after d2"
d3 = Dummy("d3")
def func(self):
print "func called"
d3.func = types.MethodType(func, d3)
d3.func()
d3.func = None
del d3
d3 = None
print "after d3"
The output (note that the destructor for d2 is never called) is this (python 2.7)
delete d1
after d1
func called
after d2
func called
delete d3
after d3
Is there a way to "fix" the code so the destructor is called without deleting the method added? I mean, the best place to put the d2.func = None would be in the destructor!
Thanks
[edit] Based on the first few answers, I'd like to clarify that I'm not asking about the merits (or lack thereof) of using __del__. I tried to create the shortest function that would demonstrate what I consider to be non-intuitive behavior. I'm assuming a circular reference has been created, but I'm not sure why. If possible, I'd like to know how to avoid the circular reference....
You cannot assume that __del__ will ever be called - it is not a place to hope that resources are automagically deallocated. If you want to make sure that a (non-memory) resource is released, you should make a release() or similar method and then call that explicitly (or use it in a context manager as pointed out by Thanatos in comments below).
At the very least you should read the __del__ documentation very closely, and then you should probably not try to use __del__. (Also refer to the gc.garbage documentation for other bad things about __del__)
I'm providing my own answer because, while I appreciate the advice to avoid __del__, my question was how to get it to work properly for the code sample provided.
Short version: The following code uses weakref to avoid the circular reference. I thought I'd tried this before posting the question, but I guess I must have done something wrong.
import types, weakref
class Dummy():
def __init__(self, name):
self.name = name
def __del__(self):
print "delete",self.name
d2 = Dummy("d2")
def func(self):
print "func called"
d2.func = types.MethodType(func, weakref.ref(d2)) #This works
#d2.func = func.__get__(weakref.ref(d2), Dummy) #This works too
d2.func()
del d2
d2 = None
print "after d2"
Longer version:
When I posted the question, I did search for similar questions. I know you can use with instead, and that the prevailing sentiment is that __del__ is BAD.
Using with makes sense, but only in certain situations. Opening a file, reading it, and closing it is a good example where with is a perfectly good solution. You've gone a specific block of code where the object is needed, and you want to clean up the object and the end of the block.
A database connection seems to be used often as an example that doesn't work well using with, since you usually need to leave the section of code that creates the connection and have the connection closed in a more event-driven (rather than sequential) timeframe.
If with is not the right solution, I see two alternatives:
You make sure __del__ works (see this blog for a better
description of weakref usage)
You use the atexit module to run a callback when your program closes. See this topic for example.
While I tried to provide simplified code, my real problem is more event-driven, so with is not an appropriate solution (with is fine for the simplified code). I also wanted to avoid atexit, as my program can be long-running, and I want to be able to perform the cleanup as soon as possible.
So, in this specific case, I find it to be the best solution to use weakref and prevent circular references that would prevent __del__ from working.
This may be an exception to the rule, but there are use-cases where using weakref and __del__ is the right implementation, IMHO.
Instead of del, you can use the with operator.
http://effbot.org/zone/python-with-statement.htm
just like with filetype objects, you could something like
with Dummy('d1') as d:
#stuff
#d's __exit__ method is guaranteed to have been called
del doesn't call __del__
del in the way you are using removes a local variable. __del__ is called when the object is destroyed. Python as a language makes no guarantees as to when it will destroy an object.
CPython as the most common implementation of Python, uses reference counting. As a result del will often work as you expect. However it will not work in the case that you have a reference cycle.
d3 -> d3.func -> d3
Python doesn't detect this and so won't clean it up right away. And its not just reference cycles. If an exception is throw you probably want to still call your destructor. However, Python will typically hold onto to the local variables as part of its traceback.
The solution is not to depend on the __del__ method. Rather, use a context manager.
class Dummy:
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
print "Destroying", self
with Dummy() as dummy:
# Do whatever you want with dummy in here
# __exit__ will be called before you get here
This is guaranteed to work, and you can even check the parameters to see whether you are handling an exception and do something different in that case.
A full example of a context manager.
class Dummy(object):
def __init__(self, name):
self.name = name
def __enter__(self):
return self
def __exit__(self, exct_type, exce_value, traceback):
print 'cleanup:', d
def __repr__(self):
return 'Dummy(%r)' % (self.name,)
with Dummy("foo") as d:
print 'using:', d
print 'later:', d
It seems to me the real heart of the matter is here:
adding the functions is dynamic (at runtime) and not known in advance
I sense that what you are really after is a flexible way to bind different functionality to an object representing program state, also known as polymorphism. Python does that quite well, not by attaching/detaching methods, but by instantiating different classes. I suggest you look again at your class organization. Perhaps you need to separate a core, persistent data object from transient state objects. Use the has-a paradigm rather than is-a: each time state changes, you either wrap the core data in a state object, or you assign the new state object to an attribute of the core.
If you're sure you can't use that kind of pythonic OOP, you could still work around your problem another way by defining all your functions in the class to begin with and subsequently binding them to additional instance attributes (unless you're compiling these functions on the fly from user input):
class LongRunning(object):
def bark_loudly(self):
print("WOOF WOOF")
def bark_softly(self):
print("woof woof")
while True:
d = LongRunning()
d.bark = d.bark_loudly
d.bark()
d.bark = d.bark_softly
d.bark()
An alternative solution to using weakref is to dynamically bind the function to the instance only when it is called by overriding __getattr__ or __getattribute__ on the class to return func.__get__(self, type(self)) instead of just func for functions bound to the instance. This is how functions defined on the class behave. Unfortunately (for some use cases) python doesn't perform the same logic for functions attached to the instance itself, but you can modify it to do this. I've had similar problems with descriptors bound to instances. Performance here probably isn't as good as using weakref, but it is an option that will work transparently for any dynamically assigned function with the use of only python builtins.
If you find yourself doing this often, you might want a custom metaclass that does dynamic binding of instance-level functions.
Another alternative is to add the function directly to the class, which will then properly perform the binding when it's called. For a lot of use cases, this would have some headaches involved: namely, properly namespacing the functions so they don't collide. The instance id could be used for this, though, since the id in cPython isn't guaranteed unique over the life of the program, you'd need to ponder this a bit to make sure it works for your use case... in particular, you probably need to make sure you delete the class function when an object goes out of scope, and thus its id/memory address is available again. __del__ is perfect for this :). Alternatively, you could clear out all methods namespaced to the instance on object creation (in __init__ or __new__).
Another alternative (rather than messing with python magic methods) is to explicitly add a method for calling your dynamically bound functions. This has the downside that your users can't call your function using normal python syntax:
class MyClass(object):
def dynamic_func(self, func_name):
return getattr(self, func_name).__get__(self, type(self))
def call_dynamic_func(self, func_name, *args, **kwargs):
return getattr(self, func_name).__get__(self, type(self))(*args, **kwargs)
"""
Alternate without using descriptor functionality:
def call_dynamic_func(self, func_name, *args, **kwargs):
return getattr(self, func_name)(self, *args, **kwargs)
"""
Just to make this post complete, I'll show your weakref option as well:
import weakref
inst = MyClass()
def func(self):
print 'My func'
# You could also use the types modules, but the descriptor method is cleaner IMO
inst.func = func.__get__(weakref.ref(inst), type(inst))
use eval()
In [1]: int('25.0')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-1-67d52e3d0c17> in <module>
----> 1 int('25.0')
ValueError: invalid literal for int() with base 10: '25.0'
In [2]: int(float('25.0'))
Out[2]: 25
In [3]: eval('25.0')
Out[3]: 25.0

Cleaning up an internal pysqlite connection on object destruction

I have an object with an internal database connection that's active throughout its lifetime. At the end of the program's run, the connection has to be committed and closed. So far I've used an explicit close method, but this is somewhat cumbersome, especially when exceptions can happen in the calling code.
I'm considering using the __del__ method for closing, but after some reading online I have concerns. Is this a valid usage pattern? Can I be sure that the internal resources will be freed in __del__ correctly?
This discussion raised a similar question but found no satisfactory answer. I don't want to have an explicit close method, and using with isn't an option, because my object isn't used as simply as open-play-close, but is kept as a member of another, larger object, that uses it while running in a GUI.
C++ has perfectly working destructors where one can free resources safely, so I would imagine Python has something agreed-upon too. For some reason it seems not to be the case, and many in the community vow against __del__. What's the alternative, then?
Read up on the with statement. You're describing its use case.
You'll need to wrap your connection in a "Context Manager" class that handles the __enter__ and __exit__ methods used by the with statement.
See PEP 343 for more information.
Edit
"my object isn't used as simply as open-play-close, but is kept as a member of another, larger object"
class AnObjectWhichMustBeClosed( object ):
def __enter__( self ):
# acquire
def __exit__( self, type, value, traceback ):
# release
def open( self, dbConnectionInfo ):
# open the connection, updating the state for __exit__ to handle.
class ALargerObject( object ):
def __init__( self ):
pass
def injectTheObjectThatMustBeClosed( self, anObject ):
self.useThis = anObject
class MyGuiApp( self ):
def run( self ):
# build GUI objects
large = ALargeObject()
with AnObjectWhichMustBeClosed() as x:
large.injectTheObjectThatMustBeClosed( x )
mainLoop()
Some folks call this "Dependency Injection" and "Inversion of Control". Other folks call this the Strategy pattern. The "ObjectThatMustBeClosed" is a strategy, plugged into some larger object. The assembly is created at a top-level of the GUI app, since that's usually where resources like databases are acquired.
You can make a connection module, since modules keep the same object in the whole application, and register a function to close it with the atexit module
# db.py:
import sqlite3
import atexit
con = None
def get_connection():
global con
if not con:
con = sqlite3.connect('somedb.sqlite')
atexit.register(close_connection, con)
return con
def close_connection(some_con):
some_con.commit()
some_con.close()
# your_program.py
import db
con = db.get_connection()
cur = con.cursor()
cur.execute("SELECT ...")
This sugestion is based on the assumption that the connection in your application seems like a single instance (singleton) which a module global provides well.
If that's not the case, then you can use a destructor.
However destructors don't go well with garbage collectors and circular references (you must remove the circular reference yourself before the destructor is called) and if that's not the case (you need multiple connections) then you can go for a destructor. Just don't keep circular references around or you'll have to break them yourself.
Also, what you said about C++ is wrong. If you use destructors in C++ they are called either when the block that defines the object finishes (like python's with) or when you use the delete keyword (that deallocates an object created with new). Outside that you must use an explicit close() that is not the destructor. So it is just like python - python is even "better" because it has a garbage collector.

Categories

Resources