I am doing something like this...
conn = sqlite3.connect(db_filename)
with conn:
cur = conn.cursor()
cur.execute( ... )
with automatically commits the changes. But the docs say nothing about closing the connection.
Actually I can use conn in later statements (which I have tested). Hence it seems that the context manager is not closing the connection.
Do I have to manually close the connection. What if I leave it open?
EDIT
My findings:
The connection is not closed in the context manager, I have tested and confirmed it. Upon __exit__, the context manager only commits the changes by doing conn.commit()
with conn and with sqlite3.connect(db_filename) as conn are same, so using either will still keep the connection alive
with statement does not create a new scope, hence all the variables created inside the suite of with will be accessible outside it
Finally, you should close the connection manually
In answer to the specific question of what happens if you do not close a SQLite database, the answer is quite simple and applies to using SQLite in any programming language. When the connection is closed explicitly by code or implicitly by program exit then any outstanding transaction is rolled back. (The rollback is actually done by the next program to open the database.) If there is no outstanding transaction open then nothing happens.
This means you do not need to worry too much about always closing the database before process exit, and that you should pay attention to transactions making sure to start them and commit at appropriate points.
You have a valid underlying concern here, however it's also important to understand how sqlite operates too:
1. connection open
2. transaction started
3. statement executes
4. transaction done
5. connection closed
in terms of data correctness, you only need to worry about transactions and not open handles. sqlite only holds a lock on a database inside a transaction(*) or statement execution.
however in terms of resource management, e.g. if you plan to remove sqlite file or use so many connections you might run out of file descriptors, you do care about open out-of-transaction connections too.
there are two ways a connection is closed: either you call .close() explicitly after which you still have a handle but can't use it, or you let the connection go out of scope and get garbage-collected.
if you must close a connection, close it explicitly, according to Python's motto "explicit is better than implicit."
if you are only checking code for side-effects, letting a last variable holding reference to connection go out of scope may be acceptable, but keep in mind that exceptions capture the stack, and thus references in that stack. if you pass exceptions around, connection lifetime may be extended arbitrarily.
caveat programmator, sqlite uses "deferred" transactions by default, that is the transaction only starts when you execute a statement. In the example above, transaction runs from 3 to 4, rather than from 2 to 4.
This is the code that I use. The Connection and the Cursor will automatically close thanks to contextlib.closing(). The Connection will automatically commit thanks to the context manager.
import sqlite3
import contextlib
def execute_statement(statement):
with contextlib.closing(sqlite3.connect(path_to_file)) as conn: # auto-closes
with conn: # auto-commits
with contextlib.closing(conn.cursor()) as cursor: # auto-closes
cursor.execute(statement)
You can use a with block like this:
from contextlib import closing
import sqlite3
def query(self, db_name, sql):
with closing(sqlite3.connect(db_name)) as con, con, \
closing(con.cursor()) as cur:
cur.execute(sql)
return cur.fetchall()
connects
starts a transaction
creates a db cursor
performs the operation and returns the results
closes the cursor
commits/rolls-back the transaction
closes the connection
all safe in both happy and exceptional cases
Your version leaves conn in scope after connection usage.
EXAMPLE:
your version
conn = sqlite3.connect(db_filename) #DECLARE CONNECTION OUT OF WITH BLOCK
with conn: #USE CONNECTION IN WITH BLOCK
cur = conn.cursor()
cur.execute( ... )
#conn variable is still in scope, so you can use it again
new version
with sqlite3.connect(db_filename) as conn: #DECLARE CONNECTION AT START OF WITH BLOCK
cur = conn.cursor()
cur.execute( ... )
#conn variable is out of scope, so connection is closed
# MIGHT BE IT IS NOT CLOSED BUT WHAT Avaris SAID!
#(I believe auto close goes for with block)
For managing a connection to a database I usually do this,
# query method belonging to a DB manager class
def query (self, sql):
con = sqlite3.connect(self.dbName)
with con:
cur = con.cursor()
cur.execute(sql)
res = cur.fetchall()
if con:
con.close()
return res
doing so, I'm sure that the connection is explicitly closed.
Related
I want to create a Database class which can create cursors on demand.
It must be possible to use the cursors in parallel (two or more cursor can coexist) and, since we can only have one cursor per connection, the Database class must handle multiple connections.
For performance reasons we want to reuse connections as much as possible and avoid creating a new connection every time a cursor is created:
whenever a request is made the class will try to find, among the opened connections, the first non-busy connection and use it.
A connection is still busy as long as the cursor has not been consumed.
Here is an example of such class:
class Database:
...
def get_cursos(self,query):
selected_connection = None
# Find usable connection
for con in self.connections:
if con.is_busy() == False: # <--- This is not PEP 249
selected_connection = con
break
# If all connections are busy, create a new one
if (selected_connection is None):
selected_connection = self._new_connection()
self.connections.append(selected_connection)
# Return cursor on query
cur = selected_connection.cursor()
cur.execute(query)
return cur
However looking at the PEP 249 standard I cannot find any way to check whether a connection is actually being used or not.
Some implementations such as MySQL Connector offer ways to check whether a connection has still unread content (see here), however as far as I know those are not part of PEP 249.
Is there a way I can achieve what described before for any PEP 249 compliant python database API ?
Perhaps you could use the status of the cursor to tell you if a cursor is being used. Let's say you had the following cursor:
new_cursor = new_connection.cursor()
cursor.execute(new_query)
and you wanted to see if that connection was available for another cursor to use. You might be able to do something like:
if (new_cursor.rowcount == -1):
another_new_cursor = new_connection.cursor()
...
Of course, all this really tells you is that the cursor hasn't executed anything yet since the last time it was closed. It could point to a cursor that is done (and therefore a connection that has been closed) or it could point to a cursor that has just been created or attached to a connection. Another option is to use a try/catch loop, something along the lines of:
try:
another_new_cursor = new_connection.cursor()
except ConnectionError?: //not actually sure which error would go here but you get the idea.
print("this connection is busy.")
Of course, you probably don't want to be spammed with printed messages but you can do whatever you want in that except block, sleep for 5 seconds, wait for some other variable to be passed, wait for user input, etc. If you are restricted to PEP 249, you are going to have to do a lot of things from scratch. Is there a reason you can't use external libraries?
EDIT: If you are willing to move outside of PEP 249, here is something that might work, but it may not be suitable for your purposes. If you make use of the mysql python library, you can take advantage of the is_connected method.
new_connection = mysql.connector.connect(host='myhost',
database='myDB',
user='me',
password='myPassword')
...stuff happens...
if (new_connection.is_connected()):
pass
else:
another_new_cursor = new_connection.cursor()
...
I have been using always the command cur.close() once I'm done with the database:
import sqlite3
conn = sqlite3.connect('mydb')
cur = conn.cursor()
# whatever actions in the database
cur.close()
However, I just saw in some cases the following approach:
import sqlite3
conn = sqlite3.connect('mydb')
cur = conn.cursor()
# whatever actions in the database
cur.close()
conn.close()
And in the official documentation sometimes the cursor is closed, sometimes the connection and sometimes both.
My questions are:
Is there any difference between cur.close() and conn.close()?
Is it enough to close one, once I am done (or I must close both)? If so, which one is preferable?
[On closing cursors]
If you close the cursor, you are simply flagging it as invalid to process further requests ("I am done with this").
So, in the end of a function/transaction, you should keep closing the cursor, giving hint to the database that that transaction is finished.
A good pattern is to make cursors are short-lived: you get one from the connection object, do what you need, and then discard it. So closing makes sense and you should keep using cursor.close() at the end of your code section that makes use of it.
I believe (couldn't find any references) that if you just let the cursor fall out of scope (end of function, or simply del cursor) you should get the same behavior. But for the sake of good coding practices you should explicitly close it.
[Connection Objects]
When you are actually done with the database, you should close your connection to it. that means connection.close()
I'm using PostgreSQL 9.3, and SQLAlchemy 1.0.11
I have code that looks like this:
import sqlalchemy as sa
engine = sa.create_engine('postgresql+psycopg2://me#myhost/mydb')
conn = engine.connect()
metadata = sa.MetaData()
# Real table has more columns
mytable = sa.Table(
'my_temp_table', metadata,
sa.Column('id', sa.Integer, primary_key=True),
sa.Column('something', sa.String(200)),
prefixes=['TEMPORARY'],
)
metadata.create_all(engine)
pg_conn = engine.raw_connection()
with pg_conn.cursor() as cursor:
cursor.copy_expert('''COPY my_temp_table (id, something)
FROM STDIN WITH CSV''',
open('somecsvfile', 'r'))
Now this works just fine - cursor.rowcount reports the expected number of rows inserted. I can even run cursor.execute('SELECT count(*) FROM my_temp_table'); print(cursor.fetchone()) and it will display the same #. The problem is when I try to run a query from SQLAlchemy's connection, e.g
result = conn.execute(sa.text('SELECT count(*) FROM my_temp_table'))
It doesn't matter where I put that. I've tried several places:
inside the with block
outside the with block
after a cursor.close()
after a pg_conn.close()
Nothing seems to work - no matter where I run the query from, it barfs with:
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) relation "my_temp_table" does not exist
The funny thing is that if I wrap that code in a try/except then I can do cursor.execute(...) in the except block successfully.
Actually, now that I'm writing this out, it appears that using the sqlalchemy connection anywhere fails to see that those tables exists.
So what gives? Why doesn't my SQLAlchemy connection see these tables, but the postgres (engine.raw_connection()) does?
Edit:
To further the mystery - if I create the the connection after the metadata.create_all(engine), it works! Well, sort of.
I can select from the tables, but then when I get the engine.raw_connection() it fails on .copy_expert because it can't find the table.
The first thing to note is that temporary tables are only visible to the connection which created them.
The second is that an Engine doesn't encapsulate a single connection; it manages a connection pool.
Finally, the documentation points out that operations performed directly on an Engine (engine.execute("select ...") in their example) will internally acquire and release their own connections.
With all of this in mind, it's clear what's going on in your example:
conn = engine.connect() acquires Connection #1 from the pool.
metadata.create_all(engine) implicitly acquires Connection #2 (as #1 is still "in use" from the engine's perspective), uses it to create the table, and releases it back to the pool.
pg_conn = engine.raw_connection() acquires #2 again, so the COPY executed via this object can still see the table.
conn is still using #1, and nothing you do via this object will be able to see your temp table.
In your second case:
metadata.create_all(engine) implicitly acquires/uses/releases Connection #1.
conn = engine.connect() acquires #1 and holds it.
pg_conn = engine.raw_connection() acquires #2, and the COPY fails to find the temp table.
The moral of the story: if you're doing something which relies on the connection state, you'd better be sure which connection you're using. Running commands directly on the engine is fine for standalone operations, but for anything involving temp tables, you should acquire one connection and stick with it through every step (including the table creation, which I suggest you change to metadata.create_all(conn)).
Well, this doesn't answer the why but it it is how to accomplish what I want.
Rather than:
pg_conn = engine.raw_connection()
with pg_conn.cursor() as cursor:
Just replace it with:
with conn.connection.cursor() as cursor:
The SQLAlchemy connection object exposes its underlying DBAPI connection via the .connection property. And whatever magic involved there does the right thing.
MySQLdb Connections have a rudimentary context manager that creates a cursor on enter, either rolls back or commits on exit, and implicitly doesn't suppress exceptions. From the Connection source:
def __enter__(self):
if self.get_autocommit():
self.query("BEGIN")
return self.cursor()
def __exit__(self, exc, value, tb):
if exc:
self.rollback()
else:
self.commit()
So, does anyone know why the cursor isn't closed on exit?
At first, I assumed it was because closing the cursor didn't do anything and that cursors only had a close method in deference to the Python DB API (see the comments to this answer). However, the fact is that closing the cursor burns through the remaining results sets, if any, and disables the cursor. From the cursor source:
def close(self):
"""Close the cursor. No further queries will be possible."""
if not self.connection: return
while self.nextset(): pass
self.connection = None
It would be so easy to close the cursor at exit, so I have to suppose that it hasn't been done on purpose. On the other hand, we can see that when a cursor is deleted, it is closed anyway, so I guess the garbage collector will eventually get around to it. I don't know much about garbage collection in Python.
def __del__(self):
self.close()
self.errorhandler = None
self._result = None
Another guess is that there may be a situation where you want to re-use the cursor after the with block. But I can't think of any reason why you would need to do this. Can't you always finish using the cursor inside its context, and just use a separate context for the next transaction?
To be very clear, this example obviously doesn't make sense:
with conn as cursor:
cursor.execute(select_stmt)
rows = cursor.fetchall()
It should be:
with conn as cursor:
cursor.execute(select_stmt)
rows = cursor.fetchall()
Nor does this example make sense:
# first transaction
with conn as cursor:
cursor.execute(update_stmt_1)
# second transaction, reusing cursor
try:
cursor.execute(update_stmt_2)
except:
conn.rollback()
else:
conn.commit()
It should just be:
# first transaction
with conn as cursor:
cursor.execute(update_stmt_1)
# second transaction, new cursor
with conn as cursor:
cursor.execute(update_stmt_2)
Again, what would be the harm in closing the cursor on exit, and what benefits are there to not closing it?
To answer your question directly: I cannot see any harm whatsoever in closing at the end of a with block. I cannot say why it is not done in this case. But, as there is a dearth of activity on this question, I had a search through the code history and will throw in a few thoughts (guesses) on why the close() may not be called:
There is a small chance that spinning through calls to nextset() may throw an exception - possibly this had been observed and seen as undesirable. This may be why the newer version of cursors.py contains this structure in close():
def close(self):
"""Close the cursor. No further queries will be possible."""
if not self.connection:
return
self._flush()
try:
while self.nextset():
pass
except:
pass
self.connection = None
There is the (somewhat remote) potential that it might take some time to spin through all the remaining results doing nothing. Therefore close() may not be called to avoid doing some unnecessary iterations. Whether you think it's worth saving those clock cycles is subjective, I suppose, but you could argue along the lines of "if it's not necessary, don't do it".
Browsing the sourceforge commits, the functionality was added to the trunk by this commit in 2007 and it appears that this section of connections.py has not changed since. That's a merge based on this commit, which has the message
Add Python-2.5 support for with statement as described in http://docs.python.org/whatsnew/pep-343.html Please test
And the code you quote has never changed since.
This prompts my final thought - it's probably just a first attempt / prototype that just worked and therefore never got changed.
More modern version
You link to source for a legacy version of the connector. I note there is a more active fork of the same library here, which I link to in my comments about "newer version" in point 1.
Note that the more recent version of this module has implemented __enter__() and __exit__() within cursor itself: see here. __exit__() here does call self.close() and perhaps this provides a more standard way to use the with syntax e.g.
with conn.cursor() as c:
#Do your thing with the cursor
End notes
N.B. I guess I should add, as far as I understand garbage collection (not an expert either) once there are no references to conn, it will be deallocated. At this point there will be no references to the cursor object and it will be deallocated too.
However calling cursor.close() does not mean that it will be garbage collected. It simply burns through the results and set the connection to None. This means it can't be re-used, but it won't be garbage collected immediately. You can convince yourself of that by manually calling cursor.close() after your with block and then, say, printing some attribute of cursor
N.B. 2 I think this is a somewhat unusual use of the with syntax as the conn object persists because it is already in the outer scope - unlike, say, the more common with open('filename') as f: where there are no objects hanging around with references after the end of the with block.
I'm using Psycopg2 in Python to access a PostgreSQL database. I'm curious if it's safe to use the with closing() pattern to create and use a cursor, or if I should use an explicit try/except wrapped around the query. My question is concerning inserting or updating, and transactions.
As I understand it, all Psycopg2 queries occur within a transaction, and it's up to calling code to commit or rollback the transaction. If within a with closing(... block an error occurs, is a rollback issued? In older versions of Psycopg2, a rollback was explicitly issued on close() but this is not the case anymore (see http://initd.org/psycopg/docs/connection.html#connection.close).
My question might make more sense with an example. Here's an example using with closing(...
with closing(db.cursor()) as cursor:
cursor.execute("""UPDATE users
SET password = %s, salt = %s
WHERE user_id = %s""",
(pw_tuple[0], pw_tuple[1], user_id))
module.rase_unexpected_error()
cursor.commit()
What happens when module.raise_unexpected_error() raises its error? Is the transaction rolled back? As I understand transactions, I either need to commit them or roll them back. So in this case, what happens?
Alternately I could write my query like this:
cursor = None
try:
cursor = db.cursor()
cursor.execute("""UPDATE users
SET password = %s, salt = %s
WHERE user_id = %s""",
(pw_tuple[0], pw_tuple[1], user_id))
module.rase_unexpected_error()
cursor.commit()
except BaseException:
if cursor is not None:
cursor.rollback()
finally:
if cursor is not None:
cursor.close()
Also I should mention that I have no idea if Psycopg2's connection class cursor() method could raise an error or not (the documentation doesn't say) so better safe than sorry, no?
Which method of issuing a query and managing a transaction should I use?
Your link to the Psycopg2 docs kind of explains it itself, no?
... Note that closing a connection without committing the changes first will
cause any pending change to be discarded as if a ROLLBACK was
performed (unless a different isolation level has been selected: see
set_isolation_level()).
Changed in version 2.2: previously an explicit ROLLBACK was issued by
Psycopg on close(). The command could have been sent to the backend at
an inappropriate time, so Psycopg currently relies on the backend to
implicitly discard uncommitted changes. Some middleware are known to
behave incorrectly though when the connection is closed during a
transaction (when status is STATUS_IN_TRANSACTION), e.g. PgBouncer
reports an unclean server and discards the connection. To avoid this
problem you can ensure to terminate the transaction with a
commit()/rollback() before closing.
So, unless you're using a different isolation level, or using PgBouncer, your first example should work fine. However, if you desire some finer-grained control over exactly what happens during a transaction, then the try/except method might be best, since it parallels the database transaction state itself.