python sqlite "BEGIN TRANSACTION" and "COMMIT" commands - python

If I want to start a transaction in my database through python do I have to execute the sql command 'BEGIN TRANSACTION' explicitly like this:
import sqlite3
conn = sqlite3.connect(db)
c = conn.cursor()
c.execute('BEGIN TRANSACTION;')
##... some updates on the database ...
conn.commit() ## or c.execute('COMMIT'). Are these two expressions the same?
Is the database locked for change from other clients when I establish the connection or when I begin the transaction or neither?

Only transactions lock the database.
However, Python tries to be clever and automatically begins transactions:
By default, the sqlite3 module opens transactions implicitly before a Data Modification Language (DML) statement (i.e. INSERT/UPDATE/DELETE/REPLACE), and commits transactions implicitly before a non-DML, non-query statement (i. e. anything other than SELECT or the aforementioned).
So if you are within a transaction and issue a command like CREATE TABLE ..., VACUUM, PRAGMA, the sqlite3 module will commit implicitly before executing that command. There are two reasons for doing that. The first is that some of these commands don’t work within transactions. The other reason is that sqlite3 needs to keep track of the transaction state (if a transaction is active or not).
You can control which kind of BEGIN statements sqlite3 implicitly executes (or none at all) via the isolation_level parameter to the connect() call, or via the isolation_level property of connections.
If you want autocommit mode, then set isolation_level to None.
Otherwise leave it at its default, which will result in a plain “BEGIN” statement, or set it to one of SQLite’s supported isolation levels: “DEFERRED”, “IMMEDIATE” or “EXCLUSIVE”.

From python docs:
When a database is accessed by multiple connections, and one of the processes modifies the database, the SQLite database is locked until that transaction is committed. The timeout parameter specifies how long the connection should wait for the lock to go away until raising an exception. The default for the timeout parameter is 5.0 (five seconds).

If I want to start a transaction in my database through python do I have to execute the sql command 'BEGIN TRANSACTION' explicitly like this:
It depends if you are in auto-commit mode (in which case yes) or in manual commit mode (in which case no before DML statements, but unfortunately yes before DDL or DQL statements, as the manual commit mode is incorrectly implemented in the current version of the SQLite3 database driver, see below).
conn.commit() ## or c.execute('COMMIT'). Are these two expressions the same?
Yes.
Is the database locked for change from other clients when I establish the connection or when I begin the transaction or neither?
When you begin the transaction (cf. SQLite3 documentation).
For a better understanding of auto-commit and manual commit modes, read this answer.

Related

Postgres "CREATE TABLE AS (SELECT ...)" stuck

I am using Python with psycopg2 2.8.6 against Postgresql 11.6 (also tried on 11.9)
When I am running a query
CREATE TABLE tbl AS (SELECT (row_number() over())::integer "id", "col" FROM tbl2)
Code is getting stuck (cursor.execute never returns), killing the transaction with pg_terminate_backend removes the query from the server, but the code is not released. Yet in this case, the target table is created.
Nothing locks the transaction. The internal SELECT query on its own was tested and it works well.
I tried analysing clues on the server and found out the following inside pg_stat_activity:
Transaction state is idle in transaction
wait_event_type is Client
wait_event is ClientRead
The same effect is happening when I am running the query from within SQL editor (pgModeler), but in this case, the query is stuck on Idle state and the target table is created.
I am not sure what is wrong and how to proceed from here.
Thanks!
I am answering my own question here, to make it helpful for others.
The problem was solved by modifying tcp_keepalives_idle Postgres setting from default 2 hours to 5 minutes.
The problem is not reporducible, you have to investigate more. You must share more details about your database table, your python code and server OS.
You can also share with us the strace attached to Python, so we can see what actually happens during the query.
wait_event_type = Client: The server process is waiting for some activity on a socket from user applications, and that the server expects something to happen that is independent from its internal processes. wait_event will identify the specific wait point.
wait_event = ClientRead: A session that waits for ClientRead is done processing the last query and waits for the client to send the next request. The only way that such a session can block anything is if its state is idle in transaction. All locks are held until the transaction ends, and no locks are held once the transaction finishes.
Idle in transaction: The activity can be idle (i.e., waiting for a client command), idle in transaction (waiting for client inside a BEGIN block), or a command type name such as SELECT. Also, waiting is appended if the server process is presently waiting on a lock held by another session.
The problem could be related to:
Network problems
Uncommitted transaction someplace that has created the same table name.
The transaction is not committed
You pointed out that is not a commit problem because the SQL editor do the same, but in your question you specify that the editor succesfully create the table.
In pgModeler you see idle, that means the session is idle, not the query.
If the session is idle, the "query" column of pg_stat_activity shows the last executed statement in that session.
So this simply means all those sessions properly ended their transaction using a ROLLBACK statement.
If sessions remain in state idle in transaction for a longer time, that is always an application bug where the application is not ending the transaction.
You can do two things:
Set the idle_in_transaction_session_timeout so that these transactions are automatically rolled back by the server after a while. This will keep locks from being held indefinitly, but your application will receive an error.
Fix the application as shown below
.commit() solution
The only way that I found to reproduce the problem is to omit the commit action.
The module psycopg2 is Python DB API-compliant, so the auto-commit feature is off by default.
Whit this option set to False you need to call conn.commit to commit any pending transaction to the database.
Enable auto-commit
You can enable the auto-commit as follow:
import psycopg2
connection = None
try:
connection = psycopg2.connect("dbname='myDB' user='myUser' host='localhost' password='myPassword'")
connection.autocommit = True
except:
print "Connection failed."
if(connection != None):
cursor = connection.cursor()
try:
cursor.execute("""CREATE TABLE tbl AS (SELECT (row_number() over())::integer 'id', 'col' FROM tbl2)""")
except:
print("Failed to create table.")
with statement
You can also use the with statement to auto-commit a transaction:
with connection, connection.cursor() as cursor: # start a transaction and create a cursor
cursor.execute("""CREATE TABLE tbl AS (SELECT (row_number() over())::integer 'id', 'col' FROM tbl2)""")
Traditional way
If you don't want to auto-commit the transaction you need to do it manually calling .commit() after your execute.
just remove the ( ) around the SELECT...
https://www.postgresql.org/docs/11/sql-createtableas.html

When do commits happen with SQLAlchemy Core?

I've been trying to test this out, but haven't been able to come to a definitive answer. I'm using SQLAlchemy on top of MySQL and trying to prevent having threads that do a select, get a SHARED_READ lock on some table, and then hold on to it (preventing future DDL operations until it's released). This happens when queries aren't committed. I'm using SQLAlchemy Core, where as far as I could tell .execute() essentially works in autocommit mode, issuing a COMMIT after everything it runs unless explicitly told we're in a transaction. Nevertheless, in show processlist, I'm seeing sleeping threads that still have SHARED_READ locks on a table they once queried. What gives?
Assuming from your post you're operating in "non-transactional" mode, either using an SQLAlchemy Connection without an ongoing transaction, or the shorthand engine.execute(). In this mode of operation SQLAlchemy will detect INSERT, UPDATE, DELETE, and DDL statements and issue a commit after automatically, but not for everything, like SELECT statements. See "Understanding Autocommit". For selects of mutating stored procedures and such that do require a commit, use
conn.execute(text('SELECT ...').execution_options(autocommit=True))
You should also consider closing connections when the thread is done with them for the time being. Closing will call rollback() on the underlying DBAPI connection, which per PEP-0249 is (probably) always in transactional state. This will remove the transactional state and/or locks, and returns the connection to the connection pool. This way you shouldn't need to worry about selects not autocommitting.

SQLAlchemy long running script: User was holding a relation lock for too long

I have an SQLAlchemy session in a script. The script is running for a long time, and it only fetches data from database, never updates or inserts.
I get quite a lot of errors like
sqlalchemy.exc.DBAPIError: (TransactionRollbackError) terminating connection due to conflict with recovery
DETAIL: User was holding a relation lock for too long.
The way I understand it, SQLAlchemy creates a transaction with the first select issued, and then reuses it. As my script may run for about an hour, it is very likely that a conflict comes up during the lifetime of that transaction.
To get rid of the error, I could use autocommit in te deprecated mode (without doing anything more), but this is explicitly discouraged by the documentation.
What is the right way to deal with the error? Can I use ORM queries without transactions at all?
I ended up closing the session after (almost) every select, like
session.query(Foo).all()
session.close()
since I do not use autocommit, a new transaction is automatically opened.

python-mysqldb without transactions

I'm reading about how do transactions work in python's MySQLdb. In this tutorial, it says that:
In Python DB API, we do not call the BEGIN statement to start a
transaction. A transaction is started when the cursor is created.
So a following line:
cur = con.cursor()
starts a transaction implicitly. It also says, that:
We must end a transaction with either a commit() or a rollback()
method.
Do I understand it correctly, that MySQLdb uses transactions always and there's no way to turn this behavior off? Forcing user to enclose all queries in transactions seems a bit strange. If so - is there any explanation why is that?
I'm not a huge expert in this, but I think the feature you're looking for here is autocommit.
This automatically commits your commands. Therefore you should be able to skip the 'BEGIN' statements.
Here's a page on it:
http://dev.mysql.com/doc/connector-python/en/connector-python-connectargs.html
You set this up when you start the python MySQLdb instance:
conn=MySQLdb.connect(host='blah', autocommit=True)
You should then have a connection that doesn't worry about Transactions.
Some storage engines don't use transactions so if you use one, you won't need to worry about this detail:
en.wikipedia.org/wiki/Comparison_of_MySQL_database_engines
However, they can run into issues if your insert \ update fails halfway through!

caching issues in MySQL response with MySQLdb in Django

I use MySQL with MySQLdb module in Python, in Django.
I'm running in autocommit mode in this case (and Django's transaction.is_managed() actually returns False).
I have several processes interacting with the database.
One process fetches all Task models with Task.objects.all()
Then another process adds a Task model (I can see it in a database management application).
If I call Task.objects.all() on the first process, I don't see anything. But if I call connection._commit() and then Task.objects.all(), I see the new Task.
My question is: Is there any caching involved at connection level? And is it a normal behaviour (it does not seems to me)?
This certainly seems autocommit/table locking - related.
If mysqldb implements the dbapi2 spec it will probably have a connection running as one single continuous transaction. When you say: 'running in autocommit mode': do you mean MySQL itself or the mysqldb module? Or Django?
Not intermittently commiting perfectly explains the behaviour you are getting:
i) a connection implemented as one single transaction in mysqldb (by default, probably)
ii) not opening/closing connections only when needed but (re)using one (or more) persistent database connections (my guess, could be Django-architecture-inherited).
ii) your selects ('reads') cause a 'simple read lock' on a table (which means other connections can still 'read' this table but connections wanting to 'write data' can't (immediately) because this lock prevents them from getting an 'exclusive lock' (needed 'for writing') on this table. The writing is thus postponed indefinitely (until it can get a (short) exclusive lock on the table for writing - when you close the connection or manually commit).
I'd do the following in your case:
find out which table locks are on your database during the scenario above
read about Django and transactions here. A quick skim suggests using standard Django functionality implicitely causes commits. This means sending handcrafted SQL maybe won't (insert, update...).

Categories

Resources