Context
So I am trying to figure out how to properly override the auto-transaction when using SQLite in Python. When I try and run
cursor.execute("BEGIN;")
.....an assortment of insert statements...
cursor.execute("END;")
I get the following error:
OperationalError: cannot commit - no transaction is active
Which I understand is because SQLite in Python automatically opens a transaction on each modifying statement, which in this case is an INSERT.
Question:
I am trying to speed my insertion by doing one transaction per several thousand records.
How can I overcome the automatic opening of transactions?
As #CL. said you have to set isolation level to None. Code example:
s = sqlite3.connect("./data.db")
s.isolation_level = None
try:
c = s.cursor()
c.execute("begin")
...
c.execute("commit")
except:
c.execute("rollback")
The documentaton says:
You can control which kind of BEGIN statements sqlite3 implicitly executes (or none at all) via the isolation_level parameter to the connect() call, or via the isolation_level property of connections.
If you want autocommit mode, then set isolation_level to None.
Related
Sometimes I have a need to execute a query from psycopg2 that is not in a transaction block.
For example:
cursor.execute('create index concurrently on my_table (some_column)')
Doesn't work:
InternalError: CREATE INDEX CONCURRENTLY cannot run inside a transaction block
I don't see any easy way to do this with psycopg2. What am I missing?
I can probably call os.system('psql -c "create index concurrently"') or something similar to get it to run from my python code, however it would be much nicer to be able to do it inside python and not rely on psql to actually be in the container.
Yes, I have to use the concurrently option for this particular use case.
Another time I've explored this and not found an obvious answer is when I have a set of sql commands that I'd like to call with a single execute(), where the first one briefly locks a resource. When I do this, that resource will remain locked for the entire duration of the execute() rather than for just when the first statement in the sql string was running because they all run together in one big happy transaction.
In that case I could break the query up into a series of execute() statements - each became its own transaction, which was ok.
It seems like there should be a way, but I seem to be missing it. Hopefully this is an easy answer for someone.
EDIT: Add code sample:
#!/usr/bin/env python3.10
import psycopg2 as pg2
# -- set the standard psql environment variables to specify which database this should connect to.
# We have to set these to 'None' explicitly to get psycopg2 to use the env variables
connDetails = {'database': None, 'host': None, 'port': None, 'user': None, 'password': None}
with (pg2.connect(**connDetails) as conn, conn.cursor() as curs):
conn.set_session(autocommit=True)
curs.execute("""
create index concurrently if not exists my_new_index on my_table (my_column);
""")
Throws:
psycopg2.errors.ActiveSqlTransaction: CREATE INDEX CONCURRENTLY cannot run inside a transaction block
Per psycopg2 documentation:
It is possible to set the connection in autocommit mode: this way all the commands executed will be immediately committed and no rollback is possible. A few commands (e.g. CREATE DATABASE, VACUUM, CALL on stored procedures using transaction control…) require to be run outside any transaction: in order to be able to run these commands from Psycopg, the connection must be in autocommit mode: you can use the autocommit property.
Hence on the connection:
conn.set_session(autocommit=True)
Further resources from psycopg2 documentation:
transactions-control
connection.autocommit
I don't understand how to test my repositories.
I want to be sure that I really saved object with all of it parameters into database, and when I execute my SQL statement I really received what I am supposed to.
But, I cannot put "CREATE TABLE test_table" in setUp method of unittest case because it will be created multiple times (tests of the same testcase are runned in parallel). So, as long as I create 2 methods in the same class which needs to work on the same table, it won't work (name clash of tables)
Same, I cannot put "CREATE TABLE test_table" setUpModule, because, now the table is created once, but since tests are runned in parallel, there is nothing which prevents from inserting the same object multiple times into my table, which breakes the unicity constraint of some field.
Same, I cannot "CREATE SCHEMA some_random_schema_name" in every method, because I need to globally "SET search_path TO ..." for a given Database, so every method runned in parallel will be affected.
The only way I see is to create to "CREATE DATABASE" for each test, and with unique name, and establish a invidual connection to each database.. This looks extreeeemly wasteful. Is there a better way?
Also, I cannot use SQLite in memory because I need to test PostgreSQL.
The best solution for this is to use the testing.postgresql module. This fires up a db in user-space, then deletes it again at the end of the run. You can put the following in a unittest suite - either in setUp, setUpClass or setUpModule - depending on what persistence you want:
import testing.postgresql
def setUp(self):
self.postgresql = testing.postgresql.Postgresql(port=7654)
# Get the url to connect to with psycopg2 or equivalent
print(self.postgresql.url())
def tearDown(self):
self.postgresql.stop()
If you want the database to persist between/after tests, you can run it with the base_dir option to set a directory - which will prevent it's removal after shutdown:
name = "testdb"
port = "5678"
path = "/tmp/my_test_db"
testing.postgresql.Postgresql(name=name, port=port, base_dir=path)
Outside of testing it can also be used as a context manager, where it will automatically clean up and shut down when the with block is exited:
with testing.postgresql.Postgresql(port=7654) as psql:
# do something here
I have a problem with a simple UPDATE statement. I wrote a Python tool which creates a lot of UPDATE statements and after creating them I want to execute them on my Access database but it doesn't work This is one statement for example:
UPDATE FCL_B_COVERSHEET_A SET BRANCH = 0 WHERE OBJ_ID = '1220140910132011062005';
The statement syntax is not the problem. I tested it and it works.
This next code snippet shows the initialization for the connect object.
strInputPathMDB = "C:\\Test.mdb"
DRV = '{Microsoft Access Driver (*.mdb)}';
con = pyodbc.connect('Driver={0};Dbq={1};Uid={2};Pwd={3};'.format(DRV,strInputPathMDB,"administrator",""))
After that I wrote a method which execute one SQL statement
def executeSQLStatement(conConnection, strSQL):
arcpy.AddMessage(strSQL)
cursor = conConnection.cursor()
cursor.execute(strSQL)
conConnection.commit()
and if I execute this code everything seems to work - no error message or anything like that - but also the data is not updated and I don't know what I'm doing wrong ...
for strSQL in sqlStateArray:
executeSQLStatement(con, strSQL)
con.close()
I hope you understand what my problem is. Thanks for your help.
Chris
The issue here was that the .mdb file was in the root folder of the C: drive. Root folders often restrict normal users to read-only access so the database file was being opened as read-only. Moving the .mdb file to a public folder solved the problem.
I'm trying to use an SQLite insert operation in a python script, it works when I execute it manually on the command line but when I try to access it on the web it won't insert it in the database. Here is my function:
def insertdb(unique_id,number_of_days):
conn = sqlite3.connect('database.db')
print "Opened database successfully";
conn.execute("INSERT INTO IDENT (ID_NUM,DAYS_LEFT) VALUES (?,?)",(unique_id,number_of_days));
conn.commit()
print "Records created successfully";
conn.close()
When it is executed on the web, it only shows the output "Opened database successfully" but does not seem to insert the value into the database. What am I missing? Is this a server configuration issue? I have checked the database permissions on writing and they are correctly set.
The problem is almost certainly that you're trying to create or open a database named database.db in whatever happens to be the current working directory, and one of the following is true:
The database exists and you don't have permission to write to it. So, everything works until you try to do something that requires write access (like commiting an INSERT).
The database exists, and you have permission to write to it, but you don't have permission to create new files in the directory. So, everything works until sqlite needs to create a temporary file (which it almost always will for execute-ing an INSERT).
Meanwhile, you don't mention what web server/container/etc. you're using, but apparently you have it configured to just swallow all errors silently, which is a really, really bad idea for any debugging. Configure it to report the errors in some way. Otherwise, you will never figure out what's going on with anything that goes wrong.
If you don't have control over the server configuration, you can at least wrap all your code in a try/except and manually log exceptions to some file you have write access to (ideally via the logging module, or just open and write if worst comes to worst).
Or, you can just do that with dumb print statements, as you're already doing:
def insertdb(unique_id,number_of_days):
conn = sqlite3.connect('database.db')
print "Opened database successfully";
try:
conn.execute("INSERT INTO IDENT (ID_NUM,DAYS_LEFT) VALUES (?,?)",(unique_id,number_of_days));
conn.commit()
print "Records created successfully";
except Exception as e:
print e # or, better, traceback.print_exc()
conn.close()
I'm looking for a way to debug queries as they are executed and I was wondering if there is a way to have MySQLdb print out the actual query that it runs, after it has finished inserting the parameters and all that? From the documentation, it seems as if there is supposed to be a Cursor.info() call that will give information about the last query run, but this does not exist on my version (1.2.2).
This seems like an obvious question, but for all my searching I haven't been able to find the answer.
We found an attribute on the cursor object called cursor._last_executed that holds the last query string to run even when an exception occurs. This was easier and better for us in production than using profiling all the time or MySQL query logging as both of those have a performance impact and involve more code or more correlating separate log files, etc.
Hate to answer my own question but this is working better for us.
You can print the last executed query with the cursor attribute _last_executed:
try:
cursor.execute(sql, (arg1, arg2))
connection.commit()
except:
print(cursor._last_executed)
raise
Currently, there is a discussion how to get this as a real feature in pymysql (see pymysql issue #330: Add mogrify to Cursor, which returns the exact string to be executed; pymysql should be used instead of MySQLdb)
edit: I didn't test it by now, but this commit indicates that the following code might work:
cursor.mogrify(sql, (arg1, arg2))
For me / for now _last_executed doesn't work anymore. In the current version you want to access
cursor.statement.
see: https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursor-statement.html
For mysql.connector:
cursor.statement
https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursor-statement.html
cursor.statement and cursor._last_executed raised AttributeError exception
cursor._executed
worked for me!
One way to do it is to turn on profiling:
cursor.execute('set profiling = 1')
try:
cursor.execute('SELECT * FROM blah where foo = %s',[11])
except Exception:
cursor.execute('show profiles')
for row in cursor:
print(row)
cursor.execute('set profiling = 0')
yields
(1L, 0.000154, 'SELECT * FROM blah where foo = 11')
Notice the argument(s) were inserted into the query, and that the query was logged even though the query failed.
Another way is to start the server with logging turned on:
sudo invoke-rc.d mysql stop
sudo mysqld --log=/tmp/myquery.log
Then you have to sift through /tmp/myquery.log to find out what the server received.
I've had luck with cursor._last_executed generally speaking, but it doesn't work correctly when used with cursor.executemany(). That drops all but the last statement. Here's basically what I use now in that instance instead (based on tweaks from the actual MySQLDb cursor source):
def toSqlResolvedList( cursor, sql, dynamicValues ):
sqlList=[]
try:
db = cursor._get_db()
if isinstance( sql, unicode ):
sql = sql.encode( db.character_set_name() )
for values in dynamicValues :
sqlList.append( sql % db.literal( values ) )
except: pass
return sqlList
This read-only property returns the last executed statement as a string. The statement property can be useful for debugging and displaying what was sent to the MySQL server.
The string can contain multiple statements if a multiple-statement string was executed. This occurs for execute() with multi=True. In this case, the statement property contains the entire statement string and the execute() call returns an iterator that can be used to process results from the individual statements. The statement property for this iterator shows statement strings for the individual statements.
str = cursor.statement
source: https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursor-statement.html
I can't say I've ever seen
Cursor.info()
In the documentation, and I can't find it after a few minutes searching. Maybe you saw some old documentation?
In the mean time you can always turn on MySQL Query Logging and have a look at the server's log files.
assume that your sql is like select * from table1 where 'name' = %s
from _mysql import escape
from MySQLdb.converters import conversions
actual_query = sql % tuple((escape(item, conversions) for item in parameters))