I have the following code that is using MySQLdb for db inserts
self.cursor.execute('START TRANSACTION;')
for item in data:
self.cursor.execute('INSERT INTO...')
self.cursor.execute('COMMIT;')
self.conn.commit()
Is the self.conn.commit() at the end redundant, or does that need to be there?
If you start a transaction you're responsible for calling COMMIT or it'll get unrolled when you close the connection.
As a note it's bad form to include ; in your queries unless you're using an interactive shell. They're not necessary and immediately raise a bunch of questions about how they came to be included there.
The ; delimiter is used by the shell to determine where one command stops and the next starts, something that's not necessary when using code where each statement is supplied as a separate string.
Related
I have a 'throwaway' sql statement that I would like to run. I don't care about the error status, and I don't need to know if it completed successfully. It is to create an index on a table that is very infrequently used. I currently have the connection and cursor object, and here is how I would normally do it:
self.cursor.execute('ALTER TABLE mytable ADD INDEX (_id)')
Easy enough. However, this statement takes about five minutes, and like I mentioned, it's not important enough to block other items that are unrelated to it. Is it possible to execute a cursor statement in the background? Again, I don't need any status or anything from it, and I don't care about 'closing the cursor/connection' or anything -- it really is a throw-away statement on a table that is probably accessed one to five times in its lifetime before being dropped.
threading.Thread(target=lambda tn, cursor: cursor.execute('ALTER TABLE %s ADD INDEX (_id)' % tn))).start()
What would be the best approach to execute a statement in the background so it doesn't block future sql statements.
I have a table, skills, which is presently empty despite my attempts to add rows to it. I have the following Python code in a CGI script:
open('/tmp/skills', 'a').write('Reached 1!\n')
if get_cgi('nous2dianoia'):
open('/tmp/skills', 'a').write('Reached 2!\n')
#if (get_cgi('previous') and get_cgi('name') and get_cgi('previous') !=
#get_cgi('name')):
#cursor.execute('DELETE FROM skills WHERE name = ?;',
#(get_cgi('previous'),))
cursor.execute('''INSERT INTO skills (name, nous2dianoia,
hereandnow2escapist, nf2nt, social2individual, ithou2iit,
slow2quick) VALUES (?, ?, ?, ?, ?, ?, ?);''',
(get_cgi('name'), get_cgi('nous2dianoia'),
get_cgi('hereandnow2escapist'), get_cgi('nf2nt'),
get_cgi('social2individual'), get_cgi('ithou2iit'),
get_cgi('slow2quick'),))
open('/tmp/skills', 'a').write('Reached 3!\n')
When I load a page, /tmp/skills has a freshly appended:
Reached 1!
Reached 2!
Reached 3!
However, the table remains empty. (The rest of the script runs without crashing, and displays what one would expect to be displayed if the script were called without any CGI variables passed.)
I haven't started a transaction; the SQL operations are not particularly advanced or intricate.
Any insight on how I can get this to run without reported error, but have an empty skills table in the database?
Thanks,
Your insert statement is not automatically committed. From the docs on sqlite3.Connection:
commit()
This method commits the current transaction. If you don’t
call this method, anything you did since the last call to commit() is
not visible from other database connections. If you wonder why you
don’t see the data you’ve written to the database, please check you
didn’t forget to call this method.
To automatically commit, use the connection as a context manager:
# connection.commit() is called automatically upon exit of context manager
# unless an exception is encountered, then connection.rollback() is called.
with connection:
connection.execute(insert_statment)
So I'm using psycopg2, I have a simple table:
CREATE TABLE IF NOT EXISTS feed_cache (
feed_id int REFERENCES feeds(id) UNIQUE,
feed_cache text NOT NULL,
expire_date timestamp --without time zone
);
I'm calling the following method and query:
#staticmethod
def get_feed_cache(conn, feed_id):
c = conn.cursor()
try:
sql = 'SELECT feed_cache FROM feed_cache WHERE feed_id=%s AND localtimestamp <= expire_date;'
c.execute(sql, (feed_id,))
result = c.fetchone()
if result:
conn.commit()
return result[0]
else:
print 'DBSELECT.get_feed_cache: %s' % result
print 'sql: %s' % (c.mogrify(sql, (feed_id,)))
except:
conn.rollback()
raise
finally:
c.close()
return None
I've added the else statement to output the exact sql and result that is being executed and returned.
The get_feed_cache() method is called from a database connection thread pool. When the get_feed_cache() method is called "slowishly" (~1/sec or less) the result is returned as expected, however when called concurrently it will occasionally return None. I have tried multiple ways of writing this query & method.
Some observations:
If I remove 'AND localtimestamp <= expire_date' from the query, the query ALWAYS returns a result.
Executing the query rapidly in serial in psql always returns a result.
After reading about the fetch*() methods of psycopg's cursor class they note that the results are cached for the cursor, I'm assuming that the cache is not shared between different cursors. http://initd.org/psycopg/docs/faq.html#best-practices
I have tried using postgresql's now() and current_timestamp functions with the same results. (I am aware of the timezone aspect of now() & current_timestamp)
Conditions to note:
There will NEVER be a case where there is not a feed_cache value for a provided feed_id.
There will NEVER be a case where any value in the feed_cache table is NULL
While testing I have completely disabled any & all writes to this table
I have set the expire_date to be sufficiently far in the future for all values such that the expression 'AND localtimestamp <= expire_date' will always be true.
Here is a copy & pasted output of it returning None:
DBSELECT.get_feed_cache: None
sql: SELECT feed_cache FROM feed_cache WHERE feed_id=5 AND localtimestamp < expire_date;
Well that's pretty much it, I'm not sure what's going on. Maybe I'm making some really dumb mistake and I just don't notice it! My current guess is that it has something to do with psycopg2 and perhaps the way it's caching results between cursors. If the cursors DO share the cache and the queries happen near-simultaneously then it could be possible that the first cursor fetches the result, the second cursor sees there is a cache of the same query, so it does not execute, then the first cursor closes and deletes the cache and the second cursor tries to fetch a now null/None cache.*
That said, psycopg2 states that it's thread-safe for read-only queries, so unless I'm miss-interpreting their implementation of thread-safe, this shouldn't be the case.
Thank you for your time!
*After adding a thread lock for the get_feed_cache, acquiring before creating the cursor and releasing before returning, I still occasionally get a None result
I think this might have to do with the fact that the time stamps returned by localtimestamp or current_timestamp are fixed when the transaction starts, not when you run the statement. And psycopg manages the transactions behind your back to some degree. So you might be getting a slightly older time stamp.
You could debug this by setting log_statement = all in your server and then observing when the BEGIN statements are executed relative to your queries.
You might want to look into using a function such as clock_timestamp(), which updates more often per transaction. See http://www.postgresql.org/docs/current/static/functions-datetime.html.
What is the best way to deal with the
1205 "deadlock victim"
error when calling SQL Server from Python?
The issue arises when I have multiple Python scripts running, and all are attempting to update a table with a MERGE statement which adds a row if it doesn't yet exist (this query will be called millions of times in each script).
MERGE table_name as table // including UPDLOCK or ROWLOCK eventually
// results in deadlock
USING ( VALUES ( ... ) )
AS row( ... )
ON table.feature = row.feature
WHEN NOT MATCHED THEN
INSERT (...)
VALUES (...)
The scripts require immediate access to the table to access the unique id assigned to the row.
Eventually, one of the scripts raises an OperationalError:
Transaction (Process ID 52) was deadlocked on lock resources with
another process and has been chosen as the deadlock victim. Rerun the
transaction.
1) I have tried using a try-except block around the call in Python:
while True:
try:
cur.execute(stmt)
break
except OperationalError:
continue
This approach slows the process down considerably. Also, I think I might be doing this incorrectly (I think I might need to reset the connection...).
2) Use a try-catch in SQL Server (something like below...):
WHILE 1 = 1
BEGIN
BEGIN TRY
MERGE statement // see above
BREAK
END TRY
BEGIN CATCH
SELECT ERROR_NUMBER() AS ErrorNumber
ROLLBACK
CONTINUE
END CATCH;
END
3) Something else?
Thanks for your help. And let me know if you need additional details, etc.
I am using Python 2.7, SQL Server 2008, and pymssql to make the connection.
I'm using Psycopg2 in Python to access a PostgreSQL database. I'm curious if it's safe to use the with closing() pattern to create and use a cursor, or if I should use an explicit try/except wrapped around the query. My question is concerning inserting or updating, and transactions.
As I understand it, all Psycopg2 queries occur within a transaction, and it's up to calling code to commit or rollback the transaction. If within a with closing(... block an error occurs, is a rollback issued? In older versions of Psycopg2, a rollback was explicitly issued on close() but this is not the case anymore (see http://initd.org/psycopg/docs/connection.html#connection.close).
My question might make more sense with an example. Here's an example using with closing(...
with closing(db.cursor()) as cursor:
cursor.execute("""UPDATE users
SET password = %s, salt = %s
WHERE user_id = %s""",
(pw_tuple[0], pw_tuple[1], user_id))
module.rase_unexpected_error()
cursor.commit()
What happens when module.raise_unexpected_error() raises its error? Is the transaction rolled back? As I understand transactions, I either need to commit them or roll them back. So in this case, what happens?
Alternately I could write my query like this:
cursor = None
try:
cursor = db.cursor()
cursor.execute("""UPDATE users
SET password = %s, salt = %s
WHERE user_id = %s""",
(pw_tuple[0], pw_tuple[1], user_id))
module.rase_unexpected_error()
cursor.commit()
except BaseException:
if cursor is not None:
cursor.rollback()
finally:
if cursor is not None:
cursor.close()
Also I should mention that I have no idea if Psycopg2's connection class cursor() method could raise an error or not (the documentation doesn't say) so better safe than sorry, no?
Which method of issuing a query and managing a transaction should I use?
Your link to the Psycopg2 docs kind of explains it itself, no?
... Note that closing a connection without committing the changes first will
cause any pending change to be discarded as if a ROLLBACK was
performed (unless a different isolation level has been selected: see
set_isolation_level()).
Changed in version 2.2: previously an explicit ROLLBACK was issued by
Psycopg on close(). The command could have been sent to the backend at
an inappropriate time, so Psycopg currently relies on the backend to
implicitly discard uncommitted changes. Some middleware are known to
behave incorrectly though when the connection is closed during a
transaction (when status is STATUS_IN_TRANSACTION), e.g. PgBouncer
reports an unclean server and discards the connection. To avoid this
problem you can ensure to terminate the transaction with a
commit()/rollback() before closing.
So, unless you're using a different isolation level, or using PgBouncer, your first example should work fine. However, if you desire some finer-grained control over exactly what happens during a transaction, then the try/except method might be best, since it parallels the database transaction state itself.