I need to apply a query timeout at a global level in my application. The query: SET SESSION max_execution_time=1 does this with MySQL 5.7. I am using MySQL 5.6 and cannot upgrade at the moment. Any solution with SQL Alchemy would also help.
It seems there is no equivalent to max_execution_time in MySQL prior to versions 5.7.4 and 5.7.8 (the setting changed its name). What you can do is create your own periodic job that checks if queries have exceeded timeout and manually kill them. Unfortunately that is not quite the same as what the newer MySQL versions do: without inspecting the command info you'll end up killing all queries, not just read only SELECT, and it is nigh impossible to control at session level.
One way to do that would be to create a stored procedure that queries the process list and kills as required. Such stored procedure could look like:
DELIMITER //
CREATE PROCEDURE stmt_timeout_killer (timeout INT)
BEGIN
DECLARE query_id INT;
DECLARE done INT DEFAULT FALSE;
DECLARE curs CURSOR FOR
SELECT id
FROM information_schema.processlist
WHERE command = 'Query' AND time >= timeout;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
-- Ignore ER_NO_SUCH_THREAD, in case the query finished between
-- checking the process list and actually killing threads
DECLARE CONTINUE HANDLER FOR 1094 BEGIN END;
OPEN curs;
read_loop: LOOP
FETCH curs INTO query_id;
IF done THEN
LEAVE read_loop;
END IF;
-- Prevent suicide
IF query_id != CONNECTION_ID() THEN
KILL QUERY query_id;
END IF;
END LOOP;
CLOSE curs;
END//
DELIMITER ;
Alternatively you could implement all that in your application logic, but it would require separate round trips to the database for each query to be killed. What's left then is to call this periodically:
# Somewhere suitable
engine.execute(text("CALL stmt_timeout_killer(:timeout)"), timeout=30)
How and where exactly depends heavily on your actual application.
Related
I have the following code that is using MySQLdb for db inserts
self.cursor.execute('START TRANSACTION;')
for item in data:
self.cursor.execute('INSERT INTO...')
self.cursor.execute('COMMIT;')
self.conn.commit()
Is the self.conn.commit() at the end redundant, or does that need to be there?
If you start a transaction you're responsible for calling COMMIT or it'll get unrolled when you close the connection.
As a note it's bad form to include ; in your queries unless you're using an interactive shell. They're not necessary and immediately raise a bunch of questions about how they came to be included there.
The ; delimiter is used by the shell to determine where one command stops and the next starts, something that's not necessary when using code where each statement is supplied as a separate string.
Situation: I have a live trading script which computes all sorts of stuff every x minutes in my main thread (Python). the order sending is performed through such thread. the reception and execution of such orders though is a different matter as I cannot allow x minutes to pass but I need them as soon as they come in. I initialized another thread to check for such data (execution) which is in a database table (POSTGRES SQL).
Problem(s): I cannot continuosly perform query every xx ms, get data from DB, compare table length, and then get the difference for a variety of reasons (not only guy to use such DB, perforamnce issues, etc). so I looked up some solutions and came up with this thread (https://dba.stackexchange.com/questions/58214/getting-last-modification-date-of-a-postgresql-database-table) where basically the gist of it was that
"There is no reliable, authorative record of the last modified time of a table".
Question: what can I do about it, that is: getting near instantenuous responses from a postgres sql table without overloading the whole thing using Python?
You can use notifications in postgresql:
import psycopg2
from psycopg2.extensions import ISOLATION_LEVEL_AUTOCOMMIT
import select
def dblisten(dsn):
connection = psycopg2.connect(dsn)
connection.set_isolation_level(ISOLATION_LEVEL_AUTOCOMMIT)
cur = connection.cursor()
cur.execute("LISTEN new_id;")
while True:
select.select([connection],[],[])
connection.poll()
events = []
while connection.notifies:
notify = connection.notifies.pop().payload
do_something(notify)
and install a trigger for each update:
CREATE OR REPLACE FUNCTION notify_id_trigger() RETURNS trigger AS $$
BEGIN
PERFORM pg_notify('new_id', NEW.ID);
RETURN new;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER data_modified AFTER insert or update on data_table for each row execute procedure notify_id_trigger();")
As part of the upgrade process, our product scripts update a stored procedure for a trigger. There are two daemons running, either of which can update the stored procedure. It seems that PostgrSQL is not serializing the DDL to upgrade the procedure. The exact error is "DB_Cursor: exception in execute: tuple concurrently updated". A Google search yields no exact matches for this error in the search results. It would appear we have a race condition. What is the best approach for avoiding or preventing such an exception? It prevents the upgrade process from succeeding and one or both of the processes (daemons) must be restarted to retry the upgrade and recover. Is there known issue with PostgreSQL? We are running PostgreSQL 9.2.5.
It seems that PostgreSQL is not serializing the DDL to upgrade the
procedure
Yes. This is mentioned from time to time on pgsql mailing lists, for example recently here:
'tuple concurrently updated' error when granting permissions
Excerpt:
We do have such locking for DDL on tables/indexes, but the theory in
the past has been that it's not worth the trouble for objects
represented by single catalog rows, such as functions or roles. You
can't corrupt the database with concurrent updates on such a row,
you'll just get a "tuple concurrently updated" error from all but the
first-to-arrive update.
If you're concurrently replacing functions bodies, this is clearly your problem.
And the proposed solution is:
In the meantime, you could consider using an application-managed
advisory lock if you really need such grants to work transparently.
If by design multiple concurrent clients can decide to perform DDL, then you really should make sure only one of them is doing it. You can do it using advisory locks.
Example in pseudocode:
function try_upgrade(db) {
if ( ! is_upgrade_needed(db) ) {
// we check it before acquiring a lock to speed up a common case of
// no upgrade available
return UPGRADE_NOT_NEEDED;
}
query_result = db->begin_transaction();
if ( query_result < 0 ) throw Error("begin failed");
query_result = db->query(
"select pg_advisory_xact_lock(?)", MAGIC_NUMBER_UPGRADE_LOCK
);
if ( query_result < 0 ) throw Error("pg_advisory_xact_lock failed");
// another client might have performed upgrade between the previous check
// and acquiring advisory lock
if ( ! is_upgrade_needed(db) ) {
query_result = db->rollback_transaction();
return UPGRADE_NOT_NEEDED;
}
perform_upgrade();
query_result = db->commit_transaction();
if ( query_result < 0 ) throw Error("commit failed");
return UPGRADE_PERFORMED;
}
So I'm using psycopg2, I have a simple table:
CREATE TABLE IF NOT EXISTS feed_cache (
feed_id int REFERENCES feeds(id) UNIQUE,
feed_cache text NOT NULL,
expire_date timestamp --without time zone
);
I'm calling the following method and query:
#staticmethod
def get_feed_cache(conn, feed_id):
c = conn.cursor()
try:
sql = 'SELECT feed_cache FROM feed_cache WHERE feed_id=%s AND localtimestamp <= expire_date;'
c.execute(sql, (feed_id,))
result = c.fetchone()
if result:
conn.commit()
return result[0]
else:
print 'DBSELECT.get_feed_cache: %s' % result
print 'sql: %s' % (c.mogrify(sql, (feed_id,)))
except:
conn.rollback()
raise
finally:
c.close()
return None
I've added the else statement to output the exact sql and result that is being executed and returned.
The get_feed_cache() method is called from a database connection thread pool. When the get_feed_cache() method is called "slowishly" (~1/sec or less) the result is returned as expected, however when called concurrently it will occasionally return None. I have tried multiple ways of writing this query & method.
Some observations:
If I remove 'AND localtimestamp <= expire_date' from the query, the query ALWAYS returns a result.
Executing the query rapidly in serial in psql always returns a result.
After reading about the fetch*() methods of psycopg's cursor class they note that the results are cached for the cursor, I'm assuming that the cache is not shared between different cursors. http://initd.org/psycopg/docs/faq.html#best-practices
I have tried using postgresql's now() and current_timestamp functions with the same results. (I am aware of the timezone aspect of now() & current_timestamp)
Conditions to note:
There will NEVER be a case where there is not a feed_cache value for a provided feed_id.
There will NEVER be a case where any value in the feed_cache table is NULL
While testing I have completely disabled any & all writes to this table
I have set the expire_date to be sufficiently far in the future for all values such that the expression 'AND localtimestamp <= expire_date' will always be true.
Here is a copy & pasted output of it returning None:
DBSELECT.get_feed_cache: None
sql: SELECT feed_cache FROM feed_cache WHERE feed_id=5 AND localtimestamp < expire_date;
Well that's pretty much it, I'm not sure what's going on. Maybe I'm making some really dumb mistake and I just don't notice it! My current guess is that it has something to do with psycopg2 and perhaps the way it's caching results between cursors. If the cursors DO share the cache and the queries happen near-simultaneously then it could be possible that the first cursor fetches the result, the second cursor sees there is a cache of the same query, so it does not execute, then the first cursor closes and deletes the cache and the second cursor tries to fetch a now null/None cache.*
That said, psycopg2 states that it's thread-safe for read-only queries, so unless I'm miss-interpreting their implementation of thread-safe, this shouldn't be the case.
Thank you for your time!
*After adding a thread lock for the get_feed_cache, acquiring before creating the cursor and releasing before returning, I still occasionally get a None result
I think this might have to do with the fact that the time stamps returned by localtimestamp or current_timestamp are fixed when the transaction starts, not when you run the statement. And psycopg manages the transactions behind your back to some degree. So you might be getting a slightly older time stamp.
You could debug this by setting log_statement = all in your server and then observing when the BEGIN statements are executed relative to your queries.
You might want to look into using a function such as clock_timestamp(), which updates more often per transaction. See http://www.postgresql.org/docs/current/static/functions-datetime.html.
What is the best way to deal with the
1205 "deadlock victim"
error when calling SQL Server from Python?
The issue arises when I have multiple Python scripts running, and all are attempting to update a table with a MERGE statement which adds a row if it doesn't yet exist (this query will be called millions of times in each script).
MERGE table_name as table // including UPDLOCK or ROWLOCK eventually
// results in deadlock
USING ( VALUES ( ... ) )
AS row( ... )
ON table.feature = row.feature
WHEN NOT MATCHED THEN
INSERT (...)
VALUES (...)
The scripts require immediate access to the table to access the unique id assigned to the row.
Eventually, one of the scripts raises an OperationalError:
Transaction (Process ID 52) was deadlocked on lock resources with
another process and has been chosen as the deadlock victim. Rerun the
transaction.
1) I have tried using a try-except block around the call in Python:
while True:
try:
cur.execute(stmt)
break
except OperationalError:
continue
This approach slows the process down considerably. Also, I think I might be doing this incorrectly (I think I might need to reset the connection...).
2) Use a try-catch in SQL Server (something like below...):
WHILE 1 = 1
BEGIN
BEGIN TRY
MERGE statement // see above
BREAK
END TRY
BEGIN CATCH
SELECT ERROR_NUMBER() AS ErrorNumber
ROLLBACK
CONTINUE
END CATCH;
END
3) Something else?
Thanks for your help. And let me know if you need additional details, etc.
I am using Python 2.7, SQL Server 2008, and pymssql to make the connection.