Mechanical Turk 'rollback' on HIT Creation - python

Here is the code I currently have:
with transaction.commit_manually():
try:
m.update_accepted_url(episode_id)
m.create_hit()
m.do_insert()
transaction.commit()
except:
transaction.rollback()
Now, what happens if the database operations fail -- and that rollback, but the create_hit goes through successfully? Is there a way to wrap the create_hit operation in something like a transaction, so if the db operations fail, that fails too?

You can add a unique token for your request, to avoid duplicates:
http://docs.aws.amazon.com/AWSMechTurk/latest/AWSMturkAPI/ApiReference_CreateHITOperation.html

Related

SQLAclhemy, auroa-serverless invalid transaction issue on commit (aurora_data_api.exceptions.DatabaseError)

I'm using the sqlalchemy-aurora-data-api to connect to aurora-postgresql-serverless, with SQLalchemy as an ORM.
For the most part, this has been working fine, but I keep hitting unexpected errors from the aurora_data_api (which sqlalchemy-aurora-data-api is built upon) during commits.
I've tried to handle this in the application logic by catching the exception and re-trying, however, this is still failing:
from aurora_data_api.exceptions import DatabaseError
from botocore.exceptions import ClientError
def handle_invalid_transaction_id(func):
retries = 3
#wraps(func)
def inner(*args, **kwargs):
for i in range(retries):
try:
return func(*args, **kwargs)
except (DatabaseError, ClientError):
if i != retries:
# The aim here is to try and force a new transaction
# If an error occurs and retry
db.session.close()
else:
raise
return inner
And then in my models doing something like this:
class MyModel(db.Model):
#classmethod
#handle_invalid_transaction_id
def create(cls, **kwargs):
instance = cls(**kwargs)
db.session.add(instance)
db.session.commit()
db.session.close()
return kwargs
However, I keep hitting unpredictable transaction failures:
DatabaseError: (aurora_data_api.exceptions.DatabaseError) An error occurred (BadRequestException) when calling the ExecuteStatement operation: Transaction AXwQlogMJsPZgyUXCYFg9gUq4/I9FBEUy1zjMTzdZriEuBCF44s+wMX7+aAnyyJH/6arYcHxbCLW73WE8oRYsPMN17MOrqWfUdxkZRBrM/vBUfrP8FKv6Phfr6kK6o7/0mirCtRJUxDQAQPotaeP+hHj6/IOGUCaOnodt4M3015c0dAycuqhsy4= is not found [+26ms]
It is worth noting that these are not particularly long-running transactions, so I do not think that I'm hitting the transaction expiry issue that can occur with aurora-serverless as documented here.
Is there something fundamentally wrong with my approach to this or is there a better way to handle transactions failures when they occur?
Just to close this off, and in case it helps anyone else, found the issue was in the transactions that were being created by in the cursor here
I can't answer the why, but we noticed that transactions were expiring despite the fact the data successfully committed. e.g:
request 1 - creates a bunch of transactions, write data, exits.
request 2 - creates a bunch of transactions, some transaction id for request 1 fails, exits.
So yeah, I don't think the issue is with the aurora-data-api, but somehow to do with transaction mgmt in general in aurora-serverless. In the end, we forked the repo and refactored so that everything is handled with ExecuteStatment calls rather than using transactions. It's been working fine so far (note we're using SQLalchemy so transactions are handled at the ORM level anyway).

Python and MySQL: If catch deadlock error, is everything rolled back?

I'd like to catch a MySQL deadlock error and then retry the failed query. But, do I have to redo every query since the transaction started, or just the one in the try/catch? I'm not sure whether the deadlock error causes everything to be rolled back.
This is performed in Python using raw mysql queries.
insert into table_1 values...
insert_into_table_2 values ...
try:
delete from table_1 where ...
except: # set to catch deadlock error
# Can I just retry the delete statement, or do I also have to do the inserts again?
# commits at end
The whole transaction is rolled back.
Here's the relevant and helpful MySQL documentation:
https://dev.mysql.com/doc/refman/5.5/en/innodb-deadlock-detection.html
https://dev.mysql.com/doc/refman/5.5/en/innodb-deadlocks.html

Better approach to handling sqlalchemy disconnects

We've been experimenting with sqlalchemy's disconnect handling, and how it integrates with ORM. We've studied the docs, and the advice seems to be to catch the disconnect exception, issue a rollback() and retry the code.
eg:
import sqlalchemy as SA
retry = 2
while retry:
retry -= 1
try:
for name in session.query(Names):
print name
break
except SA.exc.DBAPIError as exc:
if retry and exc.connection_invalidated:
session.rollback()
else:
raise
I follow the rationale -- you have to rollback any active transactions and replay them to ensure a consistent ordering of your actions.
BUT -- this means a lot of extra code added to every function that wants to work with data. Furthermore, in the case of SELECT, we're not modifying data and the concept of rollback/re-request is not only unsightly, but a violation of the principle of DRY (don't repeat yourself).
I was wondering if others would mind sharing how they handle disconnects with sqlalchemy.
FYI: we're using sqlalchemy 0.9.8 and Postgres 9.2.9
The way I like to approach this is place all my database code in a lambda or closure, and pass that into a helper function that will handle catching the disconnect exception, and retrying.
So with your example:
import sqlalchemy as SA
def main():
def query():
for name in session.query(Names):
print name
run_query(query)
def run_query(f, attempts=2):
while attempts > 0:
attempts -= 1
try:
return f() # "break" if query was successful and return any results
except SA.exc.DBAPIError as exc:
if attempts > 0 and exc.connection_invalidated:
session.rollback()
else:
raise
You can make this more fancy by passing a boolean into run_query to handle the case where you are only doing a read, and therefore want to retry without rolling back.
This helps you satisfy the DRY principle since all the ugly boiler-plate code for managing retries + rollbacks is placed in one location.
Using exponential backoff (https://github.com/litl/backoff):
#backoff.on_exception(
backoff.expo,
sqlalchemy.exc.DBAPIError,
factor=7,
max_tries=3,
on_backoff=lambda details: LocalSession.get_main_sql_session().rollback(),
on_giveup=lambda details: LocalSession.get_main_sql_session().flush(), # flush the session
logger=logging
)
def pessimistic_insertion(document_metadata):
LocalSession.get_main_sql_session().add(document_metadata)
LocalSession.get_main_sql_session().commit()
Assuming that LocalSession.get_main_sql_session() returns a singleton.

use try/except with psycopg2 or "with closing"?

I'm using Psycopg2 in Python to access a PostgreSQL database. I'm curious if it's safe to use the with closing() pattern to create and use a cursor, or if I should use an explicit try/except wrapped around the query. My question is concerning inserting or updating, and transactions.
As I understand it, all Psycopg2 queries occur within a transaction, and it's up to calling code to commit or rollback the transaction. If within a with closing(... block an error occurs, is a rollback issued? In older versions of Psycopg2, a rollback was explicitly issued on close() but this is not the case anymore (see http://initd.org/psycopg/docs/connection.html#connection.close).
My question might make more sense with an example. Here's an example using with closing(...
with closing(db.cursor()) as cursor:
cursor.execute("""UPDATE users
SET password = %s, salt = %s
WHERE user_id = %s""",
(pw_tuple[0], pw_tuple[1], user_id))
module.rase_unexpected_error()
cursor.commit()
What happens when module.raise_unexpected_error() raises its error? Is the transaction rolled back? As I understand transactions, I either need to commit them or roll them back. So in this case, what happens?
Alternately I could write my query like this:
cursor = None
try:
cursor = db.cursor()
cursor.execute("""UPDATE users
SET password = %s, salt = %s
WHERE user_id = %s""",
(pw_tuple[0], pw_tuple[1], user_id))
module.rase_unexpected_error()
cursor.commit()
except BaseException:
if cursor is not None:
cursor.rollback()
finally:
if cursor is not None:
cursor.close()
Also I should mention that I have no idea if Psycopg2's connection class cursor() method could raise an error or not (the documentation doesn't say) so better safe than sorry, no?
Which method of issuing a query and managing a transaction should I use?
Your link to the Psycopg2 docs kind of explains it itself, no?
... Note that closing a connection without committing the changes first will
cause any pending change to be discarded as if a ROLLBACK was
performed (unless a different isolation level has been selected: see
set_isolation_level()).
Changed in version 2.2: previously an explicit ROLLBACK was issued by
Psycopg on close(). The command could have been sent to the backend at
an inappropriate time, so Psycopg currently relies on the backend to
implicitly discard uncommitted changes. Some middleware are known to
behave incorrectly though when the connection is closed during a
transaction (when status is STATUS_IN_TRANSACTION), e.g. PgBouncer
reports an unclean server and discards the connection. To avoid this
problem you can ensure to terminate the transaction with a
commit()/rollback() before closing.
So, unless you're using a different isolation level, or using PgBouncer, your first example should work fine. However, if you desire some finer-grained control over exactly what happens during a transaction, then the try/except method might be best, since it parallels the database transaction state itself.

SQLAlchemy autocommiting?

I have an issue with SQLAlchemy apparently committing. A rough sketch of my code:
trans = self.conn.begin()
try:
assert not self.conn.execute(my_obj.__table__.select(my_obj.id == id)).first()
self.conn.execute(my_obj.__table__.insert().values(id=id))
assert not self.conn.execute(my_obj.__table__.select(my_obj.id == id)).first()
except:
trans.rollback()
raise
I don't commit, and the second assert always fails! In other words, it seems the data is getting inserted into the database even though the code is within a transaction! Is this assessment accurate?
You're right in that changes aren't get commited to DB. But they are auto-flushed by SQLAlchemy when you perform query, in your case flush is performed on lines with asserts. So if you will not explicitly call commit you will never see these changes in DB, within real data. However, you will get them back as long as you use the same conn object.
You can pass autoflush=False to session constructor do disable this behavior.

Categories

Resources