Proper way to rollback on DB connection fail in Django - python

This is more of a design question than anything else.
Until recently I have been using Django with SQLite in my development environment, but I have now changed to PostgreSQL for production. My app is deployed with Heroku, and after some days I realized that they do random maintenance to the DB and it goes down during a few minutes.
For example, having a model with 3 tables, one Procedure which each of them point to a ProcedureList, and a ProcedureList can have more than one Procedure. A ProcedureUser which links a ProcedureList and a user and sets some specific variables for the user on that ProcedureList. Finally there is a ProcedureState which links a Procedure with its state for an specific user.
On my app, in one of the views I have a function that modifies the DB in the following way:
user = request.user
plist = ProcedureList.objects.get(id=idFromUrl)
procedures = Procedure.objects.filter(ProcedureList=pList)
pUser = ProcedureUser(plist, user, someVariables)
pUser.save()
for procedure in procedures:
pState = ProcedureState(plist, user, pUser, procedure, otherVariables)
pState.save()
So what I'm thinking now, is that if Heroku decides to go into maintenance between those object.save() calls, we will have a problem. The later calls to .save() will fail and the DB will be corrupted. The request by the user will of course fail and there will be no way to rollback the previous insertions, because the connection with the DB is not possible.
My question is, in case of a DB fail (given by Heroku maintenance, network error or whatever), how are we supposed to correctly rollback the DB? Shall we make a list of insertions and wait for DB to go up again to roll them back?
I am using Python 3 and Django 4 but I think this is more of a general question than specific to any platform.

in case of a DB fail (given by Heroku maintenance, network error or whatever), how are we supposed to correctly rollback the DB?
This is solved by databases through atomic transactions [wiki]. An atomic transaction is a set of queries that are committed all or none. It is thus not possible that for such transaction, certain queries are applied whereas others are not.
Django offers a transaction context manager [Django-doc] to perform work in a transaction:
from django.db import transaction
with transaction.atomic():
user = request.user
plist = ProcedureList.objects.get(id=idFromUrl)
procedures = Procedure.objects.filter(ProcedureList=pList)
pUser = ProcedureUser(plist, user, someVariables)
pUser.save()
ProcedureState.objects.bulk_create([
ProcedureState(plist, user, pUser, procedure, otherVariables)
for procedure in procedures
])
At the end of the context block, it will commit the changes. This means that if the database fails in between, the actions will not be committed, and the block will raise an exception (usually an IntegrityError).
Note: Django has a .bulk_create(…) method [Django-doc] to create multiple items with a single database query, minimizing the bandwidth between the database and the application layer. This will usually outperform creating items in a loop.

Related

How do I manually commit a sqlalchemy database transaction inside a pyramid web app?

I have a Pyramid web app that needs to run a Celery task after committing changes to a sqlalchemy database. I know I can do this using request.tm.get().addAfterCommitHook(). However, that doesn't work for me because I also need to use the task_id of the celery task inside the view. Therefore I need to commit changes to the database before I call task.delay() on my Celery task.
The zope.sqlalchemy documentation says that I can manually commit using transaction.commit(). However, this does not work for me; the celery task runs before the changes are committed to the database, even though I called transaction.commit() before I called task.delay()
My Pyramid view code looks like this:
ride=appstruct_to_ride(dbsession,appstruct)
dbsession.add(ride)
# Flush dbsession so ride gets an id assignment
dbsession.flush()
# Store ride id
ride_id=ride.id
log.info('Created ride {}'.format(ride_id))
# Commit ride to database
import transaction
transaction.commit()
# Queue a task to update ride's weather data
from ..processing.weather import update_ride_weather
update_weather_task=update_ride_weather.delay(ride_id)
url = self.request.route_url('rides')
return HTTPFound(
url,
content_type='application/json',
charset='',
text=json.dumps(
{'ride_id':ride_id,
'update_weather_task_id':update_weather_task.task_id}))
My celery task looks like this:
#celery.task(bind=True,ignore_result=False)
def update_ride_weather(self,ride_id, train_model=True):
from ..celery import session_factory
logger.debug('Received update weather task for ride {}'.format(ride_id))
dbsession=session_factory()
dbsession.expire_on_commit=False
with transaction.manager:
ride=dbsession.query(Ride).filter(Ride.id==ride_id).one()
The celery task fails with NoResultFound:
File "/app/cycling_data/processing/weather.py", line 478, in update_ride_weather
ride=dbsession.query(Ride).filter(Ride.id==ride_id).one()
File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3282, in one
raise orm_exc.NoResultFound("No row was found for one()")
When I inspect the database after the fact, I see that the record was in fact created, after the celery task ran and failed. So this means that transaction.commit() did not commit the transaction as expected, but changes were instead committed automatically by the zope.sqlalchemy machinery after the view returned. How do I commit a transaction manually inside my view code?
request.tm is defined by pyramid_tm and could be the threadlocal transaction.manager object or a per-request object, depending on how you've configured pyramid_tm (look for pyramid_tm.manager_hook being defined somewhere to determine which one is being used.
Your question is tricky because whatever you do should fit into pyramid_tm and how it expects things to operate. Specifically it's planning to control a transaction around the lifecycle of the request - committing early is not a good idea with that transaction. pyramid_tm is trying to help to provide a failsafe ability to rollback the entire request if any failures occur anywhere in the request's lifecycle - not just in your view callable.
Option 1:
Commit early anyway. If you're gonna do this then failures after the commit cannot roll back the committed data, so you could have request's partially committed. Ok, fine, that's your question so the answer is to use request.tm.commit() probably followed by a request.tm.begin() to start a new one for any subsequent changes. You'll also need to be careful to not share sqlalchemy managed objects across that boundary, like request.user, etc as they need to be refreshed/merged into the new transaction (SQLAlchemy's identity cache cannot trust data loaded from a different transaction by default because that's just how isolation levels work).
Option 2:
Start a separate transaction just for the data you want to commit early. Ok, so assuming you're not using any threadlocals like transaction.manager, or scoped_session then you can probably start your own transaction and commit it, without touching the dbsession that is being controlled by pyramid_tm. Some generic code that works with the pyramid-cookiecutter-starter project structure could be:
from myapp.models import get_tm_session
tmp_tm = transaction.TransactionManager(explicit=True)
with tmp_tm:
dbsession_factory = request.registry['dbsession_factory']
tmp_dbsession = get_tm_session(dbsession_factory, tmp_tm)
# ... do stuff with tmp_dbsession that is committed in this with-statement
ride = appstruct_to_ride(tmp_dbsession, appstruct)
# do not use this ride object outside of the with-statement
tmp_dbsession.add(ride)
tmp_dbsession.flush()
ride_id = ride.id
# we are now committed so go ahead and start your background worker
update_weather_task = update_ride_weather.delay(ride_id)
# maybe you want the ride object outside of the tmp_dbsession
ride = dbsession.query(Ride).filter(Ride.id==ride_id).one()
return {...}
This isn't bad - probably about the best you can do as far as failure-modes go without hooking celery into the pyramid_tm-controlled dbsession.

Having issues doing fast enough db inserts inside a Flask endpoint

I have an HTTP POST endpoint in Flask which needs to insert whatever data comes in into a database. This endpoint can receive up to hundreds of requests per second. Doing an insert every time a new request comes takes too much time. I have thought that doing a bulk insert every 1000 request with all the previous 1000 request data should work like some sort of caching mechanism. I have tried saving 1000 incoming data objects into some collection and then doing a bulk insert once the array is 'full'.
Currently my code looks like this:
#app.route('/user', methods=['POST'])
def add_user():
firstname = request.json['firstname']
lastname = request.json['lastname']
email = request.json['email']
usr = User(firstname, lastname, email)
global bulk
bulk.append(usr)
if len(bulk) > 1000:
bulk = []
db.session.bulk_save_objects(bulk)
db.session.commit()
return user_schema.jsonify(usr)
The problem I'm having with this is that the database becomes 'locked', and I really don't know if this is a good solution but just poorly implemented, or a stupid idea.
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) database is locked
Your error message indicates that you are using an sqlite DB with SQLAlchemy. You may want to try changing the setting of the sqlite "synchronous" flag to turn syncing OFF. This can speed INSERT queries up dramatically, but it does come with the increased risk of data loss. See https://sqlite.org/pragma.html#pragma_synchronous for more details.
With synchronous OFF (0), SQLite continues without syncing as soon as
it has handed data off to the operating system. If the application
running SQLite crashes, the data will be safe, but the database might
become corrupted if the operating system crashes or the computer loses
power before that data has been written to the disk surface. On the
other hand, commits can be orders of magnitude faster with synchronous
OFF
If your application and use case can tolerate the increased risks, then disabling syncing may negate the need for bulk inserts.
See "How to set SQLite PRAGMA statements with SQLAlchemy": How to set SQLite PRAGMA statements with SQLAlchemy
Once I moved the code on AWS and used the Aurora instance as the database, the problems went away, so I suppose it's safe to conclude that the issue were solely related to my sqlite3 instance.
The final solution gave me satisfactory results and I ended up changing only this line:
db.session.bulk_save_objects(bulk)
to this:
db.session.save_all(bulk)
I can now safely do up to 400 or more (haven't tested for more) calls on that specific endpoints, all ending with valid inserts, per second.
Not an expert on this, but seems like database has reached its concurrency limits. You can try using Pony for better concurrency and transaction management
https://docs.ponyorm.org/transactions.html
By default Pony uses the optimistic concurrency control concept for increasing performance. With this concept, Pony doesn’t acquire locks on database rows. Instead it verifies that no other transaction has modified the data it has read or is trying to modify.

Would using transactions in a Celery task in Django application cause problems?

I have a set of celery tasks that I've written. Each of these tasks take a — just an example — author id as a parameter and for each of the books for the author, it fetches the latest price and stores it in the database.
I'd like to add transactions to my task by adding Django's
#transaction.commit_on_success decorator to my tasks. If any task crashes, I'd like the whole task to fail and nothing to be saved to the database.
I have a dozen or so celery workers that check the prices of books for a author and I'm wondering if this simple transactional logic would cause locking and race conditions in my Postgres database.
I've dug around and found this project called django-celery-transactions but I still haven't understood the real issue behind this and what this project tried to solve.
The reasoning is that in your Django view the DB transaction is not committed until the view has exited if you apply the decorator. Inside the view before it returns and triggers the commit you may invoke tasks that expect the DB transaction to already be committed i.e. for those entries to exist in the DB context.
In order to guard against this race condition (task starting before your view and consequently transaction finished) you can either manually manage it or use the module you mentioned which handles it automatically for you.
The example where it might fail for instance in your case is if you are adding a new author and you have a task that fetches prices for all/any of its books. Should the task execute before the commit for the new author transaction is done, your task will try to fetch Author with an id that does not yet exist.
It depends on several things including: the transaction isolation level of your database, how frequently you check for price updates, and how often you expect prices to change. If, for example, you were making a very large number of updates per second to stock standard PostgreSQL, you might get different results executing the same select statement multiple times in a transaction.
Databases are optimized to handle concurrency so I don't think this is going to be a problem for you; especially if you don't open the transaction until after fetching prices (i.e. use a context manager rather than decorating the task). If — for some reason — things get slow in the future, optimize then (fetch prices less frequently, tweak database configuration, etc.).
As for you other question: django-celery-transactions aims to prevent race conditions between Django and Celery. One example is if you were to pass the primary key of a newly created object to a task: the task may attempt to retrieve the object before the view's transaction has been committed. Boom!

SQLAlchemy/Pyramid DBSession refresh issue

Here's my scenario:
First view renders form, data goes to secend view, where i store it in DB (MySQL) and redirects to third view which shows what was written to db:
Stoing to db:
DBSession.add(object)
transaction.commit()
DB Session:
DBSession = scoped_session(sessionmaker(expire_on_commit=False,
autocommit=False,
extension=ZopeTransactionExtension()))
After that when I refresh my page several time sometimes I can see DB change, sometimes not, one time old data, second time new and so on...
When I restart server (locally, pserve) DB data is up-to-date.
Maybe it's a matter of creating session?
Check MySQL's transaction isolation level.
The default for InnoDB is REPEATABLE READ: "All consistent reads within the same transaction read the snapshot established by the first read."
You can specify the isolation level in the call to create_engine. See the SQLAlchemy docs.
I suggest you try the READ COMMITTED isolation level and see if that fixes your problem.
It's not clear exactly what your transaction object is or how it connects to the SQLAlchemy database session. I couldn't see anything about transactions in the Pyramid docs and I don't see anything in your code that links your transaction object to your SQLAlchemy session so maybe there is some configuration missing. What example are you basing this code on?
Also: the sessionmaker call is normally done at file score to create a single session factory, which is then used repeatedly to create session objects from the same source. "the sessionmaker() function is normally used to create a top level Session configuration which can then be used throughout an application without the need to repeat the configurational arguments."
It may be the case that since you are creating multiple session factories that there is some data that is supposed to be shared across sessions but is actually not shared because it is created once per factory. Try just calling sessionmaker once and see if that makes a difference.
I believe that your issue is likely to be a persistent session. By default, Pyramids expires all objects in the session after a commit -- this means that SQLA will fetch them from the database the next time you want them, and they will be fresh.
You have overridden this default by indicating "expire_on_commit=False" -- so, make sure that after committing a change you call session.expire_all() if you intend for that session object to grab fresh data on subsequent requests. (The session object is the same for multiple requests in Pyramid, but you aren't guaranteed to get the same thread-scoped session) I recommend not setting expire on commit to false, or using a non-global session: see http://docs.pylonsproject.org/projects/pyramid_cookbook/en/latest/database/sqlalchemy.html#using-a-non-global-session
Alternatively, you could make sure you are expiring objects when necessary, knowing that unexpired objects will stay in memory the way they are and will not be refreshed, and may differ from the same object in a different thread-scoped session.
The problem is that you're setting expire_on_commit=False. If you remove that, it should work. You can read more about what it does on http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html#sqlalchemy.orm.session.Session.commit

SQLAlchemy trying to access a table in the wrong database

We use CherryPy and SQLAlchemy to build our web app and everything was fine until we tested with 2 concurrent users - then things started to go wrong! Not very good for a web app so I'd be very appreciative if anyone could shine some light on this.
TL;DR
We're getting the following error about 10% of the time when two users are using our site (but accessing different databases) at the same time:
ProgrammingError: (ProgrammingError) (1146, "Table 'test_one.other_child_entity' doesn't exist")
This table is not present in that database so the error makes sense but the problem is that SQLAlchemy shouldn't be looking for the table in that database.
I have reproduced the error in an example here https://gist.github.com/1729817
Explanation
We're developing an application that is very dynamic and is based on the entity_name pattern found at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/EntityName
We've since grown that idea so that it stores entities in different databases depending on what user you're logged in as. This is because each user in the system has their own database and can create their own entities (tables). To do this we extend a base entity for each database and then extend that new entity for each additional entity they create in their database.
When the app starts we create a dictionary containing the engine, metadata, classes and tables of all these databases and reflect all of the metadata. When a user logs in they get access to one.
When two users are accessing the site at the same time something is going wrong and SQLAlchemy ends up looking for tables in the wrong database. I guess this is to do with threading but as far as I can see we are following all the rules when it comes to sessions (CP and SQLA), engines, metadata, tables and mappers.
If anyone could give my example (https://gist.github.com/1729817) a quick glance over and point out any glaring problems that would be great.
Update
I can fix the problem by changing the code to use my own custom session router like so:
# Thank you zzzeek (http://techspot.zzzeek.org/2012/01/11/django-style-database-routers-in-sqlalchemy/)
class RoutingSession(Session):
def get_bind(self, mapper = None, clause = None):
return databases[cherrypy.session.get('database')]['engine']
And then:
Session = scoped_session(sessionmaker(autoflush = True, autocommit = False, class_ = RoutingSession))
So just hard-coding it to return the engine that's linked to the database that's set in the session. This is great news but now I want to know why my original code didn't work. Either I'm doing it wrong or the following code is not completely safe:
# Before each request (but after the session tool)
def before_request_body():
if cherrypy.session.get('logged_in', None) is True:
# Configure the DB session for this thread to point to the correct DB
Session.configure(bind = databases[cherrypy.session.get('database')]['engine'])
I guess that the binding that's happening here was being overwritten by the user in the other thread which is strange because I thought scoped_session was all about thread-safety?

Categories

Resources