django: Proper way to recover from IntegrityError

django: Proper way to recover from IntegrityError - python

What's the proper way to recover from an IntegrityError, or any other errors that could leave my transactions screwed up without using manual transaction control?
In my application, I'm running into problems with IntegrityErrors that I want to recover from, that screw up later database activity, leaving me with:
DatabaseError: current transaction is aborted, commands ignored until end of transaction block`
for all database activity after ignoring IntegrityErrors.
This block of code should reproduce the error I'm seeing
from django.db import transaction
try:
MyModel.save() # Do a bad save that will raise IntegrityError
except IntegrityError:
pass
MyModel.objects.all() # raises DatabaseError: current transaction is aborted, commands ignored until end of transaction block
According to the docs, the solution to recover from an IntegrityError is by rolling back the transaction. But the following code results in a TransactionManagementError.
from django.db import transaction
try:
MyModel.save()
except IntegrityError:
transaction.rollback() # raises TransactionManagementError: This code isn't under transaction management
MyModel.objects.all() # Should work
EDIT: I'm confused by the message from the TransactionManagementError, because if in my except I do a:
connection._cursor().connection.rollback()
instead of the django transaction.rollback(), the MyModel.objects.all() succeeds, which doesn't make sense if my code "isn't under transaction management". It also doesn't make sense that code that isn't under transaction management (which I assume means it's using autocommit), can have transactions that span multiple queries.
EDIT #2: I'm aware of using manual transaction control to be able to recover from these errors, but shouldn't I be able to recover without manual transaction control? My understanding is that if I'm using autocommit, there should only be one write per transaction, so it should not affect later database activity.
EDIT #3: This is a couple years later, but in django 1.4 (not sure about later versions), another issue here was that Model.objects.bulk_create() doesn't honor autocommit behavior.
Versions:
Django: 1.4 (TransactionMiddleWare is not enabled)
Python: 2.7
Postgres: 9.1

Django's default commit mode is AutoCommit. In order to do rollback, you need to wrap the code doing the work in a transaction. [docs]
with transaction.commit_on_success():
# Your code here. Errors will auto-rollback.
To get database level autocommit, you will require the following option in your DATABASES settings dictionary.
'OPTIONS': {'autocommit': True,}
Alternately, you can use explicit savepoints to roll back to. [docs]
#transaction.commit_manually
def viewfunc(request):
a.save()
# open transaction now contains a.save()
sid = transaction.savepoint()
b.save()
# open transaction now contains a.save() and b.save()
if want_to_keep_b:
transaction.savepoint_commit(sid)
# open transaction still contains a.save() and b.save()
else:
transaction.savepoint_rollback(sid)
# open transaction now contains only a.save()
transaction.commit()

Related

SQLAlchemy/Postgres: Intermittent Error Serializing Object After Commit

I have a Flask application that uses SQLAlchemy (with some Marshmallow for serialization and deserialization).
I'm currently encountering some intermittent issues when trying to dump an object post-commit.
To give an example, let's say I have implemented a (multi-tenant) system for tracking system faults of some sort. This information is contained in a fault table:
class Fault(Base):
__tablename__ = "fault"
fault_id = Column(BIGINT, primary_key=True)
workspace_id = Column(Integer, ForeignKey('workspace.workspace_id'))
local_fault_id = Column(Integer)
name = Column(String)
description = Column(String)
I've removed a number of columns in the interest of simplicity, but this is the core of the model. The columns should be largely self explanatory, with workspace_id effectively representing tenant, and local_fault_id representing a tenant-specific fault sequence number, which is handled via a separate fault_sequence table.
This fault_sequence table holds a counter against workspace, and is updated by means of a simple on_fault_created() function that is executed by a trigger:
CREATE TRIGGER fault_created
AFTER INSERT
ON "fault"
FOR EACH ROW
EXECUTE PROCEDURE on_fault_created();
So - the problem:
I have a Flask endpoint for fault creation, where we create an instance of a Fault entity, add this via a scoped session (session.add(fault)), then call session.commit().
It seems that this is always successful in creating the desired entities in the database, executing the sequence update trigger etc. However, when I then try to interrogate the fault object for updated fields (after commit()), around 10% of the time I find that each key/field just points to an Exception:
psycopg2.errors.InFailedSqlTransaction: current transaction is aborted, commands ignored until end of transaction block
Which seems to boil down to the following:
(psycopg2.errors.InvalidTextRepresentation) invalid input syntax for integer: ""
[SQL: SELECT fault.fault_id AS fault_fault_id, fault.workspace_id AS fault_workspace_id, fault.local_fault_id AS fault_local_fault_id, fault.name as fault_name, fault.description as fault_description
FROM fault
WHERE fault.fault_id = %(param_1)s]
[parameters: {'param_1': 166}]
(Background on this error at: http://sqlalche.me/e/13/2j8
My question, then, is what do we think could be causing this?
I think it smells like a race condition, with the update trigger not being complete before SQLAlchemy has tried to get the updated data; perhaps local_fault_id is null, and this is resulting in the invalid input syntax error.
That said, I have very low confidence on this. Any guidance here would be amazing, as I could really do with retrieving that sequence number that's incremented/handled by the update trigger.
Thanks
Edit 1:
Some more info:
I have tried removing the update trigger, in the hope of eliminating that as a suspect. This behaviour is still intermittently evident, so I don't think it's related to that.
I have tried adopting usage of flush and refresh before the commit, and this allows me to get the values that I need - though commit still appears to 'break' the fault object.
Edit 2:
So it really seems to be more postgres than anything else. When I interrogate my database logs, this is the weirdest thing. I can copy and paste the command it says is failing, and I struggle to see how this integer value in the WHERE clause is possibly evaluating to an empty string.
This same error is reproducible with SELECT ... FROM fault WHERE fault.fault_id = '', which in no way seems to be the query making to the DB.
I am stumped.

Your sentence "This same error is reproducible with SELECT ... FROM fault WHERE fault.fault_id = '', which in no way seems to be the query making to the DB." seems to indicate that you are trying to access an object that does not have the database primary key "fault_id".
I guess, given that you did not provide the code, that you are adding the object to your session (session.add), committing (session.commit) and then using the object. As fault_id is autogenerated by the database, the fault object in the session (in memory) does not have fault_id.
I believe you can correct this with:
session.add(fault)
session.commit()
session.refresh(fault)
The refresh needs to be AFTER commit to refresh the fault object and retrieve fault_id.
If you are using async, you need
session.add(fault)
await session.commit()
await session.refresh(fault)

How to fetch warnings from a django mysql query?

I'm trying to get a list of warnings after a mySQL query, using Django admin. I can see from the documentation here that it's possible to record warnings by setting connection.get_warnings to true. But I can't find anything explaining how to read those warnings.
I do not want to throw an exception - I am deleting items from the database using a DELETE IGNORE statement and want to get all instances of deletions that failed (due to external keys, etc.)
I've tried returning the result of the execute function itself (just gave me a number) and calling fetchwarnings() on the cursor (threw a "Cursor object has no attribute fetchwarnings" error).
I'm still new to both Python and Django. I'm looking through all the documentation I can find but can't find anything that works.
from django.db import connection
query = "{query here}"
connection.get_warnings = True
with connection.cursor() as cursor:
cursor.execute(query) <-- Returns a number
return cursor.fetchwarnings() <-- Throws an error

Missing table name in IntegrityError (Django ORM)

I am missing the table name in IntegrityError of Django:
Traceback (most recent call last):
...
return self.cursor.execute(sql, params)
File ".../django/db/utils.py", line 94, in __exit__
six.reraise(dj_exc_type, dj_exc_value, traceback)
File ".../django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
IntegrityError: null value in column "manager_slug" violates not-null constraint
DETAIL: Failing row contains (17485, null, 2017-10-10 09:32:19, , 306).
Is there a way to see which table the INSERT/UPDATE is accessing?
We use PostgreSQL 9.6.
This is a generic question: How to get a better error message?
This is not a question about this particular column. I found the relevant table and column very soon. But I want to improve the error message which is from our CI system. The next time I want to see the table name immediately.
I know that I could reveal the missing information easily with a debugger if I see this error during software development. But in my case this happens in production, and I have only the stacktrace like above.

The exception message in this traceback is the original message from the database driver. It is useful to know this and the traceback if anything is googled, reported etc.
The exception class is the same django.db.utils.IntegrityError for all backends, but the message or rather arguments depend on the backend:
postgres: null value in column "manager_slug" violates not-null constraint\n DETAILS...\n
mysql . . : (1048, "Column 'manager_slug' cannot be null")
sqlite3 . : NOT NULL constraint failed: appname_modelname.manager_slug
The table name is visible only with sqlite3 backend. Some backends use only a string argument of exception, but mysql uses two arguments: a numeric error code and a message. (I like to accept that it is a general question, not only PostgreSQL.) Authors of some backends expect that the author of the app will know the table name directly or from SQL, but it is not true with general ORM packages. There is no preferable and generally acceptable way, how to extend the message even if it can be done technically perfect.
Development and debugging are easy:
Much additional information is available in DEBUG mode in development ("SQL" in the last frame or a class name of an object on a line like "myobj.save()")
python manage.py test --debug-sql: "Prints logged SQL queries on failure."
The same error in development/tests with sqlite3 is easier readable.
...but you are probably asking for a run-time error in production.
I guess about your possible intention in a so general question, what direction could be interesting for you.
A) The most important information from the traceback is usually a few lines above the many lines with ".../django/db/...". It is perfectly easy for a guru. It can be used very probably if the code is not so dynamic and general like a Django admin site, where no code near myobj.save() call (neither in parent frames) contains an explicit model name. Example:
# skip some initial universal code in ".../django/..."
...
# our apps start to be interesting... (maybe other installed app)
...
# START HERE: Open this line in the editor. If the function is universal, jump to the previous.
File ".../me/app/...py", line 47, in my...
my_obj.save()
# skip many stack frames .../django/db/... below
File ".../django/db/models/base.py", line 734, in save
# self.save_base(... # this line 733 is not visible
force_update=force_update, update_fields=update_fields)
...
# interesting only sql and params, but not visible in production
File ".../django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
IntegrityError (or DataError similarly)...
B) Catch the information by a common ancestor of your models
class ...(models.Model):
def save(self, *args, **wkargs):
try:
super(..., self).save(*args, **wkargs)
except django.db.utils.IntegrityError as exc:
new_message = 'table {}'.format(self._meta.db_table)
exc.extra_info = new_message
# this is less compatible, but it doesn't require additional reading support
# exc.args = exc.args + (new_message,)
reraise
This could complicate debugging with multiple inheritance.
C) An implementation in Django db would be better, but I can not imagine that it will be accepted and not reverted after some issue.

if you can create sql function you can try:
create function to get last sequence value get_sequence_last_value (original post)
CREATE FUNCTION public.get_sequence_last_value(name) RETURNS int4 AS '
DECLARE
ls_sequence ALIAS FOR $1;
lr_record RECORD;
li_return INT4;
BEGIN
FOR lr_record IN EXECUTE ''SELECT last_value FROM '' || ls_sequence LOOP
li_return := lr_record.last_value;
END LOOP;
RETURN li_return;
END;' LANGUAGE 'plpgsql' VOLATILE;
after it get table with sequence more then in error stack, and has column manager_slug
SELECT table_name, column_name
FROM information_schema.columns
WHERE table_name in (
SELECT table_name
FROM (
SELECT table_name,
get_sequence_last_value(
substr(column_default, 10, strpos(column_default, '::regclass') - 11)
) as lv
FROM information_schema.columns
WHERE column_default LIKE 'nextval%'
) as t_seq_lv
WHERE lv > 17485
)
AND column_name = 'manager_slug';
i understand that the solution not full, but any way i hope it can help you

I would suggest to use Sentry (https://sentry.io/welcome/). In Sentry Issues you can observe all local variables for all parts of a stack trace.

The best solution that I have found to your problem is overwriting DataBaseErrorWrapper method, to do that go to \django\db\utils.py and in the line 86 replace dj_exc_value = dj_exc_type(*exc_value.args) for:
if exec_value.diag:
a, b = exc_value.args + ("In the table '%s'" % exc_value.diag.table_name,)
dj_exc_value = dj_exc_type(a + b)
else:
dj_exc_value = dj_exc_type(*exc_value.args)
and if the IntegrityError appears, the message should work
django.db.utils.IntegrityError: null value in column "signature" violates not-null constraint
DETAIL: Failing row contains (89, null).
In the table 'teachers_signature'
I am not sure but this should work with these exceptions:
DataError
OperationalError
IntegrityError
InternalError
ProgrammingError
NotSupportedError
It works for me, tell me if works to you. Remember edit the file with the Django folder that you are working

You miss set value at column "manager_slug". You cant set NULL value at this column. You should set value or remove not-null condition.
IntegrityError: null value in column "manager_slug" violates not-null constraint

Is there a way to see which table the INSERT/UPDATE is accessing?
If you're running migrate, you could just visit up the stdout and check which migration is being executed. Then you can open the migration file and have a better overview of the issue.
If you don't i guess you want to take a look at PEP249 for optional error handling. as django database wrappper is based on PEP249 specification.
References on django code within DatabaseErrorWrapper
2nd EDIT:
you could catch the integrity error and access the .messages attribute from the database wrapper.
pseudo example:
try:
# operation on database
except IntegrityError as ie:
print(ie.wrapper.cursor.messages[:])

I tend to use as less dependence as possible and leave 3rd party libs as they are.
Most of the time, i would log every sql insert/update/delete query. In this way, i can identify which query went wrong easily without extra effort. This allows me to track who did what at when in my system.
That happens to meet part of regulatory requirement on actions tracking in my industry.

Control Atomic Transactions in Django

I have a simple library application. In order to force 3 actions to commit as one action, and rollback if any of the actions fail, I made the following code changes:
In settings.py:
AUTOCOMMIT=False
In forms.py
from django.db import IntegrityError, transaction
class CreateLoan(forms.Form):
#Fields...
def save(self):
id_book = form.cleaned_data.get('id_book', None)
id_customer = form.cleaned_data.get('id_customer', None)
start_date = form.cleaned_data.get('start_date', None)
book = Book.objects.get(id=id_book)
customer = Customer.objects.get(id=id_customer)
new_return = Return(
book=book
start_date=start_date)
txn=Loan_Txn(
customer=customer,
book=book,
start_date=start_date
)
try
with transaction.atomic():
book.update(status="ON_LOAN")
new_return.save(force_insert=True)
txn.save(force_insert=True)
except IntegrityError:
raise forms.ValidationError("Something occured. Please try again")
Am I still missing anything with regards to this? I'm using Django 1.9 with Python 3.4.3 and the database is MySQL.

You're using transaction.atomic() correctly (including putting the try ... except outside the transaction) but you should definitely not be setting AUTOCOMMIT = False.
As the documentation states, you set that system-wide setting to False when you want to "disable Django’s transaction management"—but that's clearly not what you want to do, since you're using transaction.atomic()! More from the documentation:
If you do this, Django won’t enable autocommit, and won’t perform any commits. You’ll get the regular behavior of the underlying database library.
This requires you to commit explicitly every transaction, even those started by Django or by third-party libraries. Thus, this is best used in situations where you want to run your own transaction-controlling middleware or do something really strange.
So just don't do that. Django will of course disable autocommit for that atomic block and re-enable it when the block finishes.

Delete Query in web2py not working

This is the function for deleting a record from database.
def pro_del():
d = request.get_vars.d
db(db.products.product_id == d).delete()
session.flash = "Product Deleted"
redirect(URL('default','index'))
#return locals()
The id is successfully getting passed to the function by get_vars(means d is getting its value). I checked it by returning locals.
The redirection is also working fine. Its also flashing the message.
Just the query is not working. The record is not getting deleted from the database.
Note:'d' is alphanumeric here

From web2py's DAL documentation:
No create, drop, insert, truncate, delete, or update operation is actually committed until web2py issues the commit command. In models, views and controllers, web2py does this for you, but in modules you are required to do the commit.
Have you tried db.commit() after your .delete() ?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

django: Proper way to recover from IntegrityError - python

Related

SQLAlchemy/Postgres: Intermittent Error Serializing Object After Commit

How to fetch warnings from a django mysql query?

Missing table name in IntegrityError (Django ORM)

Control Atomic Transactions in Django

Delete Query in web2py not working

Categories

Resources