how to handle exceptions in alembic migrations? - python

I have an alembic migration script and I want to add some exception handling but not sure what is the best practice.
Basically, I have several issues to handle:
A change was already made and not needed (e.g. if I try to
add_column, and this column already exists, I want it to continue)
A table is locked (if I try to perform some operation on a table and
it is locked, I want to raise an exception)
other exceptions?
def upgrade():
engine = op.get_bind().engine
op.add_column('t_break_employee', sa.Column('auto', sa.Boolean()))
op.add_column('t_employee', sa.Column('settings', sa.Text()))
I thought about adding a class to be used with the 'with' statement on every change. Does it sound reasonable?
for example:
def upgrade():
engine = op.get_bind().engine
with my_test():
op.add_column('t_break_employee', sa.Column('auto', sa.Boolean()))
with my_test():
op.add_column('t_employee', sa.Column('settings', sa.Text()))
In that case, what are the exceptions I need to handle and how do I know if a table is locked?

I'm not refering to specific issues of the API you use, but I discourage you from this approach.
Really migration is something that has two two outcomes:
Migration did apply succesfully
Database is in state before migration (and hence there was some kind of error).
Proper way to deal with errors is to fix the migration. Your approach is allows third migration outcome:
Migration did fail but some changes were applied
This would result in schema corruption which is a bad thing!

Related

How to stop a "latent" Python Django cursor execute exception with SQLite

I am trying to automate a complete application schema rebuild in Django. Basically, drop all the old tables, delete the migration and cache file(s), delete the migration history, and then makemigrations and migrate. I understand this will have some scratching their heads but I like to configuration control my test data and load what I need from csv files and "starting from scratch" is much simpler with this model.
I'm hopeful the automation will work against all the different Django-supported databases so trying to come up with an approach that's generic (although some gaps and brute-force).
As noted above, the first step is to drop all the existing tables. Under the assumption that most model changes aren't deletes, I'm doing some class introspection to identify all the models and then attempting to discern if the related tables exist:
app_models = get_all_app_models(models.Model, app_label)
# Loop through and remove models that no longer exist by doing a count, which should
# work no matter the SQL DBMS.
with connection.cursor() as cursor:
for app_model in app_models[:]:
try:
cursor.execute (f'select count (*) from {app_model._meta.db_table}')
except OperationalError:
app_models.remove(app_model)
<lots more code, doing cursor and other stuff that works OK>
Once the above completes, app_models contains tables that remain and the code then works through dropping them (which itself isn't trivial in a generic way).
The processing is contained in a Django view/form and once complete it attempts to render a simple "okey dokey" page. The problem is that an exception is thrown during the render saying "no such table: xxx" and it refers to the execute statement. There is no other call stack context displayed.
The table mentioned was indeed referenced in the execute, in fact it was the last one in the original app_models. Somehow it seems that Django is retaining error information after the cursor is closed (the 'with' cleanup) and somehow executing against it during the render processing. I tried to execute a "good" SQL statement to "clear" the error, but no luck. I tried to "del" the cursor, also no luck.
Update 3/25: After many hours of trial and error, I have found that if I place the above code (along with the rest of the processing) in a separate function and call it from the view/form function I no longer get the spurious call-back to the execute statement. I researched SQLite quite a bit and it seems that OperationalError is handled differently than (say) DataError. I tried other approaches to clearing the error (e.g. closing the connection) without avail. I suspect the web server is doing something tricky with the stack and is treating the OperationalError incorrectly and the function nesting "hides" it. Cheers.

What to do when database fixture teardown fails in testing?

I am working on testing database accessor methods using a DB API2 methods in Python with Pytest. Automated testing is new to me and I can't seem to figure out what should be done in the case of testing a database with fixtures. I would like to check whether getting fields in a table are successful. To be able to get the same result, I intend to add a row entry every time I run some tests and delete the row after each test that depends on it. The terms I have heard are 'setUp' and 'tearDown' although I have also read that using yield is newer syntax.
My conceptual question whose answer I would like to figure out before writing code is:
What happens when the 'tearDown' portion of the fixture fails? How do I return the database to the same state without the added row entry? Is there a way of recovering from this? I still need the rest of the data in the database?
I read this article [with unittest] that explains what runs when setting up and tearing down methods fail but it falls short on providing an answer to my question.
A common practice is to run each test in its own database transaction, so that no matter what happens any changes are rolled back and database gets returned to a clean state.

How do I drop custom types using SQLAlchemy + PostgreSQL?

I have a script which cleans the database, and this is widely used in our tests.
First, we tried to use SQLAlchemy Metadata.drop_all() thing, but it didn't resolve some foreign keys on deletion, which caused errors. Then, I found this script from #zzzeek, which does almost the same, but in a "smart" way. It handles all the issues with foreign keys, but now there are several issues regarding changed custom types (ENUMs). The question is, how can I drop them all them using SQLAlchemy? Execute DROP TYPE by hand only?
Tables in database are created with Alembic, and even while script above deletes all tables successfully, some custom ENUMs are still there and everything fails on attempt to recreate them.
Recreating the whole database is not a preferred solution, because default DB user for application shouldn't normally have rights to create databases.
Are you sure your Metadata instance fully describes all the tables?
Try:
Metadata.reflect()
Metadata.drop_all()
This is an ancient question, but it's still possible to come across this problem if create_type=False is on any postgresql.ENUM definitions.
Per the SQLAlchemy docs on create_type,
When False, no check will be performed and no CREATE TYPE or DROP TYPE is emitted, unless ENUM.create() or ENUM.drop() are called directly.
That means when running tests, while there may be a setup & teardown with create_all() and drop_all(), neither will affect custom enum types.
The solution is simply to remove create_type=False, since True is the default. Then all custom types will be created at the beginning of testing and dropped at the end, resulting in a perfectly clean test database.

How to avoid caching in sqlalchemy?

I have a problem with SQL Alchemy - my app works as a constantly working python application.
I have function like this:
def myFunction(self, param1):
s = select([statsModel.c.STA_ID, statsModel.c.STA_DATE)])\
.select_from(statsModel)
statsResult = self.connection.execute(s).fetchall()
return {'result': statsResult, 'calculation': param1}
I think this is clear example - one result set is fetched from database, second is just passed as argument.
The problem is that when I change data in my database, this function still returns data like nothing was changed. When I change data in input parameter, returned parameter "calculation" has proper value.
When I restart the app server, situation comes back to normal - new data are fetched from MySQL.
I know that there were several questions about SQLAlchemy caching like:
How to disable caching correctly in Sqlalchemy orm session?
How to disable SQLAlchemy caching?
but how other can I call this situation? It seems SQLAlchemy keeps the data fetched before and does not perform new queries until application restart. How can I avoid such behavior?
Calling session.expire_all() will evict all database-loaded data from the session. Any access of object attributes subsequent emits a new SELECT statement and gets new data back. Please see http://docs.sqlalchemy.org/en/latest/orm/session_state_management.html#refreshing-expiring for background.
If you still see so-called "caching" after calling expire_all(), then you need to close out transactions as described in my answer linked above.
A few possibilities.
You are reusing your session improperly or at improper time. Best practice is to throw away your session after you commit, and get a new one at the last possible moment before you use it. The behavior that appears to be caching may in fact be due to a session lifetime being very long in your application.
Objects that survive longer than the session are not being merged into a subsequent session. The "metadata" may not be able to update their state if you do not merge them back in. This is more a concern for the ORM API of SQLAlchemy, which you do not appear so far to be using.
Your changes are not committed. You say they are so we'll assume this is not it, but if none of the other avenues explain it you may want to look again.
One general debugging tip: if you want to know exactly what SQLAlchemy is doing in the database, pass echo=True to the create_engine function. The engine will print all queries it runs.
Also check out this suggestion I made to someone else, who was using ORM and had transactionality problems, which resolved their issue without ever pinpointing it. Maybe it will help you.
You need to change transaction isolation level to READ_COMMITTED
http://docs.sqlalchemy.org/en/rel_0_9/dialects/mysql.html#mysql-isolation-level

Using SQLite in a Python program

I have created a Python module that creates and populates several SQLite tables. Now, I want to use it in a program but I don't really know how to call it properly. All the tutorials I've found are essentially "inline", i.e. they walk through using SQLite in a linear fashion rather than how to actually use it in production.
What I'm trying to do is have a method check to see if the database is already created. If so, then I can use it. If not, an exception is raised and the program will create the database. (Or use if/else statements, whichever is better).
I created a test script to see if my logic is correct but it's not working. When I create the try statement, it just creates a new database rather than checking if one already exists. The next time I run the script, I get an error that the table already exists, even if I tried catching the exception. (I haven't used try/except before but figured this is a good time to learn).
Are there any good tutorials for using SQLite operationally or any suggestions on how to code this? I've looked through the pysqlite tutorial and others I found but they don't address this.
Don't make this more complex than it needs to be. The big, independent databases have complex setup and configuration requirements. SQLite is just a file you access with SQL, it's much simpler.
Do the following.
Add a table to your database for "Components" or "Versions" or "Configuration" or "Release" or something administrative like that.
CREATE TABLE REVISION(
RELEASE_NUMBER CHAR(20)
);
In your application, connect to your database normally.
Execute a simple query against the revision table. Here's what can happen.
The query fails to execute: your database doesn't exist, so execute a series of CREATE statements to build it.
The query succeeds but returns no rows or the release number is lower than expected: your database exists, but is out of date. You need to migrate from that release to the current release. Hopefully, you have a sequence of DROP, CREATE and ALTER statements to do this.
The query succeeds, and the release number is the expected value. Do nothing more, your database is configured correctly.
AFAIK an SQLITE database is just a file.
To check if the database exists, check for file existence.
When you open a SQLITE database it will automatically create one if the file that backs it up is not in place.
If you try and open a file as a sqlite3 database that is NOT a database, you will get this:
"sqlite3.DatabaseError: file is encrypted or is not a database"
so check to see if the file exists and also make sure to try and catch the exception in case the file is not a sqlite3 database
SQLite automatically creates the database file the first time you try to use it. The SQL statements for creating tables can use IF NOT EXISTS to make the commands only take effect if the table has not been created This way you don't need to check for the database's existence beforehand: SQLite can take care of that for you.
The main thing I would still be worried about is that executing CREATE TABLE IF EXISTS for every web transaction (say) would be inefficient; you can avoid that by having the program keep an (in-memory) variable saying whether it has created the database today, so it runs the CREATE TABLE script once per run. This would still allow for you to delete the database and start over during debugging.
As #diciu pointed out, the database file will be created by sqlite3.connect.
If you want to take a special action when the file is not there, you'll have to explicitly check for existance:
import os
import sqlite3
if not os.path.exists(mydb_path):
#create new DB, create table stocks
con = sqlite3.connect(mydb_path)
con.execute('''create table stocks
(date text, trans text, symbol text, qty real, price real)''')
else:
#use existing DB
con = sqlite3.connect(mydb_path)
...
Sqlite doesn't throw an exception if you create a new database with the same name, it will just connect to it. Since sqlite is a file based database, I suggest you just check for the existence of the file.
About your second problem, to check if a table has been already created, just catch the exception. An exception "sqlite3.OperationalError: table TEST already exists" is thrown if the table already exist.
import sqlite3
import os
database_name = "newdb.db"
if not os.path.isfile(database_name):
print "the database already exist"
db_connection = sqlite3.connect(database_name)
db_cursor = db_connection.cursor()
try:
db_cursor.execute('CREATE TABLE TEST (a INTEGER);')
except sqlite3.OperationalError, msg:
print msg
Doing SQL in overall is horrible in any language I've picked up. SQLalchemy has shown to be easiest from them to use because actual query and committing with it is so clean and absent from troubles.
Here's some basic steps on actually using sqlalchemy in your app, better details can be found from the documentation.
provide table definitions and create ORM-mappings
load database
ask it to create tables from the definitions (won't do so if they exist)
create session maker (optional)
create session
After creating a session, you can commit and query from the database.
See this solution at SourceForge which covers your question in a tutorial manner, with instructive source code :
y_serial.py module :: warehouse Python objects with SQLite
"Serialization + persistance :: in a few lines of code, compress and annotate Python objects into SQLite; then later retrieve them chronologically by keywords without any SQL. Most useful "standard" module for a database to store schema-less data."
http://yserial.sourceforge.net
Yes, I was nuking out the problem. All I needed to do was check for the file and catch the IOError if it didn't exist.
Thanks for all the other answers. They may come in handy in the future.

Categories

Resources