Preserve data in migrations

Preserve data in migrations - python

With this setup:
-Development environment
-Flask
-SQLAlchemy
-Postgres
-Possibility Alembic
If I have a database with some tables populated with random data. As far as I know the Flask-Migrate, that will use Alembic, will not preserve the data, only keep the models and database synchronized.
But what is the difference between the use of Alembic or just delete > create all tables?
Something like:
db.create_all()
The second question:
What happens to the data when something change in models? The data will be lost? Or the Alembic can preserve the previous populated data?
Well, my idea is just populate the database with some data, and then avoid any lost of data
when the models change. Alembic is the solution?
Or I need to import the data, from a .sql file, for example, when I change the models and database?

I am the Flask-Migrate author.
You are not correct. Flask-Migrate (through Alembic) will always preserve the data that you have in your database. That is the whole point of working with database migrations, you do not want to lose your data.
It seems you already have a database with data in it and you want to start using migrations. You have two options to incorporate Flask-Migrate into your project:
Only track migrations going forward, i.e. leave your initial database schema outside of migration tracking.
For this you really have nothing special to do. Just do manage.py db init to create the migrations repository and when you need to migrate your database do so normally with manage.py db migrate. The disadvantage of this method is that Flask-Migrate/Alembic do not have the initial schema of the database, so it is not possible to recreate a database from scratch.
Implement an initial migration that brings your database to your current state, then continue tracking future migrations normally.
This requires a little bit of trick. Here you want Alembic to record an initial migration that defines your current schema. Since Alembic creates migrations by comparing your models to your database, the trick is to replace your real database with an empty database and then generate a migration. After the initial migration is recorded, you restore your database, and from then on you can continue migrating your database normally.
I hope this helps. Let me know if you have any more questions.

Related

Can two separate sets of Alembic migrations be applied to the same database?

Background: Airflow uses Alembic to apply migrations to the database it uses to store DAG/task metadata. I want to store some other data in this database, and would like to track my schema changes through Alembic migrations. It can be assumed that my migrations will be limited to creating/modifying new tables, without altering any of the tables that Airflow creates and uses.
Will the fact that there are two sets of migrations (one in the Airflow source code, and one in my application code) cause any issues?

Even if you use the same DB server, I suggest to use a different schema/database for the applicative stuff.
This way, when you pass a connection string in the env.py that runs the migrations, it will use a different alembic_version table and therefore they wouldn't collide.

How do I set up Alembic for a SQLite database attached as a schema?

I've tried many contortions on this problem to try to figure out what's going on.
My SQLAlchemy code specified tables as schema.table. I have a special connection object that connects using the specified connect string if the database is PostgreSQL or Oracle, but if the database is SQLite, it connects to a :memory: database, then attaches the SQLite file-based database using the schema name. This allows me to use schema names throughout my SQLAlchemy code without a problem.
But when I try to set up Alembic to see my database, it fails completely. What am I doing wrong?

I ran into several issues that had to be worked through before I got this working.
Initially, Alembic didn't see my database at all. If I tried to specify it in the alembic.ini file, it would load the SQLite database using the default schema, but my model code specified a schema, so that didn't work. I had to change alembic/env.py in run_migrations_online() to call my connection method from my code instead of using engine_from_config. In my case, I created a database object that had a connect() method that would return the engine and the metadata. I called that as connectable, meta = db.connect(). I would return the schema name with schema=db.schema(). I had to import the db class from my SQLAlchemy code to get access to these.
Now I was getting a migration that would build up the entire database from scratch, but I couldn't run that migration because my database already had those changes. So apparently Alembic wasn't seeing my database. Alembic also kept telling me that my database was out of date. The problem there was that the alembic table alembic_version was being written to my :memory: database, and as soon as the connection was dropped, so was the database. So to get Alembic to remember the migration, I needed that table to be created in my database. I added more code to env.py to pass the schema to context.configure using the version_table_schema=my_schema.
When I went to generate the migration again, I still got the migration that would build the database from scratch, so Alembic STILL wasn't seeing my database. After lots more Googling, I found that I needed to pass include_schemas=True to context.configure in env.py. But after I added that, I started getting tracebacks from Alembic.
Fortunately, my configuration was set up to provide both the connection and the metadata. By changing the target_metadata=target_metadata line to target_metadata=meta (my local metadata returned from the connection), I got around these tracebacks as well, and Alembic started to behave properly.
So to recap, to get Alembic working with a SQLite database attached as a schema name, I had to import the connection script I use for my Flask code. That connection script properly attaches the SQLite database, then reflects the metadata. It returns both the engine and the metadata. I return the engine to the "connectable" variable in env.py, and return the metadata to the new local variable meta. I also return the schema name to the local variable schema.
In the with connectable.connect() as connection: block, I then pass to context.configure additional arguments target_metadata=meta, version_table_schema=schema, and include_schemas=True where meta and schema are my new local variables set above.
With all of these changes, I thought I was able to work with SQLite databases attached as schemas. Unfortunately, I continued to run into problems with this, and eventually decided that I simply wouldn't work with SQLite with Alembic. Our rule now is that Alembic migrations are only for non-SQLite databases, and SQLite data has to be migrated to another database before attempting an Alembic migration of the data.
I'm documenting this so that anyone else facing this may be able to follow what I've done and possibly get Alembic working for SQLite.

How to use existing mysql DB along with its migrations in sql, in the Django? What changes will need to do in the django for using the mysql db?

I am already having one mysql database with a lot of data, of which tables and migrations are written in sql. Now I want to use the same mysql database in django so that I can use the data in that database.I am expecting that there will not be need for making the migrations as I am not going to write the models in Django again, also what will be the changes/modification I will have to do. For eg: as in middlewares?. Can anyone please help me in this?

From what I know there is no 100 % automatic way to achieve that.
You can use the following command
python manage.py inspectdb
It will generate a list of unmanaged models that you can export to a model.py file and integrate in your django project.
However it is not magical and there are a lot of edge cases so the generated list of model should be manually inspected before being integrated.
More info here : https://docs.djangoproject.com/en/3.0/ref/django-admin/#django-admin-inspectdb

I am using Peewee and I need to create migration file for my database

How should I start here, I am a bit confused.
I have database schemas which are newer than the old database.
I have seen Arnold package, where there are two methods arnold up and arnold down. There are migration files where you can add all your database queries. But still those changes I can see and I can add in up like create table or alter table and in arnold down I can add drop table or alter table. But my migration will become only for that database.
I want to understand what does database migration should contain. And what does it do. I will be very grateful if someone explain me database migration and push me in the right direction of peewee, psql database migration

We use a tool peewee-db-evolve that I think you'll find useful. Try this:
sudo pip install peewee-db-evolve
Add import peeweedbevolve to the top of your models.py file, or anywhere that is imported before the models are defined.
Open up your shell and run db.evolve() on your peewee database object.
For (3), at work we have a script evolve.py that looks like this:
import peeweedbevolve
from config import db
if __name__=='__main__': db.evolve()
Just to make it easy.
It will look at the models and your existing schema and calculate all the ALTER TABLE statements for you. (No need to manually write the migrations as w/ arnold, or peewee's built in version.)
It'll look like this:
Hope this helps!

Alembic migration acting on different versions of same database

I am trying to use alembic migrations to act on different versions of the same database. An example would be that I have two databases, one live and one for testing. Each of them might be in different states of migration. For one, the test database might not exist at all.
Say live has a table table1 with columns A and B. Now I would like to add column C. I change my model to include C and I generate a migration script that has the following code
op.add_column('table1', sa.Column('C', sa.String(), nullable=True))
This works fine with the existing live database.
If I now call alembic upgrade head referring to a non-existing test database, I get an (Operational Error) duplicate column name... error. I assume this is due to my model containing the C column and that alembic/sqlalchemy creates the full table automatically if it does not exist.
Should I simply trap the error or is there a better way of doing this?

I would suggest that immediately after your test db is newly created you should stamp it with head
command.stamp(configs_for_test_db, 'head')
This will go ahead and insert the head revision number into the appropriate alembic table without actually running migrations so that the revision number will reflect the state of the db (namely that your newly created db is up to date wrt migrations). After the db is stamped, alembic upgrade should behave properly.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.