I am trying to use alembic migrations to act on different versions of the same database. An example would be that I have two databases, one live and one for testing. Each of them might be in different states of migration. For one, the test database might not exist at all.
Say live has a table table1 with columns A and B. Now I would like to add column C. I change my model to include C and I generate a migration script that has the following code
op.add_column('table1', sa.Column('C', sa.String(), nullable=True))
This works fine with the existing live database.
If I now call alembic upgrade head referring to a non-existing test database, I get an (Operational Error) duplicate column name... error. I assume this is due to my model containing the C column and that alembic/sqlalchemy creates the full table automatically if it does not exist.
Should I simply trap the error or is there a better way of doing this?
I would suggest that immediately after your test db is newly created you should stamp it with head
command.stamp(configs_for_test_db, 'head')
This will go ahead and insert the head revision number into the appropriate alembic table without actually running migrations so that the revision number will reflect the state of the db (namely that your newly created db is up to date wrt migrations). After the db is stamped, alembic upgrade should behave properly.
Related
Background: Airflow uses Alembic to apply migrations to the database it uses to store DAG/task metadata. I want to store some other data in this database, and would like to track my schema changes through Alembic migrations. It can be assumed that my migrations will be limited to creating/modifying new tables, without altering any of the tables that Airflow creates and uses.
Will the fact that there are two sets of migrations (one in the Airflow source code, and one in my application code) cause any issues?
Even if you use the same DB server, I suggest to use a different schema/database for the applicative stuff.
This way, when you pass a connection string in the env.py that runs the migrations, it will use a different alembic_version table and therefore they wouldn't collide.
The situation is:
I developped a webapp using django (and especially "django-simple-history").
I have a postgres database "db01" with a history model "db01_history" which is generated/filled using "django-simple-history".
I accidentally deleted everything from "db01"and, sadly, I don't have any db backup.
My question is:
Is there some way to replay all historical records "db01_history" (up to a specific ID) onto original database "db01" ?
(In other words, is there a way to restore a db using its historical model up to a specific date/ID ?)
Giving db0_history -> db01
Fortunately, django-simple-history keeps using your own model's field names and types (but does not keep some constraints).
The difference is that there are multiple historical objects for each of your deleted objects. If you use Django default primary key (id) it would be easy for you to group your tables by id and use the latest record as of history_date (the time of recorded history).
An exception is that if you use more direct database operations like updates or bulk_creates from model managers you don't have their histories.
So you can just configure your project to use a copy of the historical database only having the latest record for each object and then try to do python manage.py dumpdata > dump.json and then revert the database settings to the new database you like and do python manage.py loaddata dump.json.
To be concise, yes you may have all your data in your historical database.
With this setup:
-Development environment
-Flask
-SQLAlchemy
-Postgres
-Possibility Alembic
If I have a database with some tables populated with random data. As far as I know the Flask-Migrate, that will use Alembic, will not preserve the data, only keep the models and database synchronized.
But what is the difference between the use of Alembic or just delete > create all tables?
Something like:
db.create_all()
The second question:
What happens to the data when something change in models? The data will be lost? Or the Alembic can preserve the previous populated data?
Well, my idea is just populate the database with some data, and then avoid any lost of data
when the models change. Alembic is the solution?
Or I need to import the data, from a .sql file, for example, when I change the models and database?
I am the Flask-Migrate author.
You are not correct. Flask-Migrate (through Alembic) will always preserve the data that you have in your database. That is the whole point of working with database migrations, you do not want to lose your data.
It seems you already have a database with data in it and you want to start using migrations. You have two options to incorporate Flask-Migrate into your project:
Only track migrations going forward, i.e. leave your initial database schema outside of migration tracking.
For this you really have nothing special to do. Just do manage.py db init to create the migrations repository and when you need to migrate your database do so normally with manage.py db migrate. The disadvantage of this method is that Flask-Migrate/Alembic do not have the initial schema of the database, so it is not possible to recreate a database from scratch.
Implement an initial migration that brings your database to your current state, then continue tracking future migrations normally.
This requires a little bit of trick. Here you want Alembic to record an initial migration that defines your current schema. Since Alembic creates migrations by comparing your models to your database, the trick is to replace your real database with an empty database and then generate a migration. After the initial migration is recorded, you restore your database, and from then on you can continue migrating your database normally.
I hope this helps. Let me know if you have any more questions.
I dropped my database that I had previously created for django using :
dropdb <database>
but when I go to the psql prompt and say \d, I still see the relations there :
How do I remove everything from postgres so that I can do everything from scratch ?
Most likely somewhere along the line, you created your objects in the template1 database (or in older versions the postgres database) and every time you create a new db i thas all those objects in it. You can either drop the template1 / postgres database and recreate it or connect to it and drop all those objects by hand.
Chances are that you never created the tables in the correct schema in the first place. Either that or your dropdb failed to complete.
Try to drop the database again and see what it says. If that appears to work then go in to postgres and type \l, putting the output here.
One of my Django models is a subclass and I want to change its superclass to one that is very similar to the original one. In particular, the new superclass describes the same object and has the same primary key. How can I make South create the new OneToOne field and copy the values from the old one to the new one?
In south, there are two kinds of migrations: schema migrations and data migrations.
After you've created the schemamigration, create a corresponding data migration:
./manage.py datamigration <app> <migration_name>
Do not run the migration (yet). Instead, open up the migration file you just created.
You'll find the method named forwards(). Into this you define the procedure by which values from old tables get copied to new tables.
If you're changing the structure of a given table to a more complex layout, a common method is to have two schema migrations around a data migration: the first schema migration adds fields, the data migration translates the old fields to the new fields, and the second schema migration deletes the old fields. You can do just about anything with the database with the forwards() method, so long as you keep track of which schema (previous or current) you're accessing. Generally, you only read from the orm.-related, and write to the traditional Django accessors.
The South Data Migration Tutorial covers this in some detail. It shows you how to use South's orm reference to access the database using the schema prior to the schema migration and gives access to the database without Django complaining about fields it doesn't understand.
If you're renaming a class, that can be tricky-- it involves creating the new table, migrating from one to the other, and deleting the old table. South can do it, but it might take more than one pass through shifting schemas and data migrations.
South also has the backwards() method, which allows you to return your database tables to a previous step. In some cases, this may be impossible; the new table may record information that will be lost in a downgrade. I recommend using throwing an exception in backwards() if you're not in DEBUG mode.