How to handle refactor with south (django)?

How to handle refactor with south (django)? - python

I have installed south and make some migrations. Now there's a 'migrations' directory in the folder app. My question is: when I am refactoring models, which entries in the migration directory files I must apply the changes? I think some entries are related directly with the database schema, and others with the code itself. I couldn't fina an answer to this in the south docs.

Make the changes to your models then run python manage.py schemamigration yourapp --auto. This will create the migrations for you (you'll see a new file in your migrations directory every time you do this process).
Sometimes you really need to edit a migration manually, but you should try and avoid it. Particularly if you have already run the migration (the south app keeps a record of which migrations have been run so it knows the state of your database).
South is designed to support moving between different versions of your code without breaking your database. Each migration file in the migrations directory represents a snapshot of your code (specifically a snapshot of your models.py). You migrate from version to version by running python manage.py migrate yourapp version_no

Related

How do I transfer test database to product database

I'm working on an Django project, and building and testing with a database on GCP. Its full of test data and kind of a mess.
Now I want to release the app with a new and fresh another database.
How do I migrate to the new database? with all those migrations/ folder?
I don't want to delete the folder cause the development might continue.
Data do not need to be preserved. It's test data only.
Django version is 2.2;
Python 3.7
Thank you.
＝＝＝＝＝＝＝＝＝ update
After changing the settings.py, python manage.py makemigrations says no changes detected.
Then I did python manage.py migrate, and now it complains about relation does not exist.
=============== update2
The problem seems to be that, I had a table name Customer, and I changed it to 'Client'. Now it's complaining about "psycopg2.errors.UndefinedTable: relation "app_customer" does not exist".
How can I fix it, maybe without deleting all files in migrations/?
================ update final
After eliminating all possibilities, I have found out that the "new" database is not new at all. I migrated on that database some months ago.
Now I created a fresh new one and migrate worked like a charm.
Again, thank you all for your suggestions.

The migration folder is your friend. No need to delete it. You can flatten migrations if you feel the migration folder is getting too large with many migration files, but you don't need to.
If you plan to use a new database in GCP, just change the settings.py file, which is typically located in the project_name/project_name/ folder. Locate the DATABASES section to reflect the new database credentials.
Once your app is pointing at the new database, run python manage.py migrate. This will build the database schema with the necessary tables to start populating the new database.

You don't need to delete any migrations/ folder, you can use the same migration files in product database, if you don't need the test data just delete the database i.e. db.sqlite3 (I guess you are using default database).
Note: Migration files don't effect your database structure(if you delete and do makemigrations or use the existing one it remains the same)

Django project: track migrations

While developing a Django project tracking it with git and GitHub, how should I manage migrations?
Sometimes when I deploy a release to production some migrations crash due to files that I delete after this migration.
How can I avoid this?
Thanks.

There is other threads on this but basically this is the rules I use:
You should definately remote migrations files using Git.
Never run makemigrations on production environment always in developpment.
Now, let's say you made a change on one of your models (in developpment I hope), you will run a normal makemigrations. Then, run migrate (still in dev) in order to test everything. When you're ready, you will commit and push the created files and pull in prod to then run migrate to update database schema.
This will assure good versionning of your migrations files. Also, it will greatly help you in the long run, because running makemigrations in produciton and in dev simultaneously will just cause more conflicts on migrations files which can be a pain.

How to force migrations to a DB if some tables already exist in Django?

I have a Python/Django proyect. Due to some rolls back, and other mixed stuff we ended up in a kind of odd scenario.
The current scenario is like this:
DB has the correct tables
DB can't be rolled back or dropped
Code is up to date
Migrations folder is behind the DB by one or two migrations. (These migrations were applied from somewhere else and that "somewhere else" doesn't exist anymore)
I add and alter some models
I run makemigrations
New migrations are created, but it's a mix of new tables and some tables that already exist in the DB.
If I run migrate it will complain that some of the tables that I'm trying to create already exist.
What I need:
To be able to run the migrations and kind of "ignore" the existing tables and apply the new ones. Or any alternative way to achieve this.
Is that possible?

When you apply a migration, Django inserts a row in a table called django_migrations. That's the only way Django knows which migrations have been applied already and which have not. So the rows in that table have to match the files in your migrations directory. If you've lost the migration files after they were applied, or done anything else to get things out of sync, you'll have problems.. because the migration numbers in your database refer to different migration files than the ones in your project.
So before you do anything else, you need to bring things back into sync by deleting the django_migrations table rows for any migration files that you've lost somehow and can't get back. The table should contain rows for only those migrations that you do have and that were actually applied to the database correctly.
Now you need to deal with any changes in your database that Django Migrations doesn't know about.. and for that there are a few options:
If things worked out such that the database changes that were already applied to the database are in different migration files than the ones that weren't, then you can fix it by running your migrations one at a time using the --fake option on any changes that are in reality already in the database. The fake option just writes the row to the django_migrations table marking the migration as done. Only do this if the database does in fact already have all the changes contained in that migration file.
And those migration files that contain only changes which have not been applied to the database, run without the --fake option and Django will apply them. eg:
# database already has it
manage.py migrate myapp 0003 --fake
# need it
manage.py migrate myapp 0004
# database already has it
manage.py migrate myapp 0005 --fake
If you have migration files where some but not all of the changes have been applied, then you have a bigger problem. In that case, there are several ways to go about it (choose ONLY ONE):
Edit the migration files to put changes that have already been applied (whether Django did it or you did it manually does not matter) into lower number migrations, and put everything you need done into higher numbered files. Now you can --fake the lower number ones, and run the higher numbered ones as normal. Let's say you have 10 changes you made to your models, and 5 of those changes are actually in the database already, but Django doesn't know about them.. so when you run makemigrations, a new migration is created with all 10 changes. This will normally fail because the database server can't for example add a column that already exists. Move these already-applied changes out of your new migration file, into the previous (already applied) migration file. Django will then assume that these were applied with the previous migration and will not try to apply them again. You can then migrate as normal and the new changes will be applied.
If you don't want to touch your older migration file, a cleaner way to do this is to first run makemigrations --empty appname to create an empty migration file. Then run makemigrations which will create another migration with all the changes that Django thinks need to be done. Move the already done migrations from that file into the empty migration you created.. then --fake that one. This will put Django's understanding of what the database looks like will be in sync with reality and you can then migrate as normal, applying the changes in the last migration file.
Get rid of any new migrations you just created using makemigrations. Now, comment out or put back anything in your models that has not been applied to the database, leaving your code matching what's actually in the database. Now you can do makemigrations and migrate appname --fake and you will get things back in sync. Then uncomment your new code and run 'makemigrations' then migrate as normal and the changes will be applied. If the changes are small (for example, adding a few fields), sometimes this is easiest. If the changes are large, it isn't....
You can go ahead and (carefully) make the database changes yourself, bringing the database up to date. Now just run migrate --fake and if you didn't mess up then everything will be ok. Again, this is easy for smaller changes, not as easy for complicated ones.
You can run manage.py sqlmigrate > mychanges.sql. This generates mychanges.sql containing all the SQL Django WOULD have executed against the database. Now edit that file to remove any changes that have already been applied, leaving what needs to be done. Execute that SQL using pgadmin or psql (you're using postgresql I hope). Now the changes have all been made.. so you can run manage.py migrate --fake, this will bring Django into sync with reality and you should be all set. If your SQL skills are sufficient, this is probably the most straightforward solution.
I should add two warnings:
First, if you apply a later migration, eg 0003_foobar.py, and then things don't work out and you decide to try going back and apply 0002_bazbuz.py, then Django will TAKE STUFF OUT OF YOUR DATABASE. For example a column you might have added in 0003 will be dropped along with its data. Since you say you can't lose data, be very careful about going back.
Second, do not rush into running --fake migrations. Make sure that the entire migration you are about to fake is actually in the database already. Else it gets very confusing. If you do regret faking migrations and don't want to roll back, you can erase django's knowledge of the faked migration by deleting that row from the django_migrations table. It is ok to do this.. if you understand what you are doing. If you know that the migration really was not applied, then it's ok.

This blog post really nails it. https://simpleisbetterthancomplex.com/tutorial/2016/07/26/how-to-reset-migrations.html
Let me summarize the steps in his scenario 2 (you have a production database and want to change schema/models in one or more apps). In my case, I had two apps, queue and routingslip, that had model modifications that I needed to apply to a production system. Key was I already had the database, so this is where --fake-initial comes into play.
Here are the steps I followed. As always, backup everything before starting. I do work in a VM so I just took a snapshot before going forward.
1) Remove the migration history for each app.
python manage.py migrate --fake queue zero
python manage.py migrate --fake routingslip zero
2) Blow away any migration files in the entire project within which the app(s) reside.
find . -path "*/migrations/*.py" -not -name "__init__.py" -delete
find . -path "*/migrations/*.pyc" -delete
3) Make migrations
python manage.py makemigrations
4) Apply the migrations, faking initial because the database already exists and we just want the changes:
python manage.py migrate --fake-initial
Worked great for me.

If you don't have any migration files or you lost the previous file and want to migrate new changes, then you need to follow the following steps carefully:
# To create a new migration file before changing the models.
cmd: python manage.py makemigrations app_name
# Fake migrate
cmd: python manage.py migrate app_name 0005 --fake #[0005 is the migration file number created just now. It'll seem like 0005_add_address or something like this.]
# To create a new migration file after changing the models.
cmd: python manage.py makemigrations app_name
# database already has it
cmd: python manage.py migrate app_name 0006 #[0006 is the migration file number created just now.]

Should I be adding the Django migration files in the .gitignore file?

Should I be adding the Django migration files in the .gitignore file?
I've recently been getting a lot of git issues due to migration conflicts and was wondering if I should be marking migration files as ignore.
If so, how would I go about adding all of the migrations that I have in my apps, and adding them to the .gitignore file?

Quoting from the Django migrations documentation:
The migration files for each app live in a “migrations” directory inside of that app, and are designed to be committed to, and distributed as part of, its codebase. You should be making them once on your development machine and then running the same migrations on your colleagues’ machines, your staging machines, and eventually your production machines.
If you follow this process, you shouldn't be getting any merge conflicts in the migration files.
When merging version control branches, you still may encounter a situation where you have multiple migrations based on the same parent migration, e.g. if to different developers introduced a migration concurrently. One way of resolving this situation is to introduce a merge_migration. Often this can be done automatically with the command
./manage.py makemigrations --merge
which will introduce a new migration that depends on all current head migrations. Of course this only works when there is no conflict between the head migrations, in which case you will have to resolve the problem manually.
Given that some people here suggested that you shouldn't commit your migrations to version control, I'd like to expand on the reasons why you actually should do so.
First, you need a record of the migrations applied to your production systems. If you deploy changes to production and want to migrate the database, you need a description of the current state. You can create a separate backup of the migrations applied to each production database, but this seems unnecessarily cumbersome.
Second, migrations often contain custom, handwritten code. It's not always possible to automatically generate them with ./manage.py makemigrations.
Third, migrations should be included in code review. They are significant changes to your production system, and there are lots of things that can go wrong with them.
So in short, if you care about your production data, please check your migrations into version control.

You can follow the below process.
You can run makemigrations locally and this creates the migration file. Commit this new migration file to repo.
In my opinion you should not run makemigrations in production at all. You can run migrate in production and you will see the migrations are applied from the migration file that you committed from local. This way you can avoid all conflicts.
IN LOCAL ENV, to create the migration files,
python manage.py makemigrations
python manage.py migrate
Now commit these newly created files, something like below.
git add app/migrations/...
git commit -m 'add migration files' app/migrations/...
IN PRODUCTION ENV, run only the below command.
python manage.py migrate

Quote from the 2022 docs, Django 4.0. (two separate commands = makemigrations and migrate)
The reason that there are separate commands to make and apply
migrations is because you’ll commit migrations to your version control
system and ship them with your app; they not only make your
development easier, they’re also useable by other developers and in
production.
https://docs.djangoproject.com/en/4.0/intro/tutorial02/

TL;DR: commit migrations, resolve migration conflicts, adjust your git workflow.
Feels like you'd need to adjust your git workflow, instead of ignoring conflicts.
Ideally, every new feature is developed in a different branch, and merged back with a pull request.
PRs cannot be merged if there's a conflict, therefore who needs to merge his feature needs to resolve the conflict, migrations included. This might need coordination between different teams.
It is important though to commit migration files! If a conflict arises, Django might even help you solve those conflicts ;)

I can't imagine why you would be getting conflicts, unless you're editing the migrations somehow? That usually ends badly - if someone misses some intermediate commits then they won't be upgrading from the correct version, and their copy of the database will be corrupted.
The process that I follow is pretty simple - whenever you change the models for an app, you also commit a migration, and then that migration doesn't change - if you need something different in the model, then you change the model and commit a new migration alongside your changes.
In greenfield projects, you can often delete the migrations and start over from scratch with a 0001_ migration when you release, but if you have production code, then you can't (though you can squash migrations down into one).

The solution usually used, is that, before anything is merged into master, the developer must pull any remote changes. If there's a conflict in migration versions, he should rename his local migration (the remote one has been run by other devs, and, potentially, in production), to N+1.
During development it might be okay to just not-commit migrations (don't add an ignore though, just don't add them). But once you've gone into production, you'll need them in order to keep the schema in sync with model changes.
You then need to edit the file, and change the dependencies to the latest remote version.
This works for Django migrations, as well as other similar apps (sqlalchemy+alembic, RoR, etc).

Gitignore the migrations, if You have separate DBs for Development, Staging and Production environment. For dev. purposes You can use local sqlite DB and play with migrations locally.
I would recommend You to create four additional branches:
Master - Clean fresh code without migrations. Nobody is connected to this branch. Used for code reviews only
Development - daily development. Push/pull accepted. Each developer is working on sqlite DB
Cloud_DEV_env - remote cloud/server DEV environment. Pull only. Keep migrations locally on machine, which is used for the code deployment and remote migrations of Dev database
Cloud_STAG_env - remote cloud/server STAG environment. Pull only. Keep migrations locally on machine, which is used for the code deployment and remote migrations of Stag database
Cloud_PROD_env - remote cloud/server DEV environment. Pull only. Keep migrations locally on machine, which is used for the code deployment and remote migrations of Prod database
Notes:
2, 3, 4 - migrations can be kept in repos but there should be strict rules of pull requests merging, so we decided to find a person, responsible for deployments, so the only guy who has all the migration files - our deploy-er. He keeps the remote DB migrations each time we have any changes in Models.

You should think of migrations as a version control system for your database schema. makemigrations is responsible for packaging up your model changes into individual migration files - analogous to commits - and migrate is responsible for applying those to your database.
The migration files for each app live in a “migrations” directory inside of that app, and are designed to be committed to, and distributed as part of, its codebase. You should be making them once on your development machine and then running the same migrations on your colleagues’ machines, your staging machines, and eventually your production machines.
golden rule : Make once on dev and migrate on all

Having a bunch of migration files in git is messy. There is only one file in migration folder that you should not ignore. That file is init.py file, If you ignore it, python will no longer look for submodules inside the directory, so any attempts to import the modules will fail. So the question should be how to ignore all migration files but init.py?
The solution is:
Add '0*.py' to .gitignore files and it does the job perfectly.
Hope this helps someone.

Committing your migrations is just a recipe for disaster. Because the migrations are somewhat or a chain that can be traced back, if you have dependences from a former migration e.g a pip module which you used at some point in your project lifecycle and then stopped using. You might find bread crumbs of such dependences in your migrations thread and you have to manually remove these imports from the migrations file.
Verdict, except you are a god tier Django dev, probably avoid adding migrations to your commits.

Short answer
I propose excluding migrations in the repo. After code merge, just run ./manage.py makemigrations and you are all set.
Long answer
I don't think you should put migrations files into repo. It will spoil the migration states in other person's dev environment and other prod and stage environment. (refer to Sugar Tang's comment for examples).
In my point of view, the purpose of Django migrations is to find gaps between previous model states and new model states, and then serialise the gap. If your model changes after code merge, you can simple do makemigrations to find out the gap. Why do you want to manually and carefully merge other migrations when you can achieve the same automatically and bug free? Django documentation says,
They*(migrations)*’re designed to be mostly automatic
; please keep it that way. To merge migrations manually, you have to fully understand what others have changed and any dependence of the changes. That's a lot of overhead and error prone. So tracking models file is sufficient.
It is a good topic on the workflow. I am open to other options.

South: how to revert migrations in production server?

I want to revert my last migration (0157) by running its Migration.backwards() method. Since I am reverting the migration in production server I want to run it automatically during code deployment. Deployment script executes these steps:
Pull code changes
Run migrations: manage.py migrate <app>
Refresh Apache to use newest code: touch django.wsgi
If I could, I would create new migration file which would tell South to backward migrate to 0156:
migrations/0158_backward__migrate_to_0156.py
This commited migration would be deployed to production and executed during manage.py migrate <app> command. In this case I wouldn't have to execute backward migration by hand, like suggested in these answers.
Lets say, I have created two data migrations, first for user's Payment, second for User model. I have implemented backwards() methods for both migrations in case I'd have to revert these data migrations. I've deployed these two migrations to production. And suddenly find out that Payment migration contains an error. I want to revert my two last data migrations as fast as possible. What is the fastest safe way to do it?

Since I am reverting the migration in production server I want to run
it automatically during code deployment.
IMHO the safest path is
run manage.py migrate <app> (i.e. apply all existing migrations, i.e. up to 0156)
undo the changes in your model
run manage.py schemamigration <app> --auto
This will create a new migration 0157 that effectively reverts the previous migration 0156. Then simply apply the new migration by running manage.py migrate <app> again. As I understand, your code deployment will just do that.

Apparently the codeline has migrations up to #157 and now the developer decided that the last one was not a good idea after all. So the plan is to go back to #156.
Two scenarios:
(a) migration #157 was not released or deployed anywhere yet.
Simply revert the last change from models.py and delete migration #157.py from the source archive. Any deployment will take the system to level 156; "157 was never there".
(b) there have been deployments of the latest software with migration #157.
In this case the previous strategy will obviously not work. So you need to create a migration #158 to undo #157. Revert the change in models.py and run
django manage.py migrate <app> 0157
django manage.py schemamigration <app> --auto
This will auto-generate a new migration #158, which will contain the inverse schema migration compared to #157.
If schemamigration is giving trouble because of django Model validation (something that can happen if you have custom validators which check stuff outside the ORM box), I suggest the following workaround:
<django project>/<app>/management/commands/checkmigrations.py
from south.management.commands import schemamigration
class Command(schemamigration.Command):
requires_model_validation = False
help = "schemamigration without model validation"
This command becomes available in manage.py:
django manage.py checkmigrations <app> --auto

There's no silver bullet here. The simplest solution I can think of would be to - in your dev env of course - manually migrate back to 0156, manually update your migration's history table (sorry I can't remember the table's name now) to fool south in thinking you're still #0158, then run schemamigration again. Not garanteed to work but might be worth trying.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.