I have a Django app that I would like to package and offer the community through PyPI repository. Its only strong dependance is Django. It also integrates nicely with Django CMS and offers additional integration for Django CMS. I plan to offer this additional functionality only to projects with Django CMS installed. (This is what I call a weak dependency - will install and work without it, but work even bitter with it.) Specifically, some models are only defined if the base model CMSPlugin from Django CMS is installed.
Is there a good/right way to manage migrations, though?
I cannot include migrations of models depending on CMSPlugin in the package since users w/o a Django CMS installations will not be able to run it.
If I omit the migrations depending on CMSPlugin users with Django CMS will create them on first installation. I fear, however, that on every update of the package those migrations will be lost on pip install upgrade when the package is overwritten.
Since these models are not necessarily used in your package it is best to have them in another Django app, which can be a sub-app of your app or perhaps just one of the apps present in your package. If a user has installed CMSPlugin then they can now simply add this extra app of yours to the INSTALLED_APPS list to use it, this also comes with the benefit that your users now have the choice to use it or not.
This way you can also easily adapt your views if this app is installed or not by using the app registries is_installed method [Django docs]:
from django.apps import apps
def some_view(request):
if apps.is_installed('yourpackage.path.to.weak_dependency_subapp'):
# Weak dependency is present
else:
# Weak dependency is absent
Note: You would have to be careful though that you don't import this apps models if it is not installed, otherwise it may give you some error since it would not be loaded.
Edit: To make the sub-app you can either do as you did by cd'ing to the apps directory and python ../manage.py startapp subappname or directly python manage.py startapp subappname <your_app>/subappname (The directory subappname needs to be made first) and then setting it's app config's name attribute to <your_app>.subappname.
Most documentation simply tells you to add the name of each of your apps to the INSTALLED_APPS array in your Django project's settings. What is the benefit/purpose of this? What different functionality will I get if I create 2 apps, but only include the name of one in my INSTALLED_APPS array?
Django uses INSTALLED_APPS as a list of all of the places to look for models, management commands, tests, and other utilities.
If you made two apps (say myapp and myuninstalledapp), but only one was listed in INSTALLED_APPS, you'd notice the following behavior:
The models contained in myuninstalledapp/models.py would never trigger migration changes (or generate initial migrations). You wouldn't be able to interact with them on the database level either because their tables will have never been created.
Static files listed within myapp/static/ would be discovered as part of collectstatic or the test server's staticfiles serving, but myuninstalledapp/static files wouldn't be.
Tests within myapp/tests.py would run but myuninstalledapp/tests.py wouldn't.
Management commands listed in myuninstalledapp/management/commands/ wouldn't be discovered.
So really, you're welcome to have folders within your Django project that aren't installed apps (you can even create them with python manage.py startapp) but just know that certain auto-discovery Django utilities won't work for that application.
I used field of specific type from third-party package in my model in Django 1.8 project:
class MyModel(models.Model):
image = third_party_package.SpecificImageField(...)
then I changed field type to standard Django type:
class MyModel(models.Model):
image = models.ImageField(...)
Database was successfully migrated to new version of model:
./manage.py makemigrations
./manage.py migrate
Then I removed third-party package, because I don't need it any more.
The problem is that migrations still have dependency on third-party package. Makemigrations command can't find third-party package and fails.
As workaround I can install third-party package back and migrate the database, but how I can remove dependency to third-party package without data loss?
I've not tested this but I'd imagine you'd be able to squash your migrations together which will consolidate them.
manage.py squashmigrations myapp 0050
You need to pass this the name of your app you want to squash, as well as the number of the migration that you want to squash up to.
What this does is merge together your migration files in to one "super" migration file that will contain all the changes in those migrations, whilst removing those changes that conflict.
Squashing is the act of reducing an existing set of many migrations down to one (or sometimes a few) migrations which still represent the same changes.
Django does this by taking all of your existing migrations, extracting their Operations and putting them all in sequence, and then running an optimizer over them to try and reduce the length of the list - for example, it knows that CreateModel and DeleteModel cancel each other out, and it knows that AddField can be rolled into CreateModel.
Since the Field type is only on the Software level, you can make as follow (kind of Trick): just change the third_party_package.SpecificImageField(...) with models.ImageField(...) in your migrations script. It will work perfectly since it will change nothing at the database level, otherwise you will have to optimize your migrations scripts manually till get off the dependency.
Should I be adding the Django migration files in the .gitignore file?
I've recently been getting a lot of git issues due to migration conflicts and was wondering if I should be marking migration files as ignore.
If so, how would I go about adding all of the migrations that I have in my apps, and adding them to the .gitignore file?
Quoting from the Django migrations documentation:
The migration files for each app live in a “migrations” directory inside of that app, and are designed to be committed to, and distributed as part of, its codebase. You should be making them once on your development machine and then running the same migrations on your colleagues’ machines, your staging machines, and eventually your production machines.
If you follow this process, you shouldn't be getting any merge conflicts in the migration files.
When merging version control branches, you still may encounter a situation where you have multiple migrations based on the same parent migration, e.g. if to different developers introduced a migration concurrently. One way of resolving this situation is to introduce a merge_migration. Often this can be done automatically with the command
./manage.py makemigrations --merge
which will introduce a new migration that depends on all current head migrations. Of course this only works when there is no conflict between the head migrations, in which case you will have to resolve the problem manually.
Given that some people here suggested that you shouldn't commit your migrations to version control, I'd like to expand on the reasons why you actually should do so.
First, you need a record of the migrations applied to your production systems. If you deploy changes to production and want to migrate the database, you need a description of the current state. You can create a separate backup of the migrations applied to each production database, but this seems unnecessarily cumbersome.
Second, migrations often contain custom, handwritten code. It's not always possible to automatically generate them with ./manage.py makemigrations.
Third, migrations should be included in code review. They are significant changes to your production system, and there are lots of things that can go wrong with them.
So in short, if you care about your production data, please check your migrations into version control.
You can follow the below process.
You can run makemigrations locally and this creates the migration file. Commit this new migration file to repo.
In my opinion you should not run makemigrations in production at all. You can run migrate in production and you will see the migrations are applied from the migration file that you committed from local. This way you can avoid all conflicts.
IN LOCAL ENV, to create the migration files,
python manage.py makemigrations
python manage.py migrate
Now commit these newly created files, something like below.
git add app/migrations/...
git commit -m 'add migration files' app/migrations/...
IN PRODUCTION ENV, run only the below command.
python manage.py migrate
Quote from the 2022 docs, Django 4.0. (two separate commands = makemigrations and migrate)
The reason that there are separate commands to make and apply
migrations is because you’ll commit migrations to your version control
system and ship them with your app; they not only make your
development easier, they’re also useable by other developers and in
production.
https://docs.djangoproject.com/en/4.0/intro/tutorial02/
TL;DR: commit migrations, resolve migration conflicts, adjust your git workflow.
Feels like you'd need to adjust your git workflow, instead of ignoring conflicts.
Ideally, every new feature is developed in a different branch, and merged back with a pull request.
PRs cannot be merged if there's a conflict, therefore who needs to merge his feature needs to resolve the conflict, migrations included. This might need coordination between different teams.
It is important though to commit migration files! If a conflict arises, Django might even help you solve those conflicts ;)
I can't imagine why you would be getting conflicts, unless you're editing the migrations somehow? That usually ends badly - if someone misses some intermediate commits then they won't be upgrading from the correct version, and their copy of the database will be corrupted.
The process that I follow is pretty simple - whenever you change the models for an app, you also commit a migration, and then that migration doesn't change - if you need something different in the model, then you change the model and commit a new migration alongside your changes.
In greenfield projects, you can often delete the migrations and start over from scratch with a 0001_ migration when you release, but if you have production code, then you can't (though you can squash migrations down into one).
The solution usually used, is that, before anything is merged into master, the developer must pull any remote changes. If there's a conflict in migration versions, he should rename his local migration (the remote one has been run by other devs, and, potentially, in production), to N+1.
During development it might be okay to just not-commit migrations (don't add an ignore though, just don't add them). But once you've gone into production, you'll need them in order to keep the schema in sync with model changes.
You then need to edit the file, and change the dependencies to the latest remote version.
This works for Django migrations, as well as other similar apps (sqlalchemy+alembic, RoR, etc).
Gitignore the migrations, if You have separate DBs for Development, Staging and Production environment. For dev. purposes You can use local sqlite DB and play with migrations locally.
I would recommend You to create four additional branches:
Master - Clean fresh code without migrations. Nobody is connected to this branch. Used for code reviews only
Development - daily development. Push/pull accepted. Each developer is working on sqlite DB
Cloud_DEV_env - remote cloud/server DEV environment. Pull only. Keep migrations locally on machine, which is used for the code deployment and remote migrations of Dev database
Cloud_STAG_env - remote cloud/server STAG environment. Pull only. Keep migrations locally on machine, which is used for the code deployment and remote migrations of Stag database
Cloud_PROD_env - remote cloud/server DEV environment. Pull only. Keep migrations locally on machine, which is used for the code deployment and remote migrations of Prod database
Notes:
2, 3, 4 - migrations can be kept in repos but there should be strict rules of pull requests merging, so we decided to find a person, responsible for deployments, so the only guy who has all the migration files - our deploy-er. He keeps the remote DB migrations each time we have any changes in Models.
You should think of migrations as a version control system for your database schema. makemigrations is responsible for packaging up your model changes into individual migration files - analogous to commits - and migrate is responsible for applying those to your database.
The migration files for each app live in a “migrations” directory inside of that app, and are designed to be committed to, and distributed as part of, its codebase. You should be making them once on your development machine and then running the same migrations on your colleagues’ machines, your staging machines, and eventually your production machines.
golden rule : Make once on dev and migrate on all
Having a bunch of migration files in git is messy. There is only one file in migration folder that you should not ignore. That file is init.py file, If you ignore it, python will no longer look for submodules inside the directory, so any attempts to import the modules will fail. So the question should be how to ignore all migration files but init.py?
The solution is:
Add '0*.py' to .gitignore files and it does the job perfectly.
Hope this helps someone.
Committing your migrations is just a recipe for disaster. Because the migrations are somewhat or a chain that can be traced back, if you have dependences from a former migration e.g a pip module which you used at some point in your project lifecycle and then stopped using. You might find bread crumbs of such dependences in your migrations thread and you have to manually remove these imports from the migrations file.
Verdict, except you are a god tier Django dev, probably avoid adding migrations to your commits.
Short answer
I propose excluding migrations in the repo. After code merge, just run ./manage.py makemigrations and you are all set.
Long answer
I don't think you should put migrations files into repo. It will spoil the migration states in other person's dev environment and other prod and stage environment. (refer to Sugar Tang's comment for examples).
In my point of view, the purpose of Django migrations is to find gaps between previous model states and new model states, and then serialise the gap. If your model changes after code merge, you can simple do makemigrations to find out the gap. Why do you want to manually and carefully merge other migrations when you can achieve the same automatically and bug free? Django documentation says,
They*(migrations)*’re designed to be mostly automatic
; please keep it that way. To merge migrations manually, you have to fully understand what others have changed and any dependence of the changes. That's a lot of overhead and error prone. So tracking models file is sufficient.
It is a good topic on the workflow. I am open to other options.