In our project we have multiple databases and we use alembic for migration.
I know that alembic is supposed to be used only for database structure migration, but we also use it for data migration as it's convenient to have all database migration code in one place.
My problem is that alembic works on one database at a time. So if I have databases DB1 and DB2, alembic will first run all migrations for DB1 and after that all migrations for DB2.
The problems start when we migrate data between databases. Say, if in I'm in revision N of DB1 try to access data in DB2, the migration can fail because DB2 can be on revision zero or N-X.
Question: is it possible to run alembic migrations one by one for all databases instead of running all migrations for DB1 and then running all for DB2?
My current env.py migration function:
def run_migrations_online():
"""
for the direct-to-DB use case, start a transaction on all
engines, then run all migrations, then commit all transactions.
"""
engines = {}
for name in re.split(r',\s*', db_names):
engines[name] = rec = {}
cfg = context.config.get_section(name)
if not 'sqlalchemy.url' in cfg:
cfg['sqlalchemy.url'] = build_url(name)
rec['engine'] = engine_from_config(
cfg,
prefix='sqlalchemy.',
poolclass=pool.NullPool)
for name, rec in engines.items():
engine = rec['engine']
rec['connection'] = conn = engine.connect()
rec['transaction'] = conn.begin()
try:
for name, rec in engines.items():
logger.info("Migrating database %s" % name)
context.configure(
connection=rec['connection'],
upgrade_token="%s_upgrades" % name,
downgrade_token="%s_downgrades" % name,
target_metadata=target_metadata.get(name))
context.run_migrations(engine_name=name)
for rec in engines.values():
rec['transaction'].commit()
except:
for rec in engines.values():
rec['transaction'].rollback()
raise
finally:
for rec in engines.values():
rec['connection'].close()
While I haven't tested this myself, I have been reading https://alembic.sqlalchemy.org/en/latest/api/script.html
It seems feasible that you could use ScriptDirectory to iterate through all the revisions, check if each db needs to apply that revision, and then rather than context.run_migrations you could manually call command.upgrade(config, revision) to apply that one revision.
Related
I have multiple schemas in my database, and several models per schema. Flask-migrate (which is Alembic) is unable to detect changes in any schema besides the public schema. Running
flask db migrate
followed by
flask db upgrade
will yield an error every time because the tables are already created. How can I configure alembic to recognize other schemas besides the public schema?
Modify your env.py file created by Alembic so that the context.configure function is called using the include_schemas=True option. Ensure that this is done in both your offline and online functions.
Here are my modified run_migrations_offline and run_migrations_online functions.
def run_migrations_offline():
"""Run migrations in 'offline' mode.
This configures the context with just a URL
and not an Engine, though an Engine is acceptable
here as well. By skipping the Engine creation
we don't even need a DBAPI to be available.
Calls to context.execute() here emit the given string to the
script output.
"""
url = config.get_main_option("sqlalchemy.url")
context.configure(
url=url, target_metadata=get_metadata(), literal_binds=True, include_schemas=True
)
with context.begin_transaction():
context.run_migrations()
def run_migrations_online():
"""Run migrations in 'online' mode.
In this scenario we need to create an Engine
and associate a connection with the context.
"""
# this callback is used to prevent an auto-migration from being generated
# when there are no changes to the schema
# reference: http://alembic.zzzcomputing.com/en/latest/cookbook.html
def process_revision_directives(context, revision, directives):
if getattr(config.cmd_opts, 'autogenerate', False):
script = directives[0]
if script.upgrade_ops.is_empty():
directives[:] = []
logger.info('No changes in schema detected.')
connectable = get_engine()
with connectable.connect() as connection:
context.configure(
connection=connection,
target_metadata=get_metadata(),
process_revision_directives=process_revision_directives,
**current_app.extensions['migrate'].configure_args,
include_schemas=True
)
with context.begin_transaction():
context.run_migrations()
I have a flask application and trying to make it multi-tenant using multiple schemas in a single database.
When an alteration needed to the database like adding a column, adding a table, and other alterations. I need to migrate through each to schemas. I changed my migrations/env.py like below
def run_migrations_online():
"""Run migrations in 'online' mode.
In this scenario we need to create an Engine
and associate a connection with the context.
"""
engine = engine_from_config(
config.get_section(config.config_ini_section),
prefix='sqlalchemy.',
poolclass=pool.NullPool)
# schemas = set([prototype_schema,None])
connection = engine.connect()
context.configure(
connection=connection,
target_metadata=target_metadata,
include_schemas=True, #schemas,
# include_object=include_schemas([None,prototype_schema])
include_object=include_schemas([None])
)
try:
domains = ['public', 'test', 'some_schema_name']
for domain in domains:
connection.execute('set search_path to "{}", public'.format(domain))
with context.begin_transaction():
context.run_migrations()
finally:
connection.close()
The migrations are only affecting the first schema in the array. Here the public only gets migrated. I need to migrate across all schemas.
QUESTION :
How to exclude the logs migrations from the default database, when using multiple databases in Django.
I want this to be automated. I started overriding the migrate command
I am using the default database for all models in my application and I need new database Logs, for only one model (the model is in different app - logs)
I successfully connected the application with the both databases. Also I am using a Router to control the operations
class LogRouter:
route_app_labels = {'logs'}
def db_for_read(self, model, **hints):
...
def db_for_write(self, model, **hints):
...
def allow_migrate(self, db, app_label, model_name=None, **hints):
"""
Make sure the logs app only appear in the
'logs' database.
"""
if app_label in self.route_app_labels:
return db == 'logs'
if db != 'default':
"""
If the database is not default, do not apply the migrations to the other
database.
"""
return False
return None
With allow_migrate I am faking the logs migrations in the default database which is updating the table django_migrations with the logs migration.
Also with
if db != 'default':
"""
If the database is not default, do not apply the migrations to the other database.
"""
return False
I am faking the migrations from the default database in the logs database and again the django_migrations table is updated with all the default database migrations.
This is fine solution, but I want to achieve:
The logs migrations to be ignored in the default database, including django_migrations table
The migrations for the default database to be ignored from the logs database, including django_migrations table
To achieve this, I tried overriding the migrate command:
from django.core.management.commands import migrate
class Command(migrate.Command):
def handle(self, *args, **options):
super(Command, self).handle(*args, **options)
# this is equal to python manage.py migrate logs --database=logs
# This will execute only the logs migrations in the logs database
options['app_label'] = options['database'] ='logs'
super(Command, self).handle(*args, **options)
With this code I am fixing the logs database, but the default still tries to execute the logs migrations (it is writing them down in the django_migrations table)
I'd like to connect to a second external database during my migration to move some of its data into my local database. What's the best way to do this?
Once the second db has been added to the alembic context (which I am not sure of how to do), how can run SQL statements on the db during my migration?
This is what my env.py looks like right now:
from alembic import context
from sqlalchemy import engine_from_config, pool
from logging.config import fileConfig
from migration_settings import database_url
import models
# this is the Alembic Config object, which provides
# access to the values within the .ini file in use.
config = context.config
# Interpret the config file for Python logging.
# This line sets up loggers basically.
fileConfig(config.config_file_name)
# add your model's MetaData object here
# for 'autogenerate' support
target_metadata = models.Base.metadata
# other values from the config, defined by the needs of env.py,
# can be acquired:
# my_important_option = config.get_main_option("my_important_option")
# ... etc.
def run_migrations_offline():
"""Run migrations in 'offline' mode.
This configures the context with just a URL
and not an Engine, though an Engine is acceptable
here as well. By skipping the Engine creation
we don't even need a DBAPI to be available.
Calls to context.execute() here emit the given string to the
script output.
"""
url = database_url or config.get_main_option("sqlalchemy.url")
context.configure(url=url, target_metadata=target_metadata, literal_binds=True, version_table_schema='my_schema')
with context.begin_transaction():
context.run_migrations()
def run_migrations_online():
"""Run migrations in 'online' mode.
In this scenario we need to create an Engine
and associate a connection with the context.
"""
config_overrides = {'url': database_url} if database_url is not None else {}
connectable = engine_from_config(
config.get_section(config.config_ini_section),
prefix='sqlalchemy.',
poolclass=pool.NullPool, **config_overrides)
with connectable.connect() as connection:
context.configure(
connection=connection,
target_metadata=target_metadata,
version_table_schema='my_schema'
)
connection.execute('CREATE SCHEMA IF NOT EXISTS my_schema')
with context.begin_transaction():
context.run_migrations()
if context.is_offline_mode():
run_migrations_offline()
else:
run_migrations_online()
I have created some models and when I run python manage.py db migrate command it creates migrations file, so that is fine.
python manage.py db upgrade command also creates table in Database.
If I again run the python manage.py db migrate command then it is creating migrations file for those models that I have upgraded recently.
Can you please help me to resolve it.
I had same problem and i've resolved it.
In my case, There is a problem on getting current table names.
(when calling get_table_names function in _autogen_for_tables((alembic/autogenerate/compare.py))
I am using sqlalchemy with the mysql-connector.
mysql-connector return table information as bytearray.
so i've changed temporally the following. (base.py(sqlalchemy/dialects/mysql))
#reflection.cache
def get_table_names(self, connection, schema=None, **kw):
"""Return a Unicode SHOW TABLES from a given schema."""
if schema is not None:
current_schema = schema
else:
current_schema = self.default_schema_name
charset = self._connection_charset
if self.server_version_info < (5, 0, 2):
rp = connection.execute(
"SHOW TABLES FROM %s"
% self.identifier_preparer.quote_identifier(current_schema)
)
return [
row[0] for row in self._compat_fetchall(rp, charset=charset)
]
else:
rp = connection.execute(
"SHOW FULL TABLES FROM %s"
% self.identifier_preparer.quote_identifier(current_schema)
)
return [
row[0]
for row in self._compat_fetchall(rp, charset=charset)
if row[1] == "BASE TABLE"
]
to
#reflection.cache
def get_table_names(self, connection, schema=None, **kw):
"""Return a Unicode SHOW TABLES from a given schema."""
if schema is not None:
current_schema = schema
else:
current_schema = self.default_schema_name
charset = self._connection_charset
if self.server_version_info < (5, 0, 2):
rp = connection.execute(
"SHOW TABLES FROM %s"
% self.identifier_preparer.quote_identifier(current_schema)
)
return [
row[0] for row in self._compat_fetchall(rp, charset=charset)
]
else:
rp = connection.execute(
"SHOW FULL TABLES FROM %s"
% self.identifier_preparer.quote_identifier(current_schema)
)
return [
row[0].decode("utf-8")
for row in self._compat_fetchall(rp, charset=charset)
if row[1].decode("utf-8") == "BASE TABLE"
]
I think the problem is to manage.py. If you did it as described on flask-migration site and stored all your models in this file - flask-migration just get these models and generates migrations and will do it always. You wrapped the standard command in your file and this is the problem.
If you want to fix it - store models in another directory (or another file), add them to an app and use command flask db migrate. In this case, flask-migration will generate migration for models only at first time, for others, it will detect changes and generate migrations only for changes.
But be careful, flask-migration don't see all changes. From site:
The migration script needs to be reviewed and edited, as Alembic currently does not detect every change you make to your models. In particular, Alembic is currently unable to detect table name changes, column name changes, or anonymously named constraints. A detailed summary of limitations can be found in the Alembic autogenerate documentation.