How to disable DDL transaction in an alembic migration

How to disable DDL transaction in an alembic migration - python

I am trying to run an alembic transaction. However, all migrations run in a transaction whenever transactions are supported (see Run alembic upgrade migrations in a transaction). How do I disable transaction for a specific migration?

Alembic used to have just two modes of using transactions:
One transaction for the whole migration command. If there are multiple versions to apply, then they all run in that single transaction.
Use a separate transaction per migration step.
However, as of version 1.2.0 (released September 2019), you can now also switch to the AUTOCOMMIT transaction level by using the MigrationContext.autocommit_block() context manager. When in this transaction mode, each statement is committed immediately. Note that there are caveats to using this feature, see below.
By default a single transaction is used, but you can call context.configure() in your env.py script to set transaction_per_migration to true to use separate transactions.
The first and default option, to use a single transaction, is executed in the env.py file that Alembic generates for you, in the run_migrations_online() function in that file:
try:
with context.begin_transaction():
context.run_migrations()
finally:
connection.close()
You could either just edit that file to remove the with context.begin_transaction(): context manager, or use the context.get_x_argument() feature to toggle transactions on the basis of a command-line switch:
try:
# Python 3.7+
from contextlib import nullcontext
except ImportError:
# Earlier Python versions
from contextlib import contextmanager
#contextmanager
def nullcontext():
yield
# ...
def run_migrations_online():
# ...
if context.get_x_argument(as_dictionary=True).get('no-transaction', False):
transaction_cm = nullcontext()
else:
transaction_cm = context.begin_transaction()
try:
with transaction_cm:
context.run_migrations()
finally:
connection.close()
To disable a transaction per migration step or for specific operations, you can use the aforementioned autocommit_block(), which is intended to be used for DDL statements that the database requires to be run outside of a transaction context:
def upgrade():
with op.get_context().autocommit_block():
op.execute("ALTER TYPE mood ADD VALUE 'soso'")
The above example (taken from the documentation), uses the Operations.get_context() method to get access to the migration context. Within the context, all statements are executed directly, without running in a transaction.
The caveat is that any transaction currently in progress is committed first. If statements before and after such a block are connected and should not be executed without the others, then you want to avoid placing an autocommit_block() in between. You also probably want to set transaction_per_migration = true, and use autocommit_block() for entire migration steps. That way you can at least minimise issues with a migration step failing halfway through.
Before version 1.2.0, it was not easy to disable transactions per migration step. You'd have do disable transactions entirely (just don't use context.begin_transaction() in env.py), then explicitly use a transaction per upgrade() or downgrade() step:
def run_migrations_online():
# ...
try:
# no with context.begin_transaction() here
context.run_migrations()
finally:
connection.close()
and in each migration step:
def upgrade():
with context.begin_transaction():
# ### commands auto generated by Alembic - please adjust! ###
op.create_table(
# ...
)
# etc.

This can be done using an autocommit block:
with op.get_context().autocommit_block():
op.execute(...)
https://alembic.sqlalchemy.org/en/latest/api/runtime.html#alembic.runtime.migration.MigrationContext.autocommit_block
This special directive is intended to support the occasional database DDL or system operation that specifically has to be run outside of any kind of transaction block. The PostgreSQL database platform is the most common target for this style of operation, as many of its DDL operations must be run outside of transaction blocks, even though the database overall supports transactional DDL.
Note that there are some caveats:
Warning: As is necessary, the database transaction preceding the block is unconditionally committed. This means that the run of migrations preceding the operation will be committed, before the overall migration operation is complete.
It is recommended that when an application includes migrations with “autocommit” blocks, that EnvironmentContext.transaction_per_migration be used so that the calling environment is tuned to expect short per-file migrations whether or not one of them has an autocommit block.

Related

pyscopg2 WITHOUT transaction

Sometimes I have a need to execute a query from psycopg2 that is not in a transaction block.
For example:
cursor.execute('create index concurrently on my_table (some_column)')
Doesn't work:
InternalError: CREATE INDEX CONCURRENTLY cannot run inside a transaction block
I don't see any easy way to do this with psycopg2. What am I missing?
I can probably call os.system('psql -c "create index concurrently"') or something similar to get it to run from my python code, however it would be much nicer to be able to do it inside python and not rely on psql to actually be in the container.
Yes, I have to use the concurrently option for this particular use case.
Another time I've explored this and not found an obvious answer is when I have a set of sql commands that I'd like to call with a single execute(), where the first one briefly locks a resource. When I do this, that resource will remain locked for the entire duration of the execute() rather than for just when the first statement in the sql string was running because they all run together in one big happy transaction.
In that case I could break the query up into a series of execute() statements - each became its own transaction, which was ok.
It seems like there should be a way, but I seem to be missing it. Hopefully this is an easy answer for someone.
EDIT: Add code sample:
#!/usr/bin/env python3.10
import psycopg2 as pg2
# -- set the standard psql environment variables to specify which database this should connect to.
# We have to set these to 'None' explicitly to get psycopg2 to use the env variables
connDetails = {'database': None, 'host': None, 'port': None, 'user': None, 'password': None}
with (pg2.connect(**connDetails) as conn, conn.cursor() as curs):
conn.set_session(autocommit=True)
curs.execute("""
create index concurrently if not exists my_new_index on my_table (my_column);
""")
Throws:
psycopg2.errors.ActiveSqlTransaction: CREATE INDEX CONCURRENTLY cannot run inside a transaction block

Per psycopg2 documentation:
It is possible to set the connection in autocommit mode: this way all the commands executed will be immediately committed and no rollback is possible. A few commands (e.g. CREATE DATABASE, VACUUM, CALL on stored procedures using transaction control…) require to be run outside any transaction: in order to be able to run these commands from Psycopg, the connection must be in autocommit mode: you can use the autocommit property.
Hence on the connection:
conn.set_session(autocommit=True)
Further resources from psycopg2 documentation:
transactions-control
connection.autocommit

Pytest-django: cannot delete db after tests

I have a Django application, and I'm trying to test it using pytest and pytest-django. However, quite often, when the tests finish running, I get the error that the database failed to be deleted: DETAIL: There is 1 other session using the database.
Basically, the minimum test code that I could narrow it down to is:
#pytest.fixture
def make_bundle():
a = MyUser.objects.create(key_id=uuid.uuid4())
return a
class TestThings:
def test_it(self, make_bundle):
all_users = list(MyUser.objects.all())
assert_that(all_users, has_length(1))
Every now and again the tests will fail with the above error. Is there something I am doing wrong? Or how can I fix this?
The database that I am using is PostgreSQL 9.6.

I am posting this as an answer because I need to post a chunk of code and because this worked. However, this looks like a dirty hack to me, and I'll be more than happy to accept anybody else's answer if it is better.
Here's my solution: basically, add the raw sql that kicks out all the users from the given db to the method that destroys the db. And do that by monkeypatching. To ensure that the monkeypatching happens before tests, add that to the root conftest.py file as an autouse fixture:
def _destroy_test_db(self, test_database_name, verbosity):
"""
Internal implementation - remove the test db tables.
"""
# Remove the test database to clean up after
# ourselves. Connect to the previous database (not the test database)
# to do so, because it's not allowed to delete a database while being
# connected to it.
with self.connection._nodb_connection.cursor() as cursor:
cursor.execute(
"SELECT pg_terminate_backend(pg_stat_activity.pid) "
"FROM pg_stat_activity "
"WHERE pg_stat_activity.datname = '{}' "
"AND pid <> pg_backend_pid();".format(test_database_name)
)
cursor.execute("DROP DATABASE %s"
% self.connection.ops.quote_name(test_database_name))
#pytest.fixture(autouse=True)
def patch_db_cleanup():
creation.BaseDatabaseCreation._destroy_test_db = _destroy_test_db
Note that the kicking-out code may depend on your database engine, and the method that needs monkeypatching may be different in different Django versions.

testing postgres db python

I don't understand how to test my repositories.
I want to be sure that I really saved object with all of it parameters into database, and when I execute my SQL statement I really received what I am supposed to.
But, I cannot put "CREATE TABLE test_table" in setUp method of unittest case because it will be created multiple times (tests of the same testcase are runned in parallel). So, as long as I create 2 methods in the same class which needs to work on the same table, it won't work (name clash of tables)
Same, I cannot put "CREATE TABLE test_table" setUpModule, because, now the table is created once, but since tests are runned in parallel, there is nothing which prevents from inserting the same object multiple times into my table, which breakes the unicity constraint of some field.
Same, I cannot "CREATE SCHEMA some_random_schema_name" in every method, because I need to globally "SET search_path TO ..." for a given Database, so every method runned in parallel will be affected.
The only way I see is to create to "CREATE DATABASE" for each test, and with unique name, and establish a invidual connection to each database.. This looks extreeeemly wasteful. Is there a better way?
Also, I cannot use SQLite in memory because I need to test PostgreSQL.

The best solution for this is to use the testing.postgresql module. This fires up a db in user-space, then deletes it again at the end of the run. You can put the following in a unittest suite - either in setUp, setUpClass or setUpModule - depending on what persistence you want:
import testing.postgresql
def setUp(self):
self.postgresql = testing.postgresql.Postgresql(port=7654)
# Get the url to connect to with psycopg2 or equivalent
print(self.postgresql.url())
def tearDown(self):
self.postgresql.stop()
If you want the database to persist between/after tests, you can run it with the base_dir option to set a directory - which will prevent it's removal after shutdown:
name = "testdb"
port = "5678"
path = "/tmp/my_test_db"
testing.postgresql.Postgresql(name=name, port=port, base_dir=path)
Outside of testing it can also be used as a context manager, where it will automatically clean up and shut down when the with block is exited:
with testing.postgresql.Postgresql(port=7654) as psql:
# do something here

How to executescript in sqlite3 from Python transactionally? [duplicate]

Context
So I am trying to figure out how to properly override the auto-transaction when using SQLite in Python. When I try and run
cursor.execute("BEGIN;")
.....an assortment of insert statements...
cursor.execute("END;")
I get the following error:
OperationalError: cannot commit - no transaction is active
Which I understand is because SQLite in Python automatically opens a transaction on each modifying statement, which in this case is an INSERT.
Question:
I am trying to speed my insertion by doing one transaction per several thousand records.
How can I overcome the automatic opening of transactions?

As #CL. said you have to set isolation level to None. Code example:
s = sqlite3.connect("./data.db")
s.isolation_level = None
try:
c = s.cursor()
c.execute("begin")
...
c.execute("commit")
except:
c.execute("rollback")

The documentaton says:
You can control which kind of BEGIN statements sqlite3 implicitly executes (or none at all) via the isolation_level parameter to the connect() call, or via the isolation_level property of connections.
If you want autocommit mode, then set isolation_level to None.

How to force django to print each executed sql query

I have some function written with python. I want to know all sql queries, that was executed within this function. Is there a way to code something like:
def f():
start_to_print_queries()
# ...
# many many python code
# ...
stop_to_print_queries()
?

You can use django testing tools to capture queries on a connection. Assuming the default connection, something like this should work:
from django.db import connection
from django.test.utils import CaptureQueriesContext
def f():
with CaptureQueriesContext(connection) as queries:
# ...
# many many python code
# ...
print(len(queries.captured_queries))
Note that this will only work in debug mode (settings.DEBUG = True), because it relies on the engine catpuring the queries. If you are using more than one connection, simply substitute the connection you are interested in.
If you are interested in the detail of queries, queries.captured_queries contains detailed information: the sql code, the params and the timings of each request.
Also, if you need to count queries while building test cases, you can simply assert the number, like this:
def test_the_function_queries(self):
with self.assertNumQueries(42): # check the_function does 42 queries.
the_function()
If the test fails, Django will print all the queries for you to examine.

I would recommend the excellent django-debug-toolbar package. It allows you to interactively examine the SQL statements executed in a view, and even provides profiling information.
You can get it from pip:
pip install django-debug-toolbar
Include it in your settings.INSTALLED_APPLICATIONS:
INSTALLED_APPS = (
# ...
'django.contrib.staticfiles',
# ...
'debug_toolbar',
)
When executing your project in with DEBUG=True you should see a DjDT button in the top right corner.
Expanding the SQL tab will give you a detailed list of the sql queries.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.