SQL Alchemy Rollback Deletes Committed Entries In Unit Test - python

I am writing a unit test for a new method, and it is executing the expected lines in the expected order. However, when it hits a SQL Alchemy rollback command, it is undoing not only the changes in the sub-method that is being tested but also deletes all of the testing entries that I had created. Does anyone know why this might have happened?
EDIT: I have determined that the error is because in my testing suite, db_session is a scoped_session, not the primary session. To get exactly the performance I want, I would need to mock the db_session.rollback method to call the rollback of a nested session, something like the lines shown below. However, when I attempt to use the lines shown, I get an error. Anyone have an idea of how I could do it instead?
nested_sess = self.db_session.begin_nested()
with patch.object(self.db_session, "rollback", nested_sess.rollback):
The code of the unit test is shown below (Although it is mostly comprised of other custom methods):
# Create the mock admin user
admin_id = random.randint(1, 90000)
test_utils.create_user_v2(
db_session=self.db_session, user_args={"id": admin_id}
)
# Create a test site, user and active sub
test_items: dict = self.create_needed_entries(
site_commissioned=True,
active_sub_count=1,
future_sub_count=0,
)
# Pull the IDs of the test items
test_item_ids: dict = self.pull_test_entry_ids(test_items)
# Get the needed IDs
site_id: int = test_items[self.SITE_KEY].id
# Mock the admin list constant
with patch(self.ADMIN_USER_LIST_PATCH, [admin_id]):
# Test the handling method with the admin user
response: MigrationResponse = handle_subscription_migration(
db_session=self.db_session,
site_id=site_id,
user_id=admin_id,
required_access="admin",
cancel_subscription=True,
)
# Assert that the cancellation failed; Lifetimes not cancelled
self.assert_unchanged_on_fail(response, test_item_ids)
There is a self.db_session.commit() statement in the method self.create_needed_entries, and this works as expected (Tested using VSCode debugging tools) up until the last non-return line of handle_subscription_migration, which is a db_session.rollback line; After the rollback executes, SQL alchemy methods like Query.first and Query.all start returning None rather than the entries created.
I have other Unit Tests that are mostly the same logic, but work as intended. The only salient difference that I can see between them and the code shown above is that the others don't have the patch line.

Related

Calling stored function or procedure won't insert and persist changes

So I am very confused about this weird behaviour I have with SQLAlchemy and PostgreSQL. Let's say I have a table:
create table staging.my_table(
id integer DEFAULT nextval(...),
name text,
...
);
and a stored function:
create or replace function staging.test()
returns void
language plpgsql
as $function$
begin
insert into staging.my_table (name) values ('yay insert');
end;
$function$;
What I want to do now is call this function in Python with SQLAlchemy like this:
from sqlalchemy import create_engine
engine = create_engine('postgresql+psycopg2://foo:bar#localhost:5432/baz')
engine.execute('select staging.test()')
When I run this Python code nothing get's inserted in my database. That's weird because when I replace the function call with select 1 and add .fetchall() to it it gets executed and I see the result in console when I print it.
Let's say I run this code twice and nothing happens but code runs successful without errors.
If I switch to the database now and run select staging.test(); and select my_table I get: id: 3; name: yay insert.
So that means the sequence is actually increasing when I run my Python file but there is no data in my table.
What am I doing wrong? Am I missing something? I googled but didn't find anything.
This particular use case is singled out in "Understanding Autocommit":
Full control of the “autocommit” behavior is available using the generative Connection.execution_options() method provided on Connection, Engine, Executable, using the “autocommit” flag which will turn on or off the autocommit for the selected scope. For example, a text() construct representing a stored procedure that commits might use it so that a SELECT statement will issue a COMMIT:
engine.execute(text("SELECT my_mutating_procedure()").execution_options(autocommit=True))
The way SQLAlchemy autocommit detects data changing operations is that it matches the statement against a pattern, looking for things like UPDATE, DELETE, and the like. It is impossible for it to detect if a stored function/procedure performs mutations, and so explicit control over autocommit is provided.
The sequence is incremented even on failure because nextval() and setval() calls are never rolled back.

Peewee mid-air collision detection with SQLite

I have a couple models that I want to update at the same time. First I get their data from the db with a simple:
s = Store.get(Store.id == store_id)
new_book = Book.get(Book.id == data[book_id'])
old_book = Book.get(Book.id == s.books.id)
The actual schema is irrelevant here. Then I do some updates to these models and at the end I save all three of them with:
s.save()
new_book.save()
old_book.save()
The function that handles these operations uses the #db.atomic() decorator so the writes are bunched into a single transaction. The problem is that what if, between the point where I get() the data from the DB and the point where I save the modified data, another process changed something with these models in the DB already. Is there a way to execute those writes (.save() operations) only if the underlying DB rows have not been changed? I could read their last_changed value but again, is there a way to do this and update at the same time? And if data has been changed, simply throw an exception?
Turns out there is a solution for this in the official docs called Optimistic Locking.

neo4j: Deleting nodes within transaction causes exception

I have put together a simple test case:
account = EmailAccount()
account.email = "some#mail"
assert db.account_by_mail("some#mail") == []
db.add_node(account)
assert db.account_by_mail(account.email) == [account]
db.delete_node(account))
assert db.account_by_mail("some#mail") == []
All goes well until the last line, where an exception is thrown:
Neo.DatabaseError.Statement.ExecutionFailure: Node 226 has been deleted
The statement executed by last line is as follows:
MATCH (account:Account) WHERE account.email = {mail} RETURN account, id(account), head(labels(account))
with parameters
{
'mail': "some#mail"
}
All of the statements are executed within same transaction(we use the py2neo Transaction class for that, wrapped in a session wrapper - db). The behavior isn't exactly in line with delete semantics (link here) as the transaction hasn't been commited and the statement is a read, not a write. Are there some other hidden constraints? Is this default behavior, and if so, can it be changed(since I assume most other dbms don't behave this way)?
The latest version of py2neo uses a lazy approach to node commits to avoid unnecessary amounts of network traffic (see http://py2neo.org/2.0/intro.html#nodes-relationships). As a result, you have to directly commit your changes to the graph in order for them to be persisted. Without seeing your db code more explicitly, if I understand correctly, you are not committing your transactions to the graph and, as such, you are simply modifying the state of objects not yet persisted.
This behavior is in line with other orm implementations such as SQL Alchemy (python) or Hibernate (Java) and closely mirrors the actual transactional demarcation that takes place at the dbms level.

How to delete rows from a table using an SQLAlchemy query without ORM?

I'm writing a quick and dirty maintenace script to delete some rows and would like to avoid having to bring my ORM classes/mappings over from the main project. I have a query that looks similar to:
address_table = Table('address',metadata,autoload=True)
addresses = session.query(addresses_table).filter(addresses_table.c.retired == 1)
According to everything I've read, if I was using the ORM (not 'just' tables) and passed in something like:
addresses = session.query(Addresses).filter(addresses_table.c.retired == 1)
I could add a .delete() to the query, but when I try to do this using only tables I get a complaint:
File "/usr/local/lib/python2.6/dist-packages/sqlalchemy/orm/query.py", line 2146, in delete
target_cls = self._mapper_zero().class_
AttributeError: 'NoneType' object has no attribute 'class_'
Which makes sense as its a table, not a class. I'm quite green when it comes to SQLAlchemy, how should I be going about this?
Looking through some code where I did something similar, I believe this will do what you want.
d = addresses_table.delete().where(addresses_table.c.retired == 1)
d.execute()
Calling delete() on a table object gives you a sql.expression (if memory serves), that you then execute. I've assumed above that the table is bound to a connection, which means you can just call execute() on it. If not, you can pass the d to execute(d) on a connection.
See docs here.
When you call delete() from a query object, SQLAlchemy performs a bulk deletion. And you need to choose a strategy for the removal of matched objects from the session. See the documentation here.
If you do not choose a strategy for the removal of matched objects from the session, then SQLAlchemy will try to evaluate the query’s criteria in Python straight on the objects in the session. If evaluation of the criteria isn’t implemented, an error is raised.
This is what is happening with your deletion.
If you only want to delete the records and do not care about the records in the session after the deletion, you can choose the strategy that ignores the session synchronization:
address_table = Table('address', metadata, autoload=True)
addresses = session.query(address_table).filter(address_table.c.retired == 1)
addresses.delete(synchronize_session=False)

SQLAlchemy - MapperExtension.before_delete not called

I have question regarding the SQLAlchemy. I have database which contains Items, every Item has assigned more Records (1:n). And the Record is partially stored in the database, but it also has an assigned file (1:1) on the filesystem.
What I want to do is to delete the assigned file when the Record is removed from the database. So I wrote the following MapperExtension:
class _StoredRecordEraser(MapperExtension):
def before_delete(self, mapper, connection, instance):
instance.erase()
The following code creates an experimental setup (full code is here: test.py):
session = Session()
i1 = Item(id='item1')
r11 = Record(id='record11', attr='1')
i1.records.append(r11)
r12 = Record(id='record12', attr='2')
i1.records.append(r12)
session.add(i1)
session.commit()
And finally, my problem... The following code works O.k. and the old.erase() method is called:
session = Session()
i1 = session.query(Item).get('item1')
old = i1.records[0]
new = Record(id='record13', attr='3')
i1.records.remove(old)
i1.records.append(new)
session.commit()
But when I change the id of a new Record to record11, which is already in the database, but it is not the same item (attr=3), the old.erase() is not called. Does anybody know why?
Thanks
A delete + insert of two records that ultimately have the same primary key within a single flush are converted into a single update right now. this is not the best behavior - it really should delete then insert, so that the various events assigned to those activities are triggered as expected (not just mapper extension methods, but database level defaults too). But the flush() process is hardwired to perform inserts/updates first, then deletes. As a workaround, you can issue a flush() after the remove/delete operation, then a second for the add/insert.
As far as flushes' current behavior, I've looked into trying to break this out but it gets very complicated - inserts which depend on deletes would have to execute after the deletes, but updates which depend on inserts would have to execute beforehand. Ultimately, the unitofwork module would be rewritten (big time) to consider all insert/update/deletes in a single stream of dependent actions that would be topologically sorted against each other. This would simplify the methods used to execute statements in the correct order, although all new systems for synchronizing data between rows based on server-level defaults would have to be devised, and its possible that complexity would be re-introduced if it turned out the "simpler" method spent too much time naively sorting insert statements that are known at the ORM level to not require any sorting against each other. The topological sort works at a more coarse grained level than that right now.

Categories

Resources