SQLAlchemy+Postgres - using generated sequential id without a flush - python

I am using SQLAlchemy with Postgres and need to create "external ids" that depend on the Postgres-generated (sequential) id. The code looks like this:
# session is passed to the constructor as described in http://docs.sqlalchemy.org/en/latest/orm/session_basics.html#when-do-i-construct-a-session-when-do-i-commit-it-and-when-do-i-close-it
def make_merchant(merchant_data, session):
merchant = Merchant(**merchant_data)
session.add(merchant)
session.flush() # Is this needed to have Postgres generate the sequential id?
merchant.external_id = 'merchant_' + convert_to_external_id(merchant.id) # Uses the Postgres-generated id
My issue: the flush seems necessary to get the Postgres-generated id, but it introduces a potentially annoying side effect as it flushes not only the newly created merchant but also other objects in the session.
My questions:
Is there a way to access the Postgres id without a flush?
If not, is there a way to only flush this one object? Creating a separate session just for this construction seems like an anti-pattern
Thanks!

Related

Understanding session with fastApi dependency

I am new to Python and was studying FastApi and SQL model.
Reference link: https://sqlmodel.tiangolo.com/tutorial/fastapi/session-with-dependency/#the-with-block
Here, they have something like this
def create_hero(*, session: Session = Depends(get_session), hero: HeroCreate):
db_hero = Hero.from_orm(hero)
session.add(db_hero)
session.commit()
session.refresh(db_hero)
return db_hero
Here I am unable to understand this part
session.add(db_hero)
session.commit()
session.refresh(db_hero)
What is it doing and how is it working?
Couldn't understand this
In fact, you could think that all that block of code inside of the create_hero() function is still inside a with block for the session, because this is more or less what's happening behind the scenes.
But now, the with block is not explicitly in the function, but in the dependency above:
It's an explanation from docs what is a session
In the most general sense, the Session establishes all conversations
with the database and represents a “holding zone” for all the objects
which you’ve loaded or associated with it during its lifespan. It
provides the interface where SELECT and other queries are made that
will return and modify ORM-mapped objects. The ORM objects themselves
are maintained inside the Session, inside a structure called the
identity map - a data structure that maintains unique copies of each
object, where “unique” means “only one object with a particular
primary key”.
So
# This line just simply create a python object
# that sqlalchemy would "understand".
db_hero = Hero.from_orm(hero)
# This line add the object `db_hero` to a “holding zone”
session.add(db_hero)
# This line take all objects from a “holding zone” and put them in a database
# In our case we have only one object in this zone,
# but it is possible to have several
session.commit()
# This line gets created row from the database and put it to the object.
# It means it could have new attributes. For example id,
# that database would set for this new row
session.refresh(db_hero)

Detect if a SQLAlchemy model has pending write operations associated with it

I have an application in which I query against a SQL database and end up with a SQL Alchemy object representing a given row. Then, based on user input and a series of if/then statements I may perform an update on the SQLA object.
i.e.,
if 'fooinput1' in payload:
sqla_instance.foo1 = validate_foo1(fooinput1)
if 'fooinput2' in payload:
sqla_instance.foo2 = validate_foo2(fooinput2)
...
I now need to add modified_at and modified_by data to this system. Is it possible to check something on the SQLA instance like sqla_instance.was_modified or sqla_instance.update_pending to determine if a modification was performed?
(I recognize that I could maintain my own was_modified boolean, but since there are many of these if/then clauses that would lead to a lot of boilerplate which I'd like to avoid if possible.)
FWIW: this is a python 2.7 pyramids app reading from a MySQL db in the context of a web request.
The Session object SQL Alchemy ORM provides has two attributes that can help with what you are trying to do:
1) Session.is_modified()
2) Session.dirty
Depending on the number of fields which modification you want to track you may achieve what you need using ORM Events:
#sa.event.listens_for(Thing.foo, 'set')
def foo_changed(target, value, oldvalue, initiator):
target.last_modified = datetime.now()
#sa.event.listens_for(Thing.baz, 'set')
def baz_changed(target, value, oldvalue, initiator):
target.last_modified = datetime.now()
The session object provides events, which may help you.
just review once,
-->Transient to pending status for session object

Add trigger to SQLAlchemy Base Class

I'm making a SQLAlchemy base class for a new Postgres database and want to have bookkeeping fields incorporated into it. Specifically, I want to have two columns for modified_at and modified_by that are updated automatically. I was able to find out how to do this for individual tables, but it seems like making this part of the base class is trickier.
My first thought was to try and leverage the declared_attr functionality, but I don't actually want to make the triggers an attribute in the model so that seems incorrect. Then I looked at adding the trigger using event.listen:
trigger = """
CREATE TRIGGER update_{table_name}_modified
BEFORE UPDATE ON {table_name}
FOR EACH ROW EXECUTE PROCEDURE update_modified_columns()
"""
def create_modified_trigger(target, connection, **kwargs):
if hasattr(target, 'name'):
connection.execute(modified_trigger.format(table_name=target.name))
Base = declarative_base()
event.listen(Base.metadata,'after_create', create_modified_trigger)
I thought I could find table_name using the target parameter as shown in the docs but when used with Base.metadata it returns MetaData(bind=None) rather than a table.
I would strongly prefer to have this functionality as part of the Base rather than including it in migrations or externally to reduce the chance of someone forgetting to add the triggers. Is this possible?
I was able to sort this out with the help of a coworker. The returned MetaData object did in fact have a list of tables. Here is the working code:
modified_trigger = """
CREATE TRIGGER update_{table_name}_modified
BEFORE UPDATE ON {table_name}
FOR EACH ROW EXECUTE PROCEDURE update_modified_columns()
"""
def create_modified_trigger(target, connection, **kwargs):
"""
This is used to add bookkeeping triggers after a table is created. It hooks
into the SQLAlchemy event system. It expects the target to be an instance of
MetaData.
"""
for key in target.tables:
table = target.tables[key]
connection.execute(modified_trigger.format(table_name=table.name))
Base = declarative_base()
event.listen(Base.metadata, 'after_create', create_modified_trigger)

Modify data as part of an alembic upgrade

I would like to modify some database data as part of an alembic upgrade.
I thought I could just add any code in the upgrade of my migration, but the following fails:
def upgrade():
### commands auto generated by Alembic - please adjust! ###
op.add_column('smsdelivery', sa.Column('sms_message_part_id', sa.Integer(), sa.ForeignKey('smsmessagepart.id'), nullable=True))
### end Alembic commands ###
from volunteer.models import DBSession, SmsDelivery, SmsMessagePart
for sms_delivery in DBSession.query(SmsDelivery).all():
message_part = DBSession.query(SmsMessagePart).filter(SmsMessagePart.message_id == sms_delivery.message_id).first()
if message_part is not None:
sms_delivery.sms_message_part = message_part
with the following error:
sqlalchemy.exc.UnboundExecutionError: Could not locate a bind configured on mapper Mapper|SmsDelivery|smsdelivery, SQL expression or this Session
I am not really understanding this error. How can I fix this or is doing operations like this not a possibility?
It is difficult to understand what exactly you are trying to achieve from the code excerpt your provided. But I'll try to guess. So the following answer will be based on my guess.
Line 4 - you import things (DBSession, SmsDelivery, SmsMessagePart) form your modules and then you are trying to operate with these objects like you do in your application.
The error shows that SmsDelivery is a mapper object - so it is pointing to some table. mapper objects should bind to valid sqlalchemy connection.
Which tells me that you skipped initialization of DB objects (connection and binding this connection to mapper objects) like you normally do in your application code.
DBSession looks like SQLAlchemy session object - it should have connection bind too.
Alembic already has connection ready and open - for making changes to db schema you are requesting with op.* methods.
So there should be way to get this connection.
According to Alembic manual op.get_bind() will return current Connection bind:
For full interaction with a connected database, use the “bind” available from the context:
from alembic import op
connection = op.get_bind()
So you may use this connection to run your queries into db.
PS. I would assume you wanted to perform some modifications to data in your table. You may try to formulate this modification into one update query. Alembic has special method for executing such changes - so you would not need to deal with connection.
alembic.operations.Operations.execute
execute(sql, execution_options=None)
Execute the given SQL using the current migration context.
In a SQL script context, the statement is emitted directly to the output stream. There is no return result, however, as this function is oriented towards generating a change script that can run in “offline” mode.
Parameters: sql – Any legal SQLAlchemy expression, including:
a string a sqlalchemy.sql.expression.text() construct.
a sqlalchemy.sql.expression.insert() construct.
a sqlalchemy.sql.expression.update(),
sqlalchemy.sql.expression.insert(), or
sqlalchemy.sql.expression.delete() construct. Pretty much anything
that’s “executable” as described in SQL Expression Language Tutorial.
Its worth noting that if you do this, you probably want to freeze a copy of your orm model inside the migration, like this:
class MyType(Base):
__tablename__ = 'existing_table'
__table_args__ = {'extend_existing': True}
id = Column(Integer, ...)
..
def upgrade():
Base.metadata.bind = op.get_bind()
for item in Session.query(MyType).all():
...
Otherwise you'll inevitably end up in a situation where you orm model changes and previous migrations no longer work.
Particularly note that you want to extend Base, not the base type itself (app.models.MyType) because your type might go away as some point, and once again, your migrations will fail.
You need to import Base also and then
Base.metatada.bind = op.get_bind()
and after this you can use your models like always without errors.

Django. Thread safe update or create.

We know, that update - is thread safe operation.
It means, that when you do:
SomeModel.objects.filter(id=1).update(some_field=100)
Instead of:
sm = SomeModel.objects.get(id=1)
sm.some_field=100
sm.save()
Your application is relativly thread safe and operation SomeModel.objects.filter(id=1).update(some_field=100) will not rewrite data in other model fields.
My question is.. If there any way to do
SomeModel.objects.filter(id=1).update(some_field=100)
but with creation of object if it does not exists?
from django.db import IntegrityError
def update_or_create(model, filter_kwargs, update_kwargs)
if not model.objects.filter(**filter_kwargs).update(**update_kwargs):
kwargs = filter_kwargs.copy()
kwargs.update(update_kwargs)
try:
model.objects.create(**kwargs)
except IntegrityError:
if not model.objects.filter(**filter_kwargs).update(**update_kwargs):
raise # re-raise IntegrityError
I think, code provided in the question is not very demonstrative: who want to set id for model?
Lets assume we need this, and we have simultaneous operations:
def thread1():
update_or_create(SomeModel, {'some_unique_field':1}, {'some_field': 1})
def thread2():
update_or_create(SomeModel, {'some_unique_field':1}, {'some_field': 2})
With update_or_create function, depends on which thread comes first, object will be created and updated with no exception. This will be thread-safe, but obviously has little use: depends on race condition value of SomeModek.objects.get(some__unique_field=1).some_field could be 1 or 2.
Django provides F objects, so we can upgrade our code:
from django.db.models import F
def thread1():
update_or_create(SomeModel,
{'some_unique_field':1},
{'some_field': F('some_field') + 1})
def thread2():
update_or_create(SomeModel,
{'some_unique_field':1},
{'some_field': F('some_field') + 2})
You want django's select_for_update() method (and a backend that supports row-level locking, such as PostgreSQL) in combination with manual transaction management.
try:
with transaction.commit_on_success():
SomeModel.objects.create(pk=1, some_field=100)
except IntegrityError: #unique id already exists, so update instead
with transaction.commit_on_success():
object = SomeModel.objects.select_for_update().get(pk=1)
object.some_field=100
object.save()
Note that if some other process deletes the object between the two queries, you'll get a SomeModel.DoesNotExist exception.
Django 1.7 and above also has atomic operation support and a built-in update_or_create() method.
You can use Django's built-in get_or_create, but that operates on the model itself, rather than a queryset.
You can use that like this:
me = SomeModel.objects.get_or_create(id=1)
me.some_field = 100
me.save()
If you have multiple threads, your app will need to determine which instance of the model is correct. Usually what I do is refresh the model from the database, make changes, and then save it, so you don't have a long time in a disconnected state.
It's impossible in django do such upsert operation, with update. But queryset update method return number of filtered fields so you can do:
from django.db import router, connections, transaction
class MySuperManager(models.Manager):
def _lock_table(self, lock='ACCESS EXCLUSIVE'):
cursor = connections[router.db_for_write(self.model)]
cursor.execute(
'LOCK TABLE %s IN %s MODE' % (self.model._meta.db_table, lock)
)
def create_or_update(self, id, **update_fields):
with transaction.commit_on_success():
self.lock_table()
if not self.get_query_set().filter(id=id).update(**update_fields):
self.model(id=id, **update_fields).save()
this example if for postgres, you can use it without sql code, but update or insert operation will not be atomic. If you create a lock on table you will be sure that two objects will be not created in two other threads.
I think if you have critical demands on atom operations. You'd better design it in database level instead of Django ORM level.
Django ORM system is focusing on convenience instead of performance and safety. You have to optimize the automatic generated SQL sometimes.
"Transaction" in most productive databases provide database lock and rollback well.
In mashup(hybrid) systems, or say your system added some 3rd part components, like logging, statistics. Application in different framework or even language may access database at the same time, adding thread safe in Django is not enough in this case.
SomeModel.objects.filter(id=1).update(set__some_field=100)

Categories

Resources