Short Version
In SQLAlchemy's ORM column declaration, how can I use server_default=sa.FetchedValue() on one dialect, and default=somePythonFunction on another, so that my real DBMS can populate things with triggers, and my test code can be written against sqlite?
Background
I'm using SQLAlchemy's declarative ORM to work with a Postgres database, but trying to write unit tests against an sqlite:///:memory:, and running into a problem with columns that have computed defaults on their primary keys. For a minimal example:
CREATE TABLE test_table(
id VARCHAR PRIMARY KEY NOT NULL
DEFAULT (lower(hex(randomblob(16))))
)
SQLite itself is quite happy with this table definition (sqlfiddle) but SQLAlchemy seems unable to work out the ID of newly created rows.
class TestTable(Base):
__tablename__ = 'test_table'
id = sa.Column(
sa.VARCHAR,
primary_key=True,
server_default=sa.FetchedValue())
Definitions like this work just fine in postgres, but die in sqlite (as you can see on Ideone) with a FlushError when I call Session.commit:
sqlalchemy.orm.exc.FlushError: Instance <TestTable at 0x7fc0e0254a10> has a NULL identity key. If this is an auto-generated value, check that the database table allows generation of new primary key values, and that the mapped Column object is configured to expect these generated values. Ensure also that this flush() is not occurring at an inappropriate time, such as within a load() event.
The documentation for FetchedValue warns us that this can happen on dialects that don't support the RETURNING clause on INSERT:
For special situations where triggers are used to generate primary key
values, and the database in use does not support the RETURNING clause,
it may be necessary to forego the usage of the trigger and instead
apply the SQL expression or function as a “pre execute” expression:
t = Table('test', meta,
Column('abc', MyType, default=func.generate_new_value(),
primary_key=True)
)
func.generate_new_value is not defined anywhere else in SQLAlchemy, so it seems they intend I either generate defaults in Python, or else write a separate function to do a SQL query to generate a default value in the DBMS. I can do that, but the problem is, I only want to do that for SQLite, since FetchedValue does exactly what I want on postgres.
Dead Ends
Subclassing Column probably won't work. Nothing that I can find in the sources ever tells the Column what dialect is being used, and the behavior of the default and server_default fields is defined outside the class
Writing a python function that calls the triggers by hand on the real DBMS creates a race condition. Avoiding the race condition by changing the isolation level creates a deadlock.
My Current Workaround
Bad because it breaks integration tests that connect to a real postgres.
import sys
import sqlalchemy as sa
def trigger_column(*a, **kw):
python_default = kw.pop('python_default')
if 'unittest' in sys.modules:
return sa.Column(*a, default=python_default, **kw)
else
return sa.Column(*a, server_default=sa.FetchedValue(), **kw)
Not a direct answer to you question but hopefully helpful to someone
My problem was wanting to change the collation depending on the dialect, this was my solution:
from sqlalchemy import Unicode
from sqlalchemy.ext.compiler import compiles
#compiles(Unicode, 'sqlite')
def compile_unicode(element, compiler, **kw):
element.collation = None
return compiler.visit_unicode(element, **kw)
This changes the collation for all Unicode columns only for sqlite.
Here's some documentation: http://docs.sqlalchemy.org/en/latest/core/custom_types.html#overriding-type-compilation
Related
so I have a requirement to replace the individual column select that sqlalchemy automatically does in its queries with a general star select (SELECT table.* FROM...) whenever the whole table or mapped class is provided to Session.query(). It shouldn't change how InstrumentedAttribute objects are compiled though.
So for example, for the ORM classes Table1 and Table2:
str(s.query(Table1, Table2.col1, Table2.col2).select_from(Table1).join(Table2))
Should display as:
SELECT table1.*, table2.col1, table2.col2
FROM table1
JOIN table2 ON table1.some_col = table2.some_other_col
I've read the sqlalchemy documentation on the #compiles decorator factory and I've tried to use it on the following expression constructs with a simple breakpoint:
sqlalchemy.sql.expression.ColumnCollection
sqlalchemy.sql.expression.ClauseList
sqlalchemy.sql.expression.ColumnElement
sqlalchemy.sql.expression.TableClause
so for example:
#compiles(sqlalchemy.sql.expression.ColumnCollection)
def star_select(element, compiler, **kwargs):
breakpoint()
but no matter which of these I pass to #compiles, the flow of execution never actually goes into the function I've decorated with #compiles. And I can't find any others that would seem to do what I want.
Does anyone have any idea which sqlalchemy class I'd have to pass to the #compiles function to override just the column select list portion of an ORM select query?
I'm feel like I'm going crazy with this.
I want to add a new Int column to my MYSQL DB, so that in the sqlalchemy ORM it will be converted to an ENUM.
For example, let's say I have this enum:
class employee_type(Enum):
Full_time = 1
Part_time = 2
Student = 3
I want to keep in the DB those params - 1,2,3..., but when developers will write code that involves this model - they will just use the Enum, without having to go through getter and setter functions.
So they will be able to do - instance_of_model.employee_type and get an Enum. And - new_instance = model_name(employee_type=Employee_type.Full_time..)
How should I define my sqlalchemy model so it will work? (I've heard of hybrid types but not sure it will work here)
Thanks!
Apparently the answer is super simple(!), there is nothing special we need to do - SQLAlchemy support it by itself.
Meaning - you can set the specific column to be INT in the DB, but enum in the model, and when querying the DB SQLAlchemy will convert it by itself. same goes when inserting to the DB :)
I used it with declarative enum, it means it's values are strings (and it has a function - from_string()).
So once I used VARCHAR columns, it worked like a charm!
Sqlalchemy checks the db if native type enum is available.
If the db does not support this, the default type is VARCHAR.
So you can
Enum(MyEnum, native_enum=False)
I have to use SQLalchemy Core expression to fetch objects because ORM can't do "update and returning". (the update in ORM doesn't has returning)
from sqlalchemy import update
class User(ORMBase):
...
# pure sql expression, the object returned is not ORM object.
# the object is a RowProxy.
object = update(User) \
.values({'name': 'Wayne'}) \
.where(User.id == subquery.as_scalar()) \
.returning() \
.fetchone()
When
db_session.add(object)
it report UnmappedInstanceError: Class 'sqlalchemy.engine.result.RowProxy' is not mapped.
How do I put that RowProxy object from sql expression into identity map of ORM
?
I'm not sure there is a straight-forward way to do what you're describing, which is essentially to build an ORM object that maps directly to an database entry but without performing the query through the ORM.
My intuition is that the naive approach (just build init the ORM object with the values in the database) would just create another row with the same values (or fail to because of uniqueness constraints).
The more standard way to do what you are asking would be to query the row through the ORM first and then update the database from that ORM object.
user = User.query.filter(User.user_attribute == 'foo').one()
user.some_value = 'bar'
session.add(user)
session.commit()
I'm not sure if you have some constraint on your end that prevents you from using that pattern though. The documentation works through similar examples
Simple case:
Possible quick solution: construct the object from kwargs of your RowProxy, since those are object-like.
Given:
rowproxy = update(User) \
.values({'name': 'Wayne'}) \
.where(User.id == subquery.as_scalar()) \
.returning() \
.fetchone()
We might be able to do:
user = User(**dict(rowproxy.items()))
rowproxy.items() returns tuples of key-value pairs; dict(...) converts the tuples into actual key-value pairs; and User(...) takes kwargs for the model attribute names.
More difficult case:
But what if you have a model where one of the attribute names isn't quite the same as the SQL table column name? E.g. something like:
class User(ORMBase):
# etc...
user_id = Column(name='id', etc)
When we try to unpack our rowproxy into the User class, we'll likely get an error along the lines of: TypeError: 'id' is an invalid keyword argument for User (because it's expecting user_id instead).
Now it gets dirty: we should have lying around a mapper for how to get from the table attributes to the model attributes and vice versa:
kw_map = {a.key: a.class_attribute.name for a in User.__mapper__.attrs}
Here, a.key is the model attribute (and kwarg), and a.class_attribute.name is the table attribute. This gives us something like:
{
"user_id": "id"
}
Well, we want to actually provide the values we got back from our rowproxy, which besides allowing object-like access also allows dict-like access:
kwargs = {a.key: rowproxy[a.class_attribute.name] for a in User.__mapper__.attrs}
And now we can do:
user = User(**kwargs)
Errata:
you may want to session.commit() right after calling update().returning() to prevent long delays from your changes vs. when they get permanently stored in the database. No need to session.add(user) later - you already updated() and just need to commit() that transaction
object is a keyword in Python, so try not to stomp on it; you could get some very bizarre behavior doing that; that's why I renamed to rowproxy.
I'm making a SQLAlchemy base class for a new Postgres database and want to have bookkeeping fields incorporated into it. Specifically, I want to have two columns for modified_at and modified_by that are updated automatically. I was able to find out how to do this for individual tables, but it seems like making this part of the base class is trickier.
My first thought was to try and leverage the declared_attr functionality, but I don't actually want to make the triggers an attribute in the model so that seems incorrect. Then I looked at adding the trigger using event.listen:
trigger = """
CREATE TRIGGER update_{table_name}_modified
BEFORE UPDATE ON {table_name}
FOR EACH ROW EXECUTE PROCEDURE update_modified_columns()
"""
def create_modified_trigger(target, connection, **kwargs):
if hasattr(target, 'name'):
connection.execute(modified_trigger.format(table_name=target.name))
Base = declarative_base()
event.listen(Base.metadata,'after_create', create_modified_trigger)
I thought I could find table_name using the target parameter as shown in the docs but when used with Base.metadata it returns MetaData(bind=None) rather than a table.
I would strongly prefer to have this functionality as part of the Base rather than including it in migrations or externally to reduce the chance of someone forgetting to add the triggers. Is this possible?
I was able to sort this out with the help of a coworker. The returned MetaData object did in fact have a list of tables. Here is the working code:
modified_trigger = """
CREATE TRIGGER update_{table_name}_modified
BEFORE UPDATE ON {table_name}
FOR EACH ROW EXECUTE PROCEDURE update_modified_columns()
"""
def create_modified_trigger(target, connection, **kwargs):
"""
This is used to add bookkeeping triggers after a table is created. It hooks
into the SQLAlchemy event system. It expects the target to be an instance of
MetaData.
"""
for key in target.tables:
table = target.tables[key]
connection.execute(modified_trigger.format(table_name=table.name))
Base = declarative_base()
event.listen(Base.metadata, 'after_create', create_modified_trigger)
I'm using sqlalchemy and am trying to integrate alembic for database migrations.
My database currently exists and has a number of ForeignKeys defined without names. I would like to add a naming convention to allow for migrations that affect ForeignKey columns.
I've added the naming convention given here to the top of my models.py file:
SQLAlchemy Naming Constraints
convention = {
"ix": 'ix_%(column_0_label)s',
"uq": "uq_%(table_name)s_%(column_0_name)s",
"ck": "ck_%(table_name)s_%(constraint_name)s",
"fk": "fk_%(table_name)s_%(column_0_name)s_%(referred_table_name)s",
"pk": "pk_%(table_name)s"
}
DeclarativeBase = declarative_base()
DeclarativeBase.metadata = MetaData(naming_convention=convention)
def db_connect():
return create_engine(URL(**settings.DATABASE))
def create_reviews_table(engine):
DeclarativeBase.metadata.create_all(engine)
class Review(DeclarativeBase):
__tablename__ = 'reviews'
id = Column(Integer, primary_key=True)
review_id = Column('review_id', String, primary_key=True)
resto_id = Column('resto_id', Integer, ForeignKey('restaurants.id'),
nullable=True)
url = Column('url', String),
resto_name = Column('resto_name', String)
I've set up alembic/env.py as per the tutorial instructions, feeding my model's metadata into target_metadata.
When I run
$: alembic current
I get the following error:
sqlalchemy.exc.InvalidRequestError: Naming convention including %(constraint_name)s token requires that constraint is explicitly named.
In the docs they say that "This same feature [generating names for columns using a naming convention] takes effect even if we just use the Column.unique flag:" 1, so I'm thinking that there shouldn't be a problem (they go on to give an example using a ForeignKey that isn't named too).
Do I need to go back and give all my constraints explicit names, or is there a way to do it automatically?
just modify th "ck" in convention to "ck": "ck_%(table_name)s_%(column_0_name)s"。it works for me .
refer to see sqlalchemy docs
What this error message is telling you is that you should name constraints explicitly. The constraints it's referring to are Boolean, Enum etc but not foreignkeys nor primary keys.
So go through your table, wherever you have a Boolean or Enum add a name to it. For example:
is_active = Column(Boolean(name='is_active'))
That's what you need to do.
This does not aim to be definitive answer and also fails to answer your immediate technical question, but could it be a "philosophical problem"? Either your SQLAlchemy code is the source of truth as far as the database is concerned, or the RDMS is the source. In front of this a mixed situation, where each of the two have part of it, I would see two avenues:
The one that you are exploring: you modify the database's schema to match the SQLAlchemy model and you make your Python code the master. This is the most intuitive, but this may not always be possible, both for technical and administrative reasons.
Accepting that the RDMS has info that SQLAlchemy doesn't have, but is fortunately not relevant for day-to-day work. Your best chance is to use another migration tool (ETL) that will reverse engineer the database before migrating it. After the migration is complete you could give back control of the new instance to SQLAlchemy (which may require some adjustments to the new DB or to the model).
There is no way to tell which approach will work, since both have their own challenges. But I would give some thought to the second method?
I've had some luck altering naming_convention back to {} in each older migration so that they run with the correct historical context.
Still entirely unsure what kind of interesting side-effects this might have.