I'm making a SQLAlchemy base class for a new Postgres database and want to have bookkeeping fields incorporated into it. Specifically, I want to have two columns for modified_at and modified_by that are updated automatically. I was able to find out how to do this for individual tables, but it seems like making this part of the base class is trickier.
My first thought was to try and leverage the declared_attr functionality, but I don't actually want to make the triggers an attribute in the model so that seems incorrect. Then I looked at adding the trigger using event.listen:
trigger = """
CREATE TRIGGER update_{table_name}_modified
BEFORE UPDATE ON {table_name}
FOR EACH ROW EXECUTE PROCEDURE update_modified_columns()
"""
def create_modified_trigger(target, connection, **kwargs):
if hasattr(target, 'name'):
connection.execute(modified_trigger.format(table_name=target.name))
Base = declarative_base()
event.listen(Base.metadata,'after_create', create_modified_trigger)
I thought I could find table_name using the target parameter as shown in the docs but when used with Base.metadata it returns MetaData(bind=None) rather than a table.
I would strongly prefer to have this functionality as part of the Base rather than including it in migrations or externally to reduce the chance of someone forgetting to add the triggers. Is this possible?
I was able to sort this out with the help of a coworker. The returned MetaData object did in fact have a list of tables. Here is the working code:
modified_trigger = """
CREATE TRIGGER update_{table_name}_modified
BEFORE UPDATE ON {table_name}
FOR EACH ROW EXECUTE PROCEDURE update_modified_columns()
"""
def create_modified_trigger(target, connection, **kwargs):
"""
This is used to add bookkeeping triggers after a table is created. It hooks
into the SQLAlchemy event system. It expects the target to be an instance of
MetaData.
"""
for key in target.tables:
table = target.tables[key]
connection.execute(modified_trigger.format(table_name=table.name))
Base = declarative_base()
event.listen(Base.metadata, 'after_create', create_modified_trigger)
Related
I had asked this question How to create instance of a table entry but not added in ponyorm? where I was asking how I can create an instance of the class defined as a ponyorm table representation without immediately adding it. By using sqlalchemy where an explicit add is needed on a session instance I think I succeeded by using the following code.
I first create a class called AddInstance which has an add method, and then inherit from this in all my table definitions. This seems to work (ie I can create an instance of the class and add it only if I want to relatively easily) but I'm not sure if there are any unintended side effects or this is very far from best practice.
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker, relationship
from sqlalchemy import Column, String, Integer
engine = create_engine('sqlite:///:memory:')
Base = declarative_base()
Session = sessionmaker(bind=engine)
class AddInstance:
def add(self):
session = Session()
session.add(self)
session.commit()
class Pizza(Base, AddInstance):
__tablename__ = 'pizzas'
id = Column(Integer, primary_key=True)
name = Column(String(50))
toppings = relationship('Topping', back_populates='name')
class Topping(Base, AddInstance):
__tablename__ = 'fruits'
id = Column(Integer, primary_key=True)
name = Column(String(50))
pizzas = relationship('Pizza', back_populates='name')
Base.metadata.create_all(engine)
There will be no side effects, SQLAlchemy explicitly supports adding methods. It doesn't matter if those methods are defined on a mixin class or directly on the class derived from Base.
Quoting from the SQLAlchemy Object Relational Tutorial documentation section:
Outside of what the mapping process does to our class, the class remains otherwise mostly a normal Python class, to which we can define any number of ordinary attributes and methods needed by our application.
(bold emphasis mine).
There are plenty of examples of code bases that do exactly the same. Flask-SQLAlchemy provides a Model base class to add a query attribute (set when the declarative base is created), which lets you do Topping.query(...) directly from a model class, for example.
Your add() method does have downside: it creates a new Session() instance just to add and commit your object. This keeps it outside of the normal session state management semantics and if you wanted to do anything else with your newly created object you'd have to merge it into an existing session.
The normal, best practice for code involvig SQLAlchemy objects is to create a session to manage a transaction, a set of operations that together must succeed or fail. That includes creating objects; in many real-world applications you'd want to avoid creating extra rows in a database when other operations that rely on those rows fail. Your .add() method unconditionally commits each object in a separate transaction. You may want to revisit this pattern.
I'm wondering if there's a way to merge new mappings with database data, such as with session.merge, but without updating the database? Like when I do a pull with git, to get a local state which is a merge of the remote and previous local state(which might contain unpushed commits), but without updating the remote state. Here, I want to get a local "view" of the state that would result from doing a session.merge.
Maybe making savepoint(with session.begin_nested), then doing a session.merge and later on a session.rollback would accomplish this, but is there a way that doesn't require this kind of transaction management(which can imply actual undo operations on the db, so not terribly efficient for my purposes)?
Would using session.no_autoflush do the trick?
Example code for what I want to do:
local_merge = session.merge_local(Model(...))
# do stuff with local_merge...
remotes = session.query(Model).all()
# remotes should remain "old" db versions, as no data was pushed
return something
Edit: So I think I may be wrong on the rollback method being inefficient. At least, as long as no commit are emitted, there shouldn't be expensive undo operations, only chucking out the transaction.
Merge will only update the database because of the auto-flush. You can turn that off temporarily using the session.no_autoflush context manager, or just setting autoflush=False on your session. You can also pass autoflush=False to your sessionmaker.
One thing to watch out for is how the results of the session.query(Model).all() will interact with the unflushed, changed, local objects.
Because the session maintains a record of unique objects (against primary keys) in an Identity Map, you will not be able to have two versions of the same object in the same session.
Here's an example which shows how local changes interact with autoflush=False:
from sqlalchemy import create_engine, Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
engine = create_engine('sqlite:///:memory:', echo=True)
Base = declarative_base()
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
name = Column(String)
def __repr__(self):
return "<User(name='%s')>" % self.name
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
session = Session()
ed_user = User(name='ed')
session.add(ed_user)
session.commit()
ed_again = session.query(User).get(1)
ed_again.name = "not ed"
with session.no_autoflush:
ed_three = session.query(User).get(1)
all_eds = session.query(User).all()
print(ed_again, id(ed_again))
print(ed_three, id(ed_three))
print(all_eds, id(all_eds[0]))
<User(name='not ed')> 139664151068624
<User(name='not ed')> 139664151068624
[<User(name='not ed')>] 139664151068624
Yep, it's not able to get the original Ed from the database, even with no_autoflush - this is to be expected for get(), since it will check the identity map first before the database, and won't bother querying the DB if it finds it in the identity map. But with query.all(), it queries the database, finds that one of the objects it gets back was already in the identity map, and returns that reference instead so as to maintain uniqueness of objects in the session (which was also my hunch, but I couldn't find this explicitly spelled out in the docs).
You could do something like expunge objects from a session, but I think the easiest way to have an old and new copy of the merged objects is to use two separate sessions, one where the changes will be merged and possibly committed and one which you can use to check the existing state of objects in the database.
Short Version
In SQLAlchemy's ORM column declaration, how can I use server_default=sa.FetchedValue() on one dialect, and default=somePythonFunction on another, so that my real DBMS can populate things with triggers, and my test code can be written against sqlite?
Background
I'm using SQLAlchemy's declarative ORM to work with a Postgres database, but trying to write unit tests against an sqlite:///:memory:, and running into a problem with columns that have computed defaults on their primary keys. For a minimal example:
CREATE TABLE test_table(
id VARCHAR PRIMARY KEY NOT NULL
DEFAULT (lower(hex(randomblob(16))))
)
SQLite itself is quite happy with this table definition (sqlfiddle) but SQLAlchemy seems unable to work out the ID of newly created rows.
class TestTable(Base):
__tablename__ = 'test_table'
id = sa.Column(
sa.VARCHAR,
primary_key=True,
server_default=sa.FetchedValue())
Definitions like this work just fine in postgres, but die in sqlite (as you can see on Ideone) with a FlushError when I call Session.commit:
sqlalchemy.orm.exc.FlushError: Instance <TestTable at 0x7fc0e0254a10> has a NULL identity key. If this is an auto-generated value, check that the database table allows generation of new primary key values, and that the mapped Column object is configured to expect these generated values. Ensure also that this flush() is not occurring at an inappropriate time, such as within a load() event.
The documentation for FetchedValue warns us that this can happen on dialects that don't support the RETURNING clause on INSERT:
For special situations where triggers are used to generate primary key
values, and the database in use does not support the RETURNING clause,
it may be necessary to forego the usage of the trigger and instead
apply the SQL expression or function as a “pre execute” expression:
t = Table('test', meta,
Column('abc', MyType, default=func.generate_new_value(),
primary_key=True)
)
func.generate_new_value is not defined anywhere else in SQLAlchemy, so it seems they intend I either generate defaults in Python, or else write a separate function to do a SQL query to generate a default value in the DBMS. I can do that, but the problem is, I only want to do that for SQLite, since FetchedValue does exactly what I want on postgres.
Dead Ends
Subclassing Column probably won't work. Nothing that I can find in the sources ever tells the Column what dialect is being used, and the behavior of the default and server_default fields is defined outside the class
Writing a python function that calls the triggers by hand on the real DBMS creates a race condition. Avoiding the race condition by changing the isolation level creates a deadlock.
My Current Workaround
Bad because it breaks integration tests that connect to a real postgres.
import sys
import sqlalchemy as sa
def trigger_column(*a, **kw):
python_default = kw.pop('python_default')
if 'unittest' in sys.modules:
return sa.Column(*a, default=python_default, **kw)
else
return sa.Column(*a, server_default=sa.FetchedValue(), **kw)
Not a direct answer to you question but hopefully helpful to someone
My problem was wanting to change the collation depending on the dialect, this was my solution:
from sqlalchemy import Unicode
from sqlalchemy.ext.compiler import compiles
#compiles(Unicode, 'sqlite')
def compile_unicode(element, compiler, **kw):
element.collation = None
return compiler.visit_unicode(element, **kw)
This changes the collation for all Unicode columns only for sqlite.
Here's some documentation: http://docs.sqlalchemy.org/en/latest/core/custom_types.html#overriding-type-compilation
I'm using python sqlite3 api to create a database.
In all examples I saw on the documentation table names and colum names are hardcoded inside queries..but this could be a potential problem if I re-use the same table multiple times (ie, creating table, inserting records into table, reading data from table, alter table and so on...) because In case of table modification I need to change the hardcoded names in multiple places and this is not a good programming practice..
How can I solve this problem?
I thought creating a class with just constructor method in order to store all this string names..and use it inside the class that will operation on database..but as I'm not an expert python programmer I would like to share my thoughts...
class TableA(object):
def __init__(self):
self.table_name = 'tableA'
self.name_col1 = 'first_column'
self.type_col1='INTEGER'
self.name_col2 = 'second_column'
self.type.col2 = 'TEXT'
self.name_col3 = 'third_column'
self.type_col3 = 'BLOB'
and then inside the DB classe
table_A = TableA()
def insert_table(self):
conn = sqlite3.connect(self._db_name)
query = 'INSERT INTO ' + table_A.table_name + ..... <SNIP>
conn.execute(query)
Is this a proper way to proceed?
I don't know what's proper but I can tell you that it's not conventional.
If you really want to structure tables as classes, you could consider an object relational mapper like SQLAlchemy. Otherwise, the way you're going about it, how do you know how many column variables you have? What about storing a list of 2-item lists? Or a list of dictionaries?
self.column_list = []
self.column_list.append({'name':'first','type':'integer'})
The way you're doing it sounds pretty novel. Check out their code and see how they do it.
If you are going to start using classes to provide an abstraction layer for your database tables, you might as well start using an ORM. Some examples are SQLAlchemy and SQLObject, both of which are extremely popular.
Here's a taste of SQLAlchemy:
from sqlalchemy import Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import create_engine
Base = declarative_base()
class TableA(Base):
__tablename__ = 'tableA'
id = Column(Integer, primary_key=True)
first_column = Column(Integer)
second_column = Column(String)
# etc...
engine = create_engine('sqlite:///test.db')
Base.metadata.bind = engine
session = sessionmaker(bind=engine)()
ta = TableA(first_column=123, second_column='Hi there')
session.add(ta)
session.commit()
Of course you would choose semantic names for the table and columns, but you can see that declaring a table is something along the lines of what you were proposing in your question, i.e. using a class. Inserting records is simplified by creating instances of that class.
I personally don't like to use libraries and frameworks without proper reason. So, if I'd such reason, so will write a thinking wrapper around sqlite.
class Column(object):
def __init__(self, col_name="FOO", col_type="INTEGER"):
# standard initialization
And then table class that encapsulates operations with database
class Table(object):
def __init__(self, list_of_columns, cursor):
#initialization
#create-update-delete commands
In table class you can encapsulate all operations with the database you want.
I'm using SA 0.6.6, Declarative style, against Postgres 8.3, to map Python objects to a database. I have a table that is self referencing and I'm trying to make a relationship property for it's children. No matter what I try, I end up with a NoReferencedTableError.
My code looks exactly like the sample code from the SA website for how to do this very thing.
Here's the class.
class FilterFolder(Base):
__tablename__ = 'FilterFolder'
id = Column(Integer,primary_key=True)
name = Column(String)
isShared = Column(Boolean,default=False)
isGlobal = Column(Boolean,default=False)
parentFolderId = Column(Integer,ForeignKey('FilterFolder.id'))
childFolders = relationship("FilterFolder",
backref=backref('parentFolder', remote_side=id)
)
Here's the error I get:
NoReferencedTableError: Foreign key assocated with column 'FilterFolder.parentFolderId' could not find table 'FilterFolder' with which to generate a foreign key to target column 'id'
Any ideas what I'm doing wrong here?
This was a foolish mistake on my part. I typically specify my FK's by specifying the Entity type, not the string. I am using different schemas, so when defining the FK entity as a string I also need the schema.
Broken:
parentFolderId = Column(Integer,ForeignKey('FilterFolder.id'))
Fixed:
parentFolderId = Column(Integer,ForeignKey('SchemaName.FilterFolder.id'))
I checked your code with SQLAlchemy 0.6.6 and sqlite. I was able to create the tables, add a parent and child combination, and retrieve them again using a session.query.
As far as I can tell, the exception you mentioned (NoReferencedTableError) is thrown in schema.py (in the SQLAlchemy source) exclusively, and is not database specific.
Some questions: Do you see the same bug if you use an sqlite URL instead of the Postgres one? How are you creating your schema? Mine looks something like this:
engine = create_engine(db_url)
FilterFolder.metadata.create_all(self.dbengine)