Populating related table in SqlAlchemy ORM - python

So I have two table in a one-to-many relationship. When I make a new row of Table1, I want to populate Table2 with the related rows. However, this population actually involves computing the Table2 rows, using data in other related tables.
What's a good way to do that using the ORM layer? That is, assuming that that the Table1 mappings are created through the ORM, where/how should I call the code to populate Table2?
I thought about using the after_insert hook, but i want to have a session to pass to the population method.
Thanks.

You can use the before_flush or after_flush hook, it provides a session. You then check session.new objects for newly created models (tip: use isinstance(object, ModelClass)) and do your work here.
In fact, SQLAlchemy recommends before_flush for general on flush changes.
Mapper-level flush events only allow very limited operations, on attributes local to the row being operated upon only, as well as allowing any SQL to be emitted on the given Connection. Please read fully the notes at Mapper-level Events for guidelines on using these methods; generally, the SessionEvents.before_flush() method should be preferred for general on-flush changes.

After asking around in #sqlalchemy IRC, it was pointed out that this could be done using ORM-level relationships in an before_flush event listener.
It was explained that when you add a mapping through a relationship, the foreign key is automatically filled on flush, and the appropriate insert statement generated by the ORM.

Related

SQLAlchemy MetaData.reflect() vs. automap_base.prepare()

It seems to me that MetaData.reflect() and sqlalchemy.ext.automap.prepare() tables should be able to be used interchangeably for many use cases, but they can't be.
The metadata.tables['mytable'] into conn.execute(select(...)) returns a sqlalchemy.engine.cursor.CursorResult and your iterator gets the columns directly (eg x.columnA).
But automap_base().classes.mytable into the same conn.execute(select(...)) returns a sqlalchemy.engine.result.ChunkedIteratorResult and you need x.mytable.columnA to get at the column.
The sqlalchemy.engine.Result() documention says as much:
New in version 1.4: The Result object provides a completely updated
usage model and calling facade for SQLAlchemy Core and SQLAlchemy ORM.
In Core, it forms the basis of the CursorResult object which replaces
the previous ResultProxy interface. When using the ORM, a higher level
object called ChunkedIteratorResult is normally used.
Can I generically convert one to the other? That is, some wrapper that works for every table without needing the table name?
What's the best futureproof way to do this? I want my code to be forward-looking to sqlalchemy 2.0. Does that mean I should move away from either automap or MetaData?
sqlalchemy 1.4.35
This is the difference between the Core and the ORM.
select() from a Table vs. ORM class
While the SQL generated in these examples looks the same whether we
invoke select(user_table) or select(User), in the more general case
they do not necessarily render the same thing, as an ORM-mapped class
may be mapped to other kinds of “selectables” besides tables. The
select() that’s against an ORM entity also indicates that ORM-mapped
instances should be returned in a result, which is not the case when
SELECTing from a Table object.
Don't hesitate to use the ORM. It's higher level, pythonic, cool, and automap is ORM.

How do I drop custom types using SQLAlchemy + PostgreSQL?

I have a script which cleans the database, and this is widely used in our tests.
First, we tried to use SQLAlchemy Metadata.drop_all() thing, but it didn't resolve some foreign keys on deletion, which caused errors. Then, I found this script from #zzzeek, which does almost the same, but in a "smart" way. It handles all the issues with foreign keys, but now there are several issues regarding changed custom types (ENUMs). The question is, how can I drop them all them using SQLAlchemy? Execute DROP TYPE by hand only?
Tables in database are created with Alembic, and even while script above deletes all tables successfully, some custom ENUMs are still there and everything fails on attempt to recreate them.
Recreating the whole database is not a preferred solution, because default DB user for application shouldn't normally have rights to create databases.
Are you sure your Metadata instance fully describes all the tables?
Try:
Metadata.reflect()
Metadata.drop_all()
This is an ancient question, but it's still possible to come across this problem if create_type=False is on any postgresql.ENUM definitions.
Per the SQLAlchemy docs on create_type,
When False, no check will be performed and no CREATE TYPE or DROP TYPE is emitted, unless ENUM.create() or ENUM.drop() are called directly.
That means when running tests, while there may be a setup & teardown with create_all() and drop_all(), neither will affect custom enum types.
The solution is simply to remove create_type=False, since True is the default. Then all custom types will be created at the beginning of testing and dropped at the end, resulting in a perfectly clean test database.

Does Python Django support custom SQL and denormalized databases with no Foreign Key relationships?

I've just started learning Python Django and have a lot of experience building high traffic websites using PHP and MySQL. What worries me so far is Python's overly optimistic approach that you will never need to write custom SQL and that it automatically creates all these Foreign Key relationships in your database. The one thing I've learned in the last few years of building Chess.com is that its impossible to NOT write custom SQL when you're dealing with something like MySQL that frequently needs to be told what indexes it should use (or avoid), and that Foreign Keys are a death sentence. Percona's strongest recommendation was for us to remove all FKs for optimal performance.
Is there a way in Django to do this in the models file? create relationships without creating actual DB FKs? Or is there a way to start at the database level, design/create my database, and then have Django reverse engineer the models file?
If you don't want foreign keys, then avoid using
models.ForeignKey(),
models.ManyToManyField(), and
models.OneToOneField().
Django will automatically create an auto-increment int field named id that you can use to refer to individual records, or you can override that by marking a field as primary_key=True.
There is also documentation on running raw SQL queries on the database.
Raw SQL is as easy as this :
for obj in MyModel.objects.raw('SELECT * FROM myapp_mymodel'):
print obj
Denormalizing a database is up to you at model definition time.
You can use non-relational databases (MongoDB, ...) too with Django NonRel
django-admin inspectdb allows you to reverse engineer a models file from existing tables. That is only a very partial response to your question ;)
You can just create the model.py and avoid having SQL Alchemy automatically create the tables leaving it up to you to define the actual tables as you please. So although there are foreign key relationships in the model.py this does not mean that they must exist in the actual tables. This is a very good thing considering how ludicrously foreign key constraints are implemented in MySQL - MyISAM just ignores them and InnoDB creates a non-optional index on every single one regardless of whether it makes sense.
I concur with the 'no foreign keys' advice (with the disclaimer: I also work for Percona).
The reason why it is is recommended is for concurrency / reducing locking internally.
It can be a difficult "optimization" to sell, but if you consider that the database has transactions (and is more or less ACID compliant) then it should only be application-logic errors that cause foreign-key violations. Not to say they don't exist, but if you enable foreign keys in development hopefully you should find at least a few bugs.
In terms of whether or not you need to write custom SQL:
The explanation I usually give is that "optimization rarely decreases complexity". I think it is okay to stick with an ORM by default, but if in a profiler it looks like one particular piece of functionality is taking a lot more time than you suspect it would when written by hand, then you need to be prepared to fix it (assuming the code is called often enough).
The real secret here is that you need good instrumentation / profiling in order to be frugal with your complexity adding optimization(s).

Work with Postgres/PostGIS View in SQLAlchemy

Two questions:
i want to generate a View in my PostGIS-DB. How do i add this View to my geometry_columns Table?
What i have to do, to use a View with SQLAlchemy? Is there a difference between a Table and View to SQLAlchemy or could i use the same way to use a View as i do to use a Table?
sorry for my poor english.
If there a questions about my question, please feel free to ask so i can try to explain it in another way maybe :)
Nico
Table objects in SQLAlchemy have two roles. They can be used to issue DDL commands to create the table in the database. But their main purpose is to describe the columns and types of tabular data that can be selected from and inserted to.
If you only want to select, then a view looks to SQLAlchemy exactly like a regular table. It's enough to describe the view as a Table with the columns that interest you (you don't even need to describe all of the columns). If you want to use the ORM you'll need to declare for SQLAlchemy that some combination of the columns can be used as the primary key (anything that's unique will do). Declaring some columns as foreign keys will also make it easier to set up any relations. If you don't issue create for that Table object, then it is just metadata for SQLAlchemy to know how to query the database.
If you also want to insert to the view, then you'll need to create PostgreSQL rules or triggers on the view that redirect the writes to the correct location. I'm not aware of a good usage recipe to redirect writes on the Python side.

Discovering referers to SQLAlchemy object

I have a lot of model classes with ralations between them with a CRUD interface to edit. The problem is that some objects can't be deleted since there are other objects refering to them. Sometimes I can setup ON DELETE rule to handle this case, but in most cases I don't want automatic deletion of related objects till they are unbound manually. Anyway, I'd like to present editor a list of objects refering to currently viewed one and highlight those that prevent its deletion due to FOREIGN KEY constraint. Is there a ready solution to automatically discover referers?
Update
The task seems to be quite common (e.g. django ORM shows all dependencies), so I wonder that there is no solution to it yet.
There are two directions suggested:
Enumerate all relations of current object and go through their backref. But there is no guarantee that all relations have backref defined. Moreover, there are some cases when backref is meaningless. Although I can define it everywhere I don't like doing this way and it's not reliable.
(Suggested by van and stephan) Check all tables of MetaData object and collect dependencies from their foreign_keys property (the code of sqlalchemy_schemadisplay can be used as example, thanks to stephan's comments). This will allow to catch all dependencies between tables, but what I need is dependencies between model classes. Some foreign keys are defined in intermediate tables and have no models corresponding to them (used as secondary in relations). Sure, I can go farther and find related model (have to find a way to do it yet), but it looks too complicated.
Solution
Below is a method of base model class (designed for declarative extention) that I use as solution. It is not perfect and doesn't meet all my requirements, but it works for current state of my project. The result is collected as dictionary of dictionaries, so I can show them groupped by objects and their properties. I havn't decided yet whether it's good idea, since the list of referers sometimes is huge and I'm forced to limit it to some reasonable number.
def _get_referers(self):
db = object_session(self)
cls, ident = identity_key(instance=self)
medatada = cls.__table__.metadata
result = {}
# _mapped_models is my extension. It is collected by metaclass, so I didn't
# look for other ways to find all model classes.
for other_class in medatada._mapped_models:
queries = {}
for prop in class_mapper(other_class).iterate_properties:
if not (isinstance(prop, PropertyLoader) and \
issubclass(cls, prop.mapper.class_)):
continue
query = db.query(prop.parent)
comp = prop.comparator
if prop.uselist:
query = query.filter(comp.contains(self))
else:
query = query.filter(comp==self)
count = query.count()
if count:
queries[prop] = (count, query)
if queries:
result[other_class] = queries
return result
Thanks to all who helped me, especially stephan and van.
SQL: I have to absolutely disagree with S.Lott' answer.
I am not aware of out-of-the-box solution, but it is definitely possible to discover all the tables that have ForeignKey constraints to a given table. One needs to use properly the INFORMATION_SCHEMA views such as REFERENTIAL_CONSTRAINTS, KEY_COLUMN_USAGE, TABLE_CONSTRAINTS, etc. See SQL Server example. With some limitations and extensions, most versions of new relational databases support INFORMATION_SCHEMA standard. When you have all the FK information and the object (row) in the table, it is a matter of running few SELECT statements to get all other rows in other tables that refer to given row and prevent it from being deleted.
SqlAlchemy: As noted by stephan in his comment, if you use orm with backref for relations, then it should be quite easy for you to get the list of parent objects that keep reference to the object you are trying to delete, because those objects are basically mapped properties of your object (child1.Parent).
If you work with Table objects of sql alchemy (or not always use backref for relations), then you would have to get values of foreign_keys for all the tables, and then for all those ForeignKeys call references(...) method, providing your table as a parameter. In this way you will find all the FKs (and tables) that have reference to the table your object maps to. Then you can query all the objects that keep reference to your object by constructing the query for each of those FKs.
In general, there's no way to "discover" all of the references in a relational database.
In some databases, they may use declarative referential integrity in the form of explicit Foreign Key or Check constraints.
But there's no requirement to do this. It can be incomplete or inconsistent.
Any query can include a FK relationship that is not declared. Without the universe of all queries, you can't know the relationships which are used but not declared.
To find "referers" in general, you must actually know the database design and have all queries.
For each model class, you can easily see if all its one-to-many relations are empty simply by asking for the list in each case and seeing how many entries it contains. (There is probably a more efficient way implemented in terms of COUNT, too.) If there are any foreign keys relating to the object, and you have your object relations set up correctly, then at least one of these lists will be non-zero in length.

Categories

Resources