SQLAlchemy MetaData.reflect() vs. automap_base.prepare() - python

It seems to me that MetaData.reflect() and sqlalchemy.ext.automap.prepare() tables should be able to be used interchangeably for many use cases, but they can't be.
The metadata.tables['mytable'] into conn.execute(select(...)) returns a sqlalchemy.engine.cursor.CursorResult and your iterator gets the columns directly (eg x.columnA).
But automap_base().classes.mytable into the same conn.execute(select(...)) returns a sqlalchemy.engine.result.ChunkedIteratorResult and you need x.mytable.columnA to get at the column.
The sqlalchemy.engine.Result() documention says as much:
New in version 1.4: The Result object provides a completely updated
usage model and calling facade for SQLAlchemy Core and SQLAlchemy ORM.
In Core, it forms the basis of the CursorResult object which replaces
the previous ResultProxy interface. When using the ORM, a higher level
object called ChunkedIteratorResult is normally used.
Can I generically convert one to the other? That is, some wrapper that works for every table without needing the table name?
What's the best futureproof way to do this? I want my code to be forward-looking to sqlalchemy 2.0. Does that mean I should move away from either automap or MetaData?
sqlalchemy 1.4.35

This is the difference between the Core and the ORM.
select() from a Table vs. ORM class
While the SQL generated in these examples looks the same whether we
invoke select(user_table) or select(User), in the more general case
they do not necessarily render the same thing, as an ORM-mapped class
may be mapped to other kinds of “selectables” besides tables. The
select() that’s against an ORM entity also indicates that ORM-mapped
instances should be returned in a result, which is not the case when
SELECTing from a Table object.
Don't hesitate to use the ORM. It's higher level, pythonic, cool, and automap is ORM.

Related

Populating related table in SqlAlchemy ORM

So I have two table in a one-to-many relationship. When I make a new row of Table1, I want to populate Table2 with the related rows. However, this population actually involves computing the Table2 rows, using data in other related tables.
What's a good way to do that using the ORM layer? That is, assuming that that the Table1 mappings are created through the ORM, where/how should I call the code to populate Table2?
I thought about using the after_insert hook, but i want to have a session to pass to the population method.
Thanks.
You can use the before_flush or after_flush hook, it provides a session. You then check session.new objects for newly created models (tip: use isinstance(object, ModelClass)) and do your work here.
In fact, SQLAlchemy recommends before_flush for general on flush changes.
Mapper-level flush events only allow very limited operations, on attributes local to the row being operated upon only, as well as allowing any SQL to be emitted on the given Connection. Please read fully the notes at Mapper-level Events for guidelines on using these methods; generally, the SessionEvents.before_flush() method should be preferred for general on-flush changes.
After asking around in #sqlalchemy IRC, it was pointed out that this could be done using ORM-level relationships in an before_flush event listener.
It was explained that when you add a mapping through a relationship, the foreign key is automatically filled on flush, and the appropriate insert statement generated by the ORM.

SQLAlchemy doesn't map reflected class

I have this code:
def advertiser_table(engine):
return Table('advertiser', metadata, autoload=True, autoload_with=engine)
And later I try this:
advertisers = advertiser_table(engine)
...
session.bulk_insert_mappings(
advertisers.name,
missing_advetisers.to_dict('records'),
)
where missing_adverisers is a Pandas DataFrame (but it's not important for this question).
The error this gives me is:
sqlalchemy.orm.exc.UnmappedClassError: Class ''advertiser'' is not mapped
From reading the documentation I could scramble enough to ask the question, but not much more than that... What is Mapper and why is it so detrimental to the functioning of this library?.. Why isn't "the class" mapped? Obviously, what am I to do to "map" it to whatever this library wants it to map?
A Mapper is the M in ORM. It is the thing that maps your table (advertisers in this case) to instances of a class (which you are missing in this case) in order for you to operate on it.
The reason it's confusing for you is because SQLAlchemy is actually two libraries in one -- one is called SQLAlchemy Core, and the other is the SQLAlchemy ORM. Core provides the ability to work with tables and to construct queries that return rows, while the ORM builds on top of Core to provide the ability to work with instances of classes and their relationships as an abstraction. Core roughly corresponds to things you can do on Connection and Engine, while ORM roughly corresponds to things you can do on Session.
So, all of that is to say, session.bulk_insert_mappings is an ORM functionality, and you cannot use it without having a mapped class.
What can you do instead? Use the equivalent Core functionality:
query = advertisers.insert().values(missing_advetisers.to_dict('records'))
engine.execute(query) # or session.execute(query)
Or even use the pandas-provided to_sql function:
missing_advetisers.to_sql("advertiser", engine, if_exists="append")
If you insist on using the ORM, you need to declare a mapped class for your table. The easiest way when using reflection is to use automap. The linked documentation has many examples, so I won't go into detail here.

Is this a good approach to avoid using SQLAlchemy/SQLObject?

Rather than use an ORM, I am considering the following approach in Python and MySQL with no ORM (SQLObject/SQLAlchemy). I would like to get some feedback on whether this seems likely to have any negative long-term consequences since in the short-term view it seems fine from what I can tell.
Rather than translate a row from the database into an object:
each table is represented by a class
a row is retrieved as a dict
an object representing a cursor provides access to a table like so:
cursor.mytable.get_by_ids(low, high)
removing means setting the time_of_removal to the current time
So essentially this does away with the need for an ORM since each table has a class to represent it and within that class, a separate dict represents each row.
Type mapping is trivial because each dict (row) being a first class object in python/blub allows you to know the class of the object and, besides, the low-level database library in Python handles the conversion of types at the field level into their appropriate application-level types.
If you see any potential problems with going down this road, please let me know. Thanks.
That doesn't do away with the need for an ORM. That is an ORM. In which case, why reinvent the wheel?
Is there a compelling reason you're trying to avoid using an established ORM?
You will still be using SQLAlchemy. ResultProxy is actually a dictionary once you go for .fetchmany() or similar.
Use SQLAlchemy as a tool that makes managing connections easier, as well as executing statements. Documentation is pretty much separated in sections, so you will be reading just the part that you need.
web.py has in a decent db abstraction too (not an ORM).
Queries are written in SQL (not specific to any rdbms), but your code remains compatible with any of the supported dbs (sqlite, mysql, postresql, and others).
from http://webpy.org/cookbook/select:
myvar = dict(name="Bob")
results = db.select('mytable', myvar, where="name = $name")

Discovering referers to SQLAlchemy object

I have a lot of model classes with ralations between them with a CRUD interface to edit. The problem is that some objects can't be deleted since there are other objects refering to them. Sometimes I can setup ON DELETE rule to handle this case, but in most cases I don't want automatic deletion of related objects till they are unbound manually. Anyway, I'd like to present editor a list of objects refering to currently viewed one and highlight those that prevent its deletion due to FOREIGN KEY constraint. Is there a ready solution to automatically discover referers?
Update
The task seems to be quite common (e.g. django ORM shows all dependencies), so I wonder that there is no solution to it yet.
There are two directions suggested:
Enumerate all relations of current object and go through their backref. But there is no guarantee that all relations have backref defined. Moreover, there are some cases when backref is meaningless. Although I can define it everywhere I don't like doing this way and it's not reliable.
(Suggested by van and stephan) Check all tables of MetaData object and collect dependencies from their foreign_keys property (the code of sqlalchemy_schemadisplay can be used as example, thanks to stephan's comments). This will allow to catch all dependencies between tables, but what I need is dependencies between model classes. Some foreign keys are defined in intermediate tables and have no models corresponding to them (used as secondary in relations). Sure, I can go farther and find related model (have to find a way to do it yet), but it looks too complicated.
Solution
Below is a method of base model class (designed for declarative extention) that I use as solution. It is not perfect and doesn't meet all my requirements, but it works for current state of my project. The result is collected as dictionary of dictionaries, so I can show them groupped by objects and their properties. I havn't decided yet whether it's good idea, since the list of referers sometimes is huge and I'm forced to limit it to some reasonable number.
def _get_referers(self):
db = object_session(self)
cls, ident = identity_key(instance=self)
medatada = cls.__table__.metadata
result = {}
# _mapped_models is my extension. It is collected by metaclass, so I didn't
# look for other ways to find all model classes.
for other_class in medatada._mapped_models:
queries = {}
for prop in class_mapper(other_class).iterate_properties:
if not (isinstance(prop, PropertyLoader) and \
issubclass(cls, prop.mapper.class_)):
continue
query = db.query(prop.parent)
comp = prop.comparator
if prop.uselist:
query = query.filter(comp.contains(self))
else:
query = query.filter(comp==self)
count = query.count()
if count:
queries[prop] = (count, query)
if queries:
result[other_class] = queries
return result
Thanks to all who helped me, especially stephan and van.
SQL: I have to absolutely disagree with S.Lott' answer.
I am not aware of out-of-the-box solution, but it is definitely possible to discover all the tables that have ForeignKey constraints to a given table. One needs to use properly the INFORMATION_SCHEMA views such as REFERENTIAL_CONSTRAINTS, KEY_COLUMN_USAGE, TABLE_CONSTRAINTS, etc. See SQL Server example. With some limitations and extensions, most versions of new relational databases support INFORMATION_SCHEMA standard. When you have all the FK information and the object (row) in the table, it is a matter of running few SELECT statements to get all other rows in other tables that refer to given row and prevent it from being deleted.
SqlAlchemy: As noted by stephan in his comment, if you use orm with backref for relations, then it should be quite easy for you to get the list of parent objects that keep reference to the object you are trying to delete, because those objects are basically mapped properties of your object (child1.Parent).
If you work with Table objects of sql alchemy (or not always use backref for relations), then you would have to get values of foreign_keys for all the tables, and then for all those ForeignKeys call references(...) method, providing your table as a parameter. In this way you will find all the FKs (and tables) that have reference to the table your object maps to. Then you can query all the objects that keep reference to your object by constructing the query for each of those FKs.
In general, there's no way to "discover" all of the references in a relational database.
In some databases, they may use declarative referential integrity in the form of explicit Foreign Key or Check constraints.
But there's no requirement to do this. It can be incomplete or inconsistent.
Any query can include a FK relationship that is not declared. Without the universe of all queries, you can't know the relationships which are used but not declared.
To find "referers" in general, you must actually know the database design and have all queries.
For each model class, you can easily see if all its one-to-many relations are empty simply by asking for the list in each case and seeing how many entries it contains. (There is probably a more efficient way implemented in terms of COUNT, too.) If there are any foreign keys relating to the object, and you have your object relations set up correctly, then at least one of these lists will be non-zero in length.

Select statement with SqlAlchemy

Yes, very basic question.
I've successfully created my db using declarative_base, and can perform inserts into the db too. I just have a few questions about SqlAlchemy sql statements.
I've create a table called Location.
A few issues/questions (see code below):
For statement, "print row", I have to specify each column name that I want to have output. i.e. "print row.name, row.lat, etc" Why? (Otherwise the print statement outputs "<classname.Location at <...>>"
Also, what is the preferred way to interact with a db and perform queries (select, insert, update, etc.)- there seem to be a bunch of options: using sqlalchemy.orm.select for example, or engine.text(<sql query>).execute().fetchall(), or even conn.execute(<select>). Options are great, but right now they're all just confusing me.
Thanks so much for the tips!
Here's my code:
from sqlalchemy import create_engine
from sqlalchemy.sql import select
from location_db_setup import *
db_path = "sqlite:////volumes/users/shared/programming/python/web/map.db"
engine = create_engine(db_path, echo= True)
Session = sessionmaker(bind= engine)
session = Session()
session.query(Location).fetchall()
for row in locations:
print row
You code in sample is incomplete and has errors. So it's impossible to say for sure what is Location here. I assume it's a mapped class, so you are requesting a list of all Location objects, not rows. When you print an object you get its string representation. String representation of objects can be changed by defining custom __str__ method.
Although ORM is the most important part of SQLAlchemy, it's not the only. It also expose a lot of functionality not related to ORM directly. When you work with objects the preferred way to create queries are corresponding session method. But sometimes you need selectable objects not bound to particular session (they are not executed directly, but are used in expressions passed to session methods). That's why there are functions in sqlalchemy.orm package.
The preferred way to interact with a db when using an ORM is not to use queries but to use objects that correspond to the tables you are manipulating, typically in conjunction with the session object. SELECT queries become get() or find() calls in some ORMs, query() calls in others. INSERT becomes creating a new object of the type you want (and maybe explicitly adding it, eg session.add() in sqlalchemy). UPDATE becomes editing such an object, and DELETE becomes deleting an object (eg. session.delete() ). The ORM is meant to handle the hard work of translating these operations into SQL for you.
Have you read the tutorial?
Denis and Kylotan gave you good answers. I'm just gonna focus on point 2.
Sometimes depends on your taste. There are times when you need database specific features that an ORM can't do, that's a case when you should use Session(<sql here>).execute() or conn.execute(<sql here>). Another case is when you have a very complex query which is beyond you and you don't find a suitable ORM expression.
Usually, using ORM features like select([...]).where(... or Session.query(<Model here>).filter(... (declarative base) are enough. Almost every sql query has an ORM equivalent.

Categories

Resources