Dynamic relationship class loading with sqlalchemy - python

I have a library via which I dynamically load classes. It's exposed as
mylib.registry.<className>
registry is an instance of a class that contains a dictionary of class name (strings) to module names, and a getattr call that dynamically loads a class if it is requested. A user can thus refer to any class without having to deal with the module locations (there is a global namespace for class names, but not module names).
For example, the entries:
{'X', 'mylib.sublib.x',
'Y', 'mylib.sublib.y'}
could then be used like:
import mylib
x = mylib.registry.X()
y = mylib.registry.Y()
That's the background. On top of this, these objects are sqlalchemy ORM classes which have relationships to one another. Let's assume here that X has a one-to-many with Y.
Assume thus this definition.
class X(Base):
y_id = Column(Integer, ForeignKey('y.id'))
y = relationship('Y')
class Y(Base):
xs = relationship('X')
These are in separate files, and each imports the top-level registry.
So here's the issue -- how do I actually get this to resolve without loading every class up front?
The example above doesn't work, because if I import only X via the registry, then Y isn't in sqlalchemy class registry, and thus the relatiobship breaks.
If I import the registry itself and then refer to the classes directly, then the modules don't load because of interdependencies.
I tried using a lambda to defer loading, but this too fails with an error about a missing 'strategy'.
What approaches have others used here? If I'm missing something obvious, let me know. It's been a long day.
Thanks.

You should never use relationships on two sides that don't know about each other. Normally, this is avoided by using backref but in your case this creates problems, because you want each side to be aware of its relationship by itself. The trick here is the back_populates keyword, offered by relationship:
y = relationship("Y", back_populates="xs")
and
xs = relationship("X", back_populates="y")
Applying these will make them aware of each other. However, it will not solve your importing problem. Normally, you could now just from x import X on the Y side. But the other way around won't work because it will create a circular import.
The trick is simple: Put the import after the class you want to import. Because the strings in relationship are evaluated lazily, you can import the class after the relationship was defined. So for X do this:
class X(Base):
__tablename__ = 'x'
id = Column(Integer, primary_key=True)
y_id = Column(ForeignKey('y.id'))
y = relationship("Y", back_populates="xs")
from y import Y
And the other way around for the Y as well (this is not required, but creates more symmetry). I'd also put a comment there to explain this to avoid someone putting it to the top, breaking the program.

Related

SQLAlchemy multiple backrefs causing problems

I'm using SQLAlchemy with Python (linking to an MySQL database) and am having a little design issue, which has presented itself as a problem with a backref.
So the situation is this. I have a SearchGroup which contains TargetObjects and SearchObjects. These are both many to many relationships, and so the SearchGroup table comes with two association tables, one for each. The SearchObject is the same time for any SearchGroup, but the TargetObject varies. So far so good. The whole idea here is that a SearchObject is simply a string with a few other variables, and a SearchGroup compares them all to a given string and then, if there's a match, supplies the target objects.
Now for some code: the declaration of these three classes, although with the parent logic hidden for brevity:
class AssocTable_GroupCMClassesGrades_Grades(AssociationTable_Group_TargetObjectsParent, med.DeclarativeBase):
__tablename__ = 'AssocTable_GroupCMClassesGrades_Grades'
_groupsTableName = 'SearchGroup_CMClasses_Grades'
_targetObjectsTableName = 'Grades'
class AssocTable_GroupCMClassesGrades_SearchObjects(AssociationTable_Group_SearchObjectsParent, med.DeclarativeBase):
__tablename__ = 'AssocTable_GroupCMClassesGrades_SearchObjects'
_groupsTableName = 'SearchGroup_CMClasses_Grades'
_searchObjectsTableName = 'SearchObjects'
class SearchGroup_CMClasses_Grades(SearchObjectGroupParent, med.DeclarativeBase):
__tablename__ = 'SearchGroup_CMClasses_Grades'
targetAssociatedTargetObjectTableName = 'AssocTable_GroupCMClassesGrades_Grades'
targetAssociatedSearchObjectTableName = 'AssocTable_GroupCMClassesGrades_SearchObjects'
targetClassName = 'Grade'
myClassName = 'SearchGroup_CMClasses_Grades'
searchObjectClassName = 'SearchObject'
searchObjectChildrenBackRefName = 'Groups'
The top two are the association tables and the bottom is the main class. The strings are used to set up various foreign keys and relationships and such.
Let's look at a specific example, which is crucial to the question:
#declared_attr
def searchObject_childen(cls):
return relationship(f'{cls.searchObjectClassName}', secondary=f'{cls.targetAssociatedSearchObjectTableName}', backref=f'{cls.searchObjectChildrenBackRefName}')
This is inside the SearchObjectGroupParent class and, as you can see, is for the 'children' of the SearchGroup, which are SearchObjects.
So now to the problem.
That all works rather well, except for one thing. If I could direct your attention back to the large bit of code above, and to this line:
searchObjectChildrenBackRefName = 'Groups'
This, as seen in the second posted piece of code (the declared_attr one), sets up a backref; a property in the target - it creates that property and then populates it. I'm not an expert at this by any means so I won't pretend to be. The point is this: if I create another SearchObjectGroupParent derived class, like the one above, with its association tables, I can't put another 'Groups' property into SearchObject - in fact it will throw an error telling me as much:
sqlalchemy.exc.ArgumentError: Error creating backref 'Groups' on relationship 'SearchGroup_CMClasses_Grades.searchObject_childen': property of that name exists on mapper 'mapped class SearchObject->SearchObjects'
There is a rather unsatisfying way to solve this, which is to simple change that name each time, but then the SearchObject won't have a common list of SearchGroups. In fact it will contain the 'Groups' property for every SearchGroup. This will work, but will be messy and I'd rather not do it. What I would like is to say 'okay, if this backref already exists, just use that one'. I don't know if that's possible, but I think such a thing would solve my problem.
Edit: I thought an image might help explain better:
Figure 1: what I have now:
The more of these objects derived from SearchObjectsGroupParent I have, the messier it will be (SearchObject will contain Groups03, Groups04, Groups05, etc.).
Figure 2: what I want:

SQLAlchemy; getting list of related tables/classes

Say I have a Thing class that is related to some other classes, Foo and Bar.
class Thing(Base):
FooKey = Column('FooKey', Integer,
ForeignKey('FooTable.FooKey'), primary_key=True)
BarKey = Column('BarKey', Integer, ForeignKey('BarTable.BarKey'), primary_key=True)
foo = db.relationship('Foo')
bar = db.relationship('Bar')
I want to get a list of the classes/tables related to Thing created by my relationships() e.g. [Foo, Bar]. Any way to do this?
This is a closely related question:
SQLAlchemy, Flask: get relationships from a db.Model. That identifies the string names of the relationships, but not the target classes.
Context:
I'm building unit tests for my declarative base mapping of a SQL database. A lot of dev work is going into it and I want robust checks in place.
Using the Mapper as described in that other question gets you on the right path. As mentioned on the doc [0], you will get a bunch of sqlalchemy.orm.relationships.RelationshipProperty, and then you can use class_ on the mapper associated with each RelationshipProperty to get to the class:
from sqlalchemy.inspection import inspect
rels = inspect(Thing).relationships
clss = [rel.mapper.class_ for rel in rels]

SQLAlchemy creating two databases in one file with two different models

I want to initialize two databases with total different models in my database.py file.
database.py
engine1 = create_engine(uri1)
engine2 = create_engine(uri2)
session1 = scoped_session(sessionmaker(autocommit=False,autoflush=False,bind=engine1))
session2 = scoped_session(sessionmaker(autocommit=False,autoflush=False,bind=engine2))
Base = declarative_base(name='Base')
Base.query = session1.query_property()
LogBase = declarative_base(name='LogBase')
LogBase.query = session2.query_property()
and the two model structures:
models.py
class MyModel(Base):
pass
models2.py
class MyOtherModel(LogBase):
pass
back to the database.py where i want to create/initialize the databases after importing the models
# this does init the database correctly
def init_db1():
import models
Base.metadata.create_all(bind=engine1)
# this init function doeas not work properly
def init_db2():
import models2
LogBase.metadata.create_all(bind=engine2)
if I change the import in the second init function it does work
def init_db2():
from models2 import *
LogBase.metadata.create_all(bind=engine2)
but there is a warning:
database.py:87: SyntaxWarninyntaxWarning: import * only allowed at module level
Everthing does work properly, I have the databases initialized, but the Warning tells me, that there is something wrong with it.
If someone can explain me why the first attempt isn't correct I would be grateful. Thanks.
You are indeed discouraged from using the from ... import * syntax inside functions, because that makes it impossible for Python to determine what the local names are for that function, breaking scoping rules. In order for Python to make things work anyway, certain optimizations have to be disabled and name lookup is a lot slower as a result.
I cannot reproduce your problem otherwise. Importing just models2 makes sure that everything defined in that module is executed so that the LogBase class has a registry of all declarations. There is no reason for that path to fail while the models.py declarations for Base do work.
For the purposes of SQLAlchemy and declarative table metadata, there is no difference between the import models2 and the from models2 import * syntax; only their effect on the local namespace differs. In both cases, the models2 top-level code is run, classes are defined, etc. but in the latter case then the top-level names from the module are added to the local namespace as direct references, as opposed to just a reference to the module object being added.

Instantiating Multiple AbstractConcreteBase Issue

I'm getting an error I don't understand with AbstractConcreteBase
in my_enum.py
class MyEnum(AbstractConcreteBase, Base):
pass
in enum1.py
class Enum1(MyEnum):
years = Column(SmallInteger, default=0)
# class MyEnums1:
# NONE = Enum1()
# Y1 = Enum1(years=1)
in enum2.py
class Enum2(MyEnum):
class_name_python = Column(String(50))
in test.py
from galileo.copernicus.basic_enum.enum1 import Enum1
from galileo.copernicus.basic_enum.enum2 import Enum2
#...
If I uncomment the three lines in enum1.py I get the following error on the second import.
AttributeError: type object 'MyEnum' has no attribute 'table'
but without MyEnums1 it works fine or with MyEnums1 in a separate file it works fine. Why would this instantiation affect the import? Is there anyway I can keep MyEnums1 in the same file?
the purpose of the abstractconcretebase is to apply a non-standard order of operations to the standard mapping procedure. normally, mapping works like this:
define a class to be mapped
define a Table
map the class to the Table using mapper().
Declarative essentially combines these three steps, but that's what it does.
When using an abstract concrete base, we have this totally special step that needs to happen - the base class needs to be mapped to a union of all the tables that the subclasses are mapped to. So if you have enum1 and enum2, the "Base" needs to map to essentially "select * from enum1 UNION ALL select * from enum2".
This mapping to a UNION can't happen piecemeal; the MyEnum base class has to present itself to mapper() with the full UNION of every sub-table at once. So AbstractConcreteBase performs the complex task of rearranging how declarative works such that the base MyEnum is not mapped at all until the mapper configuration occurs, which among other places occurs when you first instantiate a mapped class. It then inserts itself as the mapped base for all the existing mapped subclasses.
So basically by instantiating an Enum1() object at the class level like that, you're invoking configure_mappers() way too early, such that by the time Enum2() comes along the abstractconcretebase is baked and the process fails.
All of that aside, it's not at all correct to be instantiating a mapped class like Enum1() at the class level like that. ORM-mapped objects are the complete opposite of global objects and must always be created local to a specific Session.
edit: also those classes are supposed to have {"concrete": True} on them which is part of why you're getting this message. Im trying to see if the message can be improved.
edit 2: yeah the mechanics here are weird. I've committed something else that skips this particular error message, though it will fail differently now and not much better. getting this to fail more gracefully would require a little more work.

Google App Engine base and subclass gets

I want to have a base class called MBUser that has some predefined properties, ones that I don't want to be changed. If the client wants to add properties to MBUser, it is advised that MBUser be subclassed, and any additional properties be put in there.
The API code won't know if the client actually subclasses MBUser or not, but it shouldn't matter. The thinking went that we could just get MBUser by id. So I expected this to work:
def test_CreateNSUser_FetchMBUser(self):
from nsuser import NSUser
id = create_unique_id()
user = NSUser(id = id)
user.put()
# changing MBUser.get.. to NSUser.get makes this test succeed
get_user = MBUser.get_by_id(id)
self.assertIsNotNone(get_user)
Here NSUser is a subclass of MBUser. The test fails.
Why can't I do this?
What's a work around?
Models are defined by their "kind", and a subclass is a different kind, even if it seems the same.
The point of subclassing is not to share values, but to share the "schema" you've created for a given "kind".
A kind map is created on base class ndb.Model (it seems like you're using ndb since you mentioned get_by_id) and each kind is looked up when you do queries like this.
For subclasses, the kind is just defined as the class name:
#classmethod
def _get_kind(cls):
return cls.__name__
I just discovered GAE has a solution for this. It's called the PolyModel:
https://developers.google.com/appengine/docs/python/ndb/polymodelclass

Categories

Resources