I'm using sqlalchemy to model the following relationship:
There are stops, and many stops can have the same name.
There are translations, and multiple translations use the same stop name (but different languages). So that one stop name could be translated to many languages.
Since the stop_name is not unique among stops, sqlaclhemy+postgres don't like it when I try to create a one-to-many relationship (see below). But this is not exactly one-to-many. What I want, when I access stop.translations, is to get all of the translations that match this query: SELECT * from translation WHERE translation.stop_name == stop.stop_name. So I accept the actual many-to-many relationship here, but want to hide it from my users, to make it look like one-to-many.
I thought of using hybrid attributes, but they seem to be scalar only, so that's not really an option. I probably did a bad job trying to prefill a many-to-many relationship, because that took forever and timedout.
Some context: this is part of pygtfs, but here is the minimal example of when this goes wrong. When I run the following script:
import sqlalchemy
import sqlalchemy.orm
from sqlalchemy import Column
from sqlalchemy.types import Unicode
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Stop(Base):
__tablename__ = 'stop'
stop_id = Column(Unicode, primary_key=True)
stop_name = Column(Unicode)
# What I'd like:
translations = sqlalchemy.orm.relationship('Translation', viewonly=True,
primaryjoin="stop.c.stop_name==translation.c.stop_name")
class Translation(Base):
__tablename__ = 'translation'
stop_name = Column(Unicode, primary_key=True)
lang = Column(Unicode, primary_key=True)
translation = Column(Unicode)
if __name__ == "__main__":
engine = sqlalchemy.create_engine("postgresql://postgres#localhost:5432")
Session = sqlalchemy.orm.sessionmaker(bind=engine)
session = Session()
session.add(Stop(stop_id="hrld", stop_name="Herald Square"))
I get:
[...]
sqlalchemy.exc.ArgumentError: Could not locate any relevant foreign key columns for primary join condition 'stop.stop_name = translation.stop_name' on relationship Stop.translations. Ensure that referencing columns are associated with a ForeignKey or ForeignKeyConstraint, or are annotated in the join condition with the foreign() annotation.
What can I do to map this in sqlaclhemy?
[edit after a comment]:
If I add a ForeignKey, it fails, because stop_name is not unique (and I don't want it to be unique!):
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) there is no unique constraint matching given keys for referenced table "translation"
[SQL: '\nCREATE TABLE stop (\n\tstop_id VARCHAR NOT NULL, \n\tstop_name VARCHAR, \n\tPRIMARY KEY (stop_id), \n\tFOREIGN KEY(stop_name) REFERENCES translation (stop_name)\n)\n\n'] (Background on this error at: http://sqlalche.me/e/f405)
Related
As a background: I'm creating an ORM based on a schema of an already existing database. - This due to the fact that the python application won't be the "owner" of said database.
Now in this database there is a table called "task" and a table called "task_notBefore__task_relatedTasks" - this latter is a many-to-many relation between different entries in the "task" table.
now automap_base() has an automated detection of these relationships as described here. However this fails for my case, and no relationship is being build.
I then try to manually create the relationship:
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.ext.automap import generate_relationship
from sqlalchemy.orm import sessionmaker, interfaces, relationship
from sqlalchemy import create_engine
class DBConnection:
def __init__(self, connection_url, **kwargs):
self.engine = create_engine(connection_url, **kwargs)
self._Base = automap_base()
self._Base.prepare(self.engine, reflect=True)
self.Task = self._Base.classes.task
self.Order = self._Base.classes.order
self.Poller = self._Base.classes.poller
rel = generate_relationship(self._Base, interfaces.MANYTOMANY, relationship, 'related', self.Task, self.Task,
secondary=self._Base.classes.task_notBefore__task_relatedTasks, backref='notBefore')
self._Session = sessionmaker()
self._Session.configure(bind=self.engine)
self.session = self._Session()
However this still doesn't "do" anything: it doesn't add anything to the self.Task "class".
How would one do this?
The primary problem in this case is not just the many-to-many relationship, but the fact that it's a self-referential, many-to-many relationship. Because automap is simply translating the mapped class names to relationship names, it constructs the same name, e.g. task_collection, for both directions of the relationship, and the naming collision generates the error. This shortcoming of automap feels significant in that self-referential, many-to-many relationships are not uncommon.
Explicitly adding the relationships you want, using your own names, won't solve the problem because automap will still try to create the task_collection relationships. To deal with this issue, we need to override task_collection.
If you're okay with keeping the name task_collection for the forward direction of the relationship, we can simply pre-define the relationship--specifying whatever name we want for the backref. If automap finds the expected property already in place, it will assume the relationship is being overridden and not try to add it.
Here's a stripped down example, along with the an sqlite database for testing.
Sqlite Database
CREATE TABLE task (
id INTEGER,
name VARCHAR,
PRIMARY KEY (id)
);
CREATE TABLE task_task (
tid1 INTEGER,
tid2 INTEGER,
FOREIGN KEY(tid1) REFERENCES task(id),
FOREIGN KEY(tid2) REFERENCES task(id)
);
-- Some sample data
INSERT INTO task VALUES (0, 'task_0');
INSERT INTO task VALUES (1, 'task_1');
INSERT INTO task VALUES (2, 'task_2');
INSERT INTO task VALUES (3, 'task_3');
INSERT INTO task VALUES (4, 'task_4');
INSERT INTO task_task VALUES (0, 1);
INSERT INTO task_task VALUES (0, 2);
INSERT INTO task_task VALUES (2, 4);
INSERT INTO task_task VALUES (3, 4);
INSERT INTO task_task VALUES (3, 0);
Putting it into a file called setup_self.sql, we can do:
sqlite3 self.db < setup_self.sql
Python Code
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import Session
from sqlalchemy import create_engine
from sqlalchemy import Table, Column, Integer, ForeignKey
from sqlalchemy.orm import relationship
from sqlalchemy.ext.declarative import declarative_base
DeclBase = declarative_base()
task_task = Table('task_task', DeclBase.metadata,
Column('tid1', Integer, ForeignKey('task.id')),
Column('tid2', Integer, ForeignKey('task.id')))
Base = automap_base(DeclBase)
class Task(Base):
__tablename__ = 'task'
task_collection = relationship('Task',
secondary=task_task,
primaryjoin='Task.id==task_task.c.tid1',
secondaryjoin='Task.id==task_task.c.tid2',
backref='backward')
engine = create_engine("sqlite:///self.db")
Base.prepare(engine, reflect=True)
session = Session(engine)
task_0 = session.query(Task).filter_by(name ='task_0').first()
task_4 = session.query(Task).filter_by(name ='task_4').first()
print("task_0.task_collection = {}".format([x.name for x in task_0.task_collection]))
print("task_4.backward = {}".format([x.name for x in task_4.backward]))
Results
task_0.task_collection = ['task_1', 'task_2']
task_4.backward = ['task_2', 'task_3']
Using a Different Name
If you want to have a name other than task_collection, you need to use automap's function for overriding collection-relationship names:
name_for_collection_relationship(base, local_cls, referred_cls, constraint)
The arguments local_cls and referred_cls are instances of the mapped table classes. For a self-referential, many-to-many relationship, these are both the same class. We can use the arguments to build a key that allows us to identify overrides.
Here is an example implementation of this approach.
from sqlalchemy.ext.automap import automap_base, name_for_collection_relationship
from sqlalchemy.orm import Session
from sqlalchemy import create_engine
from sqlalchemy import Table, Column, Integer, ForeignKey
from sqlalchemy.orm import relationship
from sqlalchemy.ext.declarative import declarative_base
DeclBase = declarative_base()
task_task = Table('task_task', DeclBase.metadata,
Column('tid1', Integer, ForeignKey('task.id')),
Column('tid2', Integer, ForeignKey('task.id')))
Base = automap_base(DeclBase)
class Task(Base):
__tablename__ = 'task'
forward = relationship('Task',
secondary=task_task,
primaryjoin='Task.id==task_task.c.tid1',
secondaryjoin='Task.id==task_task.c.tid2',
backref='backward')
# A dictionary that maps relationship keys to a method name
OVERRIDES = {
'Task_Task' : 'forward'
}
def _name_for_collection_relationship(base, local_cls, referred_cls, constraint):
# Build the key
key = '{}_{}'.format(local_cls.__name__, referred_cls.__name__)
# Did we have an override name?
if key in OVERRIDES:
# Yes, return it
return OVERRIDES[key]
# Default to the standard automap function
return name_for_collection_relationship(base, local_cls, referred_cls, constraint)
engine = create_engine("sqlite:///self.db")
Base.prepare(engine, reflect=True, name_for_collection_relationship=_name_for_collection_relationship)
Note that the overriding of name_for_collection_relationship simply changes the name that automap uses for the relationship. In our case, the relationship is still being pre-defined by Task. But, the override tells automap to look for forward instead of task_collection, which it finds and therefore discontinues defining the relationship.
Other Approaches Considered
Under some circumstances, it would be nice if we could override the relationship names without having to pre-define the actual relationship. On first consideration, this should be possible using name_for_collection_relationship. However, I could not get this approach to work for self-referential, many-to-many relationships, due to a combination of two reasons.
name_for_collection_relationship and the related generate_relationship are called twice, once for each direction of the many-to-many relationship. In both cases, local_cls and referred_cls are the same, because of the self-referentiality. Moreover, the other arguments of name_for_collection_relationship are effectively equivalent. Therefore, we cannot, from the context of the function call, determine which direction we are overriding.
Here is the even-more surprising part of the problem. It appears we cannot even count on one direction happening before the other. In other words, the two calls to name_for_collection_relationship and generate_relationship are very similar. The argument that actually determines the directionality of the relationship is constraint, which is one of the two foreign-key constraints for the relationship; these constraints are loaded, from Base.metadata, into a variable called m2m_const. Herein lies the problem. The order that the constraints end up in m2m_const is nondeterministic, i.e. sometimes it will be one order; other times it will be the opposite (at least when using sqlite3). Because of this, the directionality of the relationship is nondeterministic.
On the other hand, when we pre-define the relationship, the following arguments create the necessary determinism.
primaryjoin='Task.id==task_task.c.tid1',
secondaryjoin='Task.id==task_task.c.tid2',
Of particular note, I actually tried to create a solution that simply overrode the relationship names without pre-defining it. It exhibited the described nondeterminism.
Final Thoughts
If you have a reasonable number of database tables that do not change often, I would suggest just using Declarative Base. It might be a little more work to set up, but it gives you more control.
Executing this command:
sqlacodegen <connection-url> --outfile db.py
The db.py contains generated tables:
t_table1 = Table(...)
and classes too:
Table2(Base):
__tablename__ = 'table2'
The problem is that a table is generated in one way only - either a table or a class.
I would like to make it generate models (classes) only but in the provided flags I couldn't find such an option. Any idea?
It looks like what you're describing is a feature itself. sqlacodegenwill not always generate class models.
It will only form model classes for tables that have a primary key and are not association tables, as you can see in the source code:
# Only form model classes for tables that have a primary key and are not association tables
if noclasses or not table.primary_key or table.name in association_tables:
model = self.table_model(table)
else:
model = self.class_model(table, links[table.name], self.inflect_engine, not nojoined)
classes[model.name] = model
Furthermore, in the documentation it is stated that
A table is considered an association table if it satisfies all of the
following conditions:
has exactly two foreign key constraints
all its columns are involved in said constraints
Although, you can try a quick and dirty hack. Locate those lines in the source code (something like /.../lib/python2.7/site-packages/sqlacodegen/codegen.py) and comment out the first three code lines (and fix indentation):
# Only form model classes for tables that have a primary key and are not association tables
# if noclasses or not table.primary_key or table.name in association_tables:
# model = self.table_model(table)
# else:
model = self.class_model(table, links[table.name], self.inflect_engine, not nojoined)
classes[model.name] = model
I have tried this for one specific table that was generated as a table model. It went from
t_Admin_op = Table(
'Admin_op', metadata,
Column('id_admin', Integer, nullable=False),
Column('id_op', Integer, nullable=False)
)
to
class AdminOp(Base):
__tablename__ = 'Admin_op'
id_admin = Column(Integer, nullable=False)
id_op = Column(Integer, nullable=False)
You can also open an issue about this as a feature request, in the official tracker.
Just in case, if you want the opposite (only table models), you could do so with the --noclasses flag.
One of my models has the following relationship:
class User(Base):
account = relationship("Account")
I would like to set the account id manually.
My first attempt was this:
class User(Base):
account = relationship("Account")
accounts_id = Column(Integer, ForeignKey("accounts.id"), nullable=True)
#classmethod
def from_json(cls, json):
appointment = Appointment()
appointment.account_id = json["account_id"]
return appointment
The above dosen't work. We can't refer to this column because SQLAlchemy throws a fit. This is the exception:
sqlalchemy.exc.InvalidRequestError: Implicitly combining column users.accounts_id with column users.accounts_id under attribute 'accounts_id'. Please configure one or more attributes for these same-named columns explicitly.
I've tried hunting through the docs and expermiented with getting to the attribute numerous ways but I haven't been able to find, much less set it.
print(self.account.account_id)
print(self.account.relationhip)
print(self.account.properties)
print(self.account.primaryjoin)
Any ideas?
[Edit- added exception above]
Use the Account class to define the relationship, and add the backref keyword argument:
from sqlalchemy.orm import relationship
class User(Base):
accounts_id = Column(Integer, ForeignKey('account.id'))
class Account(Base):
users = relationship('User', backref='account')
When the backref keyword is used on a single relationship, it’s exactly the same as if the above two relationships were created individually using back_populates on each.
References
Linking Relationships with Backref
Controlling Cascade on Backrefs
SQLAlchemy ORM Examples
I am trying to set One-To-Many relationship on an existing database.
Simplified DDL is :
create table accnt (
code varchar(20) not null
, def varchar(100)
, constraint pk_accnt primary key (code)
);
commit;
create table slorder (
code varchar(20) not null
, def varchar(100)
, dt date
, c_accnt varchar(20) not null
, constraint pk_slorder primary key (code)
, constraint fk_slorder_accnt foreign key (c_accnt)
references accnt (code)
on update cascade on delete cascade
);
commit;
SqlAlchemy Code :
from sqlalchemy import *
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import *
engine = create_engine('firebird://sysdba:masterkey#127.0.0.1/d:\\prj\\db2\\makki.fdb?charset=WIN1254', echo=False)
Base = declarative_base()
Base.metadata.bind = engine
class Accnt(Base):
__tablename__ = 'accnt'
__table_args__ = {'autoload': True}
defi = Column('def', String(100))
class SlOrder(Base):
__tablename__ = 'slorder'
__table_args__ = {'autoload': True}
defi = Column("def", String(100))
accnt = relationship('Accnt', backref='slorders')
gives
sqlalchemy.exc.ArgumentError: Could not determine join condition between parent/child tables on relationship SlOrder.accnt. Specify a 'primaryjoin' expression. If 'secondary' is present, 'secondaryjoin' is needed as well.
error.
My possible solutions to this problem are :
1
class SlOrder(Base):
__tablename__ = 'slorder'
__table_args__ = {'autoload': True}
defi = Column("def", String(100))
c_accnt = Column("c_accnt", String(20), ForeignKey('accnt.code'))
accnt = relationship('Accnt', backref='slorders')
But this approach needs that I have to add every foreign key constraint column manually, which leads to making reflection useles. (Because I've got many columns references to other tables.)
2
class SlOrder(Base):
__table__ = Table('accnt', metadata, autoload = True, autoload_with=engine)
accnt = relationship('Accnt', backref='slorders', primaryjoin=(__table__.c_accnt==Accnt.code))
This approach has an another consequence (please see my previous question)
So what am I missing? What is the best way to define a relationship both using reflection and declarative syntax?
EDIT :
I've figured that SqlAlchemy finds and builds relationships if child table has only one reference to parent table.
But if child table has more than one reference as :
create table slorder (
code varchar(20) not null
, def varchar(100)
, dt date
, c_accnt varchar(20) not null
, c_accnt_ref varchar(20)
, constraint pk_slorder primary key (code)
, constraint fk_slorder_accnt foreign key (c_accnt)
references accnt (code)
on update cascade on delete cascade
, constraint fk_slorder_accnt_ref foreign key (c_accnt_ref)
references accnt (code)
on update cascade on delete no action
);
the above error occurs.
So is it expected behavior of SqlAlchemy to give error if there is more than one relation between two tables?
I think you have to add ForeignKey in child table.
By the way of define ForeignKey you can assign value to c_accnt and also assign object of parent to accnt.
Internally sqlalchemy fire the query which you wrote in primaryjoin. If there is no foreign key then model can't understand on which field it has to run a query.
You can use any both way. But I personally prefer ForeignKey and relation to ForeignKey. This way you have to write some more code but, it will give flexibility to assign value plus object directly.
I think your code should automatically reflect ForeignKey and use for the relationship without any changes.
Just some ideas to explore the issue though:
Make sure you do not have multiple ForeignKey's to the same parent table or else you must specify the join condition using primaryjoin parameter, because SA cannot automatically decide which one to use.
Make sure you actually have a ForeignKey defined in the slorder table (as shown in the code example)
Check if maybe there is a non-default schema defined for some tables and maybe you need to define one in your tables _table_args__ = {'schema': 'my_schema'} (Just guessing, as I do not know firebird so no idea about schema support there really)
Check/Debug the reflection step: sqlalchemy/dialects/firebird/base.py has get_foreign_keys. Check the SQL statement fkqry and execute it directly on your database to see if it reflects your ForeignKey. If not, try to find out why.
How to make dynamic queries in SqlAlchemy ORM (if it is a correct name for them).
I used SqlAlchemy as abstraction for database, with queries in python code, but what if I need to generate these queries dynamically, not only set the parameters of query like "id"?
For example, I need to generate query from list (table names, column names, joined columns) that links three tables like "organisation", "people", "staff". How can I do it properly?
For example, i meant this list:
[{'table':'organisation', 'column':'staff_id'},
{'table':'staff', 'column':'id'}]
And output for example may contain:
organisation.id, organisation.name, organisation.staff_id, staff.id, staff.name
(name column is presented only in output, because I need simple example, recieving all columns of tables, and array must just set joins)
You can use mapper on the result of a call to sqlalchemy.sql.join and/or sqlalchemy.select. This is roughly equivalent to using mapper on a database view; you can query against such classes naturally, but not necessarily create new records. You can also use sqlalchemy.orm.column_property to map computed values to object attributes. As I read your question, a combination of these three techniques should meet your needs.
Haven't tested, but it with the SQLAlchemy ORM, you can link tables together like:
from sqlalchemy import create_engine, Integer, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, ForeignKey
from sqlalchemy.orm import relationship
from asgportal.database import Session
Engine = create_engine('mysql+mysqldb://user:password#localhost:3306/mydatabase', pool_recycle=3600)
Base = declarative_base(bind=Engine)
session = Session()
session.configure(bind=Engine)
class DBOrganization(Base):
__tablename__ = 'table_organization'
id = Column(Integer(), primary_key=True)
name = Column(ASGType.sa(ASGType.STRING))
class DBEmployee(Base):
__tablename__ = 'table_employee'
id = Column(Integer(), primary_key=True)
name = Column(String(255))
organization_id = Column(Integer(), ForeignKey('table_organization.id'))
# backref below will be an array[] unless you specify uselist=False
organization = relationship(DBOrganization, backref='employees')
Base.metadata.create_all()
# From here, you can query:
rs = session.query(DBEmployee).join(DBEmployee.organization).filter(DBOrganization.name=='my organization')
for employees in rs:
print '{0} works for {1}'.format(employees.name,employees.organization.name)