In a legacy database we have, there is a pretty special datastructure where we have two many-to-many relations joining the same two tables companies to paymentschedules.
There is a many-to-many relation using an association table called companies_paymentschedules, and a second many-to-many relation using an association table called companies_comp_paymentschedules.
Both relations serve different purposes.
companies_paymentschedules stores paymentschedules for which the company has a discount, companies_comp_paymentschedules stores paymentschedules that are linked to the company.
(I know that this could be simplified by replacing these tables with a single lookup table, but that is not an option in this legacy database.)
The problem is that I need to join both types of companies (discounted and linked) in the same query. SQLAlchemy joins both tables without problems, but it also joins the companies table, and calls them both "companies", which leads to a SQL syntax error (using MSSQL BTW).
This is the query:
q = Paymentschedules.query
# join companies if a company is specified
if company is not None:
q = q.join(Paymentschedules.companies)
q = q.join(Paymentschedules.companies_with_reduction)
The many-to-many relations are both defined in our companies model, and look like this:
paymentschedules_with_reduction = relationship("Paymentschedules", secondary=companies_paymentschedules, backref="companies_with_reduction")
paymentschedules = relationship("Paymentschedules", secondary=companies_comp_paymentschedules, backref="companies")
The problem is that the JOINS trigger SQLAlchemy to create a SQL statement that looks like this:
FROM paymentschedules
JOIN companies_comp_paymentschedules AS companies_comp_paymentschedules_1 ON paymentschedules.pmsd_id = companies_comp_paymentschedules_1.pmsd_id
JOIN companies ON companies.comp_id = companies_comp_paymentschedules_1.comp_id
JOIN companies_paymentschedules AS companies_paymentschedules_1 ON paymentschedules.pmsd_id = companies_paymentschedules_1.pmsd_id
JOIN companies ON companies.comp_id = companies_paymentschedules_1.comp_id
The two lookup tables have different names, but the related companies table is called "companies" in both cases, causing a SQL error:
[SQL Server Native Client 11.0][SQL Server]The objects "companies" and "companies" in the FROM clause have the same exposed names. Use correlation names to distinguish them. (1013) (SQLExecDirectW); ...]
I have been looking for a way to alias a join, or perhaps alias one of the relations from my lookup-tables to the companies table, but I was unable to do so.
Is there a way to alias a joined many-to-many table?
update:
Based on the suggestion by #IljaEverilä I found this: http://docs.sqlalchemy.org/en/latest/orm/query.html?highlight=onclause (see "Joins to a Target with an ON Clause") as a method to alias a joined table, but the example only shows how to alias a one-to-many type join. In my case I need to alias the other side of my lookup table, so I can't apply the example code to my situation.
Related
We have a hand-written SQL query for proof of concept and hope to implement the function with the Django framework.
Specifically, Django's QuerySet usually implements a join query by matching the foreign key with the primary key of the referred table. However, in the sample SQL below, we need additional matching conditions besides the foreign key, like the eav_phone.attribute_id = 122 in the example snippet below.
...
left outer join eav_value as eav_postalcode
on t.id = eav_postalcode.entity_id and eav_phone.attribute_id = 122
...
Questions:
We wonder if there is a way to do it with Python, Django framework, or libraries.
We also wonder if other programming languages have any mature toolkits we can refer to as a design pattern. So we highly appreciate any hints and suggestions.
Backgrounds and Technical Details:
The scenario is a report that consists of transactions with customized columns by Django-EAV. This library implements the eav_value table consisting of columns of different data types, e.g. value_text, value_date, value_float, etc.
We forked an internal repository of Django-EAV and upgraded it to Python 3, so we can use any up-to-date Python features, although we are not using Django-EAV2. As far as we know, the new version, EAV2, follows the same database schema design.
So, the application defines a product with attributes in specific data types, and we referred it as metadata in this question, e.g.:
attribute_id
slug
datatype
122
postalcode
text
123
phone
text
...
...
e.g. date, float, etc. ...
One transaction is one entity, and the eav_value table contains multiple records with the matching entity_id corresponding to the different customized attributes. And we want to build a dynamic QuerySet according to the metadata to assemble the customized columns with left outer join similar to the sample SQL query below.
select
t.id, t.create_ts
, eav_postalcode.value_text as postalcode
, eav_phone.value_text as phone
from
(
select * from transactions
where product_id = __PRODUCT_ID__
) as t
left outer join eav_value as eav_postalcode
on t.id = eav_postalcode.entity_id and eav_phone.attribute_id = 122
left outer join eav_value as eav_phone
on t.id = eav_phone.entity_id and eav_phone.attribute_id = 123
;
We followed #NickODell's hint on FilteredRelation, and our tentative solution looks like the below snippet:
transaction_eav = transaction.annotate(
eav_postalcode=FilteredRelation('eav_values', condition=Q(eav_values__attribute_id=22))
)
transaction_eav = transaction_eav.annotate(
value_postalcode=F('eav_postalcode__value_text)}
)
We are new to Django ORM so please point out if the sample code above contains any low-efficient or non-standard flaws.
Many thanks to all for the great suggestions!
I am using simpleSalesforce library for python to query SalesForce.
I am looking at two different object in SalesForce: Account and Opportunity (parent-child). There is an accountId inside the opportunity object.
I am trying to perform an inner join between the two and select the results (fields from both objects).
a normal SQL statement would look like this:
SELECT acc.Name, opp.StageName
FROM Account AS acc
JOIN Opportunity AS opp ON acc.Id = opp.AccountId
I am not sure how to translate this kind of query into SOQL.
A couple notes about SOQL:
No aliases allowed
No explicit joins allowed
But with that in mind, it's still possible to get the outcome you want by directly using the desired fields names as an "attribute" of the object relationship. Example:
SELECT account.Name, Name, StageName FROM Opportunity
Which will grab the related account name, the opportunity name, and the opportunity stage name in one query.
As long as the field on your base object is of a type Lookup or Master-Detail, you can use this type of relationship. In the case of custom fields, you switch the __c over to __r though.
Example:
Opportunity has a relationship to custom object Address__c and we want to know what city & country these opportunities are in:
SELECT Address__r.Country__c, Address__r.City__c,Name, StageName from Opportunity
Salesforce doesn't allow arbitrary joins. You must write relationship queries to traverse predefined relationships in the Salesforce schema.
Here, you'd do something like
SELECT Name, (SELECT StageName FROM Opportunities)
FROM Account
No explicit join logic is required, or indeed permitted. Note too that your return values will be structured, nested JSON objects - Salesforce does not return flat rows like a SQL query would.
I am wanting to map a class object to a table that is a join between two tables, and all the columns from one table and only one column from the joined table being selected (mapped).
join_table = join(table1, table2, tabl1.c.description==table2.c.description)
model_table_join= select([table1, table2.c.description]).select_from(join_table).alias()
Am I doing this right?
If all you want to do is pull in one extra column from a JOIN, I'd not muck about with an arbitrary select mapping. As the documentation points out:
The practice of mapping to arbitrary SELECT statements, especially complex ones as above, is almost never needed; it necessarily tends to produce complex queries which are often less efficient than that which would be produced by direct query construction. The practice is to some degree based on the very early history of SQLAlchemy where the mapper() construct was meant to represent the primary querying interface; in modern usage, the Query object can be used to construct virtually any SELECT statement, including complex composites, and should be favored over the “map-to-selectable” approach.
You'd just either select that extra column in your application:
session.query(Table1Model, Table2Model.description).join(Table2Model)
or you can register a relationship on the Table1Model and an association property that always pulls in the extra column:
class Table1Model(Base):
# ...
_table2 = relationship('Table2Model', lazy='join')
description = association_proxy('_table2', 'description')
The association property manages the Table2Model.description column of the joined row as you interact with it on Table1Model instances.
That said, if you must stick with a join() query as the base, then you could just exclude the extra, duplicated columns from the join, with a exclude_properties mapper argument:
join_table = join(table1, table2, table1.c.description == table2.c.description)
class JoinedTableModel(Base):
__table__ = join_table
__mapper_args__ = {
'exclude_properties' : [table1.c.description]
}
The new model then uses all the columns from the join to create attributes with the same names, except for those listed in `exclude_properties.
Or you can keep using duplicated column names in the model simply by giving them a new name:
join_table = join(table1, table2, table1.c.description == table2.c.description)
class JoinedTableModel(Base):
__table__ = join_table
table1_description = table1.c.description
You can rename any column from the join this way, at which point they will no longer conflict with other columns with the same base name from the other table.
I am working with a database that does not have relationships created between tables, and changing schema is not an option for me.
I'm trying to describe in orm how to join two tables without describing Foregin keys. To make make things worst I need a custom ON clause in my SQL
Here is my ORM(more or less):
class Table1(Base):
__tablename__ = "table1"
id1 = Column(String)
id2 = Column(String)
class Table2(Base):
__tablename__ = "table2"
id1 = Column(String)
id2 = Column(String)
Goal
What I'm trying to create is relationship that joins tables like this:
.....
FROM Table1
JOIN Table2 ON (Table1.id1 = Table2.id1 OR Table1.id2 = Table2.id2)
My Attempt
I tried adding following Table1 but documentation does not explain how is this wrong in terms I can understand:
table2 = relationship("Table2",
primaryjoin=or_(foreign(id1) == remote(Table2.id1),
foreign(id2) == remote(Table2.id2)))
But when tested this I got wrong SQL query back(I expected to see in SQL the join I described above):
str(query(Table1,Table2))
SELECT "table1".id1, "table1".id2, "table2".id1, "table2".id2
FROM "table1","table2"
Note
I don't really undersatnd what remote and foregin do but I tried to infer from documentation where do they belong, without then I would get error on import saying:
ArgumentError: Could not locate any relevant foreign key columns for primary join condition 'my full primaryjoin code' on relationship Table1.other_table. Ensure that referencing columns are associated with a ForeignKey or ForeignKeyConstraint, or are annotated in the join condition with the foreign() annotation.
I don't think that I can use ForeignKey or ForeignKeyContraint because none of my colums are constraned to other table's values.
The expression
str(query(Table1,Table2))
produces a cross join between the 2 tables, as you've observed. This is the expected behaviour. If you want to use inner joins etc., you'll have to be explicit about it:
str(query(Table1, Table2).join(Table1.table2))
This joins along the relationship attribute table2. The attribute indicates how this join should happen.
Documentation on foreign() and remote() is a bit scattered to my own taste as well, but it is established in "Adjacency List Relationships" and "Non-relational Comparisons / Materialized Path" that when foreign and remote annotations are on different sides of the expression (in the ON clause), the relationship is considered to be many-to-one. When they are on the same side or remote is omitted it is considered one-to-many. So your relationship is considered to be many-to-one.
They are just an alternative to foreign_keys and remote_side parameters.
I have two tables, Table A and Table B. I have added one column to Table A, record_id. Table B has record_id and the primary ID for Table A, table_a_id. I am looking to deprecate Table B.
Relationships exist between Table B's table_a_id and Table A's id, if that helps.
Currently, my solution is:
db.execute("UPDATE table_a t
SET record_id = b.record_id
FROM table_b b
WHERE t.id = b.table_a_id")
This is my first time using this ORM -- I'd like to see if there is a way I can use my Python models and the actual functions SQLAlchemy gives me to be more 'Pythonic' rather than just dumping a Postgres statement that I know works in an execute call.
My solution ended up being as follows:
(db.query(TableA)
.filter(TableA.id == TableB.table_a_id,
TableA.record_id.is_(None))
.update({TableA.record_id: TableB.record_id}, synchronize_session=False))
This leverages the ability of PostgreSQL to do updates based on implicit references of other tables, which I did in my .filter() call (this is analogous to a WHERE in a JOIN query). The solution was deceivingly simple.