So I am trying to join a few tables with an outerjoin.
This is my code
products = (
db.session.query(Offers, Products, Brand, Categories, ProductImages)
.outerjoin(
Offers,
Offers.product_id == Products.id,
Offers.brand_id == Brand.id,
Offers.category_id == Categories.id,
Offers.product_id == ProductImages.product_id,
)
.filter(and_(now >= Offers.start_date), (now <= Offers.end_date))
.order_by(Offers.product_name)
.all()
)
I am getting this error:
sqlalchemy.exc.InvalidRequestError: Don't know how to join to <class 'app.models.Offers'>; please use an ON clause to more clearly establish the left side of this join
But I am assuming by mentioning "Offers" at the beginning of the join I am stating a join ON the "Offers" table.
There are no relationships defined for each of the tables and my tech lead has told me not to define it at the moment. How can I do the join without defining the relationships?
Related
I'm having a lot of trouble converting my sql query to sqlalchemy. I haven't been able to find any resources doing what I am trying to do.
The query I am trying to convert is:
SELECT
COALESCE(d.manager_name, e.name) AS name,
COALESCE(d.department_name, e.department_name) AS department
FROM employee e
LEFT JOIN department d ON e.id = d.id
WHERE e.date = '2018-11-05'
In sqlalchemy I came up with:
query = self.session.query(
func.coalesce(Department.manager_name, Employee.name),
func.coalesce(Department.department_name, Employee.department_name)).join(Department,
Employee.id == Department.id,
).filter(
Employee.date == '2018-11-05',
)
But keep getting the error:
sqlalchemy.exc.InvalidRequestError: Can't join table/selectable 'Department' to itself.
WHY?! The statements are exact!
Since Department is the leftmost item in your query, joins take place against it. To control what is considered the first – or the "left" – entity in the join use Query.select_from():
query = self.session.query(
func.coalesce(Department.manager_name, Employee.name),
func.coalesce(Department.department_name, Employee.department_name)).\
select_from(Employee).\
outerjoin(Department, Employee.id == Department.id).\
filter(Employee.date == '2018-11-05')
This behaviour is also explained in the ORM tutorial under "Querying with Joins", and Query.join(): "Controlling what to Join From".
Your query construct was also using Query.join(), though the raw SQL had LEFT JOIN. In that case Query.outerjoin() or join(..., isouter=True) should be used.
I am trying to build a compound SQL query that builds a table from a join I have previously performed. (Using SqlAlchemy (Core part) with python3 and Postgresql 9.4)
I include here the relevant part of my python3 code. I first create "in_uuid_set" using a select with a group_by. Then I join "in_uuid_set" with "in_off_messages" to get "jn_in".
Finally, I try to build a new table "incoming" from "jn_in" by selecting and generating the wanted columns:
in_uuid_set = \
sa.select([in_off_messages.c.src_uuid.label('remote_uuid')])\
.select_from(in_off_messages)\
.where(in_off_messages.c.dst_uuid == local_uuid)\
.group_by(in_off_messages.c.src_uuid)\
.alias()
jn_in = in_uuid_set.join(in_off_messages,\
and_(\
in_off_messages.c.src_uuid == in_uuid_set.c.remote_uuid,\
in_off_messages.c.dst_uuid == local_uuid,\
))\
.alias()
incoming = sa.select([\
in_off_messages.c.msg_uuid.label('msg_uuid'),\
in_uuid_set.c.remote_uuid.label('remote_uuid'),\
in_off_messages.c.msg_type.label('msg_type'),\
in_off_messages.c.date_sent.label('date_sent'),\
in_off_messages.c.content.label('content'),\
in_off_messages.c.was_read.label('was_read'),\
true().label('is_incoming')]
)\
.select_from(jn_in)
Surprisingly, I get that "incoming" has more rows than "jn_in". "incoming" has 12 rows, while "jn_in" has only 2 rows. I expect that "incoming" will have the same amount of rows (2) as "jn_in".
I also include here the SQL output the SqlAlchemy generates for "incoming":
SELECT in_off_messages.msg_uuid AS msg_uuid,
anon_1.remote_uuid AS remote_uuid,
in_off_messages.msg_type AS msg_type,
in_off_messages.date_sent AS date_sent,
in_off_messages.content AS content,
in_off_messages.was_read AS was_read,
1 AS is_incoming
FROM in_off_messages,
(SELECT in_off_messages.src_uuid AS remote_uuid
FROM in_off_messages
WHERE in_off_messages.dst_uuid = :dst_uuid_1
GROUP BY in_off_messages.src_uuid) AS anon_1,
(SELECT anon_1.remote_uuid AS anon_1_remote_uuid,
in_off_messages.msg_uuid AS in_off_messages_msg_uuid,
in_off_messages.orig_src_uuid AS in_off_messages_orig_src_uuid,
in_off_messages.src_uuid AS in_off_messages_src_uuid,
in_off_messages.dst_uuid AS in_off_messages_dst_uuid,
in_off_messages.msg_type AS in_off_messages_msg_type,
in_off_messages.date_sent AS in_off_messages_date_sent,
in_off_messages.content AS in_off_messages_content,
in_off_messages.was_read AS in_off_messages_was_read
FROM (SELECT in_off_messages.src_uuid AS remote_uuid
FROM in_off_messages
WHERE in_off_messages.dst_uuid = :dst_uuid_1
GROUP BY in_off_messages.src_uuid) AS anon_1
JOIN in_off_messages
ON in_off_messages.src_uuid = anon_1.remote_uuid
AND in_off_messages.dst_uuid = :dst_uuid_2) AS anon_2
Something doesn't look right for me with this SQL output, mostly because I see GROUP BY too many times. I would have expected it to show up about once, but it seems like it shows up twice here.
My guesses is that somehow some braces went out of place (In the generated SQL). I also suspect that I did something wrong with the alias() thing, though I'm not sure about it.
What should I do to get the wanted result (Same amount of rows for "jn_in" and "incoming")?
After playing with the code for a while, I found a way to fix it.
The answer was eventually related to the alias().
In order to make this work, the second alias() (Of jn_in) should be omitted, like this:
in_uuid_set = \
sa.select([in_off_messages.c.src_uuid.label('remote_uuid')])\
.select_from(in_off_messages)\
.where(in_off_messages.c.dst_uuid == local_uuid)\
.group_by(in_off_messages.c.src_uuid)\
.alias()
jn_in = in_uuid_set.join(in_off_messages,\
and_(\
in_off_messages.c.src_uuid == in_uuid_set.c.remote_uuid,\
in_off_messages.c.dst_uuid == local_uuid,\
))
# <<< The alias() is gone >>>
incoming = sa.select([\
in_off_messages.c.msg_uuid.label('msg_uuid'),\
in_uuid_set.c.remote_uuid.label('remote_uuid'),\
in_off_messages.c.msg_type.label('msg_type'),\
in_off_messages.c.date_sent.label('date_sent'),\
in_off_messages.c.content.label('content'),\
in_off_messages.c.was_read.label('was_read'),\
true().label('is_incoming')]
)\
.select_from(jn_in)
It seems, however, that the first alias() (of in_uuid_set) can not be ommited. If I try to omit it, I get this error message:
E subquery in FROM must have an alias
E LINE 2: FROM (SELECT in_off_messages.src_uuid AS remote_uuid
E ^
E HINT: For example, FROM (SELECT ...) [AS] foo.
As a generalization of this, probably if you have a select that you want to put as a clause somewhere else, then you want to alias() it, however if you have a join that you want to put as a clause, you should not alias() it.
For the sake of completeness, I include here the resulting SQL of the new code:
SELECT in_off_messages.msg_uuid AS msg_uuid,
anon_1.remote_uuid AS remote_uuid,
in_off_messages.msg_type AS msg_type,
in_off_messages.date_sent AS date_sent,
in_off_messages.content AS content,
in_off_messages.was_read AS was_read,
1 AS is_incoming
FROM (SELECT in_off_messages.src_uuid AS remote_uuid
FROM in_off_messages
WHERE in_off_messages.dst_uuid = :dst_uuid_1
GROUP BY in_off_messages.src_uuid) AS anon_1
JOIN in_off_messages
ON in_off_messages.src_uuid = anon_1.remote_uuid
AND in_off_messages.dst_uuid = :dst_uuid_2
Much shorter than the one at the question.
I'm working with a database that has a relationship that looks like:
class Source(Model):
id = Identifier()
class SourceA(Source):
source_id = ForeignKey('source.id', nullable=False, primary_key=True)
name = Text(nullable=False)
class SourceB(Source):
source_id = ForeignKey('source.id', nullable=False, primary_key=True)
name = Text(nullable=False)
class SourceC(Source, ServerOptions):
source_id = ForeignKey('source.id', nullable=False, primary_key=True)
name = Text(nullable=False)
What I want to do is join all tables Source, SourceA, SourceB, SourceC and then order_by name.
Sound easy to me but I've been banging my head on this for while now and my heads starting to hurt. Also I'm not very familiar with SQL or sqlalchemy so there's been a lot of browsing the docs but to no avail. Maybe I'm just not seeing it. This seems to be close albeit related to a newer version than what I have available (see versions below).
I feel close not that that means anything. Here's my latest attempt which seems good up until the order_by call.
Sources = [SourceA, SourceB, SourceC]
# list of join on Source
joins = [session.query(Source).join(source) for source in Sources]
# union the list of joins
query = joins.pop(0).union_all(*joins)
query seems right at this point as far as I can tell i.e. query.all() works. So now I try to apply order_by which doesn't throw an error until .all is called.
Attempt 1: I just use the attribute I want
query.order_by('name').all()
# throws sqlalchemy.exc.ProgrammingError: (ProgrammingError) column "name" does not exist
Attempt 2: I just use the defined column attribute I want
query.order_by(SourceA.name).all()
# throws sqlalchemy.exc.ProgrammingError: (ProgrammingError) missing FROM-clause entry for table "SourceA"
Is it obvious? What am I missing? Thanks!
versions:
sqlalchemy.version = '0.8.1'
(PostgreSQL) 9.1.3
EDIT
I'm dealing with a framework that wants a handle to a query object. I have a bare query that appears to accomplish what I want but I would still need to wrap it in a query object. Not sure if that's possible. Googling ...
select = """
select s.*, a.name from Source d inner join SourceA a on s.id = a.Source_id
union
select s.*, b.name from Source d inner join SourceB b on s.id = b.Source_id
union
select s.*, c.name from Source d inner join SourceC c on s.id = c.Source_id
ORDER BY "name";
"""
selectText = text(select)
result = session.execute(selectText)
# how to put result into a query. maybe Query(selectText)? googling...
result.fetchall():
Assuming that coalesce function is good enough, below examples should point you in the direction. One option automatically creates a list of children, while the other is explicit.
This is not the query you specified in your edit, but you are able to sort (your original request):
def test_explicit():
# specify all children tables to be queried
Sources = [SourceA, SourceB, SourceC]
AllSources = with_polymorphic(Source, Sources)
name_col = func.coalesce(*(_s.name for _s in Sources)).label("name")
query = session.query(AllSources).order_by(name_col)
for x in query:
print(x)
def test_implicit():
# get all children tables in the query
from sqlalchemy.orm import class_mapper
_map = class_mapper(Source)
Sources = [_smap.class_
for _smap in _map.self_and_descendants
if _smap != _map # #note: exclude base class, it has no `name`
]
AllSources = with_polymorphic(Source, Sources)
name_col = func.coalesce(*(_s.name for _s in Sources)).label("name")
query = session.query(AllSources).order_by(name_col)
for x in query:
print(x)
Your first attempt sounds like it isn't working because there is no name in Source, which is the root table of the query. In addition, there will be multiple name columns after your joins, so you will need to be more specific. Try
query.order_by('SourceA.name').all()
As for your second attempt, what is ServerA?
query.order_by(ServerA.name).all()
Probably a typo, but not sure if it's for SO or your code. Try:
query.order_by(SourceA.name).all()
I have the follwing SQL query (It get's the largest of a certain column per group, with 3 things to group by):
select p1.Name, p1.nvr, p1.Arch, d1.repo, p1.Date
from Packages as p1 inner join
Distribution as d1
on p1.rpm_id = d1.rpm_id inner join (
select Name, Arch, repo, max(Date) as Date
from Packages inner join Distribution
on Packages.rpm_id = Distribution.rpm_id
where Name like 'p%' and repo not like '%staging'
group by Name, Arch, repo
) as sq
on p1.Name = sq.Name and p1.Arch = sq.Arch and d1.repo = sq.repo and p1.Date = sq.Date
order by p1.nvr
And I'm trying to convert it to SQLAlchemy. This is what I have so far:
p1 = aliased(Packages)
d1 = aliased(Distribution)
sq = session.\
query(
Packages.Name,
Packages.Arch,
Distribution.repo,
func.max(Packages.Date).\
label('Date')).\
select_from(
Packages).\
join(
Distribution).\
filter(
queryfilter).\
filter(
not_(Distribution.repo.\
like('%staging'))).\
group_by(
Packages.Name,
Packages.Arch,
Distribution.repo).subquery()
result = session.\
query(
p1, d1.repo).\
select_from(
p1).\
join(
d1).\
join(
sq,
p1.Name==sq.c.Name,
p1.Arch==sq.c.Arch,
d1.repo==sq.c.repo,
p1.Date==sq.c.Date).\
order_by(p1.nvr).all()
The problem arises when I do the join on the subquery. I get an error that states that there is no from clause to join from. This is strange because I specify one right after the subquery in the join funciton as an argument. Any idea what I'm doing wrong? Perhaps I need to alias something and do a select_from again?
EDIT: Exact error
Could not find a FROM clause to join from. Tried joining to SELECT "Packages"."Name", "Packages"."Arch", "Distribution".repo, max("Packages"."Date") AS "Date" FROM "Packages" JOIN "Distribution" ON "Packages".rpm_id = "Distribution".rpm_id WHERE "Packages"."Name" LIKE :Name_1 AND "Distribution".repo NOT LIKE :repo_1 GROUP BY "Packages"."Name", "Packages"."Arch", "Distribution".repo, but got: Can't find any foreign key relationships between 'Join object on %(139953254400272 Packages)s(139953254400272) and %(139953256322768 Distribution)s(139953256322768)' and '%(139953257005520 anon)s'.
It's trying to join, but it says it doesn't know where to make the join. Is there something wrong with my syntax? I think it's correct based on what's in the join function.
Apparently you need to add an and_() around multiple join conditions.
join(
sq,
and_(p1.Name==sq.c.Name,
p1.Arch==sq.c.Arch,
d1.repo==sq.c.repo,
p1.Date==sq.c.Date)).\
i need a little help.
I have following query and i'm, curious about how to represent it in terms of sqlalchemy.orm. Currently i'm executing it by session.execute. Its not critical for me, but i'm just curious. The thing that i'm actually don't know is how to put subquery in FROM clause (nested view) without doing any join.
select g_o.group_ from (
select distinct regexp_split_to_table(g.group_name, E',') group_
from (
select array_to_string(groups, ',') group_name
from company
where status='active'
and array_to_string(groups, ',') like :term
limit :limit
) g
) g_o
where g_o.group_ like :term
order by 1
limit :limit
I need this subquery thing because of speed issue - without limit in the most inner query function regexp_split_to_table starts to parse all data and does limit only after that. But my table is huge and i cannot afford that.
If something is not very clear, please, ask, i'll do my best)
I presume this is PostgreSQL.
To create a subquery, use subquery() method. The resulting object can be used as if it were Table object. Here's how your query would look like in SQLAlchemy:
subq1 = session.query(
func.array_to_string(Company.groups, ',').label('group_name')
).filter(
(Company.status == 'active') &
(func.array_to_string(Company.groups, ',').like(term))
).limit(limit).subquery()
subq2 = session.query(
func.regexp_split_to_table(subq1.c.group_name, ',')
.distinct()
.label('group')
).subquery()
q = session.query(subq2.c.group).\
filter(subq2.c.group.like(term)).\
order_by(subq2.c.group).\
limit(limit)
However, you could avoid one subquery by using unnest function instead of converting array to string with arrayt_to_string and then splitting it with regexp_split_to_table:
subq = session.query(
func.unnest(Company.groups).label('group')
).filter(
(Company.status == 'active') &
(func.array_to_string(Company.groups, ',').like(term))
).limit(limit).subquery()
q = session.query(subq.c.group.distinct()).\
filter(subq.c.group.like(term)).\
order_by(subq.c.group).\
limit(limit)