How to query multiple tables using join in sqlalchemy - python

select count(DISTINCT(a.cust_id)) as count ,b.code, b.name from table1 as a inner join table2 as b on a.par_id = b.id where a.data = "present" group by a.par_id order by b.name asc;
How to write this in sqlalchemy to get as expected results
The above query which is writen in sql should be right in sqlalchemy.
Thanks for inputs

Hope this works...
session.query(
func.count(distinct(table1.cust_id)).label('count'),
table2.code,
table2.name
).join(
table2,
table1.par_id == table2.id
).filter(
table1.data == "present"
).group_by(
table1.par_id
).order_by(
table2.name.asc()
).all()

Related

Using CTE in Python with Postgresql and psycopg2

I'm trying to create a query using CTE where I am creating 2 subtables and then the select statement. I believe the following syntax would work for full SQL, but it isn't working in this situation using psycopg2 in Python.
The idea is that I should be able to pull a query that shows the Name of all events (E.Event), the E.EDate, the E.ETemp and SmithTime. So it should have the full list of Events but the time column only shows times recorded for Smith (not in all Events).
query = ("""WITH cte AS (SELECT E.Event, O.Time AS "SmithTime"
FROM event E JOIN outcome O ON E.EventID = O.EventID
JOIN name N ON N.ID = O.ID
WHERE Name = 'Smith'),
WITH cte2 AS (SELECT E.Event, O.Time, E.EDate, E.ETemp
FROM event E JOIN outcome O ON E.EventID = O.EventID
JOIN name N ON N.ID = O.ID)
SELECT cte2.Event, cte2.EDate, cte2.ETemp, cte.SmithTime
FROM cte JOIN cte2 ON cte.Event = cte2.Event
ORDER BY 2 ASC""")
query = pd.read_sql(query, conn)
print(query)
This is just my latest iteration, I'm not sure what else to try. It is currently generating a DatabaseError:
DatabaseError: Execution failed on sql 'WITH cte AS (SELECT E.Event, O.Time AS "SmithTime"
FROM event E JOIN outcome O ON E.EventID = O.EventID
JOIN name N ON N.ID = O.ID
WHERE Name = 'Smith'),
WITH cte2 AS (SELECT E.Event, O.Time, E.EDate, E.ETemp
FROM event E JOIN outcome O ON E.EventID = O.EventID
JOIN name N ON N.ID = O.ID)
SELECT cte2.Event, cte2.EDate, cte2.ETemp, cte.SmithTime
FROM cte JOIN cte2 ON cte.Event = cte2.Event
ORDER BY 2 ASC': syntax error at or near "WITH"
LINE 6: WITH cte2 AS (SELECT E.Event, O.Time, E.EDate, E.ETemp
I have no idea whether or not your current query even be logically correct. But we can get around the SQL error by inlining the common table expressions:
SELECT cte2.Event, cte2.EDate, cte2.ETemp, cte.SmithTime
FROM (
SELECT E.Event, O.Time AS "SmithTime"
FROM event E
INNER JOIN outcome O ON E.EventID = O.EventID
INNER JOIN name N ON N.ID = O.ID
WHERE Name = 'Smith'
) cte
INNER JOIN (
SELECT E.Event, O.Time, E.EDate, E.ETemp
FROM event E
INNER JOIN outcome O ON E.EventID = O.EventID
INNER JOIN name N ON N.ID = O.ID
) cte2
ON cte.Event = cte2.Event
ORDER BY 2;
It's an SQL syntax error, nothing specific to psycopg2.
There's only one WITH in a CTE query. It should be WITH cte AS (...), cte2 AS (...) SELECT ..., not WITH cte AS (...), WITH cte2 AS (...) SELECT ....

Python filter one list based on values that do not exist in another list

Trying to filter results of a query on a Table A by 2 values not found in a Table B. What would be the proper syntax and approach?
import pyodbc
MDB = 'C:/db/db1.mdb'; DRV = '{Microsoft Access Driver (*.mdb)}'; PWD = 'pw'
con = pyodbc.connect('DRIVER={};DBQ={};PWD={}'.format(DRV,MDB,PWD))
cur = con.cursor()
SQLA = 'SELECT * FROM TABLE1;' # your query goes here
SQLB = 'SELECT * FROM TABLE2;' # your query goes here
rows1 = cura.execute(SQLA).fetchall()
rows2 = cura.execute(SQLB).fetchall()
cur.close()
con.close()
for rit in rows1:
for git in rows2:
if (rit[1] and rit[2]) not in (git[1] and git[2]):
print ((rit[1]) (rit[2]))
Simply use a pure SQL solution with the familiar LEFT JOIN... IS NULL / NOT EXISTS / NOT IN. Below are equivalent queries, compliant in MS Access, returning rows in TableA not in TableB based on col1 and col2.
LEFT JOIN...IS NULL
SELECT a.*
FROM TABLEA a
LEFT JOIN TABLEB b
ON a.col1 = b.col1 AND a.col2 = b.col2
WHERE b.col1 IS NULL AND b.col2 IS NULL
NOT EXISTS
SELECT a.*
FROM TABLEA a
WHERE NOT EXISTS
(SELECT 1 FROM TABLEB b
WHERE a.col1 = b.col1 AND a.col2 = b.col2)
NOT IN
SELECT a.*
FROM TABLEA a
WHERE a.col1 NOT IN (SELECT col1 FROM TABLEB)
AND a.col2 NOT IN (SELECT col1 FROM TABLEB)
The SQL statements offered by Parfait are the preferred solution, but if you really wanted to use your double-loop approach it would need to be more like this:
for rit in rows1:
match_found = False
for git in rows2:
if (rit[1] == git[1]) and (rit[2] == git[2]):
match_found = True
break
if not match_found:
print(rit)

Sqlalchemy: subquery in FROM must have an alias

How can I structure this sqlalchemy query so that it does the right thing?
I've given everything I can think of an alias, but I'm still getting:
ProgrammingError: (psycopg2.ProgrammingError) subquery in FROM must have an alias
LINE 4: FROM (SELECT foo.id AS foo_id, foo.version AS ...
Also, as IMSoP pointed out, it seems to be trying to turn it into a cross join, but I just want it to join a table with a group by subquery on that same table.
Here is the sqlalchemy:
(Note: I've rewritten it to be a standalone file that is as complete as possible and can be run from a python shell)
from sqlalchemy import create_engine, func, select
from sqlalchemy import Column, BigInteger, DateTime, Integer, String, SmallInteger
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
engine = create_engine('postgresql://postgres:########localhost:5435/foo1234')
session = sessionmaker()
session.configure(bind=engine)
session = session()
Base = declarative_base()
class Foo(Base):
__tablename__ = 'foo'
__table_args__ = {'schema': 'public'}
id = Column('id', BigInteger, primary_key=True)
time = Column('time', DateTime(timezone=True))
version = Column('version', String)
revision = Column('revision', SmallInteger)
foo_max_time_q = select([
func.max(Foo.time).label('foo_max_time'),
Foo.id.label('foo_id')
]).group_by(Foo.id
).alias('foo_max_time_q')
foo_q = select([
Foo.id.label('foo_id'),
Foo.version.label('foo_version'),
Foo.revision.label('foo_revision'),
foo_max_time_q.c.foo_max_time.label('foo_max_time')
]).join(foo_max_time_q, foo_max_time_q.c.foo_id == Foo.id
).alias('foo_q')
thing = session.query(foo_q).all()
print thing
generated sql:
SELECT foo_id AS foo_id,
foo_version AS foo_version,
foo_revision AS foo_revision,
foo_max_time AS foo_max_time,
foo_max_time_q.foo_max_time AS foo_max_time_q_foo_max_time,
foo_max_time_q.foo_id AS foo_max_time_q_foo_id
FROM (SELECT id AS foo_id,
version AS foo_version,
revision AS foo_revision,
foo_max_time_q.foo_max_time AS foo_max_time
FROM (SELECT max(time) AS foo_max_time,
id AS foo_id GROUP BY id
) AS foo_max_time_q)
JOIN (SELECT max(time) AS foo_max_time,
id AS foo_id GROUP BY id
) AS foo_max_time_q
ON foo_max_time_q.foo_id = id
and here is the toy table:
CREATE TABLE foo (
id bigint ,
time timestamp with time zone,
version character varying(32),
revision smallint
);
The SQL was I expecting to get (desired SQL) would be something like this:
SELECT foo.id AS foo_id,
foo.version AS foo_version,
foo.revision AS foo_revision,
foo_max_time_q.foo_max_time AS foo_max_time
FROM foo
JOIN (SELECT max(time) AS foo_max_time,
id AS foo_id GROUP BY id
) AS foo_max_time_q
ON foo_max_time_q.foo_id = foo.id
Final note:
I'm hoping to get an answer using select() instead of session.query() if possible. Thank you
You are almost there. Make a "selectable" subquery and join it with the main query via join():
foo_max_time_q = select([func.max(Foo.time).label('foo_max_time'),
Foo.id.label('foo_id')
]).group_by(Foo.id
).alias("foo_max_time_q")
foo_q = session.query(
Foo.id.label('foo_id'),
Foo.version.label('foo_version'),
Foo.revision.label('foo_revision'),
foo_max_time_q.c.foo_max_time.label('foo_max_time')
).join(foo_max_time_q,
foo_max_time_q.c.foo_id == Foo.id)
print(foo_q.__str__())
Prints (prettified manually):
SELECT
foo.id AS foo_id,
foo.version AS foo_version,
foo.revision AS foo_revision,
foo_max_time_q.foo_max_time AS foo_max_time
FROM
foo
JOIN
(SELECT
max(foo.time) AS foo_max_time,
foo.id AS foo_id
FROM
foo
GROUP BY foo.id) AS foo_max_time_q
ON
foo_max_time_q.foo_id = foo.id
The complete working code is available in this gist.
Cause
subquery in FROM must have an alias
This error means the subquery (on which we're trying to perform a join) has no alias.
Even if we .alias('t') it just to satisfy this requirement, we will then get the next error:
missing FROM-clause entry for table "foo"
That's because the join on clause (... == Foo.id) is not familiar with Foo.
It only knows the "left" and "right" tables: t (the subquery) and foo_max_time_q.
Solution
Instead, select_from a join of Foo and foo_max_time_q.
Method 1
Replace .join(B, on_clause) with .select_from(B.join(A, on_clause):
]).join(foo_max_time_q, foo_max_time_q.c.foo_id == Foo.id
]).select_from(foo_max_time_q.join(Foo, foo_max_time_q.c.foo_id == Foo.id)
This works here because A INNER JOIN B is equivalent to B INNER JOIN A.
Method 2
To preserve the order of joined tables:
from sqlalchemy import join
and replace .join(B, on_clause) with .select_from(join(A, B, on_clause)):
]).join(foo_max_time_q, foo_max_time_q.c.foo_id == Foo.id
]).select_from(join(Foo, foo_max_time_q, foo_max_time_q.c.foo_id == Foo.id)
Alternatives to session.query() can be found here.

sqlalchemy exists() - how to avoid extra From

exists() containing another exists() results in extra From clause.
model.session.query(Table1.id).\
filter(~ exists().\
where(Table2.table1_id==Table1.id).\
where(~ exists().\
where(Table3.contract_id==Table2.contract_id).\
where(Table3.session_id==Table1.session_id))
)
this is generating:
SELECT table1.id AS table1_id FROM table1
WHERE NOT (EXISTS (SELECT * FROM table2
WHERE table2.table1_id = table1.id
AND NOT (EXISTS (SELECT * FROM table3, table1
WHERE table3.contract_id = table2.contract_id
AND table3.session_id = table1.session_id))))
Here, "FROM table1" in the last "exists" is not required because table1 is already in the topmost query. How can I force sqlalchemy not to add this extra "FROM table1"?
What I really want is:
SELECT table1.id AS table1_id FROM table1
WHERE NOT (EXISTS (SELECT * FROM table2
WHERE table2.table1_id = table1.id
AND NOT (EXISTS (SELECT * FROM table3
WHERE table3.contract_id = table2.contract_id
AND table3.session_id = table1.session_id))))
I wonder how to achieve that.
Can somebody help me please?
Using SQLAlchemy 0.7.9.
q = (session.query(Table1.id)
.filter(~exists(
select([Table2.id])
.where(Table2.table1_id == Table1.id)
.where(~exists(
# changing exists to be implicit enables the 'important' below
select([Table3.id])
.where(Table3.contract_id == Table2.contract_id)
.where(Table3.session_id == Table1.session_id)
# this is important
.correlate(Table1)
.correlate(Table2)
))
)))

Update a Joined Table with SQLAlchemy Core

I have a MySQL db with tables set up like this:
Table1 Table2
------ ------
id id, fk to Table1.id
name name
I want to update Table1 and set Table1.id = Table2.id if Table1.name = Table2.name. Or, in SQL:
UPDATE table1 t1
INNER JOIN table2 t2
ON t1.name = t2.name
SET t1.id = t2.id;
How can I accomplish an equivalent statement using the SQLAlchemy Core API?
I can call table1.join(table2, table1.c.name == table2.c.name) to create the join, but how can I update this joined table?
upd = table1.update()\
.values(id=table2.c.id)\
.where(table1.c.name == table2.c.name)
should do it, but if you really have all those foreign keys, you might get errors doing such updates.

Categories

Resources