I'm fairly new to using sqlalchemy and having some issues generating the sql code that I am looking for.
Ultimately, I'm trying to join two different subsets of table2 to table1 by using the following SQL query:
SELECT table1.date, a1.id AS name1_id, a2.id AS name2_id
FROM table1
LEFT JOIN table2 as a1
ON table1.name1 = table2.label AND table2.lookup_id = 1000
LEFT JOIN table2 as a2
ON table1.name2 = table2.label AND table2.lookup_id = 2000
Here's what I have so far using sqlalchemy:
q_generate = (
select([table1.c.date,
a1.id.label('name1_id'),
a2.id.label('name2_id')])
.select_from(table1
.outerjoin(table2.alias(name='a1'),
and_(
table2.c.lookup_id == 1000,
table1.c.name1 == table2.c.label
))
.outerjoin(table2.alias(name='a2'),
and_(
table2.c.lookup_id == 2000,
table1.c.name2== table2.c.label
))
)
)
which produces the following errors:
*NameError: name 'a1' is not defined*
Is there a special way that aliased table names must be referenced? What am I missing here? I think the error has something to do with these lines but I can't figure out how exactly to get this to work:
...
a1.id.label('name1_id'),
a2.id.label('name2_id')])
...
Thank you!
Yes, do this:
a1 = table2.alias(name='a1')
a2 = table2.alias(name='a2')
q_generate = (
select([table1.c.date,
a1.c.id.label('name1_id'),
a2.c.id.label('name2_id')])
.select_from(table1.outerjoin(a1, ...).outerjoin(a2, ...)))
Related
I have 2 queries which both get a subset of a tables, let's call them Table1 and Table2
I would like to join these 2 child tables on their id.
I tried something like this, but it throws unhelpful errors:
table1: List[Table1] = db.session.execute(query1)
table2: List[Table2] = db.session.execute(query2)
db.session.query(table1).join(table2, table1.id == table2.id).all()
You have to join the tables before executing the query.
Maybe you wanted to do like this?
results = db.session.query(Table1).filter(
{whatever the query1 does},
{whatever the query2 does}
).join(Table2, Table1.id==Table2.id)
I have a two tables,
Here is my first table,
ID SUBST_ID CREATED_ID
1 031938 TEST123
2 930111 COOL123
3 000391 THIS109
4 039301 BRO1011
5 123456 COOL938
... ... ...
This is my second table,
ID SERIAL_ID BRANCH_ID
1 039301 NULL
2 000391 NULL
3 123456 NULL
... ... ...
I need to some how update all rows within my second table using data from my first table.
It would need to do this all in one update query.
Both SUBST_ID and SERIAL_ID match, it needs to grab the created_id from the first table and insert it into the second table.
So the second table would become the following,
ID SERIAL_ID BRANCH_ID
1 039301 BRO1011
2 000391 THIS109
3 123456 COOL938
... ... ...
Thank you for your help and guidance.
UPDATE TABLE2
JOIN TABLE1
ON TABLE2.SERIAL_ID = TABLE1.SUBST_ID
SET TABLE2.BRANCH_ID = TABLE1.CREATED_ID;
In addition to Tom's answer if you need to repeat the operation frequently and want to save time you can do:
UPDATE TABLE1
JOIN TABLE2
ON TABLE1.SUBST_ID = TABLE2.SERIAL_ID
SET TABLE2.BRANCH_ID = TABLE1.CREATED_ID
WHERE TABLE2.BRANCH_ID IS NULL
UPDATE TABLE2
JOIN TABLE1
ON TABLE1.SUBST_ID = TABLE2.SERIAL_ID
SET TABLE2.BRANCH_ID = TABLE1.CREATED_ID
WHERE TABLE2.BRANCH_ID IS NULL or TABLE2.BRANCH_ID='';
I think this should work
UPDATE secondTable
JOIN firsTable ON secondTable.SERIAL_ID = firsTable.SUBST_ID
SET BRANCH_ID = CREATED_ID
It is very simple to update using Inner join query in SQL .You can do it without using FROM clause. Here is an example :
UPDATE customer_table c
INNER JOIN
employee_table e
ON (c.city_id = e.city_id)
SET c.active = "Yes"
WHERE c.city = "New york";
Using INNER JOIN:
UPDATE TABLE1
INNER JOIN TABLE2 ON TABLE1.SUBST_ID = TABLE2.SERIAL_ID
SET TABLE2.BRANCH_ID = TABLE1.CREATED_ID;
Another alternative solution like below: Here I am using WHERE clause instead of JOIN
UPDATE
TABLE1,
TABLE2
WHERE
TABLE1.SUBST_ID = TABLE2.SERIAL_ID
SET
TABLE2.BRANCH_ID = TABLE1.CREATED_ID;
You can use this too:
update TABLE1 set BRANCH_ID = ( select BRANCH_ID from TABLE2 where TABLE1.SUBST_ID = TABLE2.SERIAL_ID)
but with my experience I can say that this way is so slow and not recommend it!
This is a problem that took me a long time to solve, and I wanted to share my solution. Here's the problem.
We have 2 pandas DataFrames that need to be outer joined on a very complex condition. Here was mine:
condition_statement = """
ON (
A.var0 = B.var0
OR (
A.var1 = B.var1
AND (
A.var2 = B.var2
OR A.var3 = B.var3
OR A.var4 = B.var4
OR A.var5 = B.var5
OR A.var6 = B.var6
OR A.var7 = B.var7
OR (
A.var8 = B.var8
AND A.var9 = B.var9
)
)
)
)
"""
Doing this in pandas would be a nightmare.
I like to do most of my DataFrame massaging with the pandasql package. It lets you run SQL queries on top of the DataFrames in your local environment.
The problem with pandasql is it runs on a SQLite engine, so you can't do RIGHT or FULL OUTER joins.
So how do you approach this problem?
Well you can achieve a FULL OUTER join with two LEFT joins, a condition, and a UNION.
First, declare a snippet with the columns you want to retrieve:
select_statement = """
SELECT
A.var0
, B.var1
, COALESCE(A.var2, B.var2) as var2
"""
Next, build a condition that represents all values in A being NULL. I built mine using the columns in my DataFrame:
where_a_is_null_statement = f"""
WHERE
{" AND ".join(["A." + col + " is NULL" for col in A.columns])}
"""
Now, do the 2-LEFT-JOIN-with-a-UNION trick using all of these snippets:
sqldf(f"""
{select_statement}
FROM A
LEFT JOIN B
{condition_statement}
UNION
{select_statement}
FROM B
LEFT JOIN A
{condition_statement}
{where_a_is_null_statement}
""")
I'm trying to formulate a SQLAlchemy query that uses a CTE to build a table-like structure of an input list of tuples, and JOIN it with one of my tables (backend DB is Postgres). Conceptually, it would look like:
WITH to_compare AS (
SELECT * FROM (
VALUES
(1, 'flimflam'),
(2, 'fimblefamble'),
(3, 'pigglywiggly'),
(4, 'beepboop')
-- repeat for a couple dozen or hundred rows
) AS t (field1, field2)
)
SELECT b.field1, b.field2, b.field3
FROM my_model b
JOIN to_compare c ON (c.field1 = b.field1) AND (c.field2 = b.field2)
The goal is to see what field3 for the pair (field1, field2) in the table if it is, for a medium-sized list of (field1, field2) pairs.
In SQLAlchemy I'm trying to do it like this:
stmts = [
sa.select(
[
sa.cast(sa.literal(field1), sa.Integer).label("field1"),
sa.cast(sa.literal(field2), sa.Text).label("field2"),
]
)
if idx == 0
else sa.select([sa.literal(field1), sa.literal(field2)])
for idx, (field1, field2) in enumerate(list_of_tuples)
]
cte = sa.union_all(*stmts).cte(name="temporary_table")
already_in_db_query = db.session.query(MyModel)\
.join(cte,
cte.c.field1 == MyModel.field1,
cte.c.field2 == MyModel.field2,
).all()
But it seems like CTEs and JOINs don't play well together: the error is on the join, saying:
sqlalchemy.exc.InvalidRequestError: Don't know how to join to ; please use an ON clause to more clearly establish the left side of this join
And if I try to print the cte, it does look like a non-SQL entity:
$ from pprint import pformat
$ print(pformat(str(cte)), flush=True)
> ''
Is there a way to do this? Or a better way to achieve my goal?
The second argument to Query.join() should in this case be the full ON clause, but instead you pass 3 arguments to join(). Use and_() to combine the predicates, as is done in the raw SQL:
already_in_db_query = db.session.query(MyModel)\
.join(cte,
and_(cte.c.field1 == MyModel.field1,
cte.c.field2 == MyModel.field2),
).all()
I'm trying to understand the following code snip:
improt pandasql
data_sql = data[['account_id', 'id', 'date', 'amount']]
# data_sql is a table has the above columns
data_sql.loc[:, 'date_hist_min'] = data_sql.date.apply(lambda x: x + pd.DateOffset(months=-6))
# add one more column, 'date_hist_min', it is from the column 'data' with the month minus 6
sqlcode = '''
SELECT t1.id,
t1.date,
t2.account_id as "account_id_hist",
t2.date as "date_hist",
t2.amount as "amount_hist"
FROM data_sql as t1 JOIN data_sql as t2
ON (cast(strftime('%s', t2.date) as integer) BETWEEN
(cast(strftime('%s', t1.date_hist_min) as integer))
AND (cast(strftime('%s', t1.date) as integer)))
AND (t1.{0} == t2.{0})
'''
# perform the SQL query on the table with sqlcode:
newdf = pandasql.sqldf(sqlcode.format(column), locals())
The code is with Python pandasql. It manipulates dataframe as SQL table. You can assume
the above dataframe as SQL table.
The definition of the table is in the comments.
What's the meaning of t1.{0} == t2.{0} ? What does {0} stand for in the context?
sqlcode.format(column) is going format the string and inject the columns into {0}
The 0 means format will use the first parameter.
print("This {1} a {0}".format("string", "is")) would print "This is a string"