I am having the following query:
select t2.col as tag, count(*)
from my_table t1 JOIN TABLE(SPLIT(tags,'/')) as t2
where t2.col != ''
group by tag
(TABLE(SPLIT(tags,'/')) - create a temp table by splitting the tags field.)
Query works just fine running it on the database directly, but having trouble to create the query with this join clause using SQLAlchemy.
How can i perform a join with a table that created on the fly? and uses functions that aren't defined in SQLAlchemy.
Thanks.
Related
I'm currently executing this query in one process:
SELECT DISTINCT ON (c.api_key, worker_id) worker_id, c.api_key, a.updated_at, b.user_id, a.country
FROM TABLE_A a
INNER JOIN TABLE_B b ON (b.id = a.user)
INNER JOIN TABLE_C c ON (b.owner = c.id)
WHERE 1=1
AND a.platform = 'x'
AND a.country = 'y'
AND a.access_token is not NULL
ORDER BY c.api_key, worker_id, a.updated_at desc
I'm currently wrapping it using from SQLAlchemy import text and then simply executing
query_results = db.execute(query).fetchall()
list_dicts = [r._asdict() for r in query_results]
df = pd.DataFrame(list_dicts)
and it works, but I would really like to see if it's possible to have it in the other notation, like :
db.query(TABLE_A).filter().join()... etc
Yes, it's possible.
But the exact way to do it will depend on your SQLAlchmey version and how you've setup your SQLAlchemy project and models.
You may want to check out the SQLAlchemy ORM querying guide and the Expression Language Tutorial to see which one fits better your case.
I am querying a variety of different tables in a mysql database with sqlalchemy, and SQL query code.
My issue right now is renaming some of the columns being joined. The queries are all coming into one dataframe.
SELECT *
FROM original
LEFT JOIN table1
on original.id = table1.t1key
LEFT JOIN table2
on original.id = table2.t2key
LEFT JOIN table3
on original.id = table3.t3key;
All I actually want to get from those tables is a single column added to my query. Each table has a column with the same name. My approach to using an alias has been as below,
table1.columnchange AS 'table1columnchange'
table2.columnchange AS 'table2columnchange'
table3.columnchange AS 'table3columnchange'
But the variety of ways I've tried to implement this ends up with annoying errors.
I am querying around 20 different tables as well, so using SELECT * at the beginning while inefficient is ideal for the sake of ease.
The output I'm looking for is a dataframe that has each of the columns I need in it (which I am trying to then filter and build modeling with python for). I am fine with managing the query through sqlalchemy into pandas, the alias is what is giving me grief right now.
Thanks in advance
You can use nested queries:
SELECT
original.column1 as somename,
table1.column1 as somename1,
table2.column1 as somename2
FROM
(SELECT
column1
FROM
original
) original
LEFT JOIN (
SELECT
column1
FROM
table1
) table1 ON original.id = table1.t1key
LEFT JOIN (
SELECT
column1
FROM
table2
) table2 ON original.id = table2.t2key
I have two tables, Table A and Table B. I have added one column to Table A, record_id. Table B has record_id and the primary ID for Table A, table_a_id. I am looking to deprecate Table B.
Relationships exist between Table B's table_a_id and Table A's id, if that helps.
Currently, my solution is:
db.execute("UPDATE table_a t
SET record_id = b.record_id
FROM table_b b
WHERE t.id = b.table_a_id")
This is my first time using this ORM -- I'd like to see if there is a way I can use my Python models and the actual functions SQLAlchemy gives me to be more 'Pythonic' rather than just dumping a Postgres statement that I know works in an execute call.
My solution ended up being as follows:
(db.query(TableA)
.filter(TableA.id == TableB.table_a_id,
TableA.record_id.is_(None))
.update({TableA.record_id: TableB.record_id}, synchronize_session=False))
This leverages the ability of PostgreSQL to do updates based on implicit references of other tables, which I did in my .filter() call (this is analogous to a WHERE in a JOIN query). The solution was deceivingly simple.
Can read_sql query handle a sql script with multiple select statements?
I have a MSSQL query that is performing different tasks, but I don't want to have to write an individual query for each case. I would like to write just the one query and pull in the multiple tables.
I want the multiple queries in the same script because the queries are related, and it making updating the script easier.
For example:
SELECT ColumnX_1, ColumnX_2, ColumnX_3
FROM Table_X
INNER JOIN (Etc etc...)
----------------------
SELECT ColumnY_1, ColumnY_2, ColumnY_3
FROM Table_Y
INNER JOIN (Etc etc...)
Which leads to two separate query results.
The subsequent python code is:
scriptFile = open('.../SQL Queries/SQLScript.sql','r')
script = scriptFile.read()
engine = sqlalchemy.create_engine("mssql+pyodbc://UserName:PW!#Table")
connection = engine.connect()
df = pd.read_sql_query(script,connection)
connection.close()
Only the first table from the query is brought in.
Is there anyway I can pull in both query results (maybe with a dictionary) that will prevent me from having to separate the query into multiple scripts.
You could do the following:
queries = """
SELECT ColumnX_1, ColumnX_2, ColumnX_3
FROM Table_X
INNER JOIN (Etc etc...)
---
SELECT ColumnY_1, ColumnY_2, ColumnY_3
FROM Table_Y
INNER JOIN (Etc etc...)
""".split("---")
Now you can query each table and concat the result:
df = pd.concat([pd.read_sql_query(q, connection) for q in queries])
Another option is to use UNION on the two results i.e. do the concat in SQL.
When using mclient it is possible to list all tables in database by issuing command '\d'. I'm using python-monetdb package and I don't know how the same can be accomplished. I've seen example like "SELECT * FROM TABLES;" but I get an error that "tables" table does not exist.
In your query you need to specify that you are looking for the tables table that belongs to the default sys schema, or sys.tables. The SQL query that returns the names of all non-system tables in MonetDB is:
SELECT t.name FROM sys.tables t WHERE t.system=false
In Python this should look something like:
import monetdb.sql
connection = monetdb.sql.connect(username='<username>', password='<password>', hostname='<hostname>', port=50000, database='<database>')
cursor = connection.cursor()
cursor.execute('SELECT t.name FROM sys.tables t WHERE t.system=false')
If you are looking for tables only in a specific schema, you will need to extend your query, specifying the schema:
SELECT t.name FROM sys.tables t WHERE t.system=false AND t.schema_id IN (SELECT s.id FROM sys.schemas s WHERE name = '<schema-name>')
where the <schema-name> is your schema, surrounded by single quotes.