I wrote a code to connect sqlalchemy to clickhose .the problem is when i print querystatement, table name loaded twice in it! what do sqlalchemy exactly that cause this problem ? does anyone had this problem before?
my code :
session.query(className)
query.statement:
select a,b from table_name, table_name
Related
I'm trying to dump a pandas DataFrame into an existing Snowflake table (via a jupyter notebook).
When I run the code below no errors are raised, but no data is written to the destination SF table (df has ~800 rows).
from sqlalchemy import create_engine
from snowflake.sqlalchemy import URL
sf_engine = create_engine(
URL(
user=os.environ['SF_PROD_EID'],
password=os.environ['SF_PROD_PWD'],
account=account,
warehouse=warehouse,
database=database,
)
)
df.to_sql(
"test_table",
con=sf_engine,
schema=schema,
if_exists="append",
index=False,
chunksize=16000,
)
If I check the SF History, I can see that the queries apparently ran without issue:
If I pull the query from the SF History UI and run it manually in the Snowflake UI the data shows up in the destination table.
If I try to use locopy I run into the same issue.
If the table does not exist before hand, the same code above creates the table and drops the rows no problem.
Here's where it gets weird. When I run the pd.to_sql command to try and append and then drop the destination table, if I then issue a select count(*) from destination_table a table still exists with that name and has (only) the data that I've been trying to drop. Thinking it may be a case-sensitive table naming situation?
Any insight is appreciated :)
Try adding role="<role>" and schema="<schema>" in URL.
engine = create_engine(URL(
account=os.getenv("SNOWFLAKE_ACCOUNT"),
user=os.getenv("SNOWFLAKE_USER"),
password=os.getenv("SNOWFLAKE_PASSWORD"),
role="<role>",
warehouse="<warehouse>",
database="<database>",
schema="<schema>"
))
Issue was due how I set up the database connection and the case-sensitivity of the table name. Turns out that I was writing to a table called DB.SCHEMA."db.schema.test_table" (note that the db.schema slug turns into part of the table name). Don't be like me kids. Use upper-case table names in Snowflake!
I'm querying the latest entry from a table like this:
data = dbsession.query(db.mytable).order_by(db.mytable.timestamp.desc()).with_entities(db.mytable.timestamp).first()
On startup this is fine, but if new etries are added by the same dbsession during runtime, the query above doesn't recognize them.
But the following code without SQLAlchemy works as expected:
sql_query="SELECT timestamp FROM mytable ORDER BY timestamp DESC LIMIT 1"
data = cursor.execute(sql_query)
How do I get SQLAlchemy to work in this case?
Had a similar issue once, without recalling exactly why sqlAlchemy behaves this way, you need to commit the session before the select to refresh the data:
session.commit()
I have a query in a python script that creates a materialized view after some tables get created.
Script is something like this:
from sqlalchemy import create_engine, text
sql = '''CREATE MATERIALIZED VIEW schema1.view1 AS
SELECT t1.a,
t1.b,
t1.c,
t2.x AS d
FROM schema1.t1 t1
LEFT JOIN schema1.t2 t2 ON t1.f = t2.f
UNION ALL
SELECT t3.a,
t3.b,
t3.c,
t3.d
FROM schema1.t3 t3;'''
con=create_engine(db_conn)
con.execute(sql)
The query successfully executes when I run on the database directly.
But when running the script in python, I get an error:
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.SyntaxError) syntax error at or near "CREATE MATERIALIZED VIEW schema"
I can't for the life of me figure out what it has an issue with - any ideas?
This was the weirdest thing. I had copied my query text out of another tool that I use to navigate around my pg DB into VS Code. The last part of the answer by #EOhm gave me the idea to just type the whole thing out in VS Code instead of copy/pasting.
And everything worked.
Even though the pasted text and what I typed appear identical in every way. So apparently there was some invisible formatting causing this issue.
I don't know wether SQLAlchemy suports MView-Creation, but if it should be similiar or done with specific Metadata functions (https://docs.sqlalchemy.org/en/13/core/schema.html).
The text function is designed for database indepenendent DML, not DDL. Maybe it works for DDL (I don't know about SQLAlchemy) but by design the syntax is different than when You would execute directly on the database as SQLAlchemy shall abstract the details of databases from user.
If SQLAlchemy does no offer some convenient way for that and You nevertheless have valid reasons to use SQLAlchemy at all, You can just execute the plain SQL Statememt in the dialect the database backend understands, so just omit the sqlalchemies text function for the SQL statement, like:
from sqlalchemy import create_engine, text
sql = '''CREATE MATERIALIZED VIEW schema.view1 AS
SELECT t1.a,
t1.b,
t1.c
t2.x AS d
FROM schema.t1 t1
LEFT JOIN schema.t2 t2 ON t1.f = t2.f
UNION ALL
SELECT t3.a,
t3.b,
t3.c,
t3.d
FROM schema.t3 t3;'''
con=create_engine(db_conn)
con.raw_connection().cursor().execute(sql)
(But of course You have to take care for the backend type then opposed to the SQLAlchemy wrapped statements.)
I tested on my pg server without any issues using psycopg2 directly.
postgres=# create schema schema;
CREATE TABLE
postgres=# create table schema.t1 (a varchar, b varchar, c varchar, f integer);
CREATE TABLE
postgres=# create table schema.t2 (x varchar, f integer);
CREATE TABLE
postgres=# create table schema.t3 (a varchar, b varchar, c varchar, d varchar);
CREATE TABLE
postgres=# commit;
With the following script:
#!/usr/bin/python3
import psycopg2;
conn = psycopg2.connect("dbname=postgres")
cur = conn.cursor()
cur.execute("""
CREATE MATERIALIZED VIEW schema.view1 AS
SELECT t1.a,
t1.b,
t1.c,
t2.x AS d
FROM schema.t1 t1
LEFT JOIN schema.t2 t2 ON t1.f = t2.f
UNION ALL
SELECT t3.a,
t3.b,
t3.c,
t3.d
FROM schema.t3 t3;
""")
conn.commit()
cur.close()
conn.close()
I tested with quite current versions of python3.7/2.7 and current version of psycopg2 module and current libraries (I have 11.5 pg client and 2.8.3 psycopg2) from pgdg installed on a quite recent linux? Can You try to execute directly on psycopg2 like I did?
Also did You make sure Your dots are plain ascii dots as all the other characters in the statement are in this case? (Also keep in mind there can be invisible codepoints in unicode that can cause such sort of problems.) Maybe You can convert Your string to ASCII binary and back to Unicode-String if You are on Python. If it does not raise an error on .encode('ASCII') it should be clean.
I am using SQLAlchemy to extract data from a SQL Server DB into a Pandas Dataframe:
q: Query = self._session(db).query(tbl_obj)
return pd.read_sql(
str(q),
db.conn()
)
tbl_obj is a SQLAlchemy Table object that has been autloaded from an existing table in the DB.
My problem is that the query that's being created automatically aliases the column names to 'TABLE_NAME_COLUMN_NAME,' when I just want them to be 'COLUMN_NAME.'
I figure this is a fairly simple solution, but I haven't figured it out yet. Any thoughts?
Posting this because I figured it out while I was typing up the question. The problem was that I was calling str(q) when I should have been calling q.statement
This code works as expected, because the 'statement' attribute doesn't include the alisaing:
q: Query = self._session(db).query(tbl_obj)
return pd.read_sql(
q.statement,
db.conn()
)
I'm new to sqlalchemy and am wondering how to do a union of two tables that have the same columns. I'm doing the following:
table1_and_table2 = sql.union_all(self.tables['table1'].alias("table1_subquery").select(),
self.tables['table2'].alias("table2_subquery").select())
I'm seeing this error:
OperationalError: (OperationalError) (1248, 'Every derived table must have its own alias')
(Note that self.tables['table1'] returns a sqlalchemy Table with name table1.)
Can someone point out the error or suggest a better way to combine the rows from both tables?
First, can you output the SQL generated that creates the problem? You should be able to do this by setting echo=True in your create_engine statement.
Second, and this is just a hunch, try rearranging your subqueries to this:
table1_and_table2 = sql.union_all(self.tables['table1'].select().alias("table1_subquery"),
self.tables['table2'].select().alias("table2_subquery"))
If my hunch is right it's creating aliases, then running a query and the resulting query results are re-aliased and clashing