Error while deleting duplicates from table using sqlalchemy python - python

I am trying trying to delete duplicates from Redshift table
Mycode
from sqlalchemy import create_engine
# A long string that contains the necessary Postgres login information
postgres_str = f'postgresql://{redshift_username}:{redshift_password}#{redshift_address}:{redshift_port}/{redshift_dbname}'
# Create the connection
cnx = create_engine(postgres_str)
delstatmt = '''WITH CTE AS
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY org_country_code,dest_country_code,postcode,zone,kg,value,carrier,version
ORDER BY org_country_code,dest_country_code,postcode,zone,kg,value,carrier,version ) AS RN
FROM d.axis
)
DELETE FROM d.axis transformed WHERE RN<>1'''
cnx.execute(delstatmt)
Error
ProgrammingError: (psycopg2.errors.SyntaxError) syntax error at or near "DELETE"
LINE 8: DELETE FROM d.axis transformed WHERE ...
What is wrong is code. Any help appreciated.

Related

Add and then query temp table from pandas with Snowflake python connector

I am trying to create a temporary table from a pandas df and then use it in a sql statement
import snowflake.connector
from snowflake.connector.pandas_tools import write_pandas
with snowflake.connector.connect(
account='snoflakewebsite',
user='username',
authenticator='externalbrowser',
database='db',
schema='schema'
) as con:
success, nchunks, nrows, _ = write_pandas(
conn=con,
df=df,
table_name='temp_table',
auto_create_table = True,
table_type='temporary',
overwrite = True,
database='db',
schema='schema'
)
cur = con.cursor()
cur.execute('select * from temp_table')
The error I get:
ProgrammingError: 002003 (42S02): SQL compilation error:
Object 'TEMP_TABLE' does not exist or not authorized.
write_pandas() creates a table using the letter case exactly how it is passed in table_name=, while the query submitted in cur.execute() passes the entire string with the query to Snowflake SQL, and Snowflake SQL capitalizes the object names unless they are written in double quotes.
Therefore, either you create a table using capital letters table_name='TEMP_TABLE',
or you query it using double quotes:
cur.execute('select * from "temp_table"')
In this case, you will get your table created in small letters, and you always need to add double quotes to refer to its name.

“Associated statement not prepared” caused by pypyodc?

What - Error Message (‘HY007’, [HY007][ODBC SQL Server Driver] Associated statement Is not prepared.
I downloaded ODBC to better diagnose this error based off other posts, however it is still throwing the same error
What is the actually error here and what is the way around it?
Import requests
Import pandas as pd
Import pypyodbc
Import matplotlib.pypot as ply
Conn1 = pypyodbc.connect(“Driver={SQL Server};” “Server = DESKTOP-KOOxxx;” “Database = Horsesxx;” “Trusted_Connection=yes;”, autocommit=True)
Mycursor = Conn1.cursor()
Mycursor.execute(‘Drop table #temptable SELECT * into #temptable FROM (SELECT HorseName, DayCalender FROM horses WHERE Place = 1) AS T1 Inner Join (SELECT runnerName, day, WIN_ODDS_BSP FROM betfairdata) AS T3 ON T1.HorseName = T3.runnerName AND T1.DayCalender = T3.day SELECT WIN_ODDS_BSP FROM #temptable)
Conn1.commit()
This statement works within SQL yet not within VS
Statement also works if I drop the temp table components

Error while trying to execute the query in Denodo using Python SQLAlchemy

I'm trying to get a table from Denodo using Python and sqlalchemy library. That's my code
from sqlalchemy import create_engine
import os
sql = """SELECT * FROM test_table LIMIT 10 """
engine = create_engine('mssql+pyodbc://DenodoODBC', encoding='utf-8')
con = engine.connect().connection
cursor = con.cursor()
cursor.execute(sql)
df = cursor.fetchall()
cursor.close()
con.close()
When I'm trying to run it for the first time I get the following error.
DBAPIError: (pyodbc.Error) (' \x10#', "[ \x10#] ERROR: Function 'schema_name' with arity 0 not found\njava.sql.SQLException: Function 'schema_name' with arity 0 not found;\nError while executing the query (7) (SQLExecDirectW)")
[SQL: SELECT schema_name()]
I think the problem might be with create_engine because when I'm trying to run the code for the second time without creating an engine again, everything is fine.
I hope somebody can explain me what is going on. Thanks :)

"No results. Previous SQL was not a query" when trying to query DeltaDNA with Python

I'm currently trying to query a deltadna database. Their Direct SQL Access guide states that any PostgreSQL ODBC compliant tools should be able to connect without issue. Using the guide, I set up an ODBC data source in windows
I have tried adding Set nocount on, changed various formats for the connection string, changed the table name to be (account).(system).(tablename), all to no avail. The simple query works in Excel and I have cross referenced with how Excel formats everything as well, so it is all the more strange that I get the no query problem.
import pyodbc
conn_str = 'DSN=name'
query1 = 'select eventName from table_name limit 5'
conn = pyodbc.connect(conn_str)
conn.setdecoding(pyodbc.SQL_CHAR,encoding='utf-8')
query1_cursor = conn.cursor().execute(query1)
row = query1_cursor.fetchone()
print(row)
Result is ProgrammingError: No results. Previous SQL was not a query.
Try it like this:
import pyodbc
conn_str = 'DSN=name'
query1 = 'select eventName from table_name limit 5'
conn = pyodbc.connect(conn_str)
conn.setdecoding(pyodbc.SQL_CHAR,encoding='utf-8')
query1_cursor = conn.cursor()
query1_cursor.execute(query1)
row = query1_cursor.fetchone()
print(row)
You can't do the cursor declaration and execution in the same row. Since then your query1_cursor variable will point to a cursor object which hasn't executed any query.

Python SQLAlchemy: psycopg2.ProgrammingError relation already exists?

I have repeatable tried to create a table MYTABLENAME with SQLAlchemy in Python. I deleted all tables through my SQL client Dbeaver but I am getting an error that the table exists such that
Traceback (most recent call last):
File "/home/hhh/anaconda3/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context
context)
File "/home/hhh/anaconda3/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 470, in do_execute
cursor.execute(statement, parameters)
psycopg2.ProgrammingError: relation "ix_MYTABLENAME_index" already exists
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) relation "ix_MYTABLENAME_index" already exists
[SQL: 'CREATE INDEX "ix_MYTABLENAME_index" ON "MYTABLENAME" (index)']
I succeed in the creation of tables and their insertions with an unique name but the second time I am getting the error despite deleting the tables in Dbeaver.
Small example
from datetime import date
from sqlalchemy import create_engine
import numpy as np
import pandas as pd
def storePandasDF2PSQL(myDF_):
#Store results as Pandas Dataframe to PostgreSQL database.
#
#Example
#df=pd.DataFrame(np.random.randn(8, 4), columns=['A','B','C','D'])
#dbName= date.today().strftime("%Y%m%d")+"_TABLE"
#engine = create_engine('postgresql://hhh:yourPassword#localhost:1234/hhh')
#df.to_sql(dbName, engine)
df = myDF_
dbName = date.today().strftime("%Y%m%d")+"_TABLE"
engine = create_engine('postgresql://hhh:yourPassword#localhost:1234/hhh')
# ERROR: NameError: name 'table' is not defined
#table.declarative_base.metadata.drop_all(engine) #Drop all tables
#TODO: This step is causing errors because the SQLAlchemy thinks the
#TODO: table still exists even though deleted
df.to_sql(dbName, engine)
What is the proper way to clean up the backend such as some hanging index in order to recreate the table with fresh data? In other words, how to solve the error?
The issue might be from sqlalchemy side which believes that there is an index as message of deletion of tables was not notified to the sqlalchemy. There is a sqlalchemy way of deleting the tables
table.declarative_base.metadata.drop_all(engine)
This should keep Sqlalchemy informed about the deletions.
This answer does not address the reusing of the same table names and hence not about cleaning up the SQLAlchemy metadata.
Instead of reusing the table names, add the execution time like this to the end of the tableName
import time
dbName = date.today().strftime("%Y%m%d")+"_TABLE_"+str(time.time())
dbTableName = dbName
so your SQL developmnet environment, such as SQL client locking up the connection or specific tables, does not matter that much. Closing Dbeaver can help while running the Python with SQLAlchemy.

Categories

Resources