Python SQLAlchemy: psycopg2.ProgrammingError relation already exists? - python

I have repeatable tried to create a table MYTABLENAME with SQLAlchemy in Python. I deleted all tables through my SQL client Dbeaver but I am getting an error that the table exists such that
Traceback (most recent call last):
File "/home/hhh/anaconda3/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context
context)
File "/home/hhh/anaconda3/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 470, in do_execute
cursor.execute(statement, parameters)
psycopg2.ProgrammingError: relation "ix_MYTABLENAME_index" already exists
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) relation "ix_MYTABLENAME_index" already exists
[SQL: 'CREATE INDEX "ix_MYTABLENAME_index" ON "MYTABLENAME" (index)']
I succeed in the creation of tables and their insertions with an unique name but the second time I am getting the error despite deleting the tables in Dbeaver.
Small example
from datetime import date
from sqlalchemy import create_engine
import numpy as np
import pandas as pd
def storePandasDF2PSQL(myDF_):
#Store results as Pandas Dataframe to PostgreSQL database.
#
#Example
#df=pd.DataFrame(np.random.randn(8, 4), columns=['A','B','C','D'])
#dbName= date.today().strftime("%Y%m%d")+"_TABLE"
#engine = create_engine('postgresql://hhh:yourPassword#localhost:1234/hhh')
#df.to_sql(dbName, engine)
df = myDF_
dbName = date.today().strftime("%Y%m%d")+"_TABLE"
engine = create_engine('postgresql://hhh:yourPassword#localhost:1234/hhh')
# ERROR: NameError: name 'table' is not defined
#table.declarative_base.metadata.drop_all(engine) #Drop all tables
#TODO: This step is causing errors because the SQLAlchemy thinks the
#TODO: table still exists even though deleted
df.to_sql(dbName, engine)
What is the proper way to clean up the backend such as some hanging index in order to recreate the table with fresh data? In other words, how to solve the error?

The issue might be from sqlalchemy side which believes that there is an index as message of deletion of tables was not notified to the sqlalchemy. There is a sqlalchemy way of deleting the tables
table.declarative_base.metadata.drop_all(engine)
This should keep Sqlalchemy informed about the deletions.

This answer does not address the reusing of the same table names and hence not about cleaning up the SQLAlchemy metadata.
Instead of reusing the table names, add the execution time like this to the end of the tableName
import time
dbName = date.today().strftime("%Y%m%d")+"_TABLE_"+str(time.time())
dbTableName = dbName
so your SQL developmnet environment, such as SQL client locking up the connection or specific tables, does not matter that much. Closing Dbeaver can help while running the Python with SQLAlchemy.

Related

How can I drop this table using SQLAlchemy?

I am trying to drop a table called 'New'. I currently have the following code:
import pandas as pd
import sqlalchemy
sqlcon = sqlalchemy.create_engine('mssql://ABSECTDCS100TL/AdventureWorks?driver=ODBC+Driver+17+for+SQL+Server')3
df = pd.read_sql_query('SELECT * FROM DimReseller', sqlcon)
df.to_sql('New',sqlcon,if_exists='append', index=False)
sqlalchemy.schema.New.drop(bind=None, checkfirst=False)
I am receiving the error:
AttributeError: module 'sqlalchemy.schema' has no attribute 'New'
Any ideas on what I'm missing here?. Thanks.
You can reflect the table into a Table object and then call its drop method:
from sqlalchemy import Table, MetaData
tbl = Table('New', MetaData(), autoload_with=sqlcon)
tbl.drop(sqlcon, checkfirst=False)
If you want to delete the table using raw SQL, you can do this:
from sqlalchemy import text
with sqlcon.connect() as conn:
# Follow the identifier quoting convention for your RDBMS
# to avoid problems with mixed-case names.
conn.execute(text("""DROP TABLE "New" """))
# Commit if necessary
conn.commit()

Error while deleting duplicates from table using sqlalchemy python

I am trying trying to delete duplicates from Redshift table
Mycode
from sqlalchemy import create_engine
# A long string that contains the necessary Postgres login information
postgres_str = f'postgresql://{redshift_username}:{redshift_password}#{redshift_address}:{redshift_port}/{redshift_dbname}'
# Create the connection
cnx = create_engine(postgres_str)
delstatmt = '''WITH CTE AS
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY org_country_code,dest_country_code,postcode,zone,kg,value,carrier,version
ORDER BY org_country_code,dest_country_code,postcode,zone,kg,value,carrier,version ) AS RN
FROM d.axis
)
DELETE FROM d.axis transformed WHERE RN<>1'''
cnx.execute(delstatmt)
Error
ProgrammingError: (psycopg2.errors.SyntaxError) syntax error at or near "DELETE"
LINE 8: DELETE FROM d.axis transformed WHERE ...
What is wrong is code. Any help appreciated.

Pandas mysql last inserted id sqlalchemy import create_engine

I am trying to insert pandas dataframe data into mysql database using sqlalchemy .The table table_test has one column AUTO_INCREMENT. I want to get the value of the AUTO_INCREMENT column and use it in the later part of the program.
Below is my code :
from sqlalchemy import create_engine
mysqldb = create_engine("mysql://test:password#localhost/test")
df.to_sql(con=mysqldb, name='table_test',index=False, if_exists='append')
print (mysqldb.insert_id())
However the last print line giving me error .
File "test.py", line 208, in StagingCountLesserThanTarget
mysqldb.insert_id() AttributeError: 'Engine' object has no attribute 'insert_id'
how to get the last inserted id in mysql using sqlalchemy?

Pandas 0.24 read_sql operational errors

I just upgraded to Pandas 0.24.0 from 0.23.4 (Python 2.7.12), and many of my pd.read_sql queries are breaking. It looks like something related to MySQL, but it's strange that these errors only occur after updating my pandas version. Any ideas what's going on?
Here's my MySQL table:
CREATE TABLE `xlations_topic_update_status` (
`run_ts` datetime DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
Here's my query:
import pandas as pd
from sqlalchemy import create_engine
db_engine = create_engine('mysql+mysqldb://<><>/product_analytics', echo=False)
pd.read_sql('select max(run_ts) from product_analytics.xlations_topic_update_status', con = db_engine).values[0][0]
And here's the error:
OperationalError: (_mysql_exceptions.OperationalError) (1059, "Identifier name 'select max(run_ts) from product_analytics.xlations_topic_update_status;' is too long") [SQL: 'DESCRIBE `select max(run_ts) from product_analytics.xlations_topic_update_status;`']
I've also gotten this for other more complex queries, but won't post them here.
According to documentation the first argument is either a string (a table name) or SQLAlchemy Selectable (select or text object). In other words pd.read_sql() is delegating to pd.read_sql_table() and treating the entire query string as a table identifier.
Wrap your query string in a text() construct first:
stmt = text('select max(run_ts) from product_analytics.xlations_topic_update_status')
pd.read_sql(stmt, con = db_engine).values[0][0]
This way pd.read_sql() will delegate to pd.read_sql_query() instead. Another option is to call it directly.
Try using pd.read_sql_query(sql, con), instead of pd.read_sql(...).
So:
pd.read_sql_query('select max(run_ts) from product_analytics.xlations_topic_update_status', con = db_engine).values[0][0]

Python using mysql connector list databases LIKE and then use those databases in order and run query

I'm trying to write a script using pythong and the mysql-connector library. The script should connect to the mysql server do a "SHOW DATABASES LIKE 'pdns_%' and then using the results returned by the query use each database and then run another query while using that database.
Here is the code
import datetime
import mysql.connector
from mysql.connector import errorcode
cnx = mysql.connector.connect (user='user', password='thepassword',
host='mysql.server.com',buffered=True)
cursor = cnx.cursor()
query = ("show databases like 'pdns_%'")
cursor.execute(query)
databases = query
for (databases) in cursor:
cursor.execute("USE %s",(databases[0],))
hitcounts = ("SELECT Monthname(hitdatetime) AS 'Month', Count(hitdatetime) AS 'Hits' WHERE hitdatetime >= Date_add(Last_day(Date_sub(Curdate(), interval 4 month)), interval 1 day) AND hitdatetime < Date_add(Last_day(Date_sub(Curdate(), interval 1 month)), interval 1 day) GROUP BY Monthname(hitdatetime) ORDER BY Month(hitdatetime)")
cursor.execute(hitcounts)
print(hitcounts)
cursor.close()
cnx.close()
When running the script it stops with the following error'd output
Traceback (most recent call last):
File "./mysql-test.py", line 18, in <module>
cursor.execute("USE %s",(databases[0],))
File "/usr/lib/python2.6/site-packages/mysql/connector/cursor.py", line 491, in execute
self._handle_result(self._connection.cmd_query(stmt))
File "/usr/lib/python2.6/site-packages/mysql/connector/connection.py", line 635, in cmd_query
statement))
File "/usr/lib/python2.6/site-packages/mysql/connector/connection.py", line 553, in _handle_result
raise errors.get_exception(packet)
mysql.connector.errors.ProgrammingError: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''pdns_382'' at line 1
Based on the error I'm guessing there is an issue with how its doing the datbase name from the first query. Any pointers in the correct direction would be very helpful as I'm very much a beginner. Thank you very much.
Alas, the two-args form of execute does not support "meta" parameters, such as names of databases, tables, or fields (roughly, think of identifiers you wouldn't quote if writing the query out manually). So, the failing statement:
cursor.execute("USE %s",(databases[0],))
needs to be re-coded as:
cursor.execute("USE %s" % (databases[0],))
i.e, the single arg form of execute, with a string interpolation. Fortunately, this particular case does not expose you to SQL injection risks, since you're only interpolating DB names coming right from the DB engine.

Categories

Resources