I am trying to drop a table called 'New'. I currently have the following code:
import pandas as pd
import sqlalchemy
sqlcon = sqlalchemy.create_engine('mssql://ABSECTDCS100TL/AdventureWorks?driver=ODBC+Driver+17+for+SQL+Server')3
df = pd.read_sql_query('SELECT * FROM DimReseller', sqlcon)
df.to_sql('New',sqlcon,if_exists='append', index=False)
sqlalchemy.schema.New.drop(bind=None, checkfirst=False)
I am receiving the error:
AttributeError: module 'sqlalchemy.schema' has no attribute 'New'
Any ideas on what I'm missing here?. Thanks.
You can reflect the table into a Table object and then call its drop method:
from sqlalchemy import Table, MetaData
tbl = Table('New', MetaData(), autoload_with=sqlcon)
tbl.drop(sqlcon, checkfirst=False)
If you want to delete the table using raw SQL, you can do this:
from sqlalchemy import text
with sqlcon.connect() as conn:
# Follow the identifier quoting convention for your RDBMS
# to avoid problems with mixed-case names.
conn.execute(text("""DROP TABLE "New" """))
# Commit if necessary
conn.commit()
Related
I am trying to import some data from the database (Postgre SQL) to work with them in Python. I tried with the code below, which seems quite similar to the ones I've found on the internet.
import psycopg2
import sqlalchemy as db
import pandas as pd
engine = db.create_engine('database specifications')
connection = engine.connect()
metadata = db.MetaData()
data = db.Table(tabela, metadata, schema=shema, autoload=True, autoload_with=engine)
query = db.select([data])
ResultProxy = connection.execute(query)
ResultSet = ResultProxy.fetchall()
df = pd.DataFrame(ResultSet)
However, it returns data without column names. What did I forget?
It turned out the only thing needed is adding
columns = data.columns.keys()
df.columns = columns
There is a great debate about that in this thread.
I just upgraded to Pandas 0.24.0 from 0.23.4 (Python 2.7.12), and many of my pd.read_sql queries are breaking. It looks like something related to MySQL, but it's strange that these errors only occur after updating my pandas version. Any ideas what's going on?
Here's my MySQL table:
CREATE TABLE `xlations_topic_update_status` (
`run_ts` datetime DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
Here's my query:
import pandas as pd
from sqlalchemy import create_engine
db_engine = create_engine('mysql+mysqldb://<><>/product_analytics', echo=False)
pd.read_sql('select max(run_ts) from product_analytics.xlations_topic_update_status', con = db_engine).values[0][0]
And here's the error:
OperationalError: (_mysql_exceptions.OperationalError) (1059, "Identifier name 'select max(run_ts) from product_analytics.xlations_topic_update_status;' is too long") [SQL: 'DESCRIBE `select max(run_ts) from product_analytics.xlations_topic_update_status;`']
I've also gotten this for other more complex queries, but won't post them here.
According to documentation the first argument is either a string (a table name) or SQLAlchemy Selectable (select or text object). In other words pd.read_sql() is delegating to pd.read_sql_table() and treating the entire query string as a table identifier.
Wrap your query string in a text() construct first:
stmt = text('select max(run_ts) from product_analytics.xlations_topic_update_status')
pd.read_sql(stmt, con = db_engine).values[0][0]
This way pd.read_sql() will delegate to pd.read_sql_query() instead. Another option is to call it directly.
Try using pd.read_sql_query(sql, con), instead of pd.read_sql(...).
So:
pd.read_sql_query('select max(run_ts) from product_analytics.xlations_topic_update_status', con = db_engine).values[0][0]
I want to copy a table from one oracle database to postgre database using sqlalchemy
After setting up connection and engine in oracle and postgre and reflecting tables to the sourceMeta metadata, I try to create in the destEngine, but it gives me an error saying cant render element of type...
for t in sourceMeta.sorted_tables:
newtable = Table(t.name, sourceMeta, autoload=True)
newtable.metadata.create_all(destEngine)
it seems what you are looking for is sqlalchemys #compiles decorator.
Here is an example of how it worked for me when trying to copy tables from a MS SQL Server db to a PostgreSQL db.
from sqlalchemy import create_engine, Table, MetaData
from sqlalchemy.schema import CreateTable
from sqlalchemy.ext.compiler import compiles
from sqlalchemy.dialects.mssql import TINYINT, DATETIME, VARCHAR
#compiles(TINYINT, 'postgresql')
def compile_TINYINT_mssql_int(element, compiler, **kw):
""" Handles mssql TINYINT datatype as INT in postgresql """
return 'INTEGER'
# add a function for each datatype that causes an error
table_name = '<table_name>'
# create engine, reflect existing columns, and create table object for oldTable
srcEngine = create_engine('mssql+pymssql://<user>:<password>#<host>/<db>')
srcEngine._metadata = MetaData(bind=srcEngine)
srcEngine._metadata.reflect(srcEngine) # get columns from existing table
srcTable = Table(table_name, srcEngine._metadata)
# create engine and table object for newTable
destEngine = create_engine('postgresql+psycopg2://<user>:<password>#<host><db>')
destEngine._metadata = MetaData(bind=destEngine)
destTable = Table(table_name.lower(), destEngine._metadata)
# copy schema and create newTable from oldTable
for column in srcTable.columns:
dstCol = column.copy()
destTable.append_column(dstCol)
# maybe change column name etc.
print(CreateTable(destTable).compile(destEngine)) # <- check the query that will be used to create the table
destTable.create()
check the docs:
https://docs.sqlalchemy.org/en/13/core/compiler.html
and maybe also this example:
https://gist.github.com/methane/2972461
I am trying to work out why the schema of a dropped table returns when I attempt to create a table using a different set of column names?
After dropping the table, I can confirm in an SQLite explorer that the table has disappeared. Once trying to load the new file via ODO it then returns an error "Column names of incoming data don't match column names of existing SQL table names in SQL table". Then I can see the same table is re-created in the database, using the previously dropped schema! I attempted a VACUUM statement after dropping the table but still same issue.
I can create the table fine using a different table name, however totally confused as to why I can't use the previously dropped table name I want to use?
import sqlite3
import pandas as pd
from odo import odo, discover, resource, dshape
conn = sqlite3.connect(dbfile)
c = conn.cursor()
c.execute("DROP TABLE <table1>")
c.execute("VACUUM")
importfile = pd.read_csv(csvfile)
odo(importfile,'sqlite:///<db_path>::<table1'>)
ValueError: Column names of incoming data don't match column names of existing SQL table Names in SQL table:
import sqlite3
import pandas as pd
from odo import odo, discover, resource, dshape
conn = sqlite3.connect('test.db')
cursor = conn.cursor();
table = """ CREATE TABLE IF NOT EXISTS TABLE1 (
id integer PRIMARY KEY,
name text NOT NULL
); """;
cursor.execute(table);
conn.commit(); # Save table into database.
cursor.execute(''' DROP TABLE TABLE1 ''');
conn.commit(); # Save that table has been dropped.
cursor.execute(table);
conn.commit(); # Save that table has been created.
conn.close();
I wrote a little script to copy a table between SQL servers.
It works, but one of the columns changed type from varchar to text...
How do I make it to copy a table with the same columns types?
import pymssql
import pandas as pd
from sqlalchemy import create_engine
db_server= 1.2.3.4\\r2
db_database="Test_DB"
db_user="vaf"
db_password="1234"
local_db_server="1.1.1.1\\r2"
local_db_database="Test_DB"
local_db_user="vaf"
local_db_password="1234"
some_query=("""
select * from some_table
""")
def main():
conn=pymssql.connect(server=local_db_server,user=local_db_user,password=local_db_password,database=local_db_database,charset='UTF-8')
data=pd.io.sql.read_sql(some_query,conn)
connection_string='mssql+pymssql://{}:{}#{}/{}'.format(db_user,db_password,db_server,db_database)
engine=create_engine(connection_string)
data.to_sql(name="some_table",con=engine,if_exists='replace',index=False)
if __name__ == "__main__":
main()
Thanks
Consider three approaches:
SPECIFY TYPES (proactive as it anticipates ahead)
Using the dtype argument of pandas.DataFrame.to_sql, pass a dictionary of sqlalchemy types for named columns.
data.to_sql(name="some_table", con=engine, if_exists='replace', index=False,
dtype={'datefld': sqlalchemy.DateTime(),
'intfld': sqlalchemy.types.INTEGER(),
'strfld': sqlalchemy.types.VARCHAR(length=255),
'floatfld': sqlalchemy.types.Float(precision=3, asdecimal=True),
'booleanfld': sqlalchemy.types.Boolean}
DELETE DATA (proactive as it anticipates ahead)
Clean out table with DELETE action query. Then, migrate only the data from pandas to SQL Server without structurally changing table since to_sql replace argument re-creates the table. This approach assumes dataframe is always consistent (no new columns / changed types) with database table.
def main():
connection_string = 'mssql+pymssql://{}:{}#{}/{}'\
.format(db_user,db_password,db_server,db_database)
engine = create_engine(connection_string)
# IMPORT DATA INTO DATA FRAME
data = pd.read_sql(some_query, engine)
# SQL DELETE (CLEAN OUT TABLE) VIA TRANSACTION
with engine.begin() as conn:
conn.execute("DELETE FROM some_table")
# MIGRATE DATA INTO DATA FRAME (APPEND NOT REPLACE)
data.to_sql(name='some_table', con=engine, if_exists='append', index=False)
MODIFY COLUMN (reactive as it fixes ad-hoc)
Alter the column after migration with a DDL SQL statement.
def main():
connection_string = 'mssql+pymssql://{}:{}#{}/{}'\
.format(db_user,db_password,db_server,db_database)
engine = create_engine(connection_string)
# IMPORT DATA INTO DATA FRAME
data = pd.read_sql(some_query, engine)
# MIGRATE DATA INTO DATA FRAME
data.to_sql(name="some_table", con=engine, if_exists='replace', index=False)
# ALTER COLUMN TYPE (ASSUMING USER HAS RIGHTS/PRIVILEGES)
with engine.begin() as conn:
conn.execute("ALTER TABLE some_table ALTER COLUMN mytextcolumn VARCHAR(255);")
I recommend the second approach as I believe databases should be agnostic to application code like python and pandas. Hence, initial build/re-build of table schema should be a planned, manual process, and no script should structurally change a database on the fly, only interact with data.