How to send Excel data to MySQL using pandas and PyMySQL? - python

I'm having issues importing data with python into a table on my Database directly from an excel file.
I have this code:
import os
import pandas as pd
import pymysql
if os.path.exists("env.py"):
import env
print(os.environ)
# Abre conexion con la base de datos
db = pymysql.connect(
host = os.environ.get("MY_DATABASE_HOST"),
user = os.environ.get("MY_USERNAME"),
password = os.environ.get("MY_PASSWORD"),
database = os.environ.get("MY_DATABASE_NAME")
)
##################################################
################LECTURA DE EXCEL##################
tabla_azul = "./excelFiles/tablaAzul.xlsx"
dAzul = pd.read_excel(tabla_azul, sheet_name='Órdenes')
dAzul.to_sql(con=db, name='tablaazul', if_exists='append', schema='str')
#print(type(dAzul))
tabla_verde = "./excelFiles/tablaVerde.xlsx"
dVerde = pd.read_excel(tabla_verde, sheet_name='Órdenes')
dVerde.to_sql(con=db, name='tablaverde', if_exists='append', schema='str')
I'm not sure what table name I have to put into the name variable.
Do I need to use sqlalchemy yes or yes?
If question 2 is yes: Is it possible to connect sqlalchemy with pymysql?
If question 3 is no: Ho do I use the .env variables like host with sqlalchemy connection?
thank you!
when I run the code above, it's giving me this error:
pandas.io.sql.DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': not all arguments converted during string formatting

As stated in the pandas documentation, for any database other than SQLite .to_sql() requires a SQLAlchemy Connectable object, which is either an Engine object or a Connection object. You can create an Engine object for PyMySQL like so:
import sqlalchemy as sa
connection_url = sa.engine.URL.create(
"mysql+pymysql",
username=os.environ.get("MY_USERNAME"),
password=os.environ.get("MY_PASSWORD"),
host=os.environ.get("MY_DATABASE_HOST"),
database=os.environ.get("MY_DATABASE_NAME")
)
engine = sa.create_engine(connection_url)
Then you can call .to_sql() and pass it the engine:
dVerde.to_sql(con=engine, name='tablaverde', if_exists='append', schema='str')

Related

How to use variable in SQL Alchemy

I have successfully connected with SQL server using Alchemy and pyobdc, do update database, delete record also work as fine.
Now I want to use the variable to assign the statement in the SQL command
#import library
import pandas as pd
import os
from sqlalchemy import create_engine
from sqlalchemy.engine import URL
import pyodbc
#prepare for the connection
SERVER = 'IT\SQLEXPRESS'
DATABASE = 'lab'
DRIVER = 'SQL Server Native Client 11.0'
USERNAME = 'sa'
PASSWORD = 'Welcome1'
DATABASE_CONNECTION = f'mssql://{USERNAME}:{PASSWORD}#{SERVER}/{DATABASE}?driver={DRIVER}'
#prepare SQL query
year_delete = 2019
sql_delete = ("DELETE FROM [dbo].table1 where dbo.table1.[Year Policy] = 2019")
result=connection.execute(sql_delete)
How I could use year_delete instead of manually input 2019 in the code?
As Larnu points out in their comment, using f-strings or other string formatting techniques exposes an application to SQL injection attacks, and in any case can be error-prone.
SQLAlchemy supports parameter substitution, allowing values to be safely inserted into SQL statements.
from sqlalchemy import text
# Make a dictionary of values to be inserted into the statement.
values = {'year': 2019}
# Make the statement text into a text instance, with a placeholder for the value.
stmt = text('DELETE FROM [dbo].table1 where dbo.table1.[Year Policy] = :year')
# Execute the query.
result = connection.execute(stmt, values)
You can use an f-string (standard python-technique to insert Python-Expressions/Variables):
sql_delete=(f"delete .... where dbo.table1[Year Policy] ={year_delete}")

Importing a .sql file in python

I have just started learning SQL and I'm having some difficulties to import my sql file in python.
The .sql file is in my desktop, as well is my .py file.
That's what I tried so far:
import codecs
from codecs import open
import pandas as pd
sqlfile = "countries.sql"
sql = open(sqlfile, mode='r', encoding='utf-8-sig').read()
pd.read_sql_query("SELECT name FROM countries")
But I got the following message error:
TypeError: read_sql_query() missing 1 required positional argument: 'con'
I think I have to create some kind of connection, but I can't find a way to do that. Converting my data to an ordinary pandas DataFrame would help me a lot.
Thank you
This is the code snippet taken from https://www.dataquest.io/blog/python-pandas-databases/ should help.
import pandas as pd
import sqlite3
conn = sqlite3.connect("flights.db")
df = pd.read_sql_query("select * from airlines limit 5;", conn)
Do not read database as an ordinary file. It has specific binary format and special client should be used.
With it you can create connection which will be able to handle SQL queries. And can be passed to read_sql_query.
Refer to documentation often https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_sql_query.html
You need a database connection. I don't know what SQL flavor are you using, but suppose you want to run your query in SQL server
import pyodbc
con = pyodbc.connect(driver='{SQL Server}', server='yourserverurl', database='yourdb', trusted_connection=yes)
then pass the connection instance to pandas
pd.read_sql_query("SELECT name FROM countries", con)
more about pyodbc here
And if you want to query an SQLite database
import sqlite3
con = sqlite3.connect('pathto/example.db')
More about sqlite here

How insert the dataframe output to mysql

import pymysql
import pandas as pd
db = pymysql.connect('localhost', 'testuser', 'test123', 'world')
df1 = pd.read_sql('select * from country limit 5', db)
df1
I need to create a table name with country2 and update the df1 out to country2
Use Pandas to_sql (https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_sql.html). This should work for you:
import mymysql
from sqlalchemy import create_engine
sql_table_name = 'country2'
engine = create_engine("mysql://testuser:test123#localhost:0/world") # creat engine
df1.to_sql(sql_table_name, engine) # add to table
Definitely check out SqlAlchemy. Use SqlAlchemy to write a Mysql interation class. SqlAlchemy enables using python to connect database. Encoding your dataframe into a upsert sql string. And then use cursor.execute(query_string) to do the upsert.
engine = sqlalchemy.create_engine(
'mysql+mysqlconnector://user:pwd#hostname/db_name',
connect_args={'auth_plugin': 'mysql_native_password'})
sample_sql_database = df.to_sql('table_name', con=engine)
There is an option to "append" the contends from data frame or "replace" also
sample_sql_database = df.to_sql('table_name', engine, if_exists='replace')
sample_sql_database = df.to_sql('table_name', engine, if_exists='append')
Reference :
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_sql.html

Df to sql to Teradata in python

I'm trying to load a csv file into a Teradata table with the df.to_sql method.
So far with Teradata python modules i was able to connect, but i can't manage to load my csv file.
Here is my code :
import teradata
import pandas as pd
global udaExec
global session
global host
global username
global password
def Connexion_Teradata(usernames,passwords):
host= 'FTGPRDTD'
udaExec = teradata.UdaExec (appName="TEST", version="1.0", logConsole=False)
session=udaExec.connect(method="odbc",system=host, username=usernames,password=passwords, driver="Teradata")
print('connection ok')
df = pd.read_csv(r'C:/Users/c92434/Desktop/Load.csv')
print('chargement df ok')
df.to_sql(name = 'DB_FTG_SRS_DATALAB.mdc_load', con = session, if_exists="replace", index ="False" )
print ('done')
Connexion_Teradata ("******","****")
When I play my script all I got is:
DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': (3707, "[42000] [Teradata][ODBC Teradata Driver][Teradata Database] Syntax error, expected something like '(' between the 'type' keyword and '='. ")
What can I do?
Please try this:
from teradataml.dataframe.copy_to import copy_to_sql
from sqlalchemy import create_engine
import pandas as pd
sqlalchemy_engine = create_engine('teradatasql://'+ user + ':' + passwd + '#'+host)
td_context = create_context(tdsqlengine = sqlalchemy_engine)
df = pd.read_csv(r'/Users/abc/test.csv')
Use copy_to_sql() function to create a table in Vantage based on a teradataml DataFrame or a pandas DataFrame.
copy_to_sql(df, 'testtable', if_exists='replace')

List names of all available MS SQL databases on server using python

Trying to list the names of the databases on a remote MS SQL server using Python (Just like the Object Explorer in MS SQL Server Management Studio).
Current solution: The required query is SELECT name FROM sys.databases;. So current solution is using SQLAlchemy and Pandas, which works fine as below.
import pandas
from sqlalchemy import create_engine
#database='master'
engine = create_engine('mssql+pymssql://user:password#server:port/master')
query = "select name FROM sys.databases;"
data = pandas.read_sql(query, engine)
output:
name
0 master
1 tempdb
2 model
3 msdb
Question: How to list the names of the databases on the server using
SQLAlchemy's inspect(engine) similar to listing table names under a database? Or any simpler way without importing Pandas?
from sqlalchemy import inspect
#trial 1: with no database name
engine = create_engine('mssql+pymssql://user:password#server:port')
#this engine not have DB name
inspector = inspect(engine)
inspector.get_table_names() #returns []
inspector.get_schema_names() #returns [u'dbo', u'guest',...,u'INFORMATION_SCHEMA']
#trial 2: with database name 'master', same result
engine = create_engine('mssql+pymssql://user:password#server:port/master')
inspector = inspect(engine)
inspector.get_table_names() #returns []
inspector.get_schema_names() #returns [u'dbo', u'guest',...,u'INFORMATION_SCHEMA']
If all you really want to do is avoid importing pandas then the following works fine for me:
from sqlalchemy import create_engine
engine = create_engine('mssql+pymssql://sa:saPassword#localhost:52865/myDb')
conn = engine.connect()
rows = conn.execute("select name FROM sys.databases;")
for row in rows:
print(row["name"])
producing
master
tempdb
model
msdb
myDb
It is also possible to obtain tables from a specific scheme with execute the single query with the driver below: DB-API interface to Microsoft SQL Server for Python.
pip install pymssql
import pymssql
# Connect to the database
conn =
pymssql.connect(server='127.0.0.1',user='root',password='root',database='my_database')
# Create a Cursor object
cur = conn.cursor()
# Execute the query: To get the name of the tables from my_database
cur.execute("select table_name from information_schema.tables") # where table_schema = 'tableowner'
for row in cur.fetchall():
# Read and print tables
for row in cur.fetchall():
print(row[0])
output:
my_table_name_1
my_table_name_2
my_table_name_3
...
my_table_name_x
I believe the following snippet will list the names of the available databases on whatever server you choose to connect to. This will return a JSON object that will be displayed in your browser. This question is a bit old, but I hope this helps anyone curious who stops by.
from flask import Flask, request
from flask_restful import Resource, Api
from sqlalchemy import create_engine, inspect
from flask_jsonpify import jsonify
engine = create_engine('mssql+pymssql://user:password#server:port/master')
class AllTables(Resource):
def get(self):
conn = engine.connect()
inspector = inspect(conn)
tableList = [item for item in inspector.get_table_names()]
result = {'data': tableList}
return jsonify(result)
api.add_resource(AllTables, '/alltables')
app.run(port='8080')
here is another solution which fetch row by row:
import pymssql
connect = pymssql.connect(server, user, password, database)
cursor = connect.cursor(as_dict=True)
row = cursor.fetchone()
while row:
for r in row.items():
print r[0], r[1]
row = cursor.fetchone()

Categories

Resources