I'm trying to insert information from an MS Access database MDB file, unfortunately I don't know how to delimitate the columns from the database table with Python.
I'm getting the error
ValueError: Shape of passed values is (109861, 1), indices imply (3,1)
and the code I'm using is:
import os
import shutil
import pyodbc
import pandas as pd
import csv
from datetime import datetime
conn = pyodbc.connect(r'Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=C:\\Users\\sguerra\\Desktop\\Python\\Measurements-2020-12-15.mdb;')
cursor = conn.cursor()
cursor.execute('select * from Measurements')
new = cursor.fetchall()
columns = ['Prod_Date','Prod_Time','CCE_SKU']
df = pd.DataFrame(new,columns)
for row in df.itertuples():
cursor.execute('''
insert into MITSF_1.dbo.MeasurementsTest ([Prod_Date],[Prod_Time],[CCE_SKU])
VALUES (?,?,?)
''',
row.Prod_Date,
row.Prod_Time,
row.CCE_SKU
)
conn.commit()
You are using the same cursor to try and execute both the select and the insert, so both of those statements would be operating on the same database. To keep things simple, you should use pandas' read_sql_query() to read the required columns from Access and then use to_sql() to write them to SQL Server:
df = pd.read_sql_query(
"SELECT [Prod_Date],[Prod_Time],[CCE_SKU] FROM Measurements",
conn,
)
from sqlalchemy import create_engine
engine = create_engine(
"mssql+pyodbc://scott:tiger#192.168.0.199/MITSF_1"
"?driver=ODBC+Driver+17+for+SQL+Server",
fast_executemany=True,
)
df.to_sql("MeasurementsTest", engine, schema="dbo",
index=False, if_exists="append",
)
Related
I am using the following code to read a table from an access db as a pandas dataframe:
import pyodbc
import pandas as pd
connStr = (
r"DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};"
r"DBQ=C:\Users\A\Documents\Database3.accdb;"
)
cnxn = pyodbc.connect(connStr)
sql = "Select * From Table1"
data = pd.read_sql(sql,cnxn) # without parameters [non-prepared statement]
# with a prepared statement, use list/tuple/dictionary of parameters depending on DB
#data = pd.read_sql(sql=sql, con=cnxn, params=query_params)
I plan to make some transformations and then write the dataframe back into the databsae in a similar way. Does anyone know how I can do this?.
Thank you
When working with pandas and a database other than SQLite we need to use SQLAlchemy. In this case, we would use the sqlalchemy-access dialect.
(I am currently the maintainer.)
Example:
import pandas as pd
import sqlalchemy as sa
connection_string = (
r"DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};"
r"DBQ=C:\Users\Public\test\sqlalchemy-access\sqlalchemy_test.accdb;"
r"ExtendedAnsiSQL=1;" )
connection_url = sa.engine.URL.create(
"access+pyodbc",
query={"odbc_connect": connection_string}
)
engine = sa.create_engine(connection_url)
df = pd.DataFrame([(1, "foo"), (2, "bar")], columns=["id", "txt"])
df.to_sql("my_table", engine, index=False, if_exists="append")
I get DatabaseError: ORA-00904: "DAT_ULT_ALT": invalid identifier when I try to insert a datetime to a timestamp in oracle using to_sql from pandas with SQL Alchemy engine. My code:
import sqlalchemy as sa
import datetime
import itertools
...
oracle_db = sa.create_engine('oracle://username:password#host:port/database')
connection= oracle_db.connect()
...
dat_ult_alt = datetime.datetime.now()
df_plano['DAT_ULT_ALT'] = pd.Series(list(itertools.repeat(dat_ult_alt, max)))
df_plano.to_sql('table_name', connection, if_exists='append', index=False)
This code works to fields of type "Date" but does not work with fields of type "timestamp". Do you know what I need to do to convert dat_ult_alt to timestamp?
Check out the Oracle Data Types of the sqlalchemy documentation. In the module sqlalchemy.dialects.oracle you can find the datatype TIMESTAMP. Import it and set it as dtype for the relevant column like this.
from sqlalchemy.dialects.oracle import TIMESTAMP
df_plano.to_sql('table_name', connection, if_exists='append', index=False, dtype={'DAT_ULT_ALT':TIMESTAMP})
Not sure about sqlalchemy as I have never used it with Oracle. Here's a sample code using Cx_Oracle which works.
create table test ( tstamp TIMESTAMP);
import cx_Oracle
import datetime
conn = cx_Oracle.connect('usr/pwd#//host:1521/db')
cur = conn.cursor()
dtime=datetime.datetime.now()
cur.prepare( "INSERT INTO test(tstamp) VALUES(:ts)" )
cur.setinputsizes(ts=cx_Oracle.TIMESTAMP)
cur.execute(None, {'ts':dtime})
conn.commit()
conn.close()
select * from test;
TSTAMP
------------------------------
22-11-18 09:14:19.422278000 PM
I'm trying to run a SQL query through mysql.connector that requires a SET command in order to query a specific table:
import mysql.connector
import pandas as pd
cnx = mysql.connector.connect(host=ip,
port=port,
user=user,
passwd=pwd,
database="")
sql="""SET variable='Test';
SELECT * FROM table """
df = pd.read_sql(sql, cnx)
when I run this I get the error "Use multi=True when executing multiple statements". But where do I put multi=True?
Pass the parameters as a dictionary into the params argument should do the trick, documentation here:
pd.read_sql(sql, cnx, params={'multi': True})
The parameters are passed to the underlying database driver.
after many hours of experimenting, i figured out how do to this. forgive me if this is not the most succinct way, but the best i could come up with-
import mysql.connector
import pandas as pd
cnx = mysql.connector.connect(host=ip,
port=port,
user=user,
passwd=pwd,
database="")
sql1="SET variable='Test';"
sql2="""SELECT * FROM table """
cursor=cnx.cursor()
cursor.execute(sql1)
cursor.close()
df = pd.read_sql(sql2, cnx)
I'm new to sqllite3 and trying to understand how to create a table in sql environment by using my existing dataframe. I already have a database that I created as "pythonsqlite.db"
#import my csv to python
import pandas as pd
my_data = pd.read_csv("my_input_file.csv")
## connect to database
import sqlite3
conn = sqlite3.connect("pythonsqlite.db")
##push the dataframe to sql
my_data.to_sql("my_data", conn, if_exists="replace")
##create the table
conn.execute(
"""
create table my_table as
select * from my_data
""")
However, when I navigate to my SQLlite studio and check the tables under my database, I cannot see the table I've created. I'd really appreciate if someone tells me what I'm missing here.
I replaced just one part of the code, the 'read_csv' instead I create a small dataframe (see below), I think the issue will be either with the name of your script ( example: pandas.py)
import pandas as pd
# my_data = pd.read_csv("my_input_file.csv")
columns = ['a','b']
my_data = pd.DataFrame([[1, 2], [3, 4]], columns=columns)
## connect to database
import sqlite3
conn = sqlite3.connect("pythonsqlite.db")
##push the dataframe to sql
my_data.to_sql("my_data", conn, if_exists="replace")
##create the table
conn.execute(
"""
create table my_table as
select * from my_data
""")
I ran it and I don't see to have a problem
I used to read the data from CSV file, while I just imported all CSV data in SQL database, but I have difficulty in extracting data using Python from SQL.
My original code of read CSV is like this:
import pandas as pd
stock_data = pd.read_csv(filepath_or_buffer='stock_data_w.csv', parse_dates=[u'date'], encoding='gbk')
stock_data[u'change_weekly'] = stock_data.groupby(u'code')[u'change'].shift(-1)
Now I want to read data from SQL, here is my code, but it doesn't work and I am not sure how to sort it out:
import pandas as pd
import MySQLdb
db = MySQLdb.connect(host='localhost', user='root', passwd='232323', db='test', port=3306)
cur = db.cursor()
cur.execute("SELECT * FROM stock_data_w")
stock_data = pd.DataFrame(data=cur.fetchall(), columns=[i[0] for i in cur.description])
stock_data[u'change_weekly'] = stock_data.groupby(u'code')[u'change'].shift(-1)
the error is: "raise PandasError('DataFrame constructor not properly called!') pandas.core.common.PandasError: DataFrame constructor not properly called!"
Use below way to convert your cursor object to crate data frame.
stock_data = pd.DataFrame(data=cursor.fetchall(), index=None,
columns=cursor.keys())
print stock_data
In mysqldb, columns=[i[0] for i in cursor.description]
or
Make your connection with alchemy and use,
stock_data = pd.read_sql("SELECT * from stock_data_w",
con= cnx,parse_dates=['date'])
I'm not sure whether mysql.connector is supported in pandas read_sql(). You can give a try and let us know :)