Python to group SQL data - python

I used to read the data from CSV file, while I just imported all CSV data in SQL database, but I have difficulty in extracting data using Python from SQL.
My original code of read CSV is like this:
import pandas as pd
stock_data = pd.read_csv(filepath_or_buffer='stock_data_w.csv', parse_dates=[u'date'], encoding='gbk')
stock_data[u'change_weekly'] = stock_data.groupby(u'code')[u'change'].shift(-1)
Now I want to read data from SQL, here is my code, but it doesn't work and I am not sure how to sort it out:
import pandas as pd
import MySQLdb
db = MySQLdb.connect(host='localhost', user='root', passwd='232323', db='test', port=3306)
cur = db.cursor()
cur.execute("SELECT * FROM stock_data_w")
stock_data = pd.DataFrame(data=cur.fetchall(), columns=[i[0] for i in cur.description])
stock_data[u'change_weekly'] = stock_data.groupby(u'code')[u'change'].shift(-1)
the error is: "raise PandasError('DataFrame constructor not properly called!') pandas.core.common.PandasError: DataFrame constructor not properly called!"

Use below way to convert your cursor object to crate data frame.
stock_data = pd.DataFrame(data=cursor.fetchall(), index=None,
columns=cursor.keys())
print stock_data
In mysqldb, columns=[i[0] for i in cursor.description]
or
Make your connection with alchemy and use,
stock_data = pd.read_sql("SELECT * from stock_data_w",
con= cnx,parse_dates=['date'])
I'm not sure whether mysql.connector is supported in pandas read_sql(). You can give a try and let us know :)

Related

Missing column names when importing data from database (python + postgre sql)

I am trying to import some data from the database (Postgre SQL) to work with them in Python. I tried with the code below, which seems quite similar to the ones I've found on the internet.
import psycopg2
import sqlalchemy as db
import pandas as pd
engine = db.create_engine('database specifications')
connection = engine.connect()
metadata = db.MetaData()
data = db.Table(tabela, metadata, schema=shema, autoload=True, autoload_with=engine)
query = db.select([data])
ResultProxy = connection.execute(query)
ResultSet = ResultProxy.fetchall()
df = pd.DataFrame(ResultSet)
However, it returns data without column names. What did I forget?
It turned out the only thing needed is adding
columns = data.columns.keys()
df.columns = columns
There is a great debate about that in this thread.

Unable to create data frame from MS Access query results

I'm trying to insert information from an MS Access database MDB file, unfortunately I don't know how to delimitate the columns from the database table with Python.
I'm getting the error
ValueError: Shape of passed values is (109861, 1), indices imply (3,1)
and the code I'm using is:
import os
import shutil
import pyodbc
import pandas as pd
import csv
from datetime import datetime
conn = pyodbc.connect(r'Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=C:\\Users\\sguerra\\Desktop\\Python\\Measurements-2020-12-15.mdb;')
cursor = conn.cursor()
cursor.execute('select * from Measurements')
new = cursor.fetchall()
columns = ['Prod_Date','Prod_Time','CCE_SKU']
df = pd.DataFrame(new,columns)
for row in df.itertuples():
cursor.execute('''
insert into MITSF_1.dbo.MeasurementsTest ([Prod_Date],[Prod_Time],[CCE_SKU])
VALUES (?,?,?)
''',
row.Prod_Date,
row.Prod_Time,
row.CCE_SKU
)
conn.commit()
You are using the same cursor to try and execute both the select and the insert, so both of those statements would be operating on the same database. To keep things simple, you should use pandas' read_sql_query() to read the required columns from Access and then use to_sql() to write them to SQL Server:
df = pd.read_sql_query(
"SELECT [Prod_Date],[Prod_Time],[CCE_SKU] FROM Measurements",
conn,
)
from sqlalchemy import create_engine
engine = create_engine(
"mssql+pyodbc://scott:tiger#192.168.0.199/MITSF_1"
"?driver=ODBC+Driver+17+for+SQL+Server",
fast_executemany=True,
)
df.to_sql("MeasurementsTest", engine, schema="dbo",
index=False, if_exists="append",
)

Importing a .sql file in python

I have just started learning SQL and I'm having some difficulties to import my sql file in python.
The .sql file is in my desktop, as well is my .py file.
That's what I tried so far:
import codecs
from codecs import open
import pandas as pd
sqlfile = "countries.sql"
sql = open(sqlfile, mode='r', encoding='utf-8-sig').read()
pd.read_sql_query("SELECT name FROM countries")
But I got the following message error:
TypeError: read_sql_query() missing 1 required positional argument: 'con'
I think I have to create some kind of connection, but I can't find a way to do that. Converting my data to an ordinary pandas DataFrame would help me a lot.
Thank you
This is the code snippet taken from https://www.dataquest.io/blog/python-pandas-databases/ should help.
import pandas as pd
import sqlite3
conn = sqlite3.connect("flights.db")
df = pd.read_sql_query("select * from airlines limit 5;", conn)
Do not read database as an ordinary file. It has specific binary format and special client should be used.
With it you can create connection which will be able to handle SQL queries. And can be passed to read_sql_query.
Refer to documentation often https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_sql_query.html
You need a database connection. I don't know what SQL flavor are you using, but suppose you want to run your query in SQL server
import pyodbc
con = pyodbc.connect(driver='{SQL Server}', server='yourserverurl', database='yourdb', trusted_connection=yes)
then pass the connection instance to pandas
pd.read_sql_query("SELECT name FROM countries", con)
more about pyodbc here
And if you want to query an SQLite database
import sqlite3
con = sqlite3.connect('pathto/example.db')
More about sqlite here

How do I insert my Python dictionary into my SQL Server database table?

I have a dictionary with 3 keys which correspond to field names in a SQL Server table. The values of these keys come from an excel file and I store this dictionary in a dataframe which I now need to insert into a SQL table. This can all be seen in the code below:
import pandas as pd
import pymssql
df=[]
fp = "file path"
data = pd.read_excel(fp,sheetname ="CRM View" )
row_date = data.loc[3, ]
row_sita = "ABZPD"
row_event = data.iloc[12, :]
df = pd.DataFrame({'date': row_date,
'sita': row_sita,
'event': row_event
}, index=None)
df = df[4:]
df = df.fillna("")
print(df)
My question is how do I insert this dictionary into a SQL table now?
Also, as a side note, this code is part of a loop which needs to go through several excel files one by one, insert the data into dictionary then into SQL then delete the data in the dictionary and start again with the next excel file.
You could try something like this:
import MySQLdb
# connect
conn = MySQLdb.connect("127.0.0.1","username","passwore","table")
x = conn.cursor()
# write
x.execute('INSERT into table (row_date, sita, event) values ("%d", "%d", "%d")' % (row_date, sita, event))
# close
conn.commit()
conn.close()
You might have to change it a little based on your SQL restrictions, but should give you a good start anyway.
For the pandas dataframe, you can use the pandas built-in method to_sql to store in db. Following is the way to use it.
import sqlalchemy as sa
params = urllib.quote_plus("DRIVER={};SERVER={};DATABASE={};Trusted_Connection=True;".format("{SQL Server}",
"<db_server_url>",
"<db_name>"))
conn_str = 'mssql+pyodbc:///?odbc_connect={}'.format(params)
engine = sa.create_engine(conn_str)
df.to_sql(<table_name>, engine,schema=<schema_name>, if_exists="append", index=False)
For this method you you will need to install sqlalchemy package.
pip install sqlalchemy
You will also need to setup the MSSql DSN on the machine.

create a table in sqllite by using a dataframe

I'm new to sqllite3 and trying to understand how to create a table in sql environment by using my existing dataframe. I already have a database that I created as "pythonsqlite.db"
#import my csv to python
import pandas as pd
my_data = pd.read_csv("my_input_file.csv")
## connect to database
import sqlite3
conn = sqlite3.connect("pythonsqlite.db")
##push the dataframe to sql
my_data.to_sql("my_data", conn, if_exists="replace")
##create the table
conn.execute(
"""
create table my_table as
select * from my_data
""")
However, when I navigate to my SQLlite studio and check the tables under my database, I cannot see the table I've created. I'd really appreciate if someone tells me what I'm missing here.
I replaced just one part of the code, the 'read_csv' instead I create a small dataframe (see below), I think the issue will be either with the name of your script ( example: pandas.py)
import pandas as pd
# my_data = pd.read_csv("my_input_file.csv")
columns = ['a','b']
my_data = pd.DataFrame([[1, 2], [3, 4]], columns=columns)
## connect to database
import sqlite3
conn = sqlite3.connect("pythonsqlite.db")
##push the dataframe to sql
my_data.to_sql("my_data", conn, if_exists="replace")
##create the table
conn.execute(
"""
create table my_table as
select * from my_data
""")
I ran it and I don't see to have a problem

Categories

Resources