I want to save my dataframe to SQL Server with pyodbc that updates every month (I want the SQL data contains 300 data with updates everymonth).the problem is every time I run the py file, it gets added instead replace all data. Before I'm using sqlachemy and I can do it with if_exist=replace. Now I'm using pyodbc, I don't know what to do. This is what I do
col_names = ["month", "price", "change"]
df = pd.read_csv("sawit.csv",sep=',',quotechar='\'',encoding='utf8', names=col_names,skiprows = 1) # Replace Excel_file_name with your excel sheet name
for index,row in df.iterrows():
cursor.execute("update dbo.sawit set month = ?, price = ?, change =? ;", (row.month, row.price, row.change))
cnxn.commit()
cursor.close()
cnxn.close()
But the result that I got is the date all replaced with last record. What should I do? Thank you in advance.
There's a much simpler way to do this kind of thing.
import pandas as pd
import pyodbc
from fast_to_sql import fast_to_sql as fts
# Test Dataframe for insertion
df = pd.DataFrame(your_dataframe_here)
# Create a pyodbc connection
conn = pyodbc.connect(
"""
Driver={ODBC Driver 17 for SQL Server};
Server=localhost;
Database=my_database;
UID=my_user;
PWD=my_pass;
"""
)
# If a table is created, the generated sql is returned
create_statement = fts.fast_to_sql(df, "my_great_table", conn, if_exists="replace")
# Commit upload actions and close connection
conn.commit()
conn.close()
Main Function:
fts.fast_to_sql(df, name, conn, if_exists="append", custom=None, temp=False)
Here is a slightly different way to do essentially the same thing.
import pyodbc
engine = "mssql+pyodbc://server_name/db_name?driver=SQL Server Native Client 11.0?trusted_connection=yes"
# your dataframe is here
df.to_sql(name_of_dataframe, engine, if_exists='append', index=True, chunksize=100000)
NOTE: pyodbc will dynamically create the appropriate strongly-types fields in the table for you.
Your sql sql query does not say what entry to be replaced. There is neither a where clause to select the correct line for each entry, neither there is some primary key. So in every loop, all rows are replaced with the current entry. The last time this is done, is with the last entry, therefore every row is replaced with the last entry.
You can add some a where clause looking for the correct month to replaced.
something equivalent to this:
updatedbo.sawit set month = ?, price = ? where month = ?;", (row.month, row.price, row.month)
Related
I have a pretty simple code block that is intended to iterate through the rows of a DataFrame to check if there are any values of the new data matches corresponding values of an SQL table. If so, I run a fetchone() to get the id which is then used update the existing row in SQL, otherwise it inserts all the data as a new row.
The problem I'm having is that the fetchone() query executes and returns the right id. However, in the if clause, I can't get that query to execute. The code compiles and runs but nothing updates in the database.
When i debug, the `query variable is below
query={TextClause}UPDATE projects SET Lead_MD='Stephen', Primary_Deal_Type='Debt', Secondary_Deal_Type='1', Start_Date='2022-06-01' WHERE id=2
I've tried copying that clause into mySQL Workbench, and it updates the table correctly which leaves me even more perplexed. Any help would be appreciated!
Here's my code:
from sqlalchemy import create_engine, text
from sqlupdate import data_frame_from_xlsx_range
df = data_frame_from_xlsx_range(fileloc,'projects_info')
user = 'root'
pw = 'test!*'
db = 'hcftest'
engine = create_engine("mysql+pymysql://{user}:{pw}#localhost:3306/{db}"
.format(user=user, pw=pw, db=db),
echo=True)
# Check if each row in the Excel data already exists in the MySQL table
connection = engine.connect()
for i, row in df.iterrows():
query = text("SELECT id FROM projects WHERE Project_Name='{}' and Client_Name='{}'".format(row["Project_Name"], row["Client_Name"]))
result = connection.execute(query).fetchone()
# If the row already exists, update the remaining columns with the Excel data
if result:
query = text("UPDATE projects SET Lead_MD='{}', Primary_Deal_Type='{}', Secondary_Deal_Type='{}', Start_Date='{}' WHERE id={}".format(row["Lead_MD"], row["Primary_Deal_Type"], row["Secondary_Deal_Type"], row["Start_Date"], result[0]))
connection.execute(query)
# If the row does not exist, insert the Excel data into the MySQL table
else:
query = text("INSERT INTO table_name (Project_Name, Client_Name, Lead_MD, Primary_Deal_Type, Secondary_Deal_Type, Start_Date) VALUES ('{}', '{}', '{}', '{}', '{}', '{}')".format(row["Project_Name"], row["Client_Name"], row["Lead_MD"], row["Primary_Deal_Type"], row["Secondary_Deal_Type"], row["Start_Date"]))
connection.execute(query)
connection.close()
I'm currently trying to query a deltadna database. Their Direct SQL Access guide states that any PostgreSQL ODBC compliant tools should be able to connect without issue. Using the guide, I set up an ODBC data source in windows
I have tried adding Set nocount on, changed various formats for the connection string, changed the table name to be (account).(system).(tablename), all to no avail. The simple query works in Excel and I have cross referenced with how Excel formats everything as well, so it is all the more strange that I get the no query problem.
import pyodbc
conn_str = 'DSN=name'
query1 = 'select eventName from table_name limit 5'
conn = pyodbc.connect(conn_str)
conn.setdecoding(pyodbc.SQL_CHAR,encoding='utf-8')
query1_cursor = conn.cursor().execute(query1)
row = query1_cursor.fetchone()
print(row)
Result is ProgrammingError: No results. Previous SQL was not a query.
Try it like this:
import pyodbc
conn_str = 'DSN=name'
query1 = 'select eventName from table_name limit 5'
conn = pyodbc.connect(conn_str)
conn.setdecoding(pyodbc.SQL_CHAR,encoding='utf-8')
query1_cursor = conn.cursor()
query1_cursor.execute(query1)
row = query1_cursor.fetchone()
print(row)
You can't do the cursor declaration and execution in the same row. Since then your query1_cursor variable will point to a cursor object which hasn't executed any query.
I have a dictionary with 3 keys which correspond to field names in a SQL Server table. The values of these keys come from an excel file and I store this dictionary in a dataframe which I now need to insert into a SQL table. This can all be seen in the code below:
import pandas as pd
import pymssql
df=[]
fp = "file path"
data = pd.read_excel(fp,sheetname ="CRM View" )
row_date = data.loc[3, ]
row_sita = "ABZPD"
row_event = data.iloc[12, :]
df = pd.DataFrame({'date': row_date,
'sita': row_sita,
'event': row_event
}, index=None)
df = df[4:]
df = df.fillna("")
print(df)
My question is how do I insert this dictionary into a SQL table now?
Also, as a side note, this code is part of a loop which needs to go through several excel files one by one, insert the data into dictionary then into SQL then delete the data in the dictionary and start again with the next excel file.
You could try something like this:
import MySQLdb
# connect
conn = MySQLdb.connect("127.0.0.1","username","passwore","table")
x = conn.cursor()
# write
x.execute('INSERT into table (row_date, sita, event) values ("%d", "%d", "%d")' % (row_date, sita, event))
# close
conn.commit()
conn.close()
You might have to change it a little based on your SQL restrictions, but should give you a good start anyway.
For the pandas dataframe, you can use the pandas built-in method to_sql to store in db. Following is the way to use it.
import sqlalchemy as sa
params = urllib.quote_plus("DRIVER={};SERVER={};DATABASE={};Trusted_Connection=True;".format("{SQL Server}",
"<db_server_url>",
"<db_name>"))
conn_str = 'mssql+pyodbc:///?odbc_connect={}'.format(params)
engine = sa.create_engine(conn_str)
df.to_sql(<table_name>, engine,schema=<schema_name>, if_exists="append", index=False)
For this method you you will need to install sqlalchemy package.
pip install sqlalchemy
You will also need to setup the MSSql DSN on the machine.
So, after coding with pyodbc for a couple days now, I've run into a road block it seems. My SQL update will not work, even after putting autocommit=True in the connection statement. Nothing changes in the database at all. All my code is provided below. Please help. (I am using the 2016 version of MS Access, code runs with no errors, 32 bit Python and Access.)
import pyodbc
# Connect to the Microsoft Access Database
conn_str = (
r'DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};'
r'DBQ=C:\Users\User_Name\Desktop\Databse\CPLM.accdb'
)
cnxn = pyodbc.connect(conn_str, autocommit=True)
crsr = cnxn.cursor()
crsr2 = cnxn.cursor()
# SQL code used for the for statement
SQL = "SELECT NameProject, Type, Date, Amount, ID FROM InvoiceData WHERE Type=? OR Type=? OR Type IS NULL AND ID > ?"
# Defining variables
date = ""
projectNumber = 12.04
numberDate = []
# Main Code, for each row in the SQL query, update the table
for row in crsr.execute(SQL, "Invoice", "Deposit", "1"):
print (projectNumber)
if row.NameProject is not None:
crsr2.execute("UPDATE Cimt SET LastInvoice='%s' WHERE Num='%s'" % (date, projectNumber))
cnxn.commit()
# Just used to find where to input certain data.
# I also know all the code in this if statement completes due to outside testing
projectNumber = row.NameProject[:5]
numberDate.append([projectNumber, date])
else:
date = row.Date
print(numberDate)
crsr.commit()
cnxn.commit()
cnxn.close()
I have the following python code, it reads through a text file line by line and takes characters x to y of each line as the variable "Contract".
import os
import pyodbc
cnxn = pyodbc.connect(r'DRIVER={SQL Server};CENSORED;Trusted_Connection=yes;')
cursor = cnxn.cursor()
claimsfile = open('claims.txt','r')
for line in claimsfile:
#ldata = claimsfile.readline()
contract = line[18:26]
print(contract)
cursor.execute("USE calms SELECT XREF_PLAN_CODE FROM calms_schema.APP_QUOTE WHERE APPLICATION_ID = "+str(contract))
print(cursor.fetchall())
When including the line cursor.fetchall(), the following error is returned:
Programming Error: Previous SQL was not a query.
The query runs in SSMS and replace str(contract) with the actual value of the variable results will be returned as expected.
Based on the data, the query will return one value as a result formatted as NVARCHAR(4).
Most other examples have variables declared prior to the loop and the proposed solution is to set NO COUNT on, this does not apply to my problem so I am slightly lost.
P.S. I have also put the query in its own standalone file without the loop to iterate through the file in case this was causing the problem without success.
In your SQL query, you are actually making two commands: USE and SELECT and the cursor is not set up with multiple statements. Plus, with database connections, you should be selecting the database schema in the connection string (i.e., DATABASE argument), so TSQL's USE is not needed.
Consider the following adjustment with parameterization where APPLICATION_ID is assumed to be integer type. Add credentials as needed:
constr = 'DRIVER={SQL Server};SERVER=CENSORED;Trusted_Connection=yes;' \
'DATABASE=calms;UID=username;PWD=password'
cnxn = pyodbc.connect(constr)
cur = cnxn.cursor()
with open('claims.txt','r') as f:
for line in f:
contract = line[18:26]
print(contract)
# EXECUTE QUERY
cur.execute("SELECT XREF_PLAN_CODE FROM APP_QUOTE WHERE APPLICATION_ID = ?",
[int(contract)])
# FETCH ROWS ITERATIVELY
for row in cur.fetchall():
print(row)
cur.close()
cnxn.close()