Update SQL Records Based on Pandas DataFrame - python

I have established connection with SQL using below code and have extracted the data from SQL table, converted into dataframe and ran the predictive model. I have the output generated and want to add the values of output column alone in the database based on Unique ID column.
server = 'Servername'
database = 'DBName'
username = 'username'
password = 'password'
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
sql ='SELECT * FROM TableName'
DF= pd.read_sql(sql,cnxn)
I have columns 'UniqueID','Description','Date','Predicted' in dataframe 'DF' which is retrieved from database. I have predicted the output 'Predicted' and is available in my dataframe. I need to overwrite back only the value in 'Predicted' column of the database based on UniqueID.
Please let me know if there is any way out or we can just overwrite complete dataframe to database table.

The best method I've found is to take advantage of an SQL inner join and temporary tables to update the values. This works well if you need to update many records in SQL.
Apologies if there are any errors here as I'm borrowing this from a class I've written.
SQL Cursor
cursor = cnxn.cursor()
# reduce number of calls to server on inserts
cursor.fast_executemany = True
Insert Values into a Temporary Table
# insert only the key and the updated values
subset = DF[['UniqueID','Predicted']]
# form SQL insert statement
columns = ", ".join(subset.columns)
values = '('+', '.join(['?']*len(subset.columns))+')'
# insert
statement = "INSERT INTO #temp_TableName ("+columns+") VALUES "+values
insert = [tuple(x) for x in subset.values]
cursor.executemany(statement, insert)
Update Values in Main Table from Temporary Table
statement = '''
UPDATE
TableName
SET
u.Predicted
FROM
TableName AS t
INNER JOIN
#temp_TableName AS u
ON
u.UniqueID=t.UnqiueID;
'''
cursor.execute(statement)

Related

How to update a column based on counts from another table?

I need to update the table actor, column numCharacters, depending on how many times each actor's actorID shows up on the characters table.
I have the following code:
cursor = connection.cursor()
statement = 'UPDATE actor SET numCharacters = (SELECT count(*) FROM characters GROUP BY actorID)';
cursor.execute(statement);
connection.commit()
Does anyone know how I could complete it?
I think your problem comes from your sub query will return multiple row, so the update statement won't know which row to update. Try updating your query to this:
UPDATE actor a
SET a.numCharacter = (
SELECT count(*)
from characters c
WHERE actorId = a.id
Group by actorId
);
db fiddle link

Put cursor.fetchone to specific row?

I'm working on a project where I need to get data from my SQL Server, but there is a catch. In total there is around 100.000 rows in the specific column I need the data out of but I only need the last 20.000 - 30.000 rows of it.
I use the casual connection string and stored procedure but is there a way to select a specific row to start from? (for example let it start at row 70.000)
try:
CONNECTION_STRING = 'DRIVER='+driver+';SERVER='+server+';DATABASE='+databaseName+';UID='+username+';PWD='+ password
conn = pyodbc.connect(CONNECTION_STRING)
cursor = conn.cursor()
storedproc = "*"
cursor.execute(storedproc)
row = cursor.fetchone()
while row:
OID = ((int(row[1])))
print(OID)
So my question: is there a way (for example) set cursor.fetchone to row 70.000 instead of 1? Or is there another way to do that?
Thanks in advance!

In python script i have insert query but when i want insert multiple columns in the same query it gives error

In python script i have insert query but when i want insert multiple columns in the same query it gives error.
but for single query it works perfectly.
Below is my code.
my database AWS S3.
A = [] #
for score_row in score:
A.append(score_row[2])
print("A=",A)
B = [] #
for day_row in score:
B.append(day_row[1])
print("B=",B)
for x,y in zip(A,B):
sql = """INSERT INTO calculated_corr_coeff(date,Day) VALUES (?,?)"""
cursor.executemany(sql, (x,),(y,))
when i replace above query with following sql insert statement it works perfect.
sql = """INSERT INTO calculated_corr_coeff(date,Day) VALUES (?)"""
cursor.executemany(sql, (x,))
Fix your code like this:
sql = """INSERT INTO calculated_corr_coeff(date,Day) VALUES (?,?)"""
cursor.execute(sql, (x,y,)) #<-- here
Because is just a onet insert ( not several inserts )
Explanation
I guess you are mistaked about number of inserts ( rows ) and number of parĂ meters ( fields to insert on each row ). When you want to insert several rows, use executemany, just for one row you should to use execute. Second parapeter of execute is the "list" (or sequence ) of values to be inserted in this row.
Alternative
You can try to change syntax and insert all data in one shot using ** syntax:
values = zip(A,B) #instead of "for"
sql = """INSERT INTO calculated_corr_coeff(date,Day) VALUES (?,?)"""
cursor.executemany(sql, **values )
Notice this approach don't use for statement. This mean all data is send to database in one call, this is more efficient.

psycopg2 interpolate table name in executemany statement

I am trying to insert data into a table. The table is determined in the beging of the program and remains constant throughout. How do I interpolate the table name in an execute many statement like the one below?
tbl = 'table_name'
rows = [{'this':x, 'that': x+1} for x in range(10)]
cur.executemany("""INSERT INTO %(tbl)s
VALUES(
%(this)s,
%(that)s
)""", rows)
As stated in the official documentation: "Only query values should be bound via this method: it shouldn’t be used to merge table or field names to the query. If you need to generate dynamically an SQL query (for instance choosing dynamically a table name) you can use the facilities provided by the psycopg2.sql module."
It has the following syntax:
from psycopg2 import sql
tbl = 'table_name'
rows = [{'this':x, 'that': x+1} for x in range(10)]
cur.execute(
sql.SQL("INSERT INTO {} VALUES (%(this)s, %(that)s);"""")
.format(sql.Identifier(tbl)), rows)
More on http://initd.org/psycopg/docs/sql.html#module-psycopg2.sql

Retrieving and selecting binary values from Mysql with Python 3

I'm trying to select data from one table, and perform a query on another table using the returned values from the first table.
Both tables are case-sensitive, and of type utf8-bin.
When I perform my first select, I am returned a tuple of binary values:
query = """SELECT id FROM table1"""
results = (b'1234', b'2345', b'3456')
I'd then like to perform a query on table2 using the ids returned from table1:
query = """SELECT element FROM table2 WHERE id IN (%s) """ % results
Is this the right way to do this?
You need to create the query so that it can be properly parameterized:
query = """SELECT element FROM table2 WHERE id IN (%s) """ % ",".join(['%s'] * len(results))
This will transform the query to:
query = """SELECT element FROM table2 WHERE id IN (%s,%s,%s) """
Then you can just pass query and results to the execute() (or appropriate) method so that results are properly parameterized.

Categories

Resources