Upsert to MySQL using python and data from excel.
Im working on populating a MySQL DB, using python.
The data is stored on excel sheets.
Because the DB is suppossed to be used for monitoring "projects", there's a posibility for repeated pk, so in that case it need to be updated instead of insert, because a project can have many stages.
Also, there's a value to be inserted in the DB table, that can't be added from the spreadsheet. So i'm wondering if in that case, the insert of this value, most be done using a separated query for it or if theres a way to insert it in the same query. The value is the supplier ID and needs to be inserted between id_ops and cif_store.
And to finish, I need to perform an inner join, to import the store_id using the store_cif, from another table called store. I know how do it, but im wondering if it also must be executed from a sepparated query or can be performed at the sameone.
So far, i have done this.
import xlrd
import MySQLdb
def insert():
book = xlrd.open_workbook(r"C:\Users\DevEnviroment\Desktop\OPERACIONES.xlsx")
sheet = book.sheet_by_name("Sheet1")
database = MySQLdb.connect (host="localhost", user = "pytest", passwd = "password", db = "opstest1")
cursor = database.cursor()
query = """INSERT INTO operation (id_ops, cif_store, date, client,
time_resp, id_area_service) VALUES (%s, %s, %s, %s, %s, %s)"""
for r in range(1, sheet.nrows):
id_ops = sheet.cell(r,0).value
cif_store = sheet.cell(r,1).value
date = sheet.cell(r,2).value
client = sheet.cell(r,3).value
time_resp = sheet.cell(r,4).value
id_area_service = sheet.cell(r,5).value
values = (id_ops, cif_store, date, client, time_resp, id_area_service)
cursor.execute(query, values)
# Close the cursor
cursor.close()
# Commit the transaction
database.commit()
# Close the database connection
database.close()
# Print results
print ("")
print ("")
columns = str(sheet.ncols)
rows = str(sheet.nrows)
print ("Imported", columns,"columns and", rows, "rows. All Done!")
insert()
What you are looking for is INSERT ... ON DUPLICATE KEY UPDATE ...
Take a look here https://dev.mysql.com/doc/refman/8.0/en/insert-on-duplicate.html
Regarding the extraneous data, if its a static value for all rows you can just hard code it right into the INSERT query. If it's dynamic you'll have to write some additional logic.
For example:
query = """INSERT INTO operation (id_ops, hard_coded_value, cif_store, date, client,
time_resp, id_area_service) VALUES (%s, "my hard coded value", %s, %s, %s, %s, %s)"""
Related
question answered
I am attempting to insert a value into my table where it selects an ID by matching it with its name and inserting it into a second table. However when I print out the selected value it prints [(1,)] which is classed as a list. So when it tries to insert the value into a table I receive the error that im trying to insert a list when I just want the value 1.
the code for it is in python and its shown below:
def createaudit():
sitename2_info = sitename.get()
print(sitename2_info, "testing2")
name2_info = name2.get()
print(name2_info)
name3_info = name3.get()
print(name3_info)
# Sql code for writing the data that was written in the regsitering page.
cursor = cnn.cursor()
# the site query matches the inputted username with the corresponding userID and inserts the userID into userID_fk
siteIDQuery = "SELECT siteID FROM Sites WHERE siteName = %s"
cursor.execute(siteIDQuery, [sitename2_info])
siteID_fetch = cursor.fetchall()
print(siteID_fetch)
sitequery = "INSERT INTO `audit`(`siteID_fk`, `auditor1`, `auditor2`) VALUES (%s, %s, %s)"
sitequery_vals = (siteID_fetch, name2_info, name3_info)
cursor.execute(sitequery, sitequery_vals)
# prints how many rows were inserted to make sure values are put into the database
print(cursor.rowcount)
cnn.commit()
cursor.close()
cnn.close()
I have a table in mysql and i am inserting data in it from a python client.
I am using a insert query to insert data into the table
code
sql_insert_query = """ INSERT INTO Data
(`deviceID`,`date`,`timestamp`,`counter`,`rssi`,
`CO2 Sensor Value`,
`Supply DPT`,
`block`,
`floor`)
VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s)"""
connection = mysql.connector.connect(host='localhost',database='minniedb',user='',
password='',auth_plugin='mysql_native_password')
cursor = connection.cursor()
"""
create 'insert_tuple' based on some api calls
"""
cursor.execute(sql_insert_query,insert_tuple)
connection.commit()
cursor.close()
connection.close()
print('inserted in db')
This works fine when everything is static. I have a case when the number of columns in my table is around 60-70 and the parameters I get from api is a subset of columns(around 10-15) and these parameters can change every time. The api returns the column name and the value.
Sample return from api can be of form
{
'deviceID':20,
'counter' :61,
'block' :'A'
}
or it can be
{
'deviceID' :25,
'CO2 Sensor Value':600,
'floor' : 5
}
How do i write a query in such case to insert whatever data i received from api in the respective columns and have others as null.
You can try like this :
sensor_data = {
'deviceID':20,
'counter' :61,
'block' :'A'
}
sql_insert_query = """ INSERT INTO Data {} VALUES {}""".format(tuple(sensor_data.keys()), tuple(sensor_data.values()))
P.S: For sensor data I'll suggest using google Firebase :)
I have thousands of related CSVs and I want to write their contents to a Postgres table in a way that includes metadata about where each row came from.
I am not clear on how to write the variables I created near the top of my script into the table.
Can anyone advise?
target_directory = Path(sys.argv[1]).resolve()
# FOR THE WAC AND RAC DATASETS
for file in target_directory.rglob('*.csv'):
print(str(file.stem).split('_'))
state = str(file.stem).split('_')[0]
data_category = str(file.stem).split('_')[1]
workforce_segment = str(file.stem).split('_')[2] # THIS IS DIFFERENT FROM THE O-D DATASETS
job_type = str(file.stem).split('_')[3]
year = str(file.stem).split('_')[4]
print('Writing: ' + str(file.name))
# MAKE SURE THIS IS THE RIGHT TABLE FOR THE FILES
cur.execute(create_table_WAC)
with open(file,'r') as file_in:
# INSERT THE DATA IN USING THE COLUMN NAMES....SO YOU CAN ADD YOUR SPLIT STRING INFO ABOVE.....
# MAKE SURE THIS HAS THE RIGHT TABLE NAME IN THE COPY STATEMENT
cur.execute("INSERT INTO opendata_uscensus_usa_lodes_wac (serial_id, state_name, data_category, workforce_segment, job_type, year, w_geocode, C000, CA01, CA02, CA03, CE01, CE02) \
VALUES (%s, state_name, data_category, workforce_segment, job_type, year, %s, %s, %s, %s, %s, %s)")
conn.commit()
conn.close()
As per PEP-249 (Python Database API Specification) which most DB-APIs adhere to including pymssql, cx_oracle, ibm_db, pymysql, sqlite3, and pyodbc, in psycopg2 variables to be binded as parameters in prepared statements would go into the second argument of cur.execute(query, params).
Specifically, combine your file level variables with CSV variables during iteration and pass them as a list or tuple of parameters into execution call. Below uses the csv.DictReader method that builds a dictionary of every row from csv data.
NOTE: below query leaves out primary key, serial_id, which should populate via a sequence in Postgres table.
for file in target_directory.rglob('*.csv'):
print(str(file.stem).split('_'))
# FILE LEVEL VARIABLES
state_name = str(file.stem).split('_')[0]
data_category = str(file.stem).split('_')[1]
workforce_segment = str(file.stem).split('_')[2]
job_type = str(file.stem).split('_')[3]
year = str(file.stem).split('_')[4]
# PREPARED STATEMENT
sql = """INSERT INTO opendata_uscensus_usa_lodes_wac
(state_name, data_category, workforce_segment,
job_type, year, w_geocode, C000, CA01, CA02, CA03, CE01, CE02)
VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"""
with open(file,'r') as file_in:
# ITERATE THROUGH FOR CSV VARIABLES
reader = csv.DictReader(file_in)
for row in reader:
cur.execute(sql, (state,data_category,workforce_segment,job_type,year,
row['w_geocode'], row['C000'], row['CA01'],
row['CA02'], row['CA03'], row['CE01'], row['CE02'])
)
conn.commit()
I have a MySQL Table named TBLTEST with two columns ID and qSQL. Each qSQL has SQL queries in it.
I have another table FACTRESTTBL.
There are 10 rows in the table TBLTEST.
For example, On TBLTEST lets take id =4 and qSQL ="select id, city, state from ABC".
How can I insert into the FACTRESTTBL from TBLTEST using python, may be using dictionary?
Thx!
You can use MySQLdb for Python.
Sample code (you'll need to debug it as I have no way of running it here):
#!/usr/bin/python
import MySQLdb
# Open database connection
db = MySQLdb.connect("localhost","testuser","test123","TESTDB" )
# prepare a cursor object using cursor() method
cursor = db.cursor()
# Select qSQL with id=4.
cursor.execute("SELECT qSQL FROM TBLTEST WHERE id = 4")
# Fetch a single row using fetchone() method.
results = cursor.fetchone()
qSQL = results[0]
cursor.execute(qSQL)
# Fetch all the rows in a list of lists.
qSQLresults = cursor.fetchall()
for row in qSQLresults:
id = row[0]
city = row[1]
#SQL query to INSERT a record into the table FACTRESTTBL.
cursor.execute('''INSERT into FACTRESTTBL (id, city)
values (%s, %s)''',
(id, city))
# Commit your changes in the database
db.commit()
# disconnect from server
db.close()
I need to take data from a csv file and import it into two mysql tables within the same database.
CSV file:
username,password,path
FP_Baby,7tO0Oj/QjRSSs16,FP_Baby
lukebryan,uu00U62SKhO.sgE,lukebryan
saul,r320QdyLJEXKEsQ,saul
jencarlos,LOO07D5ZxpyzMAg,jencarlos
abepark,HUo0/XGUeJ28jaA,abepark
From the CSV file
username and password go into the USERS table
path goes into VFS_PERMISSIONS table
The USERS table looks like
INSERT INTO `USERS` (`userid`, `username`, `password`, `server_group`) VALUES
(23, 'username', 'password', 'MainUsers'),
INSERT INTO `VFS_PERMISSIONS` (`userid`, `path`, `privs`) VALUES
(23, '/path/', '(read)(write)(view)(delete)(resume)(share)(slideshow)(rename)(makedir)(deletedir)'),
if possible I'd like to start the userid in both tables at 24 and increment +1 for each row in the csv.
SO far I can read the csv files but I can't figure out how to insert into two mysql tables.
#!/usr/bin/env python
import csv
import sys
import MySQLdb
conn = MySQLdb.connect(host= "localhost",
user="crushlb",
passwd="password",
db="crushlb")
x = conn.cursor()
f = open(sys.argv[1], 'rt')
try:
reader = csv.reader(f)
for row in reader:
## mysql stuff goes here right?
finally:
f.close()
You can reduce the number of calls to cursor.execute by preparing the arguments in advance (in the loop), and calling cursor.executemany after the loop has completed:
cursor = conn.cursor()
user_args = []
perm_args = []
perms = '(read)(write)(view)(delete)(resume)(share)(slideshow)(rename)(makedir)(deletedir)'
with open(sys.argv[1], 'rt') as f:
for id, row in enumerate(csv.reader(f), start = 24):
username, password, path = row
user_args.append((id, username, password, 'MainUsers'))
perm_args.append((id, path, perms))
insert_users = '''
INSERT IGNORE INTO `USERS`
(`userid`, `username`, `password`, `server_group`)
VALUES (%s, %s, %s, %s)
'''
insert_vfs_permissions = '''
INSERT IGNORE INTO `VFS_PERMISSIONS`
(`userid`, `path`, `privs`)
VALUES (%s, %s, %s)
'''
cursor.executemany(insert_users,user_args)
cursor.executemany(insert_vfs_permissions,perm_args)
INSERT IGNORE tells MySQL to try to insert rows into the MySQL table, but ignore the command if there is a conflict. For example, if userid is the PRIMARY KEY, and there is already a row with the same userid, then the INSERT IGNORE SQL will ignore the command to insert a new row since that would create two rows with the same PRIMARY KEY.
Without the IGNORE, the cursor.executemany command would raise an exception and fail to insert any rows.
I used INSERT IGNORE so you can run the code more than once without cursor.executemany raising an exception.
There is also a INSERT ... ON DUPLICATE KEY UPDATE command which tells MySQL to try to insert a row, but update it if there is a conflict, but I'll leave it at this unless you want to know more about ON DUPLICATE KEY.
Since you already know the sql statements that you wan to execute, it should be more or less straightforward to use the cursor.execute method:
offset = 23
for row_number, row in enumerate(reader):
username, password, path = row
x.execute("INSERT INTO `USERS` (`userid`, `username`, `password`, `server_group`) "
"VALUES (%s, %s, %s, 'MainUsers')", (row_number+offset, username, password))
x.execute("INSERT INTO `VFS_PERMISSIONS` (`userid`, `path`, `privs`) "
"VALUES (%s, %s, '(read)(write)(view)(delete)(resume)(share)(slideshow)(rename)(makedir)(deletedir)'", (row_number+offset, path))