CSV data to two MYSQL tables using Python - python

I need to take data from a csv file and import it into two mysql tables within the same database.
CSV file:
username,password,path
FP_Baby,7tO0Oj/QjRSSs16,FP_Baby
lukebryan,uu00U62SKhO.sgE,lukebryan
saul,r320QdyLJEXKEsQ,saul
jencarlos,LOO07D5ZxpyzMAg,jencarlos
abepark,HUo0/XGUeJ28jaA,abepark
From the CSV file
username and password go into the USERS table
path goes into VFS_PERMISSIONS table
The USERS table looks like
INSERT INTO `USERS` (`userid`, `username`, `password`, `server_group`) VALUES
(23, 'username', 'password', 'MainUsers'),
INSERT INTO `VFS_PERMISSIONS` (`userid`, `path`, `privs`) VALUES
(23, '/path/', '(read)(write)(view)(delete)(resume)(share)(slideshow)(rename)(makedir)(deletedir)'),
if possible I'd like to start the userid in both tables at 24 and increment +1 for each row in the csv.
SO far I can read the csv files but I can't figure out how to insert into two mysql tables.
#!/usr/bin/env python
import csv
import sys
import MySQLdb
conn = MySQLdb.connect(host= "localhost",
user="crushlb",
passwd="password",
db="crushlb")
x = conn.cursor()
f = open(sys.argv[1], 'rt')
try:
reader = csv.reader(f)
for row in reader:
## mysql stuff goes here right?
finally:
f.close()

You can reduce the number of calls to cursor.execute by preparing the arguments in advance (in the loop), and calling cursor.executemany after the loop has completed:
cursor = conn.cursor()
user_args = []
perm_args = []
perms = '(read)(write)(view)(delete)(resume)(share)(slideshow)(rename)(makedir)(deletedir)'
with open(sys.argv[1], 'rt') as f:
for id, row in enumerate(csv.reader(f), start = 24):
username, password, path = row
user_args.append((id, username, password, 'MainUsers'))
perm_args.append((id, path, perms))
insert_users = '''
INSERT IGNORE INTO `USERS`
(`userid`, `username`, `password`, `server_group`)
VALUES (%s, %s, %s, %s)
'''
insert_vfs_permissions = '''
INSERT IGNORE INTO `VFS_PERMISSIONS`
(`userid`, `path`, `privs`)
VALUES (%s, %s, %s)
'''
cursor.executemany(insert_users,user_args)
cursor.executemany(insert_vfs_permissions,perm_args)
INSERT IGNORE tells MySQL to try to insert rows into the MySQL table, but ignore the command if there is a conflict. For example, if userid is the PRIMARY KEY, and there is already a row with the same userid, then the INSERT IGNORE SQL will ignore the command to insert a new row since that would create two rows with the same PRIMARY KEY.
Without the IGNORE, the cursor.executemany command would raise an exception and fail to insert any rows.
I used INSERT IGNORE so you can run the code more than once without cursor.executemany raising an exception.
There is also a INSERT ... ON DUPLICATE KEY UPDATE command which tells MySQL to try to insert a row, but update it if there is a conflict, but I'll leave it at this unless you want to know more about ON DUPLICATE KEY.

Since you already know the sql statements that you wan to execute, it should be more or less straightforward to use the cursor.execute method:
offset = 23
for row_number, row in enumerate(reader):
username, password, path = row
x.execute("INSERT INTO `USERS` (`userid`, `username`, `password`, `server_group`) "
"VALUES (%s, %s, %s, 'MainUsers')", (row_number+offset, username, password))
x.execute("INSERT INTO `VFS_PERMISSIONS` (`userid`, `path`, `privs`) "
"VALUES (%s, %s, '(read)(write)(view)(delete)(resume)(share)(slideshow)(rename)(makedir)(deletedir)'", (row_number+offset, path))

Related

Importing data from CSV to MySQL using python

I am currently working on a schoolproject, and im trying to import data from a CSV file to MySQL using python. This is my code so far:
import mysql.connector
import csv
mydb = mysql.connector.connect(host='127.0.0.1', user='root', password='abc123!', db='jd_university')
cursor = mydb.cursor()
with open('C:/Users/xxxxxx/Downloads/Students.csv') as csvfile:
reader = csv.DictReader(csvfile, delimiter=',')
for row in reader:
cursor.execute('INSERT INTO Student (First_Name, Last_Name, DOB, Username, Password, Phone_nr,'
'Email, StreetName_nr, ZIP) '
'VALUES("%s", "%s", "%s", "%s", "%s", "%s", "%s", "%s", "%s")',
row)
mydb.commit()
cursor.close()
When i run this, i get this error: "mysql.connector.errors.DataError: 1292 (22007): Incorrect date value: '%s' for column 'DOB' at row 1"
The date format used in the CSV file are yyyy-mm-dd
Any tips on this would help greatly!
You don't need to quote the %s placeholders.
Since you're using DictReader, you will need to name the columns in your row expression (or not use DictReader and hope for the correct order, which I'd not do).
Try this:
import mysql.connector
import csv
mydb = mysql.connector.connect(
host="127.0.0.1", user="root", password="abc123!", db="jd_university"
)
cursor = mydb.cursor()
with open("C:/Users/xxxxxx/Downloads/Students.csv") as csvfile:
reader = csv.DictReader(csvfile, delimiter=",")
for row in reader:
values = [
row["First_Name"],
row["Last_Name"],
row["DOB"],
row["Username"],
row["Password"],
row["Phone_nr"],
row["Email"],
row["StreetName_nr"],
row["ZIP"],
]
cursor.execute(
"INSERT INTO Student (First_Name, Last_Name, DOB, Username, Password, Phone_nr,"
"Email, StreetName_nr, ZIP) "
"VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s)",
values,
)
mydb.commit()
cursor.close()
Validate the datatype for DOB field in your data file and database column. Could be a data issue or table definition issue.

MySQL Insert / update from python, data from excel spreadsheet

Upsert to MySQL using python and data from excel.
Im working on populating a MySQL DB, using python.
The data is stored on excel sheets.
Because the DB is suppossed to be used for monitoring "projects", there's a posibility for repeated pk, so in that case it need to be updated instead of insert, because a project can have many stages.
Also, there's a value to be inserted in the DB table, that can't be added from the spreadsheet. So i'm wondering if in that case, the insert of this value, most be done using a separated query for it or if theres a way to insert it in the same query. The value is the supplier ID and needs to be inserted between id_ops and cif_store.
And to finish, I need to perform an inner join, to import the store_id using the store_cif, from another table called store. I know how do it, but im wondering if it also must be executed from a sepparated query or can be performed at the sameone.
So far, i have done this.
import xlrd
import MySQLdb
def insert():
book = xlrd.open_workbook(r"C:\Users\DevEnviroment\Desktop\OPERACIONES.xlsx")
sheet = book.sheet_by_name("Sheet1")
database = MySQLdb.connect (host="localhost", user = "pytest", passwd = "password", db = "opstest1")
cursor = database.cursor()
query = """INSERT INTO operation (id_ops, cif_store, date, client,
time_resp, id_area_service) VALUES (%s, %s, %s, %s, %s, %s)"""
for r in range(1, sheet.nrows):
id_ops = sheet.cell(r,0).value
cif_store = sheet.cell(r,1).value
date = sheet.cell(r,2).value
client = sheet.cell(r,3).value
time_resp = sheet.cell(r,4).value
id_area_service = sheet.cell(r,5).value
values = (id_ops, cif_store, date, client, time_resp, id_area_service)
cursor.execute(query, values)
# Close the cursor
cursor.close()
# Commit the transaction
database.commit()
# Close the database connection
database.close()
# Print results
print ("")
print ("")
columns = str(sheet.ncols)
rows = str(sheet.nrows)
print ("Imported", columns,"columns and", rows, "rows. All Done!")
insert()
What you are looking for is INSERT ... ON DUPLICATE KEY UPDATE ...
Take a look here https://dev.mysql.com/doc/refman/8.0/en/insert-on-duplicate.html
Regarding the extraneous data, if its a static value for all rows you can just hard code it right into the INSERT query. If it's dynamic you'll have to write some additional logic.
For example:
query = """INSERT INTO operation (id_ops, hard_coded_value, cif_store, date, client,
time_resp, id_area_service) VALUES (%s, "my hard coded value", %s, %s, %s, %s, %s)"""

Python: Insert data to database from CSV and then selecting a generated UUID from the table

I have an Excel sheet that is to be inserted into a database. I wrote a Python script, which takes an Excel file, converts it into a CSV, and then inserts it to the database.
The problem is that the database contains two tables, where one of them has a unique ID which is auto generated and gets set when the data is inserted into the table. The other table uses this as a foreign key.
This is how my tables are created:
create table table (
id uuid DEFAULT uuid_generate_v4() PRIMARY KEY NOT NULL,
foo1 varchar(255),
foo2 varchar(255),
foo3 varchar(255),
foo4 varchar(255)
);
create table another_table (
id uuid PRIMARY KEY references table (id)
foo1 varchar(255),
foo2 varchar(255)
);
This is the code I use to insert the data into the database:
with open(csv_file, 'rb') as f:
reader = csv.reader(f, delimiter=',', quoting=csv.QUOTE_NONE)
next(reader)
for row in reader:
cur.execute(
"INSERT INTO table (foo1, foo2, foo3, foo4) VALUES (%s, %s, %s, %s); ",
"INSERT INTO another_table (foo1, foo2) VALUES (%s, %s),
row
)
conn.commit()
This will insert data into the database, but the ID field in another_table will be empty. Does anyone know how I can acquire this ID and put it into the second table?
I was able to solve this myself without doing much tweaks with my code. I had to solve another problem with my code where several values in the csv file where null values, but converting to csv made it look like it was empty strings instead. By using pandas I was able to set all null values to "None", and afterwards cleaning each row before inserting it into the database:
with open(csv_file, 'rb') as f:
reader = csv.reader(f, delimiter=',', quoting=csv.QUOTE_NONE)
next(reader)
for row in reader:
clean_row = []
for x in row:
if x == "None":
clean_row.append(None)
else:
clean_row.append(x)
cur.execute(
"INSERT INTO table (foo1, foo2, foo3, foo4) VALUES (%s, %s, %s, %s); ",
"INSERT INTO another_table (foo1, foo2) VALUES (%s, %s),
clean_row
)
conn.commit()
The values from the csv is now put into an array which I can use in my query to ask table for it' id, like this:
with open(csv_file, 'rb') as f:
reader = csv.reader(f, delimiter=',', quoting=csv.QUOTE_NONE)
next(reader)
for row in reader:
clean_row = []
for x in row:
if x == "None":
clean_row.append(None)
else:
clean_row.append(x)
cur.execute(
"INSERT INTO table (foo1, foo2, foo3, foo4) VALUES (%s, %s, %s, %s); ",
"INSERT INTO another_table (foo1, foo2, id) VALUES (%s, %s, SELECT id FROM table WHERE "
"foo1 = '" + clean_row[0] + "' AND foo2 = '" + clean_row[1] + "')),
clean_row
)
conn.commit()
This will acquire the ID and put it into another_table, and can be done as long as u have unique values in table.

Using external variables in psycopg2 / postgres command

I have thousands of related CSVs and I want to write their contents to a Postgres table in a way that includes metadata about where each row came from.
I am not clear on how to write the variables I created near the top of my script into the table.
Can anyone advise?
target_directory = Path(sys.argv[1]).resolve()
# FOR THE WAC AND RAC DATASETS
for file in target_directory.rglob('*.csv'):
print(str(file.stem).split('_'))
state = str(file.stem).split('_')[0]
data_category = str(file.stem).split('_')[1]
workforce_segment = str(file.stem).split('_')[2] # THIS IS DIFFERENT FROM THE O-D DATASETS
job_type = str(file.stem).split('_')[3]
year = str(file.stem).split('_')[4]
print('Writing: ' + str(file.name))
# MAKE SURE THIS IS THE RIGHT TABLE FOR THE FILES
cur.execute(create_table_WAC)
with open(file,'r') as file_in:
# INSERT THE DATA IN USING THE COLUMN NAMES....SO YOU CAN ADD YOUR SPLIT STRING INFO ABOVE.....
# MAKE SURE THIS HAS THE RIGHT TABLE NAME IN THE COPY STATEMENT
cur.execute("INSERT INTO opendata_uscensus_usa_lodes_wac (serial_id, state_name, data_category, workforce_segment, job_type, year, w_geocode, C000, CA01, CA02, CA03, CE01, CE02) \
VALUES (%s, state_name, data_category, workforce_segment, job_type, year, %s, %s, %s, %s, %s, %s)")
conn.commit()
conn.close()
As per PEP-249 (Python Database API Specification) which most DB-APIs adhere to including pymssql, cx_oracle, ibm_db, pymysql, sqlite3, and pyodbc, in psycopg2 variables to be binded as parameters in prepared statements would go into the second argument of cur.execute(query, params).
Specifically, combine your file level variables with CSV variables during iteration and pass them as a list or tuple of parameters into execution call. Below uses the csv.DictReader method that builds a dictionary of every row from csv data.
NOTE: below query leaves out primary key, serial_id, which should populate via a sequence in Postgres table.
for file in target_directory.rglob('*.csv'):
print(str(file.stem).split('_'))
# FILE LEVEL VARIABLES
state_name = str(file.stem).split('_')[0]
data_category = str(file.stem).split('_')[1]
workforce_segment = str(file.stem).split('_')[2]
job_type = str(file.stem).split('_')[3]
year = str(file.stem).split('_')[4]
# PREPARED STATEMENT
sql = """INSERT INTO opendata_uscensus_usa_lodes_wac
(state_name, data_category, workforce_segment,
job_type, year, w_geocode, C000, CA01, CA02, CA03, CE01, CE02)
VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"""
with open(file,'r') as file_in:
# ITERATE THROUGH FOR CSV VARIABLES
reader = csv.DictReader(file_in)
for row in reader:
cur.execute(sql, (state,data_category,workforce_segment,job_type,year,
row['w_geocode'], row['C000'], row['CA01'],
row['CA02'], row['CA03'], row['CE01'], row['CE02'])
)
conn.commit()

python dictionary mysql not insert

I obtaines a dictionary 'p' from the following code,but cannot able to insert into the mysql database.please help me to insert the datas into database.
dictionary is :[('Casssandraw', 'Cooking'), ('Archanea', 'Playing'), ('Adarshan', 'Programming'), ('Leelal', 'Baking')]
should be stored to Names and Hobby fields.
Name Hobby
Cassandraw Cooking
Archanea Playing
... ...
Program:
import MySQLdb
import re
db = MySQLdb.connect(host="localhost", # your host, usually localhost
user="root", # your username
passwd="mysql", # your password
db="sakila") # n
with open('qwer2.txt','r') as file, db as cursor:
f = open('qwer2.txt', 'r')
lines = f.readlines()
for x in lines:
p=re.findall(r'(?:name is|me)\s+(\w+).*?(?:interest|hobby)\s+is\s+(\w+)',x, re.I)
print p
cursor.execute(
'''INSERT INTO Details (Names, Hobby)
VALUES (%s, %s)''',
(name, hobby))#<-donot know what to provide
db.commit()
It looks like you have a list of tuples containing name/hobby not a dict:
You can unpack the two and insert:
for name, hobby in p: # I am presuming p is the list you posted in your question
cursor.execute(
'''INSERT INTO Details (Names, Hobby)
VALUES (%s, %s)''',
(name, hobby))#<-donot know what to provide
for name,hobby in p:
print name,hobby
Casssandraw Cooking
Archanea Playing
Adarshan Programming
Leelal Baking

Categories

Resources