cx_oracle reading from CSV - python

I have a cx_oracle connection and I am looking to run a 'batch' of sorts trying to gather ids from last names from a CSV file. Below I have my code, in which I am getting a cx_Oracle.DatabaseError: ORA-01756: quoted string not properly terminated error.
It is pointing to the line
and spriden_change_ind is null'''.format(lname,fname)
However I know this is working as you will see my commented code uses the format in this way and it works just fine. The rows_to_dict_list is a nice function I found here sometime ago to basically add the column names to the output.
Any direction would be nice! thank you
import csv, cx_Oracle
def rows_to_dict_list(cursor):
columns = [i[0] for i in cursor.description]
new_list = []
for row in cursor:
row_dict = dict()
for col in columns:
row_dict[col] = row[columns.index(col)]
new_list.append(row_dict)
return new_list
connection = cx_Oracle.connect('USERNAME','PASSWORD','HOSTNAME:PORTNUMBER/SERVICEID')
cur = connection.cursor()
printHeader = True
with open('nopnumber_names.csv')as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
lname = row['Last']
fname = row['First']
cur.execute('''select spriden_pidm as PIDM,
spriden_last_name as Last,
spriden_first_name as First,
spriden_mi as Middle,
spriden_ID as ID
from spriden
where upper(spriden_last_name) = '{0}'
and upper(spriden_first_name) = '{1}'
and spriden_change_ind is null'''.format(lname,fname)
)
# THIS RECORD RUNS FINE
# cur.execute('''select spriden_pidm as PIDM,
# spriden_ID as ID,
# spriden_last_name as Last,
# spriden_first_name as First
# from spriden
# where spriden_pidm = '{}'
# and spriden_change_ind is null'''.format(99999)
# )
data = rows_to_dict_list(cur)
for row in data:
print row
cur.close()
connection.close()

My best guess is that a first name or surname somewhere in your CSV file has a ' character in it.
You really shouldn't be building SQL by concatenating strings or using string formatting. You are at risk of SQL injection if you do so. What happens if someone has put a record in your CSV file with surname X' OR 1=1 --?
Instead, use bind parameters to send the values of your variables to the database. Try the following:
cur.execute('''select spriden_pidm as PIDM,
spriden_last_name as Last,
spriden_first_name as First,
spriden_mi as Middle,
spriden_ID as ID
from spriden
where upper(spriden_last_name) = :lname
and upper(spriden_first_name) = :fname
and spriden_change_ind is null''',
{"lname": lname, "fname": fname}
)

Related

SQLITE3: Find which one are the double records when importing from excel

I'm using python to import from Excel information to my sqlite3 database. I need to avoid the same record to be added twice, but I'd also need to know which are the records added succeffully and which are the one that were already presented in the database. This is how I'm importing with pandas:
def import(self):
file = filedialog.askopenfilename(filetypes=(("Excel File", "*.csv"), ("All Files","*.*")), title="Select Excel")
with open(file, "r", encoding='utf-8') as f:
conn = sqlite3.connect('database/save.db')
double = []
ok = []
c = conn.cursor()
c.execute("SELECT rowid, * FROM record")
r = c.fetchall()
imported = pd.read_csv(f, keep_default_na=False, sep=';')
imported.to_sql('record', conn, if_exists='replace', index = False)
for row in r:
if row in imported:
double.append(row)
else:
ok.append(row)
conn.commit()
conn.close()
Using the for loop I'm able to save the duplicate row to the list but not the one inserted correctly which is always empty
Thanks for the help!

Pandas read_table columns with multiple lines

I am working with a text file (ClassTest.txt) and pandas. The text file has 3, tab-separated columns: Title, Description, and Category - Title and Description are normal strings and Category is a (non-zero) integer.
I was gathering the data as follows:
data = pd.read_table('ClassTest.txt')
feature_names = ['Title', 'Description']
X = data[feature_names]
y = data['Category']
However, because values in the Description column can themselves contain new lines, the 'y' DataFrame contains too many rows because of most of the items in the Description column having multiple lines. I attempted to get around this by making the newline character in the file to be '|' (by repopulating it) and using:
data = pd.read_table('ClassTest.txt', lineterminator='|')
X = data[feature_names]
y = data['Category']
This time, I get the error:
pandas.errors.ParserError: Error tokenizing data. C error: Expected 3 fields in line 20, saw 5
Can anyone help me with this issue?
EDIT: Adding previous code
con = lite.connect('JobDetails.db')
cur = con.cursor()
cur.execute('''SELECT Title, Description, Category FROM ReviewJobs''')
results = [list(each) for each in cur.fetchall()]
cur.execute('''SELECT Title, Description, Category FROM Jobs''')
for each in cur.fetchall():
results.append(list(each))
a = open('ClassTest.txt', 'ab')
newLine = "|"
a.write(u''.join(c for c in 'Title\tDescription\tCategory' + newLine).encode('utf-8'))
for r in results:
toWrite = "".encode('utf-8')
title = u''.join(c for c in r[0].replace("\n", " ")).encode('utf-8') + "\t".encode('utf-8')
description = u''.join(c for c in r[1]).encode('utf-8') + "\t".encode('utf-8')
toWrite += title + description
toWrite += str(r[2]).encode('utf-8') + newLine.encode('utf-8')
a.write(toWrite)
a.close()
pandas.read_table() is deprecated – use read_csv() instead. And then really use the CSV format instead of writing lots of code to write something similar that can't cope with record or field delimiters within fields. There's the csv module in the Python standard library.
Opening the file as text file and passing the encoding to open() spares you from encoding everything yourself in different places.
#!/usr/bin/env python3
from contextlib import closing
import csv
import sqlite3
def main():
with sqlite3.connect("JobDetails.db") as connection:
with closing(connection.cursor()) as cursor:
#
# TODO Having two tables with the same columns for essentially
# the same kind of records smells like a broken DB design.
#
rows = list()
for table_name in ["reviewjobs", "jobs"]:
cursor.execute(
f"SELECT title, description, category FROM {table_name}"
)
rows.extend(cursor.fetchall())
with open("ClassTest.txt", "a", encoding="utf8") as csv_file:
writer = csv.writer(csv_file, delimiter="\t")
writer.write(["Title", "Description", "Category"])
for title, description, category in rows:
writer.writerows([title.replace("\n", " "), description, category])
if __name__ == "__main__":
main()
And the in the other program:
data = pd.read_csv("ClassTest.txt", delimiter="\t")

pysqlite get column names from csv file

I have a csv file from which I am trying to load data into pysqlite database. I am not able to find a way to extract the first row of the file and get it into the database automatically as column headers of a table. I have to enter the names "manually" in the code itself, which is ok for 1-2 columsn but becomes cumbersome with tens or hundreds of columns. Here is my code:
import sqlite3
import csv
f_n = 'test_data_1.csv'
f = open( f_n , 'r' )
csv_reader = csv.reader(f)
header = next( csv_reader )
sqlite_file = 'survey_test_db.sqlite'
table_name01 = 'test_survey_1'
field_01 = 'analyst_name'
field_type_01 = 'text'
field_02 = 'question_name'
field_type_02 = 'text'
conn = sqlite3.connect( sqlite_file )
c = conn.cursor()
c.execute('CREATE TABLE {tn}({nf_01} {ft_01},{nf_02} {ft_02})'\
.format(tn = table_name01 , nf_01 = field_01 , ft_01 = field_type_01, nf_02 = field_02 , ft_02 = field_type_02 ))
for row in csv_reader:
c.execute("INSERT INTO test_survey_1 VALUES (?,?)",row)
f.close()
for row in c.execute('SELECT * FROM test_survey_1'):
print(row)
conn.commit()
conn.close()
c.execute('CREATE TABLE {tn}({fieldlist})'.format(
tn=table_name01,
fieldlist=', '.join('{} TEXT'.format(name) for name in header),
))
Or use a ORM which is designed to make this sort of thing easy. SQLAlchemy example:
t = Table(table_name01, meta, *(Column(name, String()) for name in header))
t.create()
You can use pandas to read your csv file into DataFrame and then export
it to sqlite.
import sqlite3
import pandas as pd
sqlite_file = 'survey_test_db.sqlite'
table_name01 = 'test_survey_1'
conn = sqlite3.connect(sqlite_file)
pd.read_csv('test_data_1.csv').to_sql(table_name01, con=con)

I seem to only be able to create a 3 column table with python-docx

I want to create a word file using python-docx.
This word file will have a table where cells are populated by a sqlite3 query.
it all works fine with 1-3 columns, but when i try to add a fourth it does not save anymore. I can add a fourth heading, but as soon as i populate the cells for that fourth heading it does not save anymore. I have tried looking up this problem, but could not find anything. So my question to you is: Is my code correct? Am I overlooking something? If so, what?
from docx import Document
import sqlite3
document = Document()
conn = sqlite3.connect('Database.db')
c = conn.cursor()
table = document.add_table(1, 4)
heading_cells = table.rows[0].cells
heading_cells[0].text = 'Description'
heading_cells[1].text = 'Hours/units'
heading_cells[2].text = 'Rate'
heading_cells[3].text = 'Total' #all works well when this is active
Records = (c.execute("SELECT * FROM My_List")).fetchall()
for item in Records:
cells = table.add_row().cells
cells[0].text = str(item[1])
cells[1].text = str(item[2])
cells[2].text = str(item[3])
cells[3].text = str(item[4]) #nothing works when this is active
document.save('Demo.docx')

Better ways to print out column names when using cx_Oracle

Found an example using cx_Oracle, this example shows all the information of Cursor.description.
import cx_Oracle
from pprint import pprint
connection = cx_Oracle.Connection("%s/%s#%s" % (dbuser, dbpasswd, oracle_sid))
cursor = cx_Oracle.Cursor(connection)
sql = "SELECT * FROM your_table"
cursor.execute(sql)
data = cursor.fetchall()
print "(name, type_code, display_size, internal_size, precision, scale, null_ok)"
pprint(cursor.description)
pprint(data)
cursor.close()
connection.close()
What I wanted to see was the list of Cursor.description[0](name), so I changed the code:
import cx_Oracle
import pprint
connection = cx_Oracle.Connection("%s/%s#%s" % (dbuser, dbpasswd, oracle_sid))
cursor = cx_Oracle.Cursor(connection)
sql = "SELECT * FROM your_table"
cursor.execute(sql)
data = cursor.fetchall()
col_names = []
for i in range(0, len(cursor.description)):
col_names.append(cursor.description[i][0])
pp = pprint.PrettyPrinter(width=1024)
pp.pprint(col_names)
pp.pprint(data)
cursor.close()
connection.close()
I think there will be better ways to print out the names of columns. Please get me alternatives to the Python beginner. :-)
You can use list comprehension as an alternative to get the column names:
col_names = [row[0] for row in cursor.description]
Since cursor.description returns a list of 7-element tuples you can get the 0th element which is a column name.
Here the code.
import csv
import sys
import cx_Oracle
db = cx_Oracle.connect('user/pass#host:1521/service_name')
SQL = "select * from dual"
print(SQL)
cursor = db.cursor()
f = open("C:\dual.csv", "w")
writer = csv.writer(f, lineterminator="\n", quoting=csv.QUOTE_NONNUMERIC)
r = cursor.execute(SQL)
#this takes the column names
col_names = [row[0] for row in cursor.description]
writer.writerow(col_names)
for row in cursor:
writer.writerow(row)
f.close()
The SQLAlchemy source code is a good starting point for robust methods of database introspection. Here is how SQLAlchemy reflects table names from Oracle:
SELECT table_name FROM all_tables
WHERE nvl(tablespace_name, 'no tablespace') NOT IN ('SYSTEM', 'SYSAUX')
AND OWNER = :owner
AND IOT_NAME IS NULL

Categories

Resources