Inserting UNICODE into sqlite3 - python

I am trying to read a file, parse the data using python 2.7, and import the data into sqlite3. However, I'm running into a problem when inserting the data. After I parse a line from the file, the é in my string is replaced with \xe9. After I split the line from my file, I want a list that contains [73,'Misérables, Les'] but instead I'm getting [73,'Mis\xe9rables, Les'] which is screwing up the SQL INSERT statement. How can I fix this?
#!/usr/bin/python
# -*- coding: latin-1 -*-
import sqlite3
line = '73::Misérables, Les'.decode('latin-1')
vals = line.split("::")
con = sqlite3.connect('myDb.db')
cur = con.cursor()
cur.execute("DROP TABLE IF EXISTS movie")
cur.execute('CREATE TABLE movie (id INT, title TEXT)')
sql = 'INSERT INTO movie VALUES (?,?)'
cur.execute(sql,tuple(vals))
cur.execute('SELECT * FROM movie')
for record in cur:
print record

Your program inserts data into the db perfectly. It subsequently retrieves the correct data. Your problem is when you display the result.
When you print a tuple, the system displays the repr() of each item, not the str() of each item. Thus you see \xe9 instead of é in the output.
To get what you want, try replacing the loop at the end of your program:
for record in cur:
print record[0], record[1]

Related

SQL concatenating with python

I am trying to update my mysql database field with a concatenation. I have to read my file line by line, and i need to append the existing string with the loaded line. I have to do it like this because my goal is to insert a 3gb long whitespace separated text file into one longtext field, and mysql only capable of handling 1gb text to insert.
The problem with my code is that if i add the field name to the concat function like seq=concat(seq, %s) I get a SQL syntax error, but when I add the field name as a variable, python acts like it's a string.
So short story long with this input file:
aaa
bbb
ccc
I want to have an updated mysql field like this:
aaabbbccc
But I get this: seqccc
Any idea how should i work with the fieldname to get this work?
import mysql.connector
connection = mysql.connector.connect(host='localhost',
database='sys',
user='Pannka',
password='???')
cursor = connection.cursor()
with open('test.txt', 'r') as f:
for line in f:
sql = "update linedna set seq=concat(%s, %s) where id=1"
val=('seq', line.rstrip())
print(line.rstrip())
cursor.execute(sql, val)
connection.commit()
cursor.close()
connection.close()
f.close()
print(0)
I think that you want:
sql = "update linedna set seq = concat(seq, %s) where id=1"
val=(line.rstrip())
cursor.execute(sql, val)
connection.commit()
This will append each new line at the end of the already existing database value in column seq.

Fail to run 'execute(sqlcmd)' of sqlite3 using Python,Ubuntu

I want to use sqlite3 to deal with data in Ubuntu with python. But I always failed and get errors. Codes related to database are as follows:
sqlite = "%s.db" % name
#connnect to the database
conn = sqlite3.connect(sqlite)
print "Opened database successfully"
c = conn.cursor()
#set default separator to "\t" in database
c.execute(".separator "\t"")
print "Set separator of database successfully"
#create table data_node
c.execute('''create table data_node(Time int,Node Text,CurSize int,SizeVar int,VarRate real,Evil int);''')
print "Table data_node created successfully"
node_info = "%s%s.txt" % (name,'-PIT-node')
c.execute(".import %\"s\" data_node") % node_info
print "Import to data_node successfully"
#create table data_face
data_info = "%s%s.txt" % (name,'-PIT-face')
c.execute('''create table data_face(Time int,Node Text,TotalEntry real,FaceId int,FaceEntry real,Evil int);''')
c.execute(".import \"%s\" data_face") % face_info
#get the final table : PIT_node
c.execute('''create table node_temp as select FIRST.Time,FIRST.Node,ROUND(FIRST.PacketsRaw/SECOND.PacketsRaw,4) as SatisRatio from tracer_temp FIRST,tracer_temp SECOND WHERE FIRST.Time=SECOND.Time AND FIRST.Node=SECOND.Node AND FIRST.Type='InData' AND SECOND.Type='OutInterests';''')
c.execute('''create table PIT_node as select A.Time,A.Node,B.SatisRatio,A.CurSize,A.SizeVar,A.VarRate,A.Evil from data_node A,node_temp B WHERE A.Time=B.Time AND A.Node=B.Node;''')
#get the final table : PIT_face
c.execute('''create table face_temp as select FIRST.Time,FIRST.Node,FIRST.FaceId,ROUND(FIRST.PacketsRaw/SECOND.PacketsRaw,4) as SatisRatio,SECOND.Packets from data_tracer FIRST,data_tracer SECOND WHERE FIRST.Time=SECOND.Time AND FIRST.Node=SECOND.Node AND FIRST.FaceId=SECOND.FaceId AND FIRST.Type='OutData' AND SECOND.Type='InInterests';''')
c.execute('''create table PIT_face as select A.Time,A.Node,A.FaceId,B.SatisRatio,B.Packets,ROUND(A.FaceEntry/A.TotalEntry,4),A.Evil from data_face as A,face_temp as B WHERE A.Time=B.Time AND A.Node=B.Node AND A.FaceId = B.FaceId;''')
conn.commit()
conn.close()
These sql-commands are right. When I run the code, it always shows sqlite3.OperationalError: near ".": syntax error. So how to change my code and are there other errors in other commands such as create table?
You have many problems in your code as posted, but the one you're asking about is:
c.execute(".separator "\t"")
This isn't valid Python syntax. But, even if you fix that, it's not valid SQL.
The "dot-commands" are special commands to the sqlite3 command line shell. It intercepts them and uses them to configure itself. They mean nothing to the actual database, and cannot be used from Python.
And most of them don't make any sense outside that shell anyway. For example, you're trying to set the column separator here. But the database doesn't return strings, it returns row objects—similar to lists. There is nowhere for a separator to be used. If you want to print the rows out with tab separators, you have to do that in your own print statements.
So, the simple fix is to remove all of those dot-commands.
However, there is a problem—at least one of those dot-commands actually does something:
c.execute(".import %\"s\" data_node") % node_info
You will have to replace that will valid calls to the library that do the same thing as the .import dot-command. Read what it does, and it should be easy to understand. (You basically want to open the file, parse the columns for each row, and do an executemany on an INSERT with the rows.)

pymysql returns perplexing value error

I am attempting to write a simple python script to import from a text file to a mysql database, and encounter a perplexing error
Windows 10, Mysql 5.7.18, Python 3.6, pymysql
The contents of the text file:
nickname|fullname|cell|email|updatedt
andrew|Andrew Jones|+12395551172|arj#domain.com|2017-05-04 13:26:10
laurelai|Laurelai Smith||lsmith#domain.net|2017-05-04 13:27:47
I read in the data to construct a sql string:
insert into contacts (nickname,fullname,cell,email,updatedt) values(%,%,%,%,%)
The field values to be inserted are read in as follows:
['andrew', 'Andrew Jones', '+12395551172', 'arj#domain.com', '2017-05-04 13:26:10']
This is of course a Python list object. I have tried converting it to a tuple, but with same result
The routine to insert the row into the table is as follows:
def insert(sql, values):
#insert a single row of data from the input file
connection = getconn()
with connection.cursor() as cursor:
try:
cursor.execute(sql, values)
#except ValueError:
#print('Value error from pymysql')
finally:
cursor.close()
The following ValueError is returned:
ValueError: unsupported format character ',' (0x2c) at index 69
if, however, I extract the data values and insert them into the sqlstring by concatenation, I get:
insert into contacts (nickname,fullname,cell,email,updatedt) values('laurelai','Laurelai Smith','','lsmith#domain.net','2017-05-04 13:27:47')
This inserts the rows without error
What causes the ValueError?
Change your insert query to %s instead of %:
insert into contacts (nickname,fullname,cell,email,updatedt) values(%s,%s,%s,%s,%s)
Refer to doc.

Sqlite3 cannot correctly query a UTF-8 string?

I'm having a lot of trouble using python's sqlite3 library with UTF-8 strings. I need this encoding because I am working people's names, in my database.
My SQL schema for the desired table is:
CREATE TABLE senators (id integer, name char);
I would like to do the following in Python (ignore the very ugly way I wrote the select statement. I did it this way for debugging purposes):
statement = u"select * from senators where name like '" + '%'+row[0]+'%'+"'"
c.execute(statement)
row[0] is the name of each row in a file that has this type of entry:
Dário Berger,1
Edison Lobão,1
Eduardo Braga,1
While I have a non empty result for names like Eduardo Braga, any time my string has UTF-8 characters, I get a null result.
I have checked that my file has in fact been saved with UTF-8 encoding (Microsoft Notepad). On a Apple mac, in the terminal, I used the PRAGMA command in the sqlite3 shell to check the encoding:
sqlite> PRAGMA encoding;
UTF-8
Does anybody have an idea what I can do here?
EDIT - Complete example:
Python script that creates the databases, and populates with initial data from senators.csv (file):
# -*- coding: utf-8 -*-
import sqlite3
import csv
conn = sqlite3.connect('senators.db')
c = conn.cursor()
c.execute('''CREATE TABLE senators (id integer, name char)''')
c.execute('''CREATE TABLE polls (id integer, senator char, vote integer, FOREIGN KEY(senator) REFERENCES senators(name))''')
with open('senators.csv', encoding='utf-8') as f:
f_csv = csv.reader(f)
for row in f_csv:
c.execute(u"INSERT INTO senators VALUES(?,?)", (row[1], row[0]))
conn.commit()
conn.close()
Script that populates the polls table, using Q1.txt (file).
import csv
import sqlite3
import re
import glob
conn = sqlite3.connect('senators.db')
c = conn.cursor()
POLLS = {
'senator': 'votes/senator/Q*.txt',
'deputee': 'votes/deputee/Q*.txt',
}
s_polls = glob.glob(POLLS['senator'])
d_polls = glob.glob(POLLS['deputee'])
for poll in s_polls:
m = re.match('.*Q(\d+)\.txt', poll)
poll_id = m.groups(0)
with open(poll, encoding='utf-8') as p:
f_csv = csv.reader(p)
for row in f_csv:
c.execute(u'SELECT id FROM senators WHERE name LIKE ?', ('%'+row[0]+'%',))
data = c.fetchone()
print(data) # I should not get None results here, but I do, exactly when the query has UTF-8 characters.
Note the file paths, if you want to test these scripts out.
Ok guys,
After a lot of trouble, I found out that the problem was that the encodings, all though were both considered UTF-8, were still different anyways. The difference was that while the database was decomposed UTF-8 (ã = a + ~), my input was in precomposed form (one code for the ã character).
To fix it, I had to convert all my input data to the decomposed form.
from unicodedata import normalize
with open(poll, encoding='utf-8') as p:
f_csv = csv.reader(p)
for row in f_csv:
name = normalize("NFD",row[0])
c.execute(u'SELECT id FROM senators WHERE name LIKE ?', ('%'+name+'%',))
See this article, for some excellent information on the subject.
From the SQLite docs:
Important Note: SQLite only understands upper/lower case for ASCII characters by default. The LIKE operator is case sensitive by default for unicode characters that are beyond the ASCII range. For example, the expression 'a' LIKE 'A' is TRUE but 'æ' LIKE 'Æ' is FALSE.
Also, use query parameters. Your query is vulnerable to SQL injection.

portugese characters in DBsqlite and python parsing not recognised

I got a database in DBsqlite.in this DBsqlite database I have have a records containing portugese text like "Hiper-radiação simétrica periocular bem delimitada, homogênea."
and the characters like ç ã é ê don't parse right in my python script.
While normal english text is doing it perfectly.
In my terminal window (I use a mac) the
I know it has something do to with the encoding. but the code still doesn't recognise portugese.
my sample code:
# -*- coding: UTF-8 -*-
import xml.etree.ElementTree as ET
import sqlite3
#open a database connection to the database translateDB.sqlite
conn = sqlite3.connect('translateDB.sqlite')
#prepare a cursor object using cursus() method
cursor = conn.cursor()
#test input
# this doesn't work
text = ('Hiper-radiação simétrica periocular bem delimitada, homogênea')
# this does work in english
#text = ('Well delimited, homogeneous symmetric periocular hyper- radiation.')
# Execute SQL query using execute() method.
cursor.execute('SELECT * FROM translate WHERE L2_portugese=?', (text,))
# Fetch a single row using fetchone() method and display it.
print cursor.fetchone()
# Disconnect from server
conn.close()
any tips & tricks are greatly appreciated. Ron

Categories

Resources