Python, insert XML data to sqlite3 - python

Im trying to pull some XML from a URL, parse it and store the entries in an sqlite3 database, Im trying numerous things and all are failing. Codde so far:
#!/usr/bin/env python
from urllib2 import urlopen
import gc
import xml.etree.ElementTree as ET
import sqlite3
rosetta_url = ("https://boinc.bakerlab.org/rosetta/team_email_list.php?teamid=12575&account_key=Y&xml=1")
root = ET.parse(urlopen(rosetta_url)).getroot()
cpids = [el.text for el in root.findall('.//user/cpid')]
print cpids
conn = sqlite3.connect("GridcoinTeam.db")
c = conn.cursor()
c.execute('''CREATE TABLE IF NOT EXISTS GRIDCOINTEAM (cpid TEXT)''')
c.executemany("INSERT INTO GRIDCOINTEAM VALUES (?);", cpids)
conn.commit()
conn.close()
conn = sqlite3.connect("GridcoinTeam.db")
c = conn.cursor()
cpids = c.execute('select cpid from GRIDCOINTEAM').fetchall()
conn.close()
print cpids
gc.collect()
Im getting the error:
Incorrect number of bindings supplied. The current statement uses 1, and there are 32 supplied.
I tried making the insertion tuples by changing to
c.executemany("INSERT INTO GRIDCOINTEAM VALUES (?);", (cpids, ))
but that just gives:
Incorrect number of bindings supplied. The current statement uses 1, and there are 3289 supplied.
The XML extract is in the form ['5da243d1f47b7852d372c51d6ee660d7', '5a6d18b942518aca60833401e70b75b1', '527ab53f75164864b74a89f3db6986b8'], but there are several thousand entries.
Thanks.

You need to insert this as multiple rows instead of multiple columns into one row
cpids = [el.text for el in root.findall('.//user/cpid')]
cpids = zip(*[iter(cpids)]*1)
print cpids

The problem lies in
c.executemany("INSERT INTO GRIDCOINTEAM VALUES (?);", cpids)
This executemany expects a list of tuples, but you pass a list of strings. What the code does effectively is:
for entry in cpids:
c.execute("INSERT INTO GRIDCOINTEAM VALUES (?);", *entry)
Note the star before entry, which unloads the string, and which gives you 32+ params whereas you only want one.
In order to fix that we'd need to know your GRIDCOINTEAM table schema. If you have a table with only one column and you want to insert that, you could probably do this:
for entry in cpids:
c.execute("INSERT INTO GRIDCOINTEAM VALUES (?)", entry)
In contrast to executemany, execute takes each parameter as one param - no tuples and lists unloading here.
Alternatively you can resort to using executemany, but you'd then need to wrap every one of your strings in a tuple or generator:
c.executemany("INSERT INTO GRIDCOINTEAM VALUES (?);", [(i,) for i in cpids])

Related

Insert record from list if not exists in table

cHandler = myDB.cursor()
cHandler.execute('select UserId,C1,LogDate from DeviceLogs_12_2019') // data from remote sql server database
curs = connection.cursor()
curs.execute("""select * from biometric""") //data from my database table
lst = []
result= cHandler.fetchall()
for row in result:
lst.append(row)
lst2 = []
result2= curs.fetchall()
for row in result2:
lst2.append(row)
t = []
r = [elem for elem in lst if not elem in lst2]
for i in r:
print(i)
t.append(i)
for i in t:
frappe.db.sql("""Insert into biometric(UserId,C1,LogDate) select '%s','%s','%s' where not exists(select * from biometric where UserID='%s' and LogDate='%s')""",(i[0],i[1],i[2],i[0],i[2]),as_dict=1)
I am trying above code to insert data into my table if record not exists but getting error :
pymysql.err.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '1111'',''in'',''2019-12-03 06:37:15'' where not exists(select * from biometric ' at line 1")
Is there anything I am doing wrong or any other way to achieve this?
It appears you have potentially four problems:
There is a from clause missing between select and where not exists.
When using a prepared statement you do not enclose your placeholder arguments, %s, within quotes. Your SQL should be:
Your loop:
Loop:
t = []
r = [elem for elem in lst if not elem in lst2]
for i in r:
print(i)
t.append(i)
If you are trying to only include rows from the remote site that will not be duplicates, then you should explicitly check the two fields that matter, i.e. UserId and LogDate. But what is the point since your SQL is taking care of making sure that you are excluding these duplicate rows? Also, what is the point of copying everything form r to t?
SQL:
Insert into biometric(UserId,C1,LogDate) select %s,%s,%s from DUAL where not exists(select * from biometric where UserID=%s and LogDate=%s
But here is the problem even with the above SQL:
If the not exists clause is false, then the select %s,%s,%s from DUAL ... returns no columns and the column count will not match the number of columns you are trying to insert, namely three.
If your concern is getting an error due to duplicate keys because (UserId, LogDate) is either a UNIQUE or PRIMARY KEY, then add the IGNORE keyword on the INSERT statement and then if a row with the key already exists, the insertion will be ignored. But there is no way of knowing since you have not provided this information:
for i in t:
frappe.db.sql("Insert IGNORE into biometric(UserId,C1,LogDate) values(%s,%s,%s)",(i[0],i[1],i[2]))
If you do not want multiple rows with the same (UserId, LogDate) combination, then you should define a UNIQUE KEY on these two columns and then the above SQL should be sufficient. There is also an ON DUPLICATE KEY SET ... variation of the INSERT statement where if the key exists you can do an update instead (look this up).
If you don't have a UNIQUE KEY defined on these two columns or you need to print out those rows which are being updated, then you do need to test for the presence of the existing keys. But this would be the way to do it:
cHandler = myDB.cursor()
cHandler.execute('select UserId,C1,LogDate from DeviceLogs_12_2019') // data from remote sql server database
rows = cHandler.fetchall()
curs = connection.cursor()
for row in rows:
curs.execute("select UserId from biometric where UserId=%s and LogDate=%s", (ros[0], row[2])) # row already in biometric table?
biometric_row = curs.fetchone()
if biometric_row is None: # no, it is not
print(row)
frappe.db.sql("Insert into biometric(UserId,C1,LogDate) values(%s, %s, %s)", (row[0],row[1],row[2]))

How to use psycopg2 to retrieve a certain key's value from a postgres table which has key-value pairs

New to python, trying to use psycopg2 to read Postgres
I am reading from a database table called deployment and trying to handle a Value from a table with three fields id, Key and Value
import psycopg2
conn = psycopg2.connect(host="localhost",database=database, user=user, password=password)
cur = conn.cursor()
cur.execute("SELECT \"Value\" FROM deployment WHERE (\"Key\" = 'DUMPLOCATION')")
records = cur.fetchall()
print(json.dumps(records))
[["newdrive"]]
I want this to be just "newdrive" so that I can do a string comparison in the next line to check if its "newdrive" or not
I tried json.loads on the json.dumps output, didn't work
>>> a=json.loads(json.dumps(records))
>>> print(a)
[['newdrive']]
I also tried to print just the records without json.dump
>>> print(records)
[('newdrive',)]
The result of fetchall() is a sequence of tuples. You can loop over the sequence and print the first (index 0) element of each tuple:
cur.execute("SELECT \"Value\" FROM deployment WHERE (\"Key\" = 'DUMPLOCATION')")
records = cur.fetchall()
for record in records:
print(record[0])
Or simpler, if you are sure the query returns no more than one row, use fetchone() which gives a single tuple representing returned row, e.g.:
cur.execute("SELECT \"Value\" FROM deployment WHERE (\"Key\" = 'DUMPLOCATION')")
row = cur.fetchone()
if row: # check whether the query returned a row
print(row[0])

How to insert a dictionary into Postgresql Table with Pscycopg2

How do I insert a python dictionary into a Postgresql2 table? I keep getting the following error, so my query is not formatted correctly:
Error syntax error at or near "To" LINE 1: INSERT INTO bill_summary VALUES(To designate the facility of...
import psycopg2
import json
import psycopg2.extras
import sys
with open('data.json', 'r') as f:
data = json.load(f)
con = None
try:
con = psycopg2.connect(database='sanctionsdb', user='dbuser')
cur = con.cursor(cursor_factory=psycopg2.extras.DictCursor)
cur.execute("CREATE TABLE bill_summary(title VARCHAR PRIMARY KEY, summary_text VARCHAR, action_date VARCHAR, action_desc VARCHAR)")
for d in data:
action_date = d['action-date']
title = d['title']
summary_text = d['summary-text']
action_date = d['action-date']
action_desc = d['action-desc']
q = "INSERT INTO bill_summary VALUES(" +str(title)+str(summary_text)+str(action_date)+str(action_desc)+")"
cur.execute(q)
con.commit()
except psycopg2.DatabaseError, e:
if con:
con.rollback()
print 'Error %s' % e
sys.exit(1)
finally:
if con:
con.close()
You should use the dictionary as the second parameter to cursor.execute(). See the example code after this statement in the documentation:
Named arguments are supported too using %(name)s placeholders in the query and specifying the values into a mapping.
So your code may be as simple as this:
with open('data.json', 'r') as f:
data = json.load(f)
print(data)
""" above prints something like this:
{'title': 'the first action', 'summary-text': 'some summary', 'action-date': '2018-08-08', 'action-desc': 'action description'}
use the json keys as named parameters:
"""
cur = con.cursor()
q = "INSERT INTO bill_summary VALUES(%(title)s, %(summary-text)s, %(action-date)s, %(action-desc)s)"
cur.execute(q, data)
con.commit()
Note also this warning (from the same page of the documentation):
Warning: Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.
q = "INSERT INTO bill_summary VALUES(" +str(title)+str(summary_text)+str(action_date)+str(action_desc)+")"
You're writing your query in a wrong way, by concatenating the values, they should rather be the comma-separated elements, like this:
q = "INSERT INTO bill_summary VALUES({0},{1},{2},{3})".format(str(title), str(summery_text), str(action_date),str(action_desc))
Since you're not specifying the columns names, I already suppose they are in the same orders as you have written the value in your insert query. There are basically two way of writing insert query in postgresql. One is by specifying the columns names and their corresponding values like this:
INSERT INTO TABLE_NAME (column1, column2, column3,...columnN)
VALUES (value1, value2, value3,...valueN);
Another way is, You may not need to specify the column(s) name in the SQL query if you are adding values for all the columns of the table. However, make sure the order of the values is in the same order as the columns in the table. Which you have used in your query, like this:
INSERT INTO TABLE_NAME VALUES (value1,value2,value3,...valueN);

Python SQLite3: want to iteratively fetching rows, but code is pulling every other row

I'm trying to accomplish a very simple task:
Create a table in SQLite
Insert several rows
Query a single column in the table and pull back each row
Code to create tab:
import sqlite3
sqlite_file = '/Users/User/Desktop/DB.sqlite'
conn = sqlite3.connect(sqlite_file)
c = conn.cursor()
c.execute('''CREATE TABLE ListIDTable(ID numeric, Day numeric, Month
numeric, MonthTxt text, Year numeric, ListID text, Quantity text)''')
values_to_insert = [
(1,16,7,"Jul",2015,"XXXXXXX1","Q2"),
(2,16,7,"Jul",2015,"XXXXXXX2","Q2"),
(3,14,7,"Jul",2015,"XXXXXXX3","Q1"),
(4,14,7,"Jul",2015,"XXXXXXX4","Q1")] #Entries continue similarly
c.executemany("INSERT INTO ListIdTable (ID, Day, Month, MonthTxt,
Year, ListID, Quantity) values (?,?,?,?,?,?,?)", values_to_insert)
conn.commit()
conn.close()
When I look at this table in SQLite DB Browser, everything looks fine.
Here's my code to try and query the above table:
import sqlite3
sqlite_file = '/Users/User/Desktop/DB.sqlite'
conn = sqlite3.connect(sqlite_file)
conn.row_factory = sqlite3.Row
c = conn.cursor()
for row in c.execute('select * from ListIDTable'):
r = c.fetchone()
ID = r['ID']
print (ID)
I should get a print out of 1, 2, 3, 4.
However, I only get 2 and 4.
My code actually uploads 100 entries to the table, but still, when I query, I only get ID printouts of even numbers (i.e. 2, 4, 6, 8 etc.).
Thanks for any advice on fixing this.
You don't need to fetchone in the loop -- The loop is already fetching the values (one at a time). If you fetchone while you're iterating, you'll only see half the data because the loop fetches one and then you immediately fetch the next one (without ever looking at the one that was fetched by the loop):
for r in c.execute('select * from ListIDTable'):
ID = r['ID']
print (ID)

Python Sqlite3 insert operation with a list of column names

Normally, if i want to insert values into a table, i will do something like this (assuming that i know which columns that the values i want to insert belong to):
conn = sqlite3.connect('mydatabase.db')
conn.execute("INSERT INTO MYTABLE (ID,COLUMN1,COLUMN2)\
VALUES(?,?,?)",[myid,value1,value2])
But now i have a list of columns (the length of list may vary) and a list of values for each columns in the list.
For example, if i have a table with 10 columns (Namely, column1, column2...,column10 etc). I have a list of columns that i want to update.Let's say [column3,column4]. And i have a list of values for those columns. [value for column3,value for column4].
How do i insert the values in the list to the individual columns that each belong?
As far as I know the parameter list in conn.execute works only for values, so we have to use string formatting like this:
import sqlite3
conn = sqlite3.connect(':memory:')
conn.execute('CREATE TABLE t (a integer, b integer, c integer)')
col_names = ['a', 'b', 'c']
values = [0, 1, 2]
conn.execute('INSERT INTO t (%s, %s, %s) values(?,?,?)'%tuple(col_names), values)
Please notice this is a very bad attempt since strings passed to the database shall always be checked for injection attack. However you could pass the list of column names to some injection function before insertion.
EDITED:
For variables with various length you could try something like
exec_text = 'INSERT INTO t (' + ','.join(col_names) +') values(' + ','.join(['?'] * len(values)) + ')'
conn.exec(exec_text, values)
# as long as len(col_names) == len(values)
Of course string formatting will work, you just need to be a bit cleverer about it.
col_names = ','.join(col_list)
col_spaces = ','.join(['?'] * len(col_list))
sql = 'INSERT INTO t (%s) values(%s)' % (col_list, col_spaces)
conn.execute(sql, values)
I was looking for a solution to create columns based on a list of unknown / variable length and found this question. However, I managed to find a nicer solution (for me anyway), that's also a bit more modern, so thought I'd include it in case it helps someone:
import sqlite3
def create_sql_db(my_list):
file = 'my_sql.db'
table_name = 'table_1'
init_col = 'id'
col_type = 'TEXT'
conn = sqlite3.connect(file)
c = conn.cursor()
# CREATE TABLE (IF IT DOESN'T ALREADY EXIST)
c.execute('CREATE TABLE IF NOT EXISTS {tn} ({nf} {ft})'.format(
tn=table_name, nf=init_col, ft=col_type))
# CREATE A COLUMN FOR EACH ITEM IN THE LIST
for new_column in my_list:
c.execute('ALTER TABLE {tn} ADD COLUMN "{cn}" {ct}'.format(
tn=table_name, cn=new_column, ct=col_type))
conn.close()
my_list = ["Col1", "Col2", "Col3"]
create_sql_db(my_list)
All my data is of the type text, so I just have a single variable "col_type" - but you could for example feed in a list of tuples (or a tuple of tuples, if that's what you're into):
my_other_list = [("ColA", "TEXT"), ("ColB", "INTEGER"), ("ColC", "BLOB")]
and change the CREATE A COLUMN step to:
for tupl in my_other_list:
new_column = tupl[0] # "ColA", "ColB", "ColC"
col_type = tupl[1] # "TEXT", "INTEGER", "BLOB"
c.execute('ALTER TABLE {tn} ADD COLUMN "{cn}" {ct}'.format(
tn=table_name, cn=new_column, ct=col_type))
As a noob, I can't comment on the very succinct, updated solution #ron_g offered. While testing, though I had to frequently delete the sample database itself, so for any other noobs using this to test, I would advise adding in:
c.execute('DROP TABLE IF EXISTS {tn}'.format(
tn=table_name))
Prior the the 'CREATE TABLE ...' portion.
It appears there are multiple instances of
.format(
tn=table_name ....)
in both 'CREATE TABLE ...' and 'ALTER TABLE ...' so trying to figure out if it's possible to create a single instance (similar to, or including in, the def section).

Categories

Resources