I met problems while using sqlite3 in python.
def getEntryId(self, table, field, value, createNew=True):
cur=self.con.execute("select rowid from %s where %s = '%s'" % (table, field, value))
res=cur.fetchone()
if res==None:
cur=self.con.execute("insert into %s (%s) values('%s') " % (table, field, value))
return cur.lastrowid
else:
return res[0]
However, I met this:
OperationalError: unrecognized token: "'''". It seems that my 2nd line of codes is incorrect.
I can not figure out why, so I do the same thing:
cu.execute("select rowid from urllist where %s = '%s'" % ('url', 'yes'))
It came out without an error. Why? How could I fix it?
You should parameterize the query. You cannot though parameterize the table and field names, you can use string formatting to insert the table and field names into the query, but make sure you either trust the source, or validate the values properly:
query = "select rowid from {table} where {field} = %s".format(table=table, field=field)
cur = self.con.execute(query, (value, ))
res = cur.fetchone()
The parameterization not only helps to prevent SQL injection attacks, but also handles the data types conversions, escapes the parameters properly, which may fix your current problem as well.
Related
I am new to python, currently learning. So I am using the psycopg2 library to run postgresql query and have an insert command like this:
cursor.execute(
"INSERT INTO test_case (run_id, listener_id, tc_name, elapsed_time_in_ms, elapsed_time_in_hr_min_sec, "
"epoch_time, status, message, tags) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s);",
(
run_id_from_test_run,
attributes.get("id"),
attributes.get("originalname"),
attributes.get("elapsedtime"),
convert_ms(attributes.get("elapsedtime")),
math.trunc(time.time()),
attributes.get("status"),
attributes.get("message"),
attributes.get("tags"),
),
)
Now similarly I want to write the update query, currently written like this:
cursor.execute(
f"UPDATE test_run SET status='{test_run_status}', "
f"project_name='{project_name_metadata}', "
f"total_time_in_ms={total_execution_time}, "
f"total_time_in_hr_min_sec='{convert_ms(total_execution_time)}' "
f"WHERE run_id={run_id_from_test_run}"
)
similar to the insert query. I have tried a lot of permutation combination, but wasn't able to achieve what I was looking for. I have already tried searching stackOverflow, but couldn't find a suitable answer for this. If this has been previously answered please do link me to it.
Edit This is what I tried:
sql = "UPDATE test_run SET status = %s, project_name = %s, total_time_in_ms = %d, total_time_in_hr_min_sec = %s " \
"WHERE run_id = %d"
val = ('{test_run_status}', '{project_name_metadata}', {total_execution_time}, '{convert_ms(total_execution_time)}',
{run_id_from_test_run})
cursor.execute(sql, val)
And I got the error as:
ProgrammingError: can't adapt type 'set'
Something like below:
cursor.execute(
"""UPDATE
test_run
SET (status, project_name,total_time_in_ms, total_time_in_hr_min_sec)
= (%s, %s, %s, %s)
WHERE
run_id = %s""",
[status, project_name,total_time_in_ms, total_time_in_hr_min_sec, run_id]
)
See UPDATE for the update form used.
Also take a look at [Fast Execution Helpers]https://www.psycopg.org/docs/extras.html#fast-execution-helpers) execute_values() for another way.
I have some problem to design a good algorithm which use specification of psycopg2 library described here
I want to build a dynamic query equal to this string :
SELECT ST_GeomFromText('POLYGON((0.0 0.0,20.0 0.0,20.0 20.0,0.0 20.0,0.0 0.0))');
As you can see, my POLYGON object contain multiple point, read in a simple csv file some.csv which contain :
0.0;0.0
20.0;0.0
20.0;20.0
0.0;20.0
0.0;0.0
So i build the query dynamically, function of the number of line/data in the csv.
Here my program to generate the SQL Query string to execute :
import psycopg2
import csv
# list of points
lXy = []
DSN= "dbname='testS' user='postgres' password='postgres' host='localhost'"
conn = psycopg2.connect(DSN)
curs = conn.cursor()
def genPointText(curs,x,y):
generatedPoint = "%s %s" % (x,y)
return generatedPoint
#Lecture fichier csv
polygonFile = open('some.csv', 'rb')
readerCSV = csv.reader(polygonFile,delimiter = ';')
for coordinates in readerCSV:
lXy.append(genPointText(curs,float(coordinates[0]),float(coordinates[1])))
# function of list concatenation by separator
def convert(myList,separator):
return separator.join([str(i) for i in myList])
# construct simple query with psycopg
def genPolygonText(curs,l):
# http://initd.org/psycopg/docs/usage.html#python-types-adaptation
generatedPolygon = "POLYGON((%s))" % convert(l, ",")
return generatedPolygon
def executeWKT(curs,geomObject,srid):
try:
# geometry ST_GeomFromText(text WKT, integer srid);
finalWKT = "SELECT ST_GeomFromText('%s');" % (geomObject)
print finalWKT
curs.execute(finalWKT)
except psycopg2.ProgrammingError,err:
print "ERROR = " , err
polygonQuery = genPolygonText(curs,lXy)
executeWKT(curs,polygonQuery,4326)
As you can see, that's works, but this way is not correct because of conversion problem between python object and sql postgresql object.
In the documentation, i see only example to feed and convert data for static query. Do you know an "elegant" way to create correct string with correct type in a dynamic build for query ?
UPDATE 1 :
As you can see, when i use psycopg type transformation function on this simple example, i have error like this :
query = "ST_GeomFromText('POLYGON(( 52.146542 19.050557, 52.148430 19.045527, 52.149525 19.045831, 52.147400 19.050780, 52.147400 19.050780, 52.146542 19.050557))',4326)"
name = "my_table"
try:
curs.execute('INSERT INTO %s(name, url, id, point_geom, poly_geom) VALUES (%s);', (name,query))
except psycopg2.ProgrammingError,err:
print "ERROR = " , err
Error equal :
ERROR = ERREUR: erreur de syntaxe sur ou près de « E'my_table' »
LINE 1: INSERT INTO E'my_table'(name, poly_geom) VALUES (E'ST_GeomFr...
UPDATE 2 :
Final code which work thanks to stackoverflow users !
#info lib : http://www.initd.org/psycopg/docs/
import psycopg2
# info lib : http://docs.python.org/2/library/csv.html
import csv
# list of points
lXy = []
DSN= "dbname='testS' user='postgres' password='postgres' host='localhost'"
print "Opening connection using dns:", DSN
conn = psycopg2.connect(DSN)
curs = conn.cursor()
def genPointText(curs,x,y):
generatedPoint = "%s %s" % (x,y)
return generatedPoint
#Lecture fichier csv
polygonFile = open('some.csv', 'rb')
readerCSV = csv.reader(polygonFile,delimiter = ';')
for coordinates in readerCSV:
lXy.append(genPointText(curs,float(coordinates[0]),float(coordinates[1])))
# function of list concatenation by separator
def convert(myList,separator):
return separator.join([str(i) for i in myList])
# construct simple query with psycopg
def genPolygonText(l):
# http://initd.org/psycopg/docs/usage.html#python-types-adaptation
generatedPolygon = "POLYGON((%s))" % convert(l, ",")
return generatedPolygon
def generateInsert(curs,tableName,name,geomObject):
curs.execute('INSERT INTO binome1(name,geom) VALUES (%s, %s);' , (name,geomObject))
def create_db_binome(conn,name):
curs = conn.cursor()
SQL = (
"CREATE TABLE %s"
" ("
" polyname character varying(15),"
" geom geometry,"
" id serial NOT NULL,"
" CONSTRAINT id_key PRIMARY KEY (id)"
" )"
" WITH ("
" OIDS=FALSE"
" );"
" ALTER TABLE %s OWNER TO postgres;"
) %(name,name)
try:
#print SQL
curs.execute(SQL)
except psycopg2.ProgrammingError,err:
conn.rollback()
dropQuery = "ALTER TABLE %s DROP CONSTRAINT id_key; DROP TABLE %s;" % (name,name)
curs.execute(dropQuery)
curs.execute(SQL)
conn.commit()
def insert_geometry(polyname,tablename,geometry):
escaped_name = tablename.replace('""','""')
try:
test = 'INSERT INTO %s(polyname, geom) VALUES(%%s, ST_GeomFromText(%%s,%%s))' % (escaped_name)
curs.execute(test, (tablename, geometry, 4326))
conn.commit()
except psycopg2.ProgrammingError,err:
print "ERROR = " , err
################
# PROGRAM MAIN #
################
polygonQuery = genPolygonText(lXy)
srid = 4326
table = "binome1"
create_db_binome(conn,table)
insert_geometry("Berlin",table,polygonQuery)
insert_geometry("Paris",table,polygonQuery)
polygonFile.close()
conn.close()
You are trying to pass a table name as a parameter. You probably could've seen this immediately if you'd just looked at the PostgreSQL error log.
The table name you're trying to pass through psycopg2 as a parameter is being escaped, producing a query like:
INSERT INTO E'my_table'(name, url, id, point_geom, poly_geom) VALUES (E'ST_GeomFromText(''POLYGON(( 52.146542 19.050557, 52.148430 19.045527, 52.149525 19.045831, 52.147400 19.050780, 52.147400 19.050780, 52.146542 19.050557))'',4326)');'
This isn't what you intended and won't work; you can't escape a table name like a literal. You must use normal Python string interpolation to construct dynamic SQL, you can only use parameterized statement placeholders for actual literal values.
params = ('POLYGON(( 52.146542 19.050557, 52.148430 19.045527, 52.149525 19.045831, 52.147400 19.050780, 52.147400 19.050780, 52.146542 19.050557))',4326)
escaped_name = name.replace('"",'""')
curs.execute('INSERT INTO "%s"(name, url, id, point_geom, poly_geom) VALUES (ST_GeomFromText(%%s,%%s));' % escaped_name, params)
See how I've interpolated the name directly to produce the query string:
INSERT INTO my_table(name, url, id, point_geom, poly_geom) VALUES (ST_GeomFromText(%s,%s));
(%% gets converted to plain % by % substitution). Then I'm using that query with the string defining the POLYGON and the other argument to ST_GeomFromText as query parameters.
I haven't tested this, but it should give you the right idea and help explain what's wrong.
BE EXTEMELY CAREFUL when doing string interpolation like this, it's an easy avenue for SQL injection. I've done very crude quoting in the code shown above, but I'd want to use a proper identifier quoting function if your client library offers one.
Now that 2.7 is on PyPi here is is an example for a dynamic query.
In this example I'll assume the polygon as dictionary from your csv file. Keys could be name, url, id, point_geom, poly_geom as mentioned above but they won't matter really as long as the table structure contains the same keys.
There's probably a way to shorten this but I hope this clarifies the use of the sql functions, namely sql.SQL, sql.Identifier, and sql.Placeholder and how to concatenate a list of strings sql.SQL('..').join(list()).
from psycopg2 import sql
table = 'my_table'
polygon = Polyogon.from_file() # or something
column_list = list()
value_list = list()
# Convert the dictionary to lists
for column, value in polygon.items():
column_list.append(sql.Identifier(column)) # Convert to identifiers
value_list.append(value)
# Build the query, values will be inserted later
query = sql.SQL("INSERT INTO {} ({}) VALUES ({}) ON CONFLICT DO NOTHING").format(
sql.Identifier(table),
sql.SQL(', ').join(column_list), # already sql.Identifier
sql.SQL(', ').join([sql.Placeholder()] * len(value_list)))
# Execute the cursor
with postgres.cursor() as p_cursor:
# execute requires tuples and not a list
p_cursor.execute(insert_query, tuple(value_list))
Reference: http://initd.org/psycopg/docs/sql.html
The proper way is to use psycopg2 2.7's new sql module which includes an Identifier object. This allows you to dynamically specify SQL identifiers in a safe way.
Unfortunately 2.7 is not on PyPi yet (2.6.2 as of writing).
Until then, psycopg2 cover this under the heading "How can I pass field/table names to a query?"
http://initd.org/psycopg/docs/faq.html#problems-with-type-conversions
You can pass SQL identifiers in along with data values to the execute function by using the AsIs function.
Note: this provides NO security. It is as good as using a format string, which is not recommended.
The only real advantage of this is you encourage future code to follow the execute + data style. You can also easily search for AsIs in future.
from psycopg2.extensions import AsIs
<snip>
with transaction() as cur:
# WARNING: not secure
cur.execute('SELECT * from %(table)s', {'table': AsIs('mytable')})
How to avoid inserting duplicate data? I only want to insert data that does not already exist. I have written following queries but its not working properly. I'm using PostgreSQL.
title_exits = cursor.execute ("SELECT title,pageid FROM movie_movie WHERE title = %s AND pageid = %s;",(title,pageid))
if title_exits == 0:
cursor.execute("INSERT INTO movie_movie (title,pageid,slug,language) values (%s,%s,%s,%s);",(title,pageid,slug,id))
db.commit()
Update: I tried result = cursor.fetchone ("SELECT count(*) FROM movie_movie WHERE title = %s AND pageid = %s;",(title,pageid)). But I'm getting error message. TypeError: fetchone() takes not arugments (2 given).
Answer related to your update:
You should use "%" symbol instead comma:
result = cursor.fetchone ("SELECT count(*) FROM movie_movie WHERE title = %s AND pageid = %s;" % (title,pageid))
update
as #no_freedom said in comments, think better approach would be
result = cursor.fetchone ("SELECT count(*) FROM movie_movie WHERE title = :1 AND pageid = :2", [title,pageid])
But i'm not sure, just try it.
Try to define title field as unique(must define as varchar(constant_length)). Then try insert title into database if title exists, db return error else will insert
As I suspected (and #tony points out) cursor.execute does not return the number of rows. It always return None.
How do I do this correctly:
I want to do a query like this:
query = """SELECT * FROM sometable
order by %s %s
limit %s, %s;"""
conn = app_globals.pool.connection()
cur = conn.cursor()
cur.execute(query, (sortname, sortorder, limit1, limit2) )
results = cur.fetchall()
All works fine but the order by %s %s is not putting the strings in correctly. It is putting the two substitutions in with quotes around them.
So it ends up like:
ORDER BY 'somecol' 'DESC'
Which is wrong should be:
ORDER BY somecol DESC
Any help greatly appreciated!
paramstyle
Parameter placeholders can only be used to insert column values. They can not be used for other parts of SQL, such as table names, statements, etc.
%s placeholders inside query string are reserved for parameters. %s in 'order by %s %s' are not parameters. You should make query string in 2 steps:
query = """SELECT * FROM sometable order by %s %s limit %%s, %%s;"""
query = query % ('somecol', 'DESC')
conn = app_globals.pool.connection()
cur = conn.cursor()
cur.execute(query, (limit1, limit2) )
results = cur.fetchall()
DO NOT FORGET to filter first substitution to prevent SQL-injection possibilities
Not all parts of an SQL query can be parametrized. The DESC keyword for example is not
a parameter. Try
query = """SELECT * FROM sometable
order by %s """ + sortorder + """
limit %s, %s"""
cur.execute(query, (sortname, limit1, limit2) )
You could try this alternatively...
query = """SELECT * FROM sometable
order by {0} {1}
limit {2}, {3};"""
sortname = 'somecol'
sortorder = 'DESC'
limit1 = 'limit1'
limit2 = 'limit2'
print(query.format(sortname, sortorder, limit1, limit2))
I'd like to use placeholders as seen in this example:
cursor.execute ("""
UPDATE animal SET name = %s
WHERE name = %s
""", ("snake", "turtle"))
Except I'd like to have the query be its own variable as I need to insert a query into multiple databases, as in:
query = """UPDATE animal SET name = %s
WHERE name = %s
""", ("snake", "turtle"))
cursor.execute(query)
cursor2.execute(query)
cursor3.execute(query)
What would be the proper syntax for doing something like this?
query = """UPDATE animal SET name = %s
WHERE name = %s
"""
values = ("snake", "turtle")
cursor.execute(query, values)
cursor2.execute(query, values)
or if you want group them together...
arglist = [query, values]
cursor.execute(*arglist)
cursor2.execute(*arglist)
but it's probably more readable to do it the first way.