Python 3.5.2, stdlib sqlite3.
I'm trying to issue a SQL query with a dynamic criterion in an IN operator in the WHERE clause:
bad code (doesn't work)
def top_item_counts_in_installations(cursor, installation_names):
return cursor.execute(
"SELECT itm.name, COUNT(*)"
" FROM Installations ins INNER JOIN Items itm ON itm.id=ins.itmid"
" WHERE ins.name IN (?)"
" GROUP BY itm.name"
" ORDER BY COUNT(*) DESC"
" LIMIT 3",
(installation_names,))
installation_names is a tuple, but the above does not work, obviously, with an error
sqlite3.InterfaceError: Error binding parameter 0 - probably unsupported type.
So what I currently do is prepare a sequence of parameter holders based on the length of installation_names:
kludge (working but ugly)
def top_item_counts_in_installations(cursor, installation_names):
param_holders= ",".join("?" for i in installation_names)
return cursor.execute(
"SELECT itm.name, COUNT(*)"
" FROM Installations ins INNER JOIN Items itm ON itm.id=ins.itmid"
" WHERE ins.name IN (%s)"
" GROUP BY itm.name"
" ORDER BY COUNT(*) DESC"
" LIMIT 3" % param_holders,
installation_names)
Is there a proper way to parameterize the right term of IN instead of my kludge? I haven't managed to find something in the documentation.
Related
While reading this question: SQL Multiple Updates vs single Update performance
I was wondering how could I dynamically implement an update for several variables at the same time using a connector like MariaDB's. Reading the official documentation I did not find anything similar.
This question is similar, and it has helped me to understand how to use parametrized queries with custom connectors but it does not answer my question.
Let's suppose that, from one of the views of the project, we receive a dictionary.
This dictionary has the following structure (simplified example):
{'form-0-input_file_name': 'nofilename', 'form-0-id': 'K0944', 'form-0-gene': 'GJXX', 'form-0-mutation': 'NM_0040(p.Y136*)', 'form-0-trix': 'ZSSS4'}
Assuming that each key in the dictionary corresponds to a column in a table of the database, if I'm not mistaken we would have to iterate over the dictionary and build the query in each iteration.
Something like this (semi pseudo-code, probably it's not correct):
query = "UPDATE `db-dummy`.info "
for key in a_dict:
query += "SET key = a_dict[key]"
It is not clear to me how to construct said query within a loop.
What is the most pythonic way to achieve this?
Although this could work.
query = "UPDATE `db-dummy`.info "
for index, key in enumerate(a_dict):
query = query + ("," if index != 0 else "") +" SET {0} = '{1}'".format(key,a_dict[key])
You should consider parameterized queries for safety and security. Moreover, a dynamic dictionary may also raise other concerns, it may be best to verify or filter on a set of agreed keys before attempting such an operation.
query = "UPDATE `db-dummy`.info "
for index, key in enumerate(a_dict):
query = query + ("," if index != 0 else "") +" SET {0} = ? ".format(key)
# Then execute with your connection/cursor
cursor.execute(query, tuple(a_dict.values()) )
This is what I did (inspired by #ggordon's answer)
query = "UPDATE `db-dummy`.info "
for index, key in enumerate(a_dict):
if index == 0:
query = query + "SET {0} = ?".format(key)
else:
query = query + ", {0} = ?".format(key)
query += " WHERE record_id = " + record_id
And it works!
I am using pyodbc to update an Access database.
I need the functionality of an UPSERT.
ON DUPLICATE KEY UPDATE doesn't exist in Access SQL, and REPLACE is not an option since I want to keep other fields.
There are a lot of suggestions out there how to solve that, so this is
the solution which I put together:
for table_name in data_source:
table = data_source[table_name]
for y in table:
if table_name == "whatever":
SQL_UPDATE = "UPDATE {} set [Part Name] = '{}', [value] = {}, [code] = {}, [tolerance] = {} WHERE [Unique Part Number]='{}'".\
format(table_name,y['PartName'],y['Value'],y['keycode'],y['Tolerance'], y['PartNum'])
SQL_INSERT = "INSERT INTO {} ([Part Name],[Unique Part Number], [value], [code], [tolerance]) VALUES ('{}','{}','{}',{},{},{});".\
format(table_name,y['PartName'],y['PartNum'],y['Value'],y['keycode'],y['Tolerance'])
elsif ....
9 more tables....
res = cursor.execute(SQL_UPDATE)
if res.rowcount == 0:
cursor.execute(SQL_INSERT)
Well I have to say, I am not a Python expert, and I didn't manage to understand the fundamental concept nor the Magic of SQL,
so I can just Google things together here.
I don't like my above solution because it is very hard to read and difficult to maintain (I have to to this for ~10 different tables). The other point is that I have to use 2 queries because I didn't manage to understand and run any other UPSERT approach I found.
Does anyone have a recommendation for me how to do this in a smarter, better maintainable way?
As noted in this question and others, Access SQL does not have an "upsert" statement, so you will need to use a combination of UPDATE and INSERT. However, you can improve your current implementation by
using proper parameters for your query, and
using Python string manipulation to build the SQL command text.
For example, to upsert into a table named [Donor]
Donor ID Last Name First Name
-------- --------- ----------
1 Thompson Gord
You can start with a list of the field names. The trick here is to put the key field(s) at the end, so the INSERT and UPDATE statements will refer to the fields in the same order (i.e., the UPDATE will refer to the ID field last because it will be in the WHERE clause).
data_fields = ['Last Name', 'First Name']
key_fields = ['Donor ID']
The parameter values will be the same for both the UPDATE and INSERT cases, e.g.
params = ('Elk', 'Anne', 2)
The UPDATE statement can be constructed like this
update_set = ','.join(['[' + x + ']=?' for x in data_fields])
update_where = ' AND '.join(['[' + x + ']=?' for x in key_fields])
sql_update = "UPDATE [Donor] SET " + update_set + " WHERE " + update_where
print(sql_update)
which shows us
UPDATE [Donor] SET [Last Name]=?,[First Name]=? WHERE [Donor ID]=?
Similarly, the INSERT statement can be constructed like this
insert_fields = ','.join(['[' + x + ']' for x in (data_fields + key_fields)])
insert_placeholders = ','.join(['?' for x in (data_fields + key_fields)])
sql_insert = "INSERT INTO [Donor] (" + insert_fields + ") VALUES (" + insert_placeholders + ")"
print(sql_insert)
which prints
INSERT INTO [Donor] ([Last Name],[First Name],[Donor ID]) VALUES (?,?,?)
So, to perform our upsert, all we need to do is
crsr.execute(sql_update, params)
if crsr.rowcount > 0:
print('Existing row updated.')
else:
crsr.execute(sql_insert, params)
print('New row inserted.')
crsr.commit()
Consider using parameterized queries from prepared statements that uses ? placeholders. The str.format is still needed for identifiers such as table and field names. Then unpack dictionary items with zip(*dict.items()) to pass as parameters in the cursor's execute call: cursor.execute(query, params).
for table_name in data_source:
table = data_source[table_name]
for y in table:
keys, values = zip(*y.items()) # UNPACK DICTIONARY INTO TWO TUPLES
if table_name == "whatever":
SQL_UPDATE = "UPDATE {} set [Part Name] = ?, [value] = ?, [code] = ?," + \
" [tolerance] = ? WHERE [Unique Part Number]= ?".format(table_name)
SQL_INSERT = "INSERT INTO {} ([Part Name], [Unique Part Number], [value]," + \
" [code], [tolerance]) VALUES (?, ?, ?, ?, ?);".format(table_name)
res = cursor.execute(SQL_UPDATE, values)
if res.rowcount == 0:
cursor.execute(SQL_INSERT, values)
...
I am working in Django 1.8 and doing some raw SQL queries using connection.cursor.
My question is about how to safely supply multiple parameters to the cursor. Here is my code:
cursor = connection.cursor()
query = "SELECT cost, id, date, org_id FROM mytable "
query += " WHERE ("
for i, c in enumerate(codes):
query += "id=%s "
if (i != len(codes)-1):
query += ' OR '
query += " AND "
for i, c in enumerate(orgs):
query += "org_id=%s "
if (i != len(orgs)-1):
query += ' OR '
cursor.execute(query, tuple(codes), tuple(orgs))
But this gives me:
TypeError: execute() takes at most 3 arguments (4 given)
I'm trying to follow the PEP documentation on execute, it says that one can use executemany instead, but that doesn't seem to help either:
cursor.executemany(query, [tuple(codes), tuple(orgs)])
I just can't follow the PEP documentation without an example. Could anyone help?
Your problem is that you're passing more arguments to execute than it accepts. What you need is to combine the query's parameters into a single tuple. One way to do that is to use itertools.chain to chain both lists' elements into one iterable that can be used to create a single tuple:
import itertools
cursor.execute(query, tuple(itertools.chain(codes, orgs)))
I am trying to query a mysql db from python but having troubles generating the query ebcasue of the wildcard % and python's %s. As a solution I find using ?, but when I run the following,
query = '''select * from db where name like'Al%' and date = '%s' ''', myDateString
I get an error
cursor.execute(s %'2015_05_21')
ValueError: unsupported format character ''' (0x27) at index 36 (the position of %)
How can i combine python 2.7 string bulding and sql wildcards? (The actual query is a lot longer and involves more variables)
First of all, you need to escape the percent sign near the Al:
'''select * from db where name like 'Al%%' and date = '%s''''
Also, follow the best practices and pass the query parameters in the second argument to execute(). This way your query parameters would be escaped and you would avoid sql injections:
query = """select * from db where name like 'Al%%' and date = %s"""
cursor.execute(query, ('2015_05_21', ))
Two things:
Don't use string formatting ('%s' % some_var) in SQL queries. Instead, pass the string as a sequence (like a list or a tuple) to the execute method.
You can escape your % so Python will not expect a format specifier:
q = 'SELECT foo FROM bar WHERE zoo LIKE 'abc%%' and id = %s'
cursor.execute(q, (some_var,))
Use the format syntax for Python string building, and %s for SQL interpolation. That way they don't conflict with each other.
You are not using the ? correctly.
Here's an example:
command = '''SELECT M.name, M.year
FROM Movie M, Person P, Director D
WHERE M.id = D.movie_id
AND P.id = D.director_id
AND P.name = ?
AND M.year BETWEEN ? AND ?;'''
*Execute the command, replacing the placeholders with the values of
the variables in the list [dirName, start, end]. *
cursor.execute(command, [dirName, start, end])
So, you want to try:
cursor.execute(query,'2015_05_21')
I am using mysql-connector with python and have a query like this:
SELECT avg(downloadtime) FROM tb_npp where date(date) between %s and %s and host like %s",(s_date,e_date,"%" + dc + "%")
NOw, if my variable 'dc' is a list like this:
dc = ['sjc','iad','las']
Then I have a mysql query like below:
SELECT avg(downloadtime) FROM tb_npp where date(date) = '2013-07-01' and substring(host,6,3) in ('sjc','las');
My question is, how do I write this query in my python code which will convert my variable 'dc' to a list?
I tried the below query but getting error: Failed processing format-parameters; 'MySQLConverter' object has no attribute '_list_to_mysql'
cursor3.execute("SELECT avg(downloadtime) FROM tb_npp where date(date) between %s and %s and substring(host,6,3) in %s",(s_date,e_date,dc))
Can somebody please tell me what I am doing wrong?
Thanks in advance
I'm not familiar with mysql-connector, but its behavior appears to be similar to MySQLdb in this regard. If that's true, you need to use a bit of string formatting:
sql = """SELECT avg(downloadtime) FROM tb_npp where date(date) = %s
and substring(host,6,3) in ({c})""".format(
c=', '.join(['%s']*len(dc)))
args = ['2013-07-01'] + dc
cursor3.execute(sql, args)
As an alternative to #unutbu's answer which is specific to using mysql-connector:
cursor.execute(
"SELECT thing "
" FROM table "
" WHERE some_col > %(example_param)s "
" AND id IN ({li})".format(li=", ".join(list_of_values)),
params={
"example_param": 33
})
If you try to move the joined list into a param (like example param) it may
complain because mysql-connector interprets the values as strings.
If your list isn't made up of things that are string-format-able (like integers) by default then in your join statement replace list_of_values with:
[str(v) for v in list_of_values]
SELECT avg(downloadtime) FROM tb_npp where date(date) = '2013-07-01' and substring(host,6,3) in %s
That's it, that's it