How to write multi column in clause with sqlalchemy - python

Please suggest is there way to write query multi-column in clause using SQLAlchemy?
Here is example of the actual query:
SELECT url FROM pages WHERE (url_crc, url) IN ((2752937066, 'http://members.aye.net/~gharris/blog/'), (3799762538, 'http://www.coxandforkum.com/'));
I have a table that has two columns primary key and I'm hoping to avoid adding one more key just to be used as an index.
PS I'm using mysql DB.
Update: This query will be used for batch processing - so I would need to put few hundreds pairs into the in clause. With IN clause approach I hope to know fixed limit of how many pairs I can stick into one query. Like Oracle has 1000 enum limit by default.
Using AND/OR combination might be limited by the length of the query in chars. Which would be variable and less predictable.

Assuming that you have your model defined in Page, here's an example using tuple_:
keys = [
(2752937066, 'http://members.aye.net/~gharris/blog/'),
(3799762538, 'http://www.coxandforkum.com/')
]
select([
Page.url
]).select_from(
Page
).where(
tuple_(Page.url_crc, Page.url).in_(keys)
)
Or, using the query API:
session.query(Page.url).filter(tuple_(Page.url_crc, Page.url).in_(keys))

I do not think this is currently possible in sqlalchemy, and not all RDMBS support this.
You can always transform this to a OR(AND...) condition though:
filter_rows = [
(2752937066, 'http://members.aye.net/~gharris/blog/'),
(3799762538, 'http://www.coxandforkum.com/'),
]
qry = session.query(Page)
qry = qry.filter(or_(*(and_(Page.url_crc == crc, Page.url == url) for crc, url in filter_rows)))
print qry
should produce something like (for SQLite):
SELECT pages.id AS pages_id, pages.url_crc AS pages_url_crc, pages.url AS pages_url
FROM pages
WHERE pages.url_crc = ? AND pages.url = ? OR pages.url_crc = ? AND pages.url = ?
-- (2752937066L, 'http://members.aye.net/~gharris/blog/', 3799762538L, 'http://www.coxandforkum.com/')
Alternatively, you can combine two columns into just one:
filter_rows = [
(2752937066, 'http://members.aye.net/~gharris/blog/'),
(3799762538, 'http://www.coxandforkum.com/'),
]
qry = session.query(Page)
qry = qry.filter((func.cast(Page.url_crc, String) + '|' + Page.url).in_(["{}|{}".format(*_frow) for _frow in filter_rows]))
print qry
which produces the below (for SQLite), so you can use IN:
SELECT pages.id AS pages_id, pages.url_crc AS pages_url_crc, pages.url AS pages_url
FROM pages
WHERE (CAST(pages.url_crc AS VARCHAR) || ? || pages.url) IN (?, ?)
-- ('|', '2752937066|http://members.aye.net/~gharris/blog/', '3799762538|http://www.coxandforkum.com/')

I ended up using the test() based solution: generated "(a,b) in ((:a1, :b1), (:a2,:b2), ...)" with named bind vars and generating dictionary with bind vars' values.
params = {}
for counter, r in enumerate(records):
a_param = "a%s" % counter
params[a_param] = r['a']
b_param = "b%s" % counter
params[b_param] = r['b']
pair_text = "(:%s,:%s)" % (a_param, b_param)
enum_pairs.append(pair_text)
multicol_in_enumeration = ','.join(enum_pairs)
multicol_in_clause = text(
" (a,b) in (" + multicol_in_enumeration + ")")
q = session.query(Table.id, Table.a,
Table.b).filter(multicol_in_clause).params(params)
Another option I thought about using mysql upserts but this would make whole included even less portable for the other db engine then using multicolumn in clause.
Update SQLAlchemy has sqlalchemy.sql.expression.tuple_(*clauses, **kw) construct that can be used for the same purpose. (I haven't tried it yet)

Related

PYTHON - Dynamically update multiple columns using a custom MariaDB connector

While reading this question: SQL Multiple Updates vs single Update performance
I was wondering how could I dynamically implement an update for several variables at the same time using a connector like MariaDB's. Reading the official documentation I did not find anything similar.
This question is similar, and it has helped me to understand how to use parametrized queries with custom connectors but it does not answer my question.
Let's suppose that, from one of the views of the project, we receive a dictionary.
This dictionary has the following structure (simplified example):
{'form-0-input_file_name': 'nofilename', 'form-0-id': 'K0944', 'form-0-gene': 'GJXX', 'form-0-mutation': 'NM_0040(p.Y136*)', 'form-0-trix': 'ZSSS4'}
Assuming that each key in the dictionary corresponds to a column in a table of the database, if I'm not mistaken we would have to iterate over the dictionary and build the query in each iteration.
Something like this (semi pseudo-code, probably it's not correct):
query = "UPDATE `db-dummy`.info "
for key in a_dict:
query += "SET key = a_dict[key]"
It is not clear to me how to construct said query within a loop.
What is the most pythonic way to achieve this?
Although this could work.
query = "UPDATE `db-dummy`.info "
for index, key in enumerate(a_dict):
query = query + ("," if index != 0 else "") +" SET {0} = '{1}'".format(key,a_dict[key])
You should consider parameterized queries for safety and security. Moreover, a dynamic dictionary may also raise other concerns, it may be best to verify or filter on a set of agreed keys before attempting such an operation.
query = "UPDATE `db-dummy`.info "
for index, key in enumerate(a_dict):
query = query + ("," if index != 0 else "") +" SET {0} = ? ".format(key)
# Then execute with your connection/cursor
cursor.execute(query, tuple(a_dict.values()) )
This is what I did (inspired by #ggordon's answer)
query = "UPDATE `db-dummy`.info "
for index, key in enumerate(a_dict):
if index == 0:
query = query + "SET {0} = ?".format(key)
else:
query = query + ", {0} = ?".format(key)
query += " WHERE record_id = " + record_id
And it works!

Best practice querying the DB using python

Below is the code. In iterating the dictionary, the code is querying multiple times. Is it the best practice to execute the query or Pinging the DB multiple times ?
import cx_Oracle
connDev = 'username/password#hostname:port/service'
connDev = cx_Oracle.connect(connDev)
cursor = connDev.cursor()
d = {'2006': '20170019201',
'2006172': '2017000002',
'200617123': '200003'
}
for key,value in d.items():
cursDev.execute('SELECT columnName from tableName where columnName={}'.format(key))
if len(cursDev.fetchall())!=0:
# cursDev.execute('UPDATE tableName SET columnName= {0} WHERE columnName= {1} '.format(value, key))
else:
continue
connDev.commit()
cursDev.close()
connDev.close()
You could run a single query and get everything:
cursDev.execute(
'SELECT columnName FROM tableName WHERE columnName IN ({})'.format(
','.join(':p{}'.format(n) for n in range(len(d))),
{'p{}'.format(n): k for n, k in enumerate(d)}
)
or run the updates directly - they do nothing if the row is not found:
for k, v in d.items():
cursDev.execute(
'UPDATE tableName SET columnName = {} WHERE columnName = :value',
{'value': v}
)
Note that both examples are using parameterized queries - the data is being passed separated from the query and it is the job of the database to do the parameter interpolation - thus freeing you from quote hell and preventing injection automatically, besides performing better.
The code uses :value named-style parameter placeholders because that is what cx_Oracle uses - see documentation on cx_Oracle.paramstyle.
For a 'batch' update like this, the executemany() call is going to be most efficient way.
As #nosklo noted, the SELECT calls aren't necessary - they just take time. And with executemany() you don't need to do repeated execute() calls, which is another saving.
From samples/ArrayDMLRowCounts.py:
# delete the following parent IDs only
parentIdsToDelete = [20, 30, 50]
print("Deleting Parent IDs:", parentIdsToDelete)
print()
# enable array DML row counts for each iteration executed in executemany()
cursor.executemany("""
delete from ChildTable
where ParentId = :1""",
[(i,) for i in parentIdsToDelete],
arraydmlrowcounts = True)
# display the number of rows deleted for each parent ID
rowCounts = cursor.getarraydmlrowcounts()
for parentId, count in zip(parentIdsToDelete, rowCounts):
print("Parent ID:", parentId, "deleted", count, "rows.")
See Efficient and Scalable Batch Statement Execution in Python cx_Oracle for more examples, including those with multiple binds like you will need.
[Never ever (except is specialized cases) build up a SQL statement by concatenating strings. It is a security hole and can also give poor performance. Always use bind variables]

Avoid python interpreting % as a placeholder in like mysql clause

I'm trying to create a DataFrame through a sql query with pandas read_sql_query method. The query has a where clause that includes a like operation but it also includes a = operation that depends on a variable. The issue is that python is interpreting the % in the like operation as a place holder, just like in the = variable operation which is something I DO want.
Here's an example of it:
sql_string = """ SELECT a,b from table WHERE a = %(variable)s
AND b like '%fixed_chars%' """
params = {'variable':'AA'}
df = pandas.read_sql_query(sql_string, params=params, con=connection)
The error that I get is TypeError: not enough arguments for format string since it interprets the % you usually use as wildcard in mysql as the place holder in python.
In this case, you'll have to use two % for those not being formatting placeholders:
sql_string = "SELECT a,b from table WHERE a = %(variable)s AND \
b like '%%fixed_chars%%'"
Hope this helps!

Insert tuple in sqlite request [duplicate]

all I want to do is send a query like
SELECT * FROM table WHERE col IN (110, 130, 90);
So I prepared the following statement
SELECT * FROM table WHERE col IN (:LST);
Then I use
sqlite_bind_text(stmt, 1, "110, 130, 90", -1, SQLITE_STATIC);
Unfortunately this becomes
SELECT * FROM table WHERE col IN ('110, 130, 90');
and is useless (note the two additional single quotes). I already tried putting extra ' in the string but they get escaped. I didn't find an option to turn off the escaping or prevent the text from being enclosed by single quotes. The last thing I can think of is not using a prepared statement, but I'd only take it as last option. Do you have any ideas or suggestions?
Thanks
Edit:
The number of parameters is dynamic, so it might be three numbers, as in the example above, one or twelve.
You can dynamically build a parameterized SQL statement of the form
SELECT * FROM TABLE WHERE col IN (?, ?, ?)
and then call sqlite_bind_int once for each "?" you added to the statement.
There is no way to directly bind a text parameter to multiple integer (or, for that matter, multiple text) parameters.
Here's pseudo code for what I have in mind:
-- Args is an array of parameter values
for i = Lo(Args) to Hi(Args)
paramlist = paramlist + ', ?'
sql = 'SELECT * FROM TABLE WHERE col IN (' + Right(paramlist, 3) + ')'
for i = Lo(Args) to Hi(Args)
sql_bind_int(sql, i, Args[i]
-- execute query here.
I just faced this question myself, but answered it by creating a temporary table and inserting all the values into that, so that I could then do:
SELECT * FROM TABLE WHERE col IN (SELECT col FROM temporarytable);
Even simpler, build your query like this:
"SELECT * FROM TABLE WHERE col IN (" + ",".join(["?"] * len(lst)) + ")"
Depending on your build of sqlite (it's not part of the default build), you may be able to use:
SELECT * FROM table WHERE col IN carray(?42);
and then bind ?42 using (assuming the C API):
int32_t data[] = {110, 130, 90};
sqlite3_carray_bind(
stmtPtr, 42,
data, sizeof(data)/sizeof(data[0]),
CARRAY_INT32, SQLITE_TRANSIENT);
I haven't actually tested that, I just read the docs: https://sqlite.org/carray.html
You cannot pass an array as one parameter, but you can pass each array value as a separate parameter (IN (?, ?, ?)).
The safe way to do this for dynamic number parameters (you should not use string concatenation, .format(), etc. to insert the values themselves into the query, it can lead to SQL injections) is to generate the query string with the needed number of ? placeholders and then bind the array elements. Use array concatenation or spread syntax (* or ... in most languages) if you need to pass other parameters too.
Here is an example for Python 3:
c.execute('SELECT * FROM TABLE WHERE col IN ({}) LIMIT ?'
.format(', '.join(['?'] * len(values))), [*values, limit])
One solution (which I haven't tried yet in code, but only on the SQLite shell) is to use json_each function from SQLite.
So you could do something like:
SELECT * FROM table
WHERE col IN (SELECT value FROM json_each(?));
The caveat is that you'd have to manually assemble a valid JSON array with the values you're trying to bind.
A much simpler and safer answer simply involves generating the mask (as opposed to the data part of the query) and allowing the SQL-injection formatter engine to do its job.
Suppose we have some ids in an array, and some cb callback:
/* we need to generate a '?' for each item in our mask */
const mask = Array(ids.length).fill('?').join();
db.get(`
SELECT *
FROM films f
WHERE f.id
IN (${mask})
`, ids, cb);
Working on a same functionality lead me to this approach:
(nodejs, es6, Promise)
var deleteRecords = function (tblName, data) {
return new Promise((resolve, reject) => {
var jdata = JSON.stringify(data);
this.run(`DELETE FROM ${tblName} WHERE id IN (?)`, jdata.substr(1, jdata.length - 2), function (err) {
err ? reject('deleteRecords failed with : ' + err) : resolve();
});
});
};
this works fine aswell (Javascript ES6):
let myList = [1, 2, 3];
`SELECT * FROM table WHERE col IN (${myList.join()});`
You can try this
RSQLite in R:
lst <- c("a", "b", "c")
dbGetQuery(db_con, paste0("SELECT * FROM table WHERE col IN (", paste0(shQuote(lst), collapse=", ") , ");"))
My solution for node (ES6, Promises):
let records = await db.all(`
SELECT * FROM table
WHERE (column1 = ?) and column2 IN ( ${[...val2s].fill('?').join(',')} )
`, [val1, ...val2s])
Works with a variable number of possible values.
This uses sqlite-async but you can modify it for the callback style version trivially.
if you are using Python the easiest way to handle this, in practice, is to create a local function that tests against a string value of the list (which can be passed as a bind variable).
I used this when providing "Query By Example" functionality in a Python GUI app.
pros:
can use common approach in parsing and building the SQL across entries
as I would when parsing LIKE xxx and > xxx etc
just one extra call to set the function up - either at connection time
or if the function call is detected in the created sql
cons:
function needs to parse string list for each row. This is bad if the query
is running against a large table
embedded commas, blanks and other similar stuff may be difficult to handle
For example
user enters IN 18C, 356, 013 into Account field in application
application creates sql with ... WHERE inz( Account , ? ) ...
application creates string bind value 18C, 356, 013
application issues <sqlite3.connection>.create_function("inz", 2, inz ) to bind local python function inz (see below) to sqlite function inz.
application issues query
the coding for the inz function is as follows
def inz( val , possibles ) :
"""implements the IN list function allowing one bind variable
use <sqlite3.connection>.create_function("inz", 2, inz )
and ensure that the bind variable is a comma delimited list
in string form (without quotes)
matches can be string or integer but do not allow for leading
or trailing spaces or contained commas, quotes etc or floating points
"""
poss = [ x.strip() for x in possibles.split(',') ]
if val in poss :
return True
if isinstance( val, int ) :
ipos = [ int(x) for x in poss if x.isdecimal() ]
if val in ipos :
return True
return False
For example, if you want the sql query:
select * from table where col in (110, 130, 90)
What about:
my_list = [110, 130, 90]
my_list_str = repr(my_list).replace('[','(').replace(']',')')
cur.execute("select * from table where col in %s" % my_list_str )

What is the correct way to form MySQL queries in python?

I am new to python, I come here from the land of PHP. I constructed a SQL query like this in python based on my PHP knowledge and I get warnings and errors
cursor_.execute("update posts set comment_count = comment_count + "+str(cursor_.rowcount)+" where ID = " + str(postid))
# rowcount here is int
What is the right way to form queries?
Also, how do I escape strings to form SQL safe ones? like if I want to escape -, ', " etc, I used to use addslashes. How do we do it in python?
Thanks
First of all, it's high time to learn to pass variables to the queries safely, using the method Matus expressed. Clearer,
tuple = (foovar, barvar)
cursor.execute("QUERY WHERE foo = ? AND bar = ?", tuple)
If you only need to pass one variable, you must still make it a tuple: insert comma at the end to tell Python to treat it as a one-tuple: tuple = (onevar,)
Your example would be of form:
cursor_.execute("update posts set comment_count = comment_count + ? where id = ?",
(cursor_.rowcount, postid))
You can also use named parameters like this:
cursor_.execute("update posts set comment_count = comment_count + :count where id = :id",
{"count": cursor_.rowcount, "id": postid})
This time the parameters aren't a tuple, but a dictionary that is formed in pairs of "key": value.
from python manual:
t = (symbol,)
c.execute( 'select * from stocks where symbol=?', t )
this way you prevent SQL injection ( suppose this is the SQL safe you refer to ) and also have formatting solved

Categories

Resources