Python/ SQL Statement - Use of parameters / injection protection?

Python/ SQL Statement - Use of parameters / injection protection? - python

I have inherited some code similar to below. I understand the concept of passing values to make a query dynamic(in this case field_id) but I don't understand what the benefit of taking the passed-in field_id list and putting it into a dictionary parameters = {"logical_field_id": field_id} before accessing the newly created dictionary to build the SQL statement. Along the same line why return parameters=parameters rather than just listing parameters in the return? I assume this is all the make the request more secure but I would like to better understand of why/how as I need to take on a similar task on a slightly more complex query that is below
def get_related_art(self, field_id):
parameters = {"logical_field_id": field_id}
sql = (
"SELECT a.id AS id,"
" a.name AS name,"
" a.description AS description,"
" a.type AS type,"
" a.subtype AS subtype "
" FROM ArtclTbl AS a INNER JOIN ("
" SELECT article_id AS id FROM LogFldArtclTbl"
" WHERE logical_field_id = %(logical_field_id)s"
" ORDER BY a.name"
)
return self.query(sql, parameters=parameters)
My reason for asking this question is I was asked to parameterize this
def get_group_fields(self, exbytes=None):
parameters = {}
where_clause = (
f"WHERE eig_eb.ebyte in ({', '.join(str(e) for e in ebytes)})" if ebytes else ""
)
sql = (
"SELECT l.id AS id, "
" eig_eb.ebyte AS ebyte, "
" eig.id AS instrument_group_id, "
" eig_lf.relationship_type AS relationship "
....
f" {where_clause}"
)
I started to modify code to iterate when setting the parameters and then accessing that value in the original location. This 'works' except now the query string returns ([ebyte1, ebyte2] instead of (ebyte1, ebyte2). I could modify the string to work around this but i really wanted to understand the why of this first.
parameters = {"exbytes": ', '.join(str(e) for e in exbytes)}
...
where_clause = (
f"WHERE eig_eb.exbyte in " + str(exbytes) if exbytes else ""

The benefit of using named parameter placeholders is so you can pass the parameter values as a dict, and you can add values to that dict in any order. There's no benefit in the first example you show, because you only have one entry in the dict.
There's no benefit in the second example either, because the parameters are part of an IN() list, and there are no other parameterized parts of the query. The order of values in an IN() list is irrelevant. So you could just use positional parameters instead of named parameters.
where_clause = (
f"WHERE eig_eb.ebyte in ({', '.join('%s' for e in ebytes)})" if ebytes else ""
)
Then you don't need a dict at all, you can just pass the ebytes list as the parameters.
Using the syntax parameters=parameters looks like a usage of keyword arguments to a Python function. I don't know the function self.query() in your example, but I suppose it accepts keyword arguments to implement optional arguments. The fact that your local variable is the same name as the keyword argument name is a coincidence.

Related

PYTHON - Dynamically update multiple columns using a custom MariaDB connector

While reading this question: SQL Multiple Updates vs single Update performance
I was wondering how could I dynamically implement an update for several variables at the same time using a connector like MariaDB's. Reading the official documentation I did not find anything similar.
This question is similar, and it has helped me to understand how to use parametrized queries with custom connectors but it does not answer my question.
Let's suppose that, from one of the views of the project, we receive a dictionary.
This dictionary has the following structure (simplified example):
{'form-0-input_file_name': 'nofilename', 'form-0-id': 'K0944', 'form-0-gene': 'GJXX', 'form-0-mutation': 'NM_0040(p.Y136*)', 'form-0-trix': 'ZSSS4'}
Assuming that each key in the dictionary corresponds to a column in a table of the database, if I'm not mistaken we would have to iterate over the dictionary and build the query in each iteration.
Something like this (semi pseudo-code, probably it's not correct):
query = "UPDATE `db-dummy`.info "
for key in a_dict:
query += "SET key = a_dict[key]"
It is not clear to me how to construct said query within a loop.
What is the most pythonic way to achieve this?

Although this could work.
query = "UPDATE `db-dummy`.info "
for index, key in enumerate(a_dict):
query = query + ("," if index != 0 else "") +" SET {0} = '{1}'".format(key,a_dict[key])
You should consider parameterized queries for safety and security. Moreover, a dynamic dictionary may also raise other concerns, it may be best to verify or filter on a set of agreed keys before attempting such an operation.
query = "UPDATE `db-dummy`.info "
for index, key in enumerate(a_dict):
query = query + ("," if index != 0 else "") +" SET {0} = ? ".format(key)
# Then execute with your connection/cursor
cursor.execute(query, tuple(a_dict.values()) )

This is what I did (inspired by #ggordon's answer)
query = "UPDATE `db-dummy`.info "
for index, key in enumerate(a_dict):
if index == 0:
query = query + "SET {0} = ?".format(key)
else:
query = query + ", {0} = ?".format(key)
query += " WHERE record_id = " + record_id
And it works!

Dynamic binding on WHERE for query string on sql python

My query string in dict used to filter data on WHERE clause.
parameters =
{
"manufacuturerId": "1",
"fileName": "abc1234 ",
"categoryName": "normal"
}
And SQL query as:
fileSql = "select * from file_table as a
left join category_table as b
on a.fId = b.fId
left join manufacturer_table as c
on c.mId = a.mId
where c.manufacturerId = %(manufacturerId)s and
a.file_name = %(fileName)s and
b.name = %(categoryName)s ;"
cursor.execute(fileSql,(parameters))
This works well to bind the value of dict to SQL query based on key using parametrized queries.
But this way is not flexible if my query string changed to
{
"manufacuturerId": "1",
"fileName": "abc1234 "
}
Then the code will die.
The only manufacuturerId is must and others key-value pair is optional to further filter.
How to optimize the code?

The simple obvious answer is to build your query dynamically, ie:
fileSql = """
select * from file_table as a
left join category_table as b on a.fId = b.fId
left join manufacturer_table as c on c.mId = a.mId
where c.manufacturerId = %(manufacturerId)s
"""
if "fileName" in parameters:
fileSql += " and a.file_name = %(fileName)s "
if "categoryName" in parameters:
fileSql += " and b.name = %(categoryName)s "
Note that this is still not optimal since we keep the join on category_table even when we don't need it. This can be solved in a similar way by dynamically building the "from" clause too, and that's ok if you only have a couple such case in your project - but most often database-drievn apps require a lot of dynamic queries, and building them by hand using plain strings quickly becomes tedious and error-prone, so you may want to check what an ORM (Peewee comes to mind) can do for you.

Passing multiple parameters with cursor.executemany?

I am working in Django 1.8 and doing some raw SQL queries using connection.cursor.
My question is about how to safely supply multiple parameters to the cursor. Here is my code:
cursor = connection.cursor()
query = "SELECT cost, id, date, org_id FROM mytable "
query += " WHERE ("
for i, c in enumerate(codes):
query += "id=%s "
if (i != len(codes)-1):
query += ' OR '
query += " AND "
for i, c in enumerate(orgs):
query += "org_id=%s "
if (i != len(orgs)-1):
query += ' OR '
cursor.execute(query, tuple(codes), tuple(orgs))
But this gives me:
TypeError: execute() takes at most 3 arguments (4 given)
I'm trying to follow the PEP documentation on execute, it says that one can use executemany instead, but that doesn't seem to help either:
cursor.executemany(query, [tuple(codes), tuple(orgs)])
I just can't follow the PEP documentation without an example. Could anyone help?

Your problem is that you're passing more arguments to execute than it accepts. What you need is to combine the query's parameters into a single tuple. One way to do that is to use itertools.chain to chain both lists' elements into one iterable that can be used to create a single tuple:
import itertools
cursor.execute(query, tuple(itertools.chain(codes, orgs)))

Django Raw SQL give me TypeError not enough arguments

I've got this raw query :
counters = Counter.objects.raw("""
SELECT id, name FROM building_counter c
INNER JOIN scope_scope_buildings ssb
ON c.building_id = ssb.building_id
AND ssb.scope_id = %s
WHERE energy_id = %s
AND parent_id is not NULL
AND type = "C"
""", [self.id, energy_id])
The result gave me :
TypeError: not enough arguments for format string
[2, 1L]
I don't understand what's wrong with it :s

The real problem is that you're passing in a list to params, but then you're trying to call repr on the result (I only know this because I got the same problem when running it in ipython). What you need to do is pass in a tuple:
counters = Counter.objects.raw("""
SELECT id, name FROM building_counter c
INNER JOIN scope_scope_buildings ssb
ON c.building_id = ssb.building_id
AND ssb.scope_id = %s
WHERE energy_id = %s
AND parent_id is not NULL
AND type = 'C'
""", (self.id, energy_id))
Or you can apply this patch to your django source and it'll turn these into tuples when you're in the shell.
If you don't use the shell for raw queries often, you can just ignore this though, since the rest of django handles list params just fine.

You must use '%s' instead of %s
counters = Counter.objects.raw("""
SELECT id, name FROM building_counter c
INNER JOIN scope_scope_buildings ssb
ON c.building_id = ssb.building_id
AND ssb.scope_id = '%s'
WHERE energy_id = '%s'
AND parent_id is not NULL
AND type = 'C'
""", [self.id, energy_id])

You also need to use %d for your first placeholder, since self.id will be an Integer and not a String. %d tells the string to expect an int in that slot.

What is the correct way to form MySQL queries in python?

I am new to python, I come here from the land of PHP. I constructed a SQL query like this in python based on my PHP knowledge and I get warnings and errors
cursor_.execute("update posts set comment_count = comment_count + "+str(cursor_.rowcount)+" where ID = " + str(postid))
# rowcount here is int
What is the right way to form queries?
Also, how do I escape strings to form SQL safe ones? like if I want to escape -, ', " etc, I used to use addslashes. How do we do it in python?
Thanks

First of all, it's high time to learn to pass variables to the queries safely, using the method Matus expressed. Clearer,
tuple = (foovar, barvar)
cursor.execute("QUERY WHERE foo = ? AND bar = ?", tuple)
If you only need to pass one variable, you must still make it a tuple: insert comma at the end to tell Python to treat it as a one-tuple: tuple = (onevar,)
Your example would be of form:
cursor_.execute("update posts set comment_count = comment_count + ? where id = ?",
(cursor_.rowcount, postid))
You can also use named parameters like this:
cursor_.execute("update posts set comment_count = comment_count + :count where id = :id",
{"count": cursor_.rowcount, "id": postid})
This time the parameters aren't a tuple, but a dictionary that is formed in pairs of "key": value.

from python manual:
t = (symbol,)
c.execute( 'select * from stocks where symbol=?', t )
this way you prevent SQL injection ( suppose this is the SQL safe you refer to ) and also have formatting solved

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python/ SQL Statement - Use of parameters / injection protection? - python

Related

PYTHON - Dynamically update multiple columns using a custom MariaDB connector

Dynamic binding on WHERE for query string on sql python

Passing multiple parameters with cursor.executemany?

Django Raw SQL give me TypeError not enough arguments

What is the correct way to form MySQL queries in python?

Categories

Resources