This question already has answers here:
How do you escape strings for SQLite table/column names in Python?
(8 answers)
Closed 7 years ago.
I have a wide table in a sqlite3 database, and I wish to dynamically query certain columns in a Python script. I know that it's bad to inject parameters by string concatenation, so I tried to use parameter substitution instead.
I find that, when I use parameter substitution to supply a column name, I get unexpected results. A minimal example:
import sqlite3 as lite
db = lite.connect("mre.sqlite")
c = db.cursor()
# Insert some dummy rows
c.execute("CREATE TABLE trouble (value real)")
c.execute("INSERT INTO trouble (value) VALUES (2)")
c.execute("INSERT INTO trouble (value) VALUES (4)")
db.commit()
for row in c.execute("SELECT AVG(value) FROM trouble"):
print row # Returns 3
for row in c.execute("SELECT AVG(:name) FROM trouble", {"name" : "value"}):
print row # Returns 0
db.close()
Is there a better way to accomplish this than simply injecting a column name into a string and running it?
As Rob just indicated in his comment, there was a related SO post that contains my answer. These substitution constructions are called "placeholders," which is why I did not find the answer on SO initially. There is no placeholder pattern for column names, because dynamically specifying columns is not a code safety issue:
It comes down to what "safe" means. The conventional wisdom is that
using normal python string manipulation to put values into your
queries is not "safe". This is because there are all sorts of things
that can go wrong if you do that, and such data very often comes from
the user and is not in your control. You need a 100% reliable way of
escaping these values properly so that a user cannot inject SQL in a
data value and have the database execute it. So the library writers do
this job; you never should.
If, however, you're writing generic helper code to operate on things
in databases, then these considerations don't apply as much. You are
implicitly giving anyone who can call such code access to everything
in the database; that's the point of the helper code. So now the
safety concern is making sure that user-generated data can never be
used in such code. This is a general security issue in coding, and is
just the same problem as blindly execing a user-input string. It's a
distinct issue from inserting values into your queries, because there
you want to be able to safely handle user-input data.
So, the solution is that there is no problem in the first place: inject the values using string formatting, be happy, and move on with your life.
Why not use string formatting?
for row in c.execute("SELECT AVG({name}) FROM trouble".format(**{"name" : "value"})):
print row # => (3.0,)
Related
I'm to link my code to a MySQL database using pymysql. In general everything has gone smoothly but I'm having difficulty with the following function to find the minimum of a variable column.
def findmin(column):
cur = db.cursor()
sql = "SELECT MIN(%s) FROM table"
cur.execute(sql,column)
mintup = cur.fetchone()
if everything went smoothly this would return me a tuple with the minimum, e.g. (1,).
However, if I run the function:
findmin(column_name)
I have to put column name in "" (i.e. "column_name"), else Python sees it as an unknown variable. But if I put the quotation marks around column_name then SQL sees
SELECT MIN("column_name") FROM table
which just returns the column header, not the value.
How can I get around this?
The issue is likely the use of %s for the column name. That means the SQL Driver will try to escape that variable when interpolating it, including quoting, which is not what you want for things like column names, table names, etc.
When using a value in SELECT, WHERE, etc. then you do want to use %s to prevent SQL injections and enable quoting, among other things.
Here, you just want to interpolate using pure Python (assuming a trusted value; please see below for more information). That also means no bindings tuple passed to the execute method.
def findmin(column):
cur = db.cursor()
sql = "SELECT MIN({0}) FROM table".format(column)
cur.execute(sql)
mintup = cur.fetchone()
SQL fiddle showing the SQL working:
http://sqlfiddle.com/#!2/e70a41/1
In response to the Jul 15, 2014 comment from Colin Phipps (September 2022):
The relatively recent edit on this post by another community member brought it to my attention, and I wanted to respond to Colin's comment from many years ago.
I totally agree re: being careful about one's input if one interpolates like this. Certainly one needs to know exactly what is being interpolated. In this case, I would say a defined value within a trusted internal script or one supplied by a trusted internal source would be fine. But if, as Colin mentioned, there is any external input, then that is much different and additional precautions should be taken.
This question already has answers here:
Can PHP PDO Statements accept the table or column name as parameter?
(8 answers)
Closed last month.
I've used the mysqli_stmt_bind_param function several times. However, if I separate variables that I'm trying to protect against SQL injection I run into errors.
Here's some code sample:
function insertRow( $db, $mysqli, $new_table, $Partner, $Merchant, $ips, $score, $category, $overall, $protocol )
{
$statement = $mysqli->prepare("INSERT INTO " .$new_table . " VALUES (?,?,?,?,?,?,?);");
mysqli_stmt_bind_param( $statment, 'sssisss', $Partner, $Merchant, $ips, $score, $category, $overall, $protocol );
$statement->execute();
}
Is it possible to somehow replace the .$new_table. concatenation with another question mark statement, make another bind parameter statement, or add onto the existing one to protect against SQL injection?
Like this or some form of this:
function insertRow( $db, $mysqli, $new_table, $Partner, $Merchant, $ips, $score, $category, $overall, $protocol )
{
$statement = $mysqli->prepare("INSERT INTO (?) VALUES (?,?,?,?,?,?,?);");
mysqli_stmt_bind_param( $statment, 'ssssisss', $new_table, $Partner, $Merchant, $ips, $score, $category, $overall, $protocol );
$statement->execute();
}
Short answer to your question is "no".
In the strictest sense, at the database level, prepared statements only allow parameters to be bound for "values" bits of the SQL statement.
One way of thinking of this is "things that can be substituted at runtime execution of the statement without altering its meaning". The table name(s) is not one of those runtime values, as it determines the validity of the SQL statement itself (ie, what column names are valid) and changing it at execution time would potentially alter whether the SQL statement was valid.
At a slightly higher level, even in database interfaces that emulate prepared statement parameter substitution rather than actually send prepared statements to the database, such as PDO, which could conceivably allow you to use a placeholder anywhere (since the placeholder gets replaced before being sent to the database in those systems), the value of the table placeholder would be a string, and enclosed as such within the SQL sent to the database, so SELECT * FROM ? with mytable as the param would actually end up sending SELECT * FROM 'mytable' to the database, which is invalid SQL.
Your best bet is just to continue with
SELECT * FROM {$mytable}
but you absolutely should have a white-list of tables that you check against first if that $mytable is coming from user input.
The same rule applies when trying to create a "database".
You cannot use a prepared statement to bind a database.
I.e.:
CREATE DATABASE IF NOT EXISTS ?
will not work. Use a safelist instead.
Hello StackEx community.
I am implementing a relational database using SQLite interfaced with Python. My table consists of 5 attributes with around a million tuples.
To avoid large number of database queries, I wish to execute a single query that updates 2 attributes of multiple tuples. These updated values depend on the tuples' Primary Key value and so, are different for each tuple.
I am trying something like the following in Python 2.7:
stmt= 'UPDATE Users SET Userid (?,?), Neighbours (?,?) WHERE Username IN (?,?)'
cursor.execute(stmt, [(_id1, _Ngbr1, _name1), (_id2, _Ngbr2, _name2)])
In other words, I am trying to update the rows that have Primary Keys _name1 and _name2 by substituting the Neighbours and Userid columns with corresponding values. The execution of the two statements returns the following error:
OperationalError: near "(": syntax error
I am reluctant to use executemany() because I want to reduce the number of trips across the database.
I am struggling with this issue for a couple of hours now but couldn't figure out either the error or an alternate on the web. Please help.
Thanks in advance.
If the column that is used to look up the row to update is properly indexed, then executing multiple UPDATE statements would be likely to be more efficient than a single statement, because in the latter case the database would probably need to scan all rows.
Anyway, if you really want to do this, you can use CASE expressions (and explicitly numbered parameters, to avoid duplicates):
UPDATE Users
SET Userid = CASE Username
WHEN ?5 THEN ?1
WHEN ?6 THEN ?2
END,
Neighbours = CASE Username
WHEN ?5 THEN ?3
WHEN ?6 THEN ?4
END,
WHERE Username IN (?5, ?6);
I'm to link my code to a MySQL database using pymysql. In general everything has gone smoothly but I'm having difficulty with the following function to find the minimum of a variable column.
def findmin(column):
cur = db.cursor()
sql = "SELECT MIN(%s) FROM table"
cur.execute(sql,column)
mintup = cur.fetchone()
if everything went smoothly this would return me a tuple with the minimum, e.g. (1,).
However, if I run the function:
findmin(column_name)
I have to put column name in "" (i.e. "column_name"), else Python sees it as an unknown variable. But if I put the quotation marks around column_name then SQL sees
SELECT MIN("column_name") FROM table
which just returns the column header, not the value.
How can I get around this?
The issue is likely the use of %s for the column name. That means the SQL Driver will try to escape that variable when interpolating it, including quoting, which is not what you want for things like column names, table names, etc.
When using a value in SELECT, WHERE, etc. then you do want to use %s to prevent SQL injections and enable quoting, among other things.
Here, you just want to interpolate using pure Python (assuming a trusted value; please see below for more information). That also means no bindings tuple passed to the execute method.
def findmin(column):
cur = db.cursor()
sql = "SELECT MIN({0}) FROM table".format(column)
cur.execute(sql)
mintup = cur.fetchone()
SQL fiddle showing the SQL working:
http://sqlfiddle.com/#!2/e70a41/1
In response to the Jul 15, 2014 comment from Colin Phipps (September 2022):
The relatively recent edit on this post by another community member brought it to my attention, and I wanted to respond to Colin's comment from many years ago.
I totally agree re: being careful about one's input if one interpolates like this. Certainly one needs to know exactly what is being interpolated. In this case, I would say a defined value within a trusted internal script or one supplied by a trusted internal source would be fine. But if, as Colin mentioned, there is any external input, then that is much different and additional precautions should be taken.
I used MySQL Connector/Python API, NOT MySQLdb.
I need to dynamically insert values into a sparse table so I wrote the Python code like this:
cur.executemany("UPDATE myTABLE SET %s=%s WHERE id=%s" % data)
where
data=[('Depth', '17.5cm', Decimal('3003')), ('Input_Voltage', '110 V AC', Decimal('3004'))]
But it resulted an error:
TypeError: not enough arguments for format string
Is there any solution for this problem? Is it possible to use executemany when there is a
substitution of a field in query?
Thanks.
Let's start with the original method:
As the error message suggests you have a problem with your SQL syntax (not Python). If you insert your values you are effectively trying to execute
UPDATE myTABLE SET 'Depth'='17.5cm' WHERE id='3003'
You should notice that you are trying to assign a value to a string 'Depth', not a database field. The reason for this is that the %s substitution of the mysql module is only possible for values, not for tables/fields or other object identifiers.
In the second try you are not using the substitution anymore. Instead you use generic python string interpolation, which however looks similar. This does not work for you because you have a , and a pair of brackets too much in your code. It should read:
cur.execute("UPDATE myTABLE SET %s=%s WHERE id=%s" % data)
I also replaced executemany with execute because this method will work only for a single row. However your example only has one row, so there is no need to use executemany anyway.
The second method has some drawbacks however. The substitution is not guaranteed to be quoted or formatted in a correct manner for the SQL query, which might cause unexpected behaviour for certain inputs and may be a security concern.
I would rather ask, why it is necessary to provide the field name dynamically in the first place. This should not be necessary and might cause some trouble.