PyMySQL Cannot Replace Variable in Statement When Executing SQL [duplicate]

PyMySQL Cannot Replace Variable in Statement When Executing SQL [duplicate] - python

This question already has answers here:
Can PHP PDO Statements accept the table or column name as parameter?
(8 answers)
Closed last month.
I've used the mysqli_stmt_bind_param function several times. However, if I separate variables that I'm trying to protect against SQL injection I run into errors.
Here's some code sample:
function insertRow( $db, $mysqli, $new_table, $Partner, $Merchant, $ips, $score, $category, $overall, $protocol )
{
$statement = $mysqli->prepare("INSERT INTO " .$new_table . " VALUES (?,?,?,?,?,?,?);");
mysqli_stmt_bind_param( $statment, 'sssisss', $Partner, $Merchant, $ips, $score, $category, $overall, $protocol );
$statement->execute();
}
Is it possible to somehow replace the .$new_table. concatenation with another question mark statement, make another bind parameter statement, or add onto the existing one to protect against SQL injection?
Like this or some form of this:
function insertRow( $db, $mysqli, $new_table, $Partner, $Merchant, $ips, $score, $category, $overall, $protocol )
{
$statement = $mysqli->prepare("INSERT INTO (?) VALUES (?,?,?,?,?,?,?);");
mysqli_stmt_bind_param( $statment, 'ssssisss', $new_table, $Partner, $Merchant, $ips, $score, $category, $overall, $protocol );
$statement->execute();
}

Short answer to your question is "no".
In the strictest sense, at the database level, prepared statements only allow parameters to be bound for "values" bits of the SQL statement.
One way of thinking of this is "things that can be substituted at runtime execution of the statement without altering its meaning". The table name(s) is not one of those runtime values, as it determines the validity of the SQL statement itself (ie, what column names are valid) and changing it at execution time would potentially alter whether the SQL statement was valid.
At a slightly higher level, even in database interfaces that emulate prepared statement parameter substitution rather than actually send prepared statements to the database, such as PDO, which could conceivably allow you to use a placeholder anywhere (since the placeholder gets replaced before being sent to the database in those systems), the value of the table placeholder would be a string, and enclosed as such within the SQL sent to the database, so SELECT * FROM ? with mytable as the param would actually end up sending SELECT * FROM 'mytable' to the database, which is invalid SQL.
Your best bet is just to continue with
SELECT * FROM {$mytable}
but you absolutely should have a white-list of tables that you check against first if that $mytable is coming from user input.

The same rule applies when trying to create a "database".
You cannot use a prepared statement to bind a database.
I.e.:
CREATE DATABASE IF NOT EXISTS ?
will not work. Use a safelist instead.

Related

How to choose a column as user input and pass it into database statement? [duplicate]

I'm to link my code to a MySQL database using pymysql. In general everything has gone smoothly but I'm having difficulty with the following function to find the minimum of a variable column.
def findmin(column):
cur = db.cursor()
sql = "SELECT MIN(%s) FROM table"
cur.execute(sql,column)
mintup = cur.fetchone()
if everything went smoothly this would return me a tuple with the minimum, e.g. (1,).
However, if I run the function:
findmin(column_name)
I have to put column name in "" (i.e. "column_name"), else Python sees it as an unknown variable. But if I put the quotation marks around column_name then SQL sees
SELECT MIN("column_name") FROM table
which just returns the column header, not the value.
How can I get around this?

The issue is likely the use of %s for the column name. That means the SQL Driver will try to escape that variable when interpolating it, including quoting, which is not what you want for things like column names, table names, etc.
When using a value in SELECT, WHERE, etc. then you do want to use %s to prevent SQL injections and enable quoting, among other things.
Here, you just want to interpolate using pure Python (assuming a trusted value; please see below for more information). That also means no bindings tuple passed to the execute method.
def findmin(column):
cur = db.cursor()
sql = "SELECT MIN({0}) FROM table".format(column)
cur.execute(sql)
mintup = cur.fetchone()
SQL fiddle showing the SQL working:
http://sqlfiddle.com/#!2/e70a41/1
In response to the Jul 15, 2014 comment from Colin Phipps (September 2022):
The relatively recent edit on this post by another community member brought it to my attention, and I wanted to respond to Colin's comment from many years ago.
I totally agree re: being careful about one's input if one interpolates like this. Certainly one needs to know exactly what is being interpolated. In this case, I would say a defined value within a trusted internal script or one supplied by a trusted internal source would be fine. But if, as Colin mentioned, there is any external input, then that is much different and additional precautions should be taken.

Is my code susceptible to SQL injection attack? [duplicate]

I have some code in Python that sets a char(80) value in an sqlite DB.
The string is obtained directly from the user through a text input field and sent back to the server with a POST method in a JSON structure.
On the server side I currently pass the string to a method calling the SQL UPDATE operation.
It works, but I'm aware it is not safe at all.
I expect that the client side is unsafe anyway, so any protection is to be put on the server side. What can I do to secure the UPDATE operation agains SQL injection ?
A function that would "quote" the text so that it can't confuse the SQL parser is what I'm looking for. I expect such function exist but couldn't find it.
Edit:
Here is my current code setting the char field name label:
def setLabel( self, userId, refId, label ):
self._db.cursor().execute( """
UPDATE items SET label = ? WHERE userId IS ? AND refId IS ?""", ( label, userId, refId) )
self._db.commit()

From the documentation:
con.execute("insert into person(firstname) values (?)", ("Joe",))
This escapes "Joe", so what you want is
con.execute("insert into person(firstname) values (?)", (firstname_from_client,))

The DB-API's .execute() supports parameter substitution which will take care of escaping for you, its mentioned near the top of the docs; http://docs.python.org/library/sqlite3.html above Never do this -- insecure.

Noooo... USE BIND VARIABLES! That's what they're there for. See this
Another name for the technique is parameterized sql (I think "bind variables" may be the name used with Oracle specifically).

Substituting column names in Python sqlite3 query [duplicate]

This question already has answers here:
How do you escape strings for SQLite table/column names in Python?
(8 answers)
Closed 7 years ago.
I have a wide table in a sqlite3 database, and I wish to dynamically query certain columns in a Python script. I know that it's bad to inject parameters by string concatenation, so I tried to use parameter substitution instead.
I find that, when I use parameter substitution to supply a column name, I get unexpected results. A minimal example:
import sqlite3 as lite
db = lite.connect("mre.sqlite")
c = db.cursor()
# Insert some dummy rows
c.execute("CREATE TABLE trouble (value real)")
c.execute("INSERT INTO trouble (value) VALUES (2)")
c.execute("INSERT INTO trouble (value) VALUES (4)")
db.commit()
for row in c.execute("SELECT AVG(value) FROM trouble"):
print row # Returns 3
for row in c.execute("SELECT AVG(:name) FROM trouble", {"name" : "value"}):
print row # Returns 0
db.close()
Is there a better way to accomplish this than simply injecting a column name into a string and running it?

As Rob just indicated in his comment, there was a related SO post that contains my answer. These substitution constructions are called "placeholders," which is why I did not find the answer on SO initially. There is no placeholder pattern for column names, because dynamically specifying columns is not a code safety issue:
It comes down to what "safe" means. The conventional wisdom is that
using normal python string manipulation to put values into your
queries is not "safe". This is because there are all sorts of things
that can go wrong if you do that, and such data very often comes from
the user and is not in your control. You need a 100% reliable way of
escaping these values properly so that a user cannot inject SQL in a
data value and have the database execute it. So the library writers do
this job; you never should.
If, however, you're writing generic helper code to operate on things
in databases, then these considerations don't apply as much. You are
implicitly giving anyone who can call such code access to everything
in the database; that's the point of the helper code. So now the
safety concern is making sure that user-generated data can never be
used in such code. This is a general security issue in coding, and is
just the same problem as blindly execing a user-input string. It's a
distinct issue from inserting values into your queries, because there
you want to be able to safely handle user-input data.
So, the solution is that there is no problem in the first place: inject the values using string formatting, be happy, and move on with your life.

Why not use string formatting?
for row in c.execute("SELECT AVG({name}) FROM trouble".format(**{"name" : "value"})):
print row # => (3.0,)

To convert from Python arrays to PostgreSQL quickly?

This is a follow-up question to: How to cast to int array in PostgreSQL?
I am thinking how to convert Python's datatype of array-array of signed integer into to int of PostgreSQL quickly:
import numpy as np; # use any data format of Python here
event = np.array([[1,2],[3,4]]);
where [] should be replaced by {} and surrounded by ' if manually.
In PostgreSQL, the following is accepted as the syntax of the datatype
...
FOR EACH ROW EXECUTE PROCEDURE insaft_function('{{1,2},{3,4}}');
#JohnMee's suggestion
str(event).replace('[','{').replace(']','}').replace('\n ',',')
#ErwinBrandstetter's suggestion
Stick to signed integers because it is supported by SQL standard.
Map to int, so just in PostgreSQL side:
TG_ARGV::int[]
I want to stick to this Erwin's suggestion.
Test run of simpler version of #ErwinBrandstetter's answer
I have to simplify his answer to keep it enough focused here by removing the table-name from function so just keeping the trigger for one initial table measurements:
CREATE OR REPLACE FUNCTION f_create_my_trigger(_arg0 text)
RETURNS void AS
$func$
BEGIN
EXECUTE format($$
DROP TRIGGER IF EXISTS insaft_ids ON measurements;
CREATE TRIGGER insaft_ids
AFTER INSERT ON measurements
FOR EACH ROW EXECUTE PROCEDURE insaft_function(%1$L)$$
, _arg0
);
END
$func$ LANGUAGE plpgsql;
And I run:
sudo -u postgres psql detector -c "SELECT f_create_my_trigger('[[1,2],[3,4]]');"
But get empty output:
f_create_my_trigger
---------------------
(1 row)
How can you map to int for PostgreSQL 9.4 in Python?

Setup
You want to create triggers (repeatedly?) using the same trigger function like outlined in my related answer on dba.SE. You need to pass values to the trigger function to create multiple rows with multiple column values, hence the two-dimensional array. (But we can work with any clearly defined string!)
The only way to pass values to a PL/pgSQL trigger function (other than column values of the triggering row) are text parameters, which are accessible inside the function as 0-based array of text in the special array variable TG_ARGV[]. You can pass a variable number of parameters, but we discussed a single string literal representing your 2-dimenstional array earlier.
Input comes from a 2-dimensional Python array with signed integer numbers, that fits into the Postgres type integer. Use the Postgres type bigint to cover unsigned integer numbers, as commented.
The text representation in Python looks like this:
[[1,2],[3,4]]
Syntax for a Postgres array literal:
{{1,2},{3,4}}
And you want to automate the process.
Full automation
You can concatenate the string for the CREATE TRIGGER statement in your client or you can persist the logic in a server-side function and just pass parameters.
Demonstrating an example function taking a table name and the string that's passed to the trigger function. The trigger function insaft_function() is defined in your previous question on dba.SE.
CREATE OR REPLACE FUNCTION f_create_my_trigger(_tbl regclass, _arg0 text)
RETURNS void
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE format($$
DROP TRIGGER IF EXISTS insaft_%1$s_ids ON %1$s;
CREATE TRIGGER insaft_%1$s_ids
AFTER INSERT ON %1$s
FOR EACH ROW EXECUTE PROCEDURE insaft_function(%2$L)$$
, _tbl
, translate(_arg0, '[]', '{}')
);
END
$func$;
Call:
SELECT f_create_my_trigger('measurements', '[[1,2],[3,4]]');
Or:
SELECT f_create_my_trigger('some_other_table', '{{5,6},{7,8}}');
db<>fiddle here
Old sqlfiddle
Now you can pass either [[1,2],[3,4]] (with square brackets) or {{1,2},{3,4}} (with curly braces). Both work the same. translate(_arg0, '[]', '{}' transforms the first into the second form.
This function drops a trigger of the same name if it exists, before creating the new one. You may want to drop or keep this line:
DROP TRIGGER IF EXISTS insaft_%1$s_ids ON %1$s;
This runs with the privileges of the calling DB role. You could make it run with superuser (or any other) privileges if need be. See:
Is there a way to disable updates/deletes but still allow triggers to perform them?
There are many ways to achieve this. It depends on exact requirements.
Explaining format()
format() and the data type regclass help to safely concatenate the DDL command and make SQL injection impossible. See:
Table name as a PostgreSQL function parameter
The first argument is the "format string", followed by arguments to be embedded in the string. I use dollar-quoting, which is not strictly necessary for the example, but generally a good idea for concatenating long strings containing single-quotes: $$DROP TRIGGER ... $$
format() is modeled along the C function sprintf. %1$s is a format specifier of the format() function. It means that the first (1$) argument after the format string is inserted as unquoted string (%s), hence: %1$s. The first argument to format is _tbl in the example - the regclass parameter is rendered as legal identifier automatically, double-quoted if necessary, so format() does not have to do more. Hence just %s, not %I (identifier). Read the linked answer above for details.
The other format specifier in use is %2$L: Second argument as quoted string literal.
If you are new to format(), play with these simple examples to understand:
SELECT format('input -->|%s|<-- here', '[1,2]')
, format('input -->|%s|<-- here', translate('[1,2]', '[]', '{}'))
, format('input -->|%L|<-- here', translate('[1,2]', '[]', '{}'))
, format('input -->|%I|<-- here', translate('[1,2]', '[]', '{}'));
And read the manual.

query of sqlite3 with python using '?'

I have a table of three columnsid,word,essay.I want to do a query using (?). The sql sentence is sql1 = "select id,? from training_data". My code is below:
def dbConnect(db_name,sql,flag):
conn = sqlite3.connect(db_name)
cursor = conn.cursor()
if (flag == "danci"):
itm = 'word'
elif flag == "wenzhang":
itm = 'essay'
n = cursor.execute(sql,(itm,))
res1 = cursor.fetchall()
return res1
However, when I print dbConnect("data.db",sql1,"danci")
The result I obtained is [(1,'word'),(2,'word'),(3,'word')...].What I really want to get is [(1,'the content of word column'),(2,'the content of word column')...]. What should I do ? Please give me some ideas.

You can't use placeholders for identifiers -- only for literal values.
I don't know what to suggest in this case, as your function takes a database nasme, an SQL string, and a flag to say how to modify that string. I think it would be better to pass just the first two, and write something like
sql = {
"danci": "SELECT id, word FROM training_data",
"wenzhang": "SELECT id, essay FROM training_data",
}
and then call it with one of
dbConnect("data.db", sql['danci'])
or
dbConnect("data.db", sql['wenzhang'])
But a lot depends on why you are asking dbConnect to decide on the columns to fetch based on a string passed in from outside; it's an unusual design.
Update - SQL Injection
The problems with SQL injection and tainted data is well documented, but here is a summary.
The principle is that, in theory, a programmer can write safe and secure programs as long as all the sources of data are under his control. As soon as they use any information from outside the program without checking its integrity, security is under threat.
Such information ranges from the obvious -- the parameters passed on the command line -- to the obscure -- if the PATH environment variable is modifiable then someone could induce a program to execute a completely different file from the intended one.
Perl provides direct help to avoid such situations with Taint Checking, but SQL Injection is the open door that is relevant here.
Suppose you take the value for a database column from an unverfied external source, and that value appears in your program as $val. Then, if you write
my $sql = "INSERT INTO logs (date) VALUES ('$val')";
$dbh->do($sql);
then it looks like it's going to be okay. For instance, if $val is set to 2014-10-27 then $sql becomes
INSERT INTO logs (date) VALUES ('2014-10-27')
and everything's fine. But now suppose that our data is being provided by someone less than scrupulous or downright malicious, and your $val, having originated elsewhere, contains this
2014-10-27'); DROP TABLE logs; SELECT COUNT(*) FROM security WHERE name != '
Now it doesn't look so good. $sql is set to this (with added newlines)
INSERT INTO logs (date) VALUES ('2014-10-27');
DROP TABLE logs;
SELECT COUNT(*) FROM security WHERE name != '')
which adds an entry to the logs table as before, end then goes ahead and drops the entire logs table and counts the number of records in the security table. That isn't what we had in mind at all, and something we must guard against.
The immediate solution is to use placeholders ? in a prepared statement, and later passing the actual values in a call to execute. This not only speeds things up, because the SQL statement can be prepared (compiled) just once, but protects the database from malicious data by quoting every supplied value appropriately for the data type, and escaping any embedded quotes so that it is impossible to close one statement and another open another.
This whole concept was humourised in Randall Munroe's excellent XKCD comic

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.