This question already has an answer here:
Dynamic SQL Queries with Python and mySQL
(1 answer)
Closed 3 months ago.
I am unable to execute the following statement I keep getting SQL syntax errors.
According to all the examples I can find this should work
Any help will be greatly appreciated.
d2 = df.iloc[-1,:]
q = symbol+'_ivol'
query = """SELECT close FROM %s WHERE date = %s"""
VALUES= (q, d2[1])
cursor.execute(query, VALUES)
ivol = cursor.fetchall()
conn.close()
Query parameters in SQL are not just string substitution. You can't use a query parameter for a table identifier. Parameters can only be used where you would normally use a quoted string literal or numeric literal.
Stated another way, all the identifiers must be fixed in the query string before you prepare it, because identifiers must be validated during the prepare phase, to make sure the table is a valid identifier, and that the table exists. You can't pass the name of a table identifier after the query has been prepared.
The Python driver unfortunately makes this more confusing because it uses %s instead of MySQL's own ? symbol for the parameter placeholder. This makes developers naturally think that %s is simply string substitution, like it is for Python string formatting.
So there's %s and there's %s, and they are handled differently. I know, it's confusing.
So you can do a plain string-formatting substitution to put your table into the query string:
query = """SELECT close FROM %s WHERE date = %%s""".format(q)
But it's more idiomatic for modern Python to use f-string formatting:
query = f"""SELECT close FROM `{q}` WHERE date = %s"""
I put back-ticks around the table name, just in case it's a SQL reserved keyword or something.
Then the other %s is an actual query parameter, because it works as a scalar value in the SQL expression. In this query, there is just one query parameter.
VALUES= [ d2[1] ]
cursor.execute(query, VALUES)
Related
This question already has answers here:
How can prepared statements protect from SQL injection attacks?
(10 answers)
Closed 11 months ago.
What is the best way to sanitize a SQL to prevent injection when using python? I'm using mysql-connector. I have read that I should use a structure similar to:
import mysql.connector
connection = mysql.connector.connect(host="", port="", user="", password="", database="")
cursor = connection.cursor( buffered = True )
sql = "INSERT INTO mytable (column1, column2) VALUES (%s, %s)"
val = (myvalue1, myvalue2)
cursor.execute(sql, val)
connection.commit()
However, I don't understand why this can prevent an injection. Is this sufficient? A user could introduce me anything on myvalue1 or myvalue2, even if it is not suposed to. Is there any useful library?
SQL injection works when untrusted input is interpolated into an SQL query and the input contains characters that change the syntax of the query.
Query parameters are kept separate from the SQL query, never interpolated into it. The values of the parameters are combined with the SQL query after it is parsed, so there is no longer any opportunity to change the syntax. The parameter is guaranteed to be treated as a single scalar value (i.e. as if it's just a string literal in an SQL expression).
This is the way the Python connector works if you use the MySQLCursorPrepared cursor subclass. See https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursorprepared.html
Otherwise, the Python connector "simulates" prepared queries. It actually does interpolate parameters into the SQL query before it is parsed, but it does so safely, by escaping special characters that would cause SQL injection. It is well-tested so it's reliable.
Both cursor types are used the same way, passing an SQL query string with %s placeholders, and another argument with a tuple of parameter values. You are using it correctly.
Re comment from #Learningfrommasters:
Yes, a string stored in your database can be used unsafely in another SQL query, and cause SQL injection. Some people think that only user input must be treated safely, but this is not true. Any variable should be treated as a query parameter, whether the value for that variable comes from user input, or read from a file, or even pulled out of your own database.
Example: Suppose my name is Bill O'Karwin. It has an apostrophe in it, which you know is a special character to SQL because it terminates a string literal.
If my name were stored in the database and then fetched into an application into a variable userlastname, then I could search for other people with the same last name:
sql = f"SELECT * FROM Users WHERE lastname = '{userlastname}'"
That is unsafe because the apostrophe would cause SQL injection. Even though the value didn't come directly from user input, it came from my own database.
So use parameters for all variables. Then you don't have to think about whether the source is safe or not.
sql = "SELECT * FROM Users WHERE lastname = %s"
cur.execute(sql, (userlastname,))
I have an query string in Python as follows:
query = "select name from company where id = 13 order by name;"
I want to be able to change the id dynamically. Thus I want to find id = 13 and replace it with a new id.
I can do it as follows:
query.replace("id = 13", "id = {}".format(some_new_id))
But if in the query is id= 13 or id=13 or id =13, ... it will not work.
How to avoid that?
Gluing variables directly into your query leaves you vulnerable to SQL injection.
If you are passing your query to a function to be executed in your database, that function should accept additional parameters.
For instance,
query = "select name from company where id = %s order by name"
cursor.execute(query, params=(some_other_id,))
It is better to use formatted sql.
Ex:
query = "select name from company where id = %s order by name;".
cursor.execute(query, (id,))
The usual solution when it comes to dynamically building strings is string formatting, ie
tpl = "Hello {name}, how are you"
for name in ("little", "bobby", "table"):
print(tpl.format(name))
BUT (and that's a BIG "but"): you do NOT want to do this for SQL queries (assuming you want to pass this query to your db using your db's python api).
There are two reasons to not use string formatting here: the first one is that correctly handling quoting and escaping is tricky at best, the second and much more important one is that it makes your code vulnerable to SQL injections attacks.
So in this case, the proper solution is to use prepared statements instead:
# assuming MySQL which uses "%" as placeholder,
# consult your db-api module's documentation for
# the proper placeholder
sql = "select name from company where id=%s order by name"
cursor = yourdbconnection.cursor()
cursor.execute(sql, [your_id_here])
My objective is to store a JSON object into a MySQL database field of type json, using the mysql.connector library.
import mysql.connector
import json
jsonData = json.dumps(origin_of_jsonData)
cnx = mysql.connector.connect(**config_defined_elsewhere)
cursor = cnx.cursor()
cursor.execute('CREATE DATABASE dataBase')
cnx.database = 'dataBase'
cursor = cnx.cursor()
cursor.execute('CREATE TABLE table (id_field INT NOT NULL, json_data_field JSON NOT NULL, PRIMARY KEY (id_field))')
Now, the code below WORKS just fine, the focus of my question is the use of '%s':
insert_statement = "INSERT INTO table (id_field, json_data_field) VALUES (%s, %s)"
values_to_insert = (1, jsonData)
cursor.execute(insert_statement, values_to_insert)
My problem with that: I am very strictly adhering to the use of '...{}'.format(aValue) (or f'...{aValue}') when combining variable aValue(s) into a string, thus avoiding the use of %s (whatever my reasons for that, let's not debate them here - but it is how I would like to keep it wherever possible, hence my question).
In any case, I am simply unable, whichever way I try, to create something that stores the jsonData into the mySql dataBase using something that resembles the above structure and uses '...{}'.format() (in whatever shape or form) instead of %s. For example, I have (among many iterations) tried
insert_statement = "INSERT INTO table (id_field, json_data_field) VALUES ({}, {})".format(1, jsonData)
cursor.execute(insert_statement)
but no matter how I turn and twist it, I keep getting the following error:
ProgrammingError: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '[some_content_from_jsonData})]' at line 1
Now my question(s):
1) Is there a way to avoid the use of %s here that I am missing?
2) If not, why? What is it that makes this impossible? Is it the cursor.execute() function, or is it the fact that it is a JSON object, or is it something completely different? Shouldn't {}.format() be able to do everything that %s could do, and more?
First of all: NEVER DIRECTLY INSERT YOUR DATA INTO YOUR QUERY STRING!
Using %s in a MySQL query string is not the same as using it in a python string.
In python, you just format the string and 'hello %s!' % 'world' becomes 'hello world!'. In SQL, the %s signals parameter insertion. This sends your query and data to the server separately. You are also not bound to this syntax. The python DB-API specification specifies more styles for this: DB-API parameter styles (PEP 249). This has several advantages over inserting your data directly into the query string:
Prevents SQL injection
Say you have a query to authenticate users by password. You would do that with the following query (of course you would normally salt and hash the password, but that is not the topic of this question):
SELECT 1 FROM users WHERE username='foo' AND password='bar'
The naive way to construct this query would be:
"SELECT 1 FROM users WHERE username='{}' AND password='{}'".format(username, password)
However, what would happen if someone inputs ' OR 1=1 as password. The formatted query would then become
SELECT 1 FROM users WHERE username='foo' AND password='' OR 1=1
which will allways return 1. When using parameter insertion:
execute('SELECT 1 FROM users WHERE username=%s AND password=%s', username, password)
this will never happen, as the query will be interpreted by the server separately.
Performance
If you run the same query many times with different data, the performance difference between using a formatted query and parameter insertion can be significant. With parameter insertion, the server only has to compile the query once (as it is the same every time) and execute it with different data, but with string formatting, it will have to compile it over and over again.
In addition to what was said above, I would like to add some details that I did not immediately understand, and that other (newbies like me ;)) may also find helpful:
1) "parameter insertion" is meant for only for values, it will not work for table names, column names, etc. - for those, the Python string substitution works fine in the sql syntax defintion
2) the cursor.execute function requires a tuple to work (as specified here, albeit not immediately clear, at least to me: https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursor-execute.html)
EXAMPLE for both in one function:
def checkIfRecordExists(column, table, condition_name, condition_value):
...
sqlSyntax = 'SELECT {} FROM {} WHERE {} = %s'.format(column, table, condition_name)
cursor.execute(sqlSyntax, (condition_value,))
Note both the use of .format in the initial sql syntax definition and the use of (condition_value,) in the execute function.
A Python API is giving back u"'HOPPE'S No. 9'" as a value for a particular product attribute. I'm then looking to insert it into the DB, also using Python (python-mysqldb), with the following query:
INSERT INTO mytable (rating, Name) VALUES('5.0 (7)', 'HOPPE'S No. 9';
MySQL rejects this, and the suggested approach to handling a single quote in MySQL is to escape it first. This I need to do in Python, so I try:
In [5]: u"'HOPPE'S No. 9'".replace("'", "\'")
Out[5]: u"'HOPPE'S No. 9'"
When I incorporate this in my program, MySQL still rejects it. So I double-escape the apostrophe, and then an insert happens successfully. Thing is, it contains the escape character (so what gets written is 'HOPPE\'S No. 9').
If I need the second escape character, but when I add it gets left in, then how can I handle the escaping without having the escape character included in the string that gets inserted?
Edit: Based on theBjorn's suggestion, tried:
actualSQL = "INSERT INTO %s (%s) VALUES(%s);"
#cur.execute(queryString)
cur.execute(actualSQL,
(configData["table"], sqlFieldMappingString, sqlFieldValuesString))
but it looks like I'm back to where I was when I was trying to escape using the single escape with .replace():
Error 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''mytable' ('rating, Name, Image, mfg, price, URL') VALUES('\'5.0 (3)\', \'AR-1' at line 1
You should never construct sql that way. Use parameterized code instead:
cursor.execute(
"insert into mytable (rating, name) values (%s, %s);",
("5.0 (7)", "HOPPE'S No. 9")
)
your latest problem is due to the misconception that this is string interpolation, which it isn't (the use of %s is confusing), thus:
actualSQL = "INSERT INTO %s (%s) VALUES(%s);"
will be wrong. It is possible to construct your sql string, but probably easier to do so in two steps so we don't trip over sql parameter markers looking like string interpolation markers. Assuming you have the values in a tuple named field_values:
params = ["%s"] * len(field_values) # create a list with the correct number of parameter markers
sql = "insert into %s (%s) values (%s)" % ( # here we're using string interpolation, but not with the values
configData["table"],
sqlFieldMappingString,
', '.join(params)
)
if you print sql it should look like my example above. Now you can execute it with:
cursor.execute(sql, field_values)
I have the following list of providers (in Russian):
providers = [u'\u041e\u041e\u041e "\u041a\u0432\u0430\u0440\u0442\u0430\u043b
\u041b\u0435\u043e\u043f\u043e\u043b\u0438\u0441"',
u'\u0426\u0435\u043d\u0442\u0440\u0430\u043b']
These are obviously in unicode. Previously, to do a SQL SELECT, I was doing:
providers = tuple([str(item) for item in providers])
sql += " WHERE provider IN {} GROUP BY date ORDER BY date ASC".format(repr(providers))
cursor.execute(sql,)
Now, since the list items are in unicode, I run into a UnicodeEncodeError.
How would I correctly do this sql statement?
You should not use .format() to include values in a sql query. Use sql parameters instead:
sql += " WHERE provider IN ({}) GROUP BY date ORDER BY date ASC".format(', '.join(['%s'] * len(providers)))
cursor.execute(sql, providers)
where providers is the original list.
The idea is to generate a SQL query with the in test using SQL parameter syntax matching the number of providers in your list: WHERE provider in (%s, %s) ... for a two-provider list. Yes, the MySQLdb sql parameter syntax echoes the old-style python formatting syntax, but is not the same thing.