This question already has answers here:
How can prepared statements protect from SQL injection attacks?
(10 answers)
Closed 11 months ago.
What is the best way to sanitize a SQL to prevent injection when using python? I'm using mysql-connector. I have read that I should use a structure similar to:
import mysql.connector
connection = mysql.connector.connect(host="", port="", user="", password="", database="")
cursor = connection.cursor( buffered = True )
sql = "INSERT INTO mytable (column1, column2) VALUES (%s, %s)"
val = (myvalue1, myvalue2)
cursor.execute(sql, val)
connection.commit()
However, I don't understand why this can prevent an injection. Is this sufficient? A user could introduce me anything on myvalue1 or myvalue2, even if it is not suposed to. Is there any useful library?
SQL injection works when untrusted input is interpolated into an SQL query and the input contains characters that change the syntax of the query.
Query parameters are kept separate from the SQL query, never interpolated into it. The values of the parameters are combined with the SQL query after it is parsed, so there is no longer any opportunity to change the syntax. The parameter is guaranteed to be treated as a single scalar value (i.e. as if it's just a string literal in an SQL expression).
This is the way the Python connector works if you use the MySQLCursorPrepared cursor subclass. See https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursorprepared.html
Otherwise, the Python connector "simulates" prepared queries. It actually does interpolate parameters into the SQL query before it is parsed, but it does so safely, by escaping special characters that would cause SQL injection. It is well-tested so it's reliable.
Both cursor types are used the same way, passing an SQL query string with %s placeholders, and another argument with a tuple of parameter values. You are using it correctly.
Re comment from #Learningfrommasters:
Yes, a string stored in your database can be used unsafely in another SQL query, and cause SQL injection. Some people think that only user input must be treated safely, but this is not true. Any variable should be treated as a query parameter, whether the value for that variable comes from user input, or read from a file, or even pulled out of your own database.
Example: Suppose my name is Bill O'Karwin. It has an apostrophe in it, which you know is a special character to SQL because it terminates a string literal.
If my name were stored in the database and then fetched into an application into a variable userlastname, then I could search for other people with the same last name:
sql = f"SELECT * FROM Users WHERE lastname = '{userlastname}'"
That is unsafe because the apostrophe would cause SQL injection. Even though the value didn't come directly from user input, it came from my own database.
So use parameters for all variables. Then you don't have to think about whether the source is safe or not.
sql = "SELECT * FROM Users WHERE lastname = %s"
cur.execute(sql, (userlastname,))
Related
I have a database with 2 tables: students, employees and I want to update one of those tables:
import sqlite3
db_file = "school.db"
def update_address(identifier, user_address, user_id):
with sqlite3.connect(db_file) as conn:
c = conn.cursor()
c.execute(f"""
UPDATE {identifier}
SET address = ?
WHERE id = ?;
""",
(user_address, user_id))
update_address("students", "204 Sycamore Street", 2)
The above code works, the problem is I know that using python string formatting in an sql operation can lead to vulnerabilities per sqlite3 docs:
Usually your SQL operations will need to use values from Python variables. You shouldn’t assemble your query using Python’s string operations because doing so is insecure; it makes your program vulnerable to an SQL injection attack (see https://xkcd.com/327/ for humorous example of what can go wrong).
Instead, use the DB-API’s parameter substitution. Put ? as a placeholder wherever you want to use a value, and then provide a tuple of values as the second argument to the cursor’s execute() method.
The placeholder '?' works when it comes to inserting values but not for sql identifiers. Output:
sqlite3.OperationalError: near "?": syntax error
So the question here is: can an sql injection occur if I use python string formatting on an sql identifier or does it only occur on values ?
If it also occurs on identifiers is there a way to format the string in a safe manner?
Yes, if you interpolate any content into an SQL query unsafely, it is an SQL injection vulnerability. It doesn't matter if the content is supposed to be used as a value in the SQL expression, or an identifier, SQL keyword, or anything else.
It's pretty common to format queries from fragments of SQL expressions, if you want to write a query with a variable set of conditions. These are also possible SQL injection risks.
The way to mitigate the SQL injection risk is: don't interpolate untrusted input into your SQL query.
For identifiers, you should make sure the content matches a legitimate name of a table (or column, or other element, if that's what you're trying to make dynamic). I.e. create an "allowlist" of tables known to exist in your database that are permitted to update using your function. If the input doesn't match one of these, then don't run the query.
It's also a good idea to use back-ticks to delimit identifiers, because if one of the table names happens to be a reserved keyword in SQLite, that will allow the table to be used in the SQL query.
if identifier not in ["table1", "table2", "table3"]:
raise Exception("Unknown table name: '{identifier}'")
c.execute(f"""
UPDATE `{identifier}`
SET address = ?
WHERE id = ?;
""",
(user_address, user_id))
This question already has answers here:
How can I know if my website is vulnerable to SQL Injection?
(3 answers)
Closed 8 months ago.
I have an originally SQL query:
f"SELECT FIELDS(ALL) from xxxx WHERE CreatedDate >= {start_time}"
I wanted to make that query safe from sql injection attack but I could not see how can I know that I did it right.
This is the new version that should be safe:
f"SELECT FIELDS(ALL) from xxxx WHERE CreatedDate >= %s" % (start_time,)
I'm using it in an API call. The query itself will be excecated in the other side (third party). I want to send the query as parameter in the api call
I would like to get some tips regarding this issue
Thank you!
Anytime you are directly creating the string in your code you are exposing yourself to SQL injection. You want to pass the handling of data off to the DBMS. Using an ORM like SQLAlchemy will handle a lot of that (if you use the ORM and don't pass your SQL in directly). Most libraries for connecting to a database follow python's DB api standard. Since you haven't mentioned what you're using I'll use pyodbc as an example.
Copied from the docs:
Inserting Data
To insert data, pass the insert SQL to Cursor execute(), along with any parameters > necessary:
cursor.execute("insert into products(id, name) values ('pyodbc', 'awesome library')")
cnxn.commit()
or, parameterized:
cursor.execute("insert into products(id, name) values (?, ?)", 'pyodbc', 'awesome library')
cnxn.commit()
Notice the parameterized version. This is what you want. Here pyodbc is handing both your query and your data to the DBMS. The DBMS will handle sanitizing the data. This form is called qmark notation (notice the question marks). There are a few other notations but the important part is you are using parameterization and passing the data as separate from your query. With most libraries this looks something like:
cursor.execute(query_string_with_qmark_notation, data_or_tuple_of_data)
I am using SQL server and need to run the following SQL via Python script
SELECT DISTINCT LEN(Wav)-CHARINDEX('.', Wav) FROM <>;
I have tried to play with the String but couldn’t figure out how to work around the dot character.
sql = 'SELECT DISTINCT LEN(Wav)-CHARINDEX({}, Wav) FROM xxx'.format('.')
print(sql)
cursor = conn.cursor()
cursor.execute(sql)
Any idea how to resolve this
Thank you
'.' is the string ., you want "'.'", the string '.'
>>> print("{}".format('.'))
.
>>> print("{}".format("'.'"))
'.'
As #Justin Ezequiel's answer notes, do beware of SQL injections here!
Specifically, unfiltered user inputs can and will cause an SQL injection where unanticipated commands can be run against the target database by breaking out of the raw string. These can do anything your connection has permission to do, such as retrieving, modifying, or deleting arbitrary data.
A traditional approach is to use prepared statements
In Python, you can also use a regex or other test to explicitly error for statements with control characters (if not re.match(r"^[a-zA-Z\d _+-]+$"), s):raise_) or use (trust) an escaping library to do it for you if you must take arbitrary strings.
Use parameters to avoid SQL-injection attacks.
sql = 'SELECT DISTINCT LEN(Wav)-CHARINDEX(?, Wav) FROM xxx' # note placeholder (?)
print(sql)
params = ('.',) # tuple
cursor = conn.cursor()
cursor.execute(sql, params)
My objective is to store a JSON object into a MySQL database field of type json, using the mysql.connector library.
import mysql.connector
import json
jsonData = json.dumps(origin_of_jsonData)
cnx = mysql.connector.connect(**config_defined_elsewhere)
cursor = cnx.cursor()
cursor.execute('CREATE DATABASE dataBase')
cnx.database = 'dataBase'
cursor = cnx.cursor()
cursor.execute('CREATE TABLE table (id_field INT NOT NULL, json_data_field JSON NOT NULL, PRIMARY KEY (id_field))')
Now, the code below WORKS just fine, the focus of my question is the use of '%s':
insert_statement = "INSERT INTO table (id_field, json_data_field) VALUES (%s, %s)"
values_to_insert = (1, jsonData)
cursor.execute(insert_statement, values_to_insert)
My problem with that: I am very strictly adhering to the use of '...{}'.format(aValue) (or f'...{aValue}') when combining variable aValue(s) into a string, thus avoiding the use of %s (whatever my reasons for that, let's not debate them here - but it is how I would like to keep it wherever possible, hence my question).
In any case, I am simply unable, whichever way I try, to create something that stores the jsonData into the mySql dataBase using something that resembles the above structure and uses '...{}'.format() (in whatever shape or form) instead of %s. For example, I have (among many iterations) tried
insert_statement = "INSERT INTO table (id_field, json_data_field) VALUES ({}, {})".format(1, jsonData)
cursor.execute(insert_statement)
but no matter how I turn and twist it, I keep getting the following error:
ProgrammingError: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '[some_content_from_jsonData})]' at line 1
Now my question(s):
1) Is there a way to avoid the use of %s here that I am missing?
2) If not, why? What is it that makes this impossible? Is it the cursor.execute() function, or is it the fact that it is a JSON object, or is it something completely different? Shouldn't {}.format() be able to do everything that %s could do, and more?
First of all: NEVER DIRECTLY INSERT YOUR DATA INTO YOUR QUERY STRING!
Using %s in a MySQL query string is not the same as using it in a python string.
In python, you just format the string and 'hello %s!' % 'world' becomes 'hello world!'. In SQL, the %s signals parameter insertion. This sends your query and data to the server separately. You are also not bound to this syntax. The python DB-API specification specifies more styles for this: DB-API parameter styles (PEP 249). This has several advantages over inserting your data directly into the query string:
Prevents SQL injection
Say you have a query to authenticate users by password. You would do that with the following query (of course you would normally salt and hash the password, but that is not the topic of this question):
SELECT 1 FROM users WHERE username='foo' AND password='bar'
The naive way to construct this query would be:
"SELECT 1 FROM users WHERE username='{}' AND password='{}'".format(username, password)
However, what would happen if someone inputs ' OR 1=1 as password. The formatted query would then become
SELECT 1 FROM users WHERE username='foo' AND password='' OR 1=1
which will allways return 1. When using parameter insertion:
execute('SELECT 1 FROM users WHERE username=%s AND password=%s', username, password)
this will never happen, as the query will be interpreted by the server separately.
Performance
If you run the same query many times with different data, the performance difference between using a formatted query and parameter insertion can be significant. With parameter insertion, the server only has to compile the query once (as it is the same every time) and execute it with different data, but with string formatting, it will have to compile it over and over again.
In addition to what was said above, I would like to add some details that I did not immediately understand, and that other (newbies like me ;)) may also find helpful:
1) "parameter insertion" is meant for only for values, it will not work for table names, column names, etc. - for those, the Python string substitution works fine in the sql syntax defintion
2) the cursor.execute function requires a tuple to work (as specified here, albeit not immediately clear, at least to me: https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursor-execute.html)
EXAMPLE for both in one function:
def checkIfRecordExists(column, table, condition_name, condition_value):
...
sqlSyntax = 'SELECT {} FROM {} WHERE {} = %s'.format(column, table, condition_name)
cursor.execute(sqlSyntax, (condition_value,))
Note both the use of .format in the initial sql syntax definition and the use of (condition_value,) in the execute function.
I'm trying to insert strings read from a file into an sqlite database in Python. The strings have whitespace (newline, tab characters, and spaces) and also have appearances of single or double quotes. Here's how I try to do it:
import sqlite3
conn = sqlite3.connect('example.db')
c = conn.cursor()
# Create table
c.execute('''CREATE TABLE test
(a text, b text)''')
f = open("foo", "w")
f.write("hello\n\'world\'\n")
f.close()
testfield = open("foo").read()
# Insert a row of data
c.execute("INSERT INTO test VALUES ('%s', 'bar')" %(testfield))
# Save (commit) the changes
conn.commit()
I find that this fails with the error:
c.execute("INSERT INTO test VALUES ('%s', 'bar')" %(testfield))
sqlite3.OperationalError: near "world": syntax error
How can I achieve this? Do the strings need to be escaped before insertion in the db, and if so how? thanks.
You use SQL parameters instead of string formatting:
c.execute("INSERT INTO test VALUES (?, 'bar')", (testfield,))
When using SQL parameters you let the database library handle the quoting, and even better, give the database to optimize the query and reuse the optimized query plan for multiple executions of the same basic query (with different parameters).
Last but not least, you are much better defended against SQL injection attacks as the database library knows best how to escape dangerous SQL-like values.
To quote the sqlite3 documentation:
Usually your SQL operations will need to use values from Python variables. You shouldn’t assemble your query using Python’s string operations because doing so is insecure; it makes your program vulnerable to an SQL injection attack (see http://xkcd.com/327/ for humorous example of what can go wrong).
Instead, use the DB-API’s parameter substitution. Put ? as a placeholder wherever you want to use a value, and then provide a tuple of values as the second argument to the cursor’s execute() method.