I use pyodbc in my python flask Project for the SQLite DB connection.
I know and understand SQL Injections but this is my first time dealing with it.
I tried to execute some
I have a function which concatenates the SQL String in my database.py file:
def open_issue(self, data_object):
cursor = self.conn.cursor()
# data_object is the issue i get from the user
name = data_object["name"]
text = data_object["text"]
rating_sum = 0
# if the user provides an issue
if name:
# check if issue is already in db
test = cursor.execute(f'''SELECT name FROM issue WHERE name = "{name}"''')
data = test.fetchall()
# if not in db insert
if len(data) == 0:
# insert the issue
cursor.executescript(f'''INSERT INTO issue (name, text, rating_sum)
VALUES ("{name}", "{text}", {rating_sum})''')
else:
print("nothing inserted!")
In the api.py file the open_issue() function gets called:
#self.app.route('/open_issue')
def insertdata():
# data sent from client
# data_object = flask.request.json
# unit test dictionary
data_object = {"name": "injection-test-table",
"text": "'; CREATE TABLE 'injected_table-1337';--"}
DB().open_issue(data_object)
The "'; CREATE TABLE 'injected_table-1337';--" sql injection has not created the injected_table-1337, instead it got inserted normally like a string into the text column of the injection-test-table.
So i don't really know if i am safe for the standard ways of SQL injection (this project will only be hosted locally but good security is always welcome)
And secondary: are there ways with pyodbc to check if a string contains sql syntax or symbols, so that nothing will get inserted in my example or do i need to check the strings manually?
Thanks a lot
As it turns out, with SQLite you are at much less risk of SQL injection issues because by default neither Python's built-in sqlite3 module nor the SQLite ODBC driver allow multiple statements to be executed in a single .execute call (commonly known as an "anonymous code block"). This code:
thing = "'; CREATE TABLE bobby (id int primary key); --"
sql = f"SELECT * FROM table1 WHERE txt='{thing}'"
crsr.execute(sql)
throws this for sqlite3
sqlite3.Warning: You can only execute one statement at a time.
and this for SQLite ODBC
pyodbc.Error: ('HY000', '[HY000] only one SQL statement allowed (-1) (SQLExecDirectW)')
Still, you should follow best practices and use a proper parameterized query
thing = "'; CREATE TABLE bobby (id int primary key); --"
sql = "SELECT * FROM table1 WHERE txt=?"
crsr.execute(sql, (thing, ))
because this will also correctly handle parameter values that would cause errors if injected directly, e.g.,
thing = "it's good to avoid SQL injection"
Related
Trying to insert into temp tables with SQLAlchemy (plugged into PYODBC/sql server) and inserting more than one row with decimal values and fast_executemany=True throws:
ProgrammingError("(pyodbc.ProgrammingError) ('Converting decimal loses precision', 'HY000')")
This happens only in temp table with fast_executemany=True and multiple rows being inserted at once with one column being decimal. Inserting one at a time, turning fast_executemany off or inserting into a regular table works perfectly.
I've built a simple example:
CONNSTR = "mssql+pyodbc://user:PASSWORD#SERVER?driver=ODBC Driver 17 for SQL Server&trusted_connection=yes"
def test():
data = [(1, Decimal('41763.9907359278'), Decimal('227367.1749095026')), (1027, Decimal('3117.1592020142'), Decimal('16970.1139430488'))]
engine = sqlalchemy.create_engine(CONNSTR, fast_executemany=True, connect_args={'connect_timeout': 10})
#this will fail
insert(engine, data, "#temp_table_test")
#this will work
insert(engine, data, "regular_table_test")
def insert(engine, data, table_name):
try:
with engine.begin() as con:
con.execute(f"""DROP TABLE IF EXISTS {table_name};""")
con.execute(f"""
CREATE TABLE {table_name} (
[id_column] INT NULL UNIQUE,
[usd_price] DECIMAL(38,20) NULL,
[brl_price] DECIMAL(38,20) NULL,
)
""")
sql_insert_prices = f"INSERT INTO {table_name} VALUES (?,?,?)"
con.execute(sql_insert_prices, data)
print(f"Insert em {table_name} worked!")
except Exception as e:
print(f"{e!r}")
print(f"Insert em {table_name} failed!")
While obviously related to the minimal conversion mechanisms done by fast execute, I can't find out why this runs differently depending on the type of table. Every other question here citing this particular exception is caused by other factors not present here I think so I'm really at a loss.
EDIT: so the original test with just one decimal column ran fine (I assumed reducing the number of columns wouldn't change the output), but adding another decimal column brings me back to square one with the same error message
fast_executemany=True asks the ODBC driver what the column types are, and the default mechanism used by Microsoft's ODBC drivers for SQL Server is to call a system stored procedure named sp_describe_undeclared_parameters. That stored procedure has some difficulties with #local_temp tables that do not occur with regular tables or ##global_temp tables. Details in this GitHub issue.
As mentioned in the related wiki entry, workarounds include
using Cursor.setinputsizes() to explicitly declare the column types,
using a ##global_temp table instead of a #local_temp table, or
adding UseFMTONLY=Yes to the ODBC connection string.
The easiest way to enable UseFMTONLY with SQLAlchemy is to use a pass-through pyodbc connection string, for example
from sqlalchemy.engine import URL
connection_string = (
"DRIVER=ODBC Driver 17 for SQL Server;"
"SERVER=192.168.0.199;"
"DATABASE=test;"
"UID=scott;PWD=tiger^5HHH;"
"UseFMTONLY=Yes;"
)
connection_url = URL.create("mssql+pyodbc", query={"odbc_connect": connection_string})
engine = create_engine(connection_url, fast_executemany=True)
I am writing a python file to query which is vulnerable to sql injection.
Here table name and column name on which constraint is made and constraint is given as command line argument while executing python file.
Here is the pyhon file:
import sqlite3
import sys
con = sqlite3.connect("univ1.db")
cur = con.cursor()
table = sys.argv[1]
column = sys.argv[2]
constraint = sys.argv[3]
cur.execute( """SELECT * FROM {} WHERE {} = '%s'""".format(table, column)% constraint)
rows = cur.fetchall()
for row in rows:
print(','.join([str(val) for val in row]))
This code is spposed to be vulnerable to sql injection hence executing following command is expected to drop the specified table from the database along with printing the detail of classroom whose building is blah.
python3 query.py classroom building "blah'; DROP TABLE INSTRUCTOR; --'"
But since the cursor.execute can execute only one command at a time the program terminates with a warning.
How can I allow executing multiple command. Also note that fetchall function should return the relevant data.
Why am I asking this?
It is a part of an assignment where I am supposed to write both injection disabled as well as injection vulnerable query file.
As it turns out python sqlite's is completely invulnerable to drop related attacks.
I have a query in a python script that creates a materialized view after some tables get created.
Script is something like this:
from sqlalchemy import create_engine, text
sql = '''CREATE MATERIALIZED VIEW schema1.view1 AS
SELECT t1.a,
t1.b,
t1.c,
t2.x AS d
FROM schema1.t1 t1
LEFT JOIN schema1.t2 t2 ON t1.f = t2.f
UNION ALL
SELECT t3.a,
t3.b,
t3.c,
t3.d
FROM schema1.t3 t3;'''
con=create_engine(db_conn)
con.execute(sql)
The query successfully executes when I run on the database directly.
But when running the script in python, I get an error:
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.SyntaxError) syntax error at or near "CREATE MATERIALIZED VIEW schema"
I can't for the life of me figure out what it has an issue with - any ideas?
This was the weirdest thing. I had copied my query text out of another tool that I use to navigate around my pg DB into VS Code. The last part of the answer by #EOhm gave me the idea to just type the whole thing out in VS Code instead of copy/pasting.
And everything worked.
Even though the pasted text and what I typed appear identical in every way. So apparently there was some invisible formatting causing this issue.
I don't know wether SQLAlchemy suports MView-Creation, but if it should be similiar or done with specific Metadata functions (https://docs.sqlalchemy.org/en/13/core/schema.html).
The text function is designed for database indepenendent DML, not DDL. Maybe it works for DDL (I don't know about SQLAlchemy) but by design the syntax is different than when You would execute directly on the database as SQLAlchemy shall abstract the details of databases from user.
If SQLAlchemy does no offer some convenient way for that and You nevertheless have valid reasons to use SQLAlchemy at all, You can just execute the plain SQL Statememt in the dialect the database backend understands, so just omit the sqlalchemies text function for the SQL statement, like:
from sqlalchemy import create_engine, text
sql = '''CREATE MATERIALIZED VIEW schema.view1 AS
SELECT t1.a,
t1.b,
t1.c
t2.x AS d
FROM schema.t1 t1
LEFT JOIN schema.t2 t2 ON t1.f = t2.f
UNION ALL
SELECT t3.a,
t3.b,
t3.c,
t3.d
FROM schema.t3 t3;'''
con=create_engine(db_conn)
con.raw_connection().cursor().execute(sql)
(But of course You have to take care for the backend type then opposed to the SQLAlchemy wrapped statements.)
I tested on my pg server without any issues using psycopg2 directly.
postgres=# create schema schema;
CREATE TABLE
postgres=# create table schema.t1 (a varchar, b varchar, c varchar, f integer);
CREATE TABLE
postgres=# create table schema.t2 (x varchar, f integer);
CREATE TABLE
postgres=# create table schema.t3 (a varchar, b varchar, c varchar, d varchar);
CREATE TABLE
postgres=# commit;
With the following script:
#!/usr/bin/python3
import psycopg2;
conn = psycopg2.connect("dbname=postgres")
cur = conn.cursor()
cur.execute("""
CREATE MATERIALIZED VIEW schema.view1 AS
SELECT t1.a,
t1.b,
t1.c,
t2.x AS d
FROM schema.t1 t1
LEFT JOIN schema.t2 t2 ON t1.f = t2.f
UNION ALL
SELECT t3.a,
t3.b,
t3.c,
t3.d
FROM schema.t3 t3;
""")
conn.commit()
cur.close()
conn.close()
I tested with quite current versions of python3.7/2.7 and current version of psycopg2 module and current libraries (I have 11.5 pg client and 2.8.3 psycopg2) from pgdg installed on a quite recent linux? Can You try to execute directly on psycopg2 like I did?
Also did You make sure Your dots are plain ascii dots as all the other characters in the statement are in this case? (Also keep in mind there can be invisible codepoints in unicode that can cause such sort of problems.) Maybe You can convert Your string to ASCII binary and back to Unicode-String if You are on Python. If it does not raise an error on .encode('ASCII') it should be clean.
My objective is to store a JSON object into a MySQL database field of type json, using the mysql.connector library.
import mysql.connector
import json
jsonData = json.dumps(origin_of_jsonData)
cnx = mysql.connector.connect(**config_defined_elsewhere)
cursor = cnx.cursor()
cursor.execute('CREATE DATABASE dataBase')
cnx.database = 'dataBase'
cursor = cnx.cursor()
cursor.execute('CREATE TABLE table (id_field INT NOT NULL, json_data_field JSON NOT NULL, PRIMARY KEY (id_field))')
Now, the code below WORKS just fine, the focus of my question is the use of '%s':
insert_statement = "INSERT INTO table (id_field, json_data_field) VALUES (%s, %s)"
values_to_insert = (1, jsonData)
cursor.execute(insert_statement, values_to_insert)
My problem with that: I am very strictly adhering to the use of '...{}'.format(aValue) (or f'...{aValue}') when combining variable aValue(s) into a string, thus avoiding the use of %s (whatever my reasons for that, let's not debate them here - but it is how I would like to keep it wherever possible, hence my question).
In any case, I am simply unable, whichever way I try, to create something that stores the jsonData into the mySql dataBase using something that resembles the above structure and uses '...{}'.format() (in whatever shape or form) instead of %s. For example, I have (among many iterations) tried
insert_statement = "INSERT INTO table (id_field, json_data_field) VALUES ({}, {})".format(1, jsonData)
cursor.execute(insert_statement)
but no matter how I turn and twist it, I keep getting the following error:
ProgrammingError: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '[some_content_from_jsonData})]' at line 1
Now my question(s):
1) Is there a way to avoid the use of %s here that I am missing?
2) If not, why? What is it that makes this impossible? Is it the cursor.execute() function, or is it the fact that it is a JSON object, or is it something completely different? Shouldn't {}.format() be able to do everything that %s could do, and more?
First of all: NEVER DIRECTLY INSERT YOUR DATA INTO YOUR QUERY STRING!
Using %s in a MySQL query string is not the same as using it in a python string.
In python, you just format the string and 'hello %s!' % 'world' becomes 'hello world!'. In SQL, the %s signals parameter insertion. This sends your query and data to the server separately. You are also not bound to this syntax. The python DB-API specification specifies more styles for this: DB-API parameter styles (PEP 249). This has several advantages over inserting your data directly into the query string:
Prevents SQL injection
Say you have a query to authenticate users by password. You would do that with the following query (of course you would normally salt and hash the password, but that is not the topic of this question):
SELECT 1 FROM users WHERE username='foo' AND password='bar'
The naive way to construct this query would be:
"SELECT 1 FROM users WHERE username='{}' AND password='{}'".format(username, password)
However, what would happen if someone inputs ' OR 1=1 as password. The formatted query would then become
SELECT 1 FROM users WHERE username='foo' AND password='' OR 1=1
which will allways return 1. When using parameter insertion:
execute('SELECT 1 FROM users WHERE username=%s AND password=%s', username, password)
this will never happen, as the query will be interpreted by the server separately.
Performance
If you run the same query many times with different data, the performance difference between using a formatted query and parameter insertion can be significant. With parameter insertion, the server only has to compile the query once (as it is the same every time) and execute it with different data, but with string formatting, it will have to compile it over and over again.
In addition to what was said above, I would like to add some details that I did not immediately understand, and that other (newbies like me ;)) may also find helpful:
1) "parameter insertion" is meant for only for values, it will not work for table names, column names, etc. - for those, the Python string substitution works fine in the sql syntax defintion
2) the cursor.execute function requires a tuple to work (as specified here, albeit not immediately clear, at least to me: https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursor-execute.html)
EXAMPLE for both in one function:
def checkIfRecordExists(column, table, condition_name, condition_value):
...
sqlSyntax = 'SELECT {} FROM {} WHERE {} = %s'.format(column, table, condition_name)
cursor.execute(sqlSyntax, (condition_value,))
Note both the use of .format in the initial sql syntax definition and the use of (condition_value,) in the execute function.
I have to call a MS SQLServer stored procedure with a table variable parameter.
/* Declare a variable that references the type. */
DECLARE #TableVariable AS [AList];
/* Add data to the table variable. */
INSERT INTO #TableVariable (val) VALUES ('value-1');
INSERT INTO #TableVariable (val) VALUES ('value-2');
EXEC [dbo].[sp_MyProc]
#param = #TableVariable
Works well in the SQL Sv Management studio. I tried the following in python using PyOdbc:
cursor.execute("declare #TableVariable AS [AList]")
for a in mylist:
cursor.execute("INSERT INTO #TableVariable (val) VALUES (?)", a)
cursor.execute("{call dbo.sp_MyProc(#TableVariable)}")
With the following error: error 42000 : the table variable must be declared. THe variable does not survive the different execute steps.
I also tried:
sql = "DECLARE #TableVariable AS [AList]; "
for a in mylist:
sql = sql + "INSERT INTO #TableVariable (val) VALUES ('{}'); ".format(a)
sql = sql + "EXEC [dbo].[sp_MyProc] #param = #TableVariable"
cursor.execute(sql)
With the following error: No results. Previous SQL was not a query.
No more chance with
sql = sql + "{call dbo.sp_MyProc(#TableVariable)}"
does somebody knows how to handle this using Pyodbc?
Now the root of your problem is that a SQL Server variable has the scope of the batch it was defined in. Each call to cursor.execute is a separate batch, even if they are in the same transaction.
There are a couple of ways you can work around this. The most direct is to rewrite your Python code so that it sends everything as a single batch. (I tested this on my test server and it should work as long as you either add set nocount on or else step over the intermediate results with nextset.)
A more indirect way is to rewrite the procedure to look for a temp table instead of a table variable and then just create and populate the temp table instead of a table variable. A temp table that is not created inside a stored procedure has a scope of the session it was created in.
I believe this error has nothing to do with sql forgetting the table variable. I've experienced this recently, and the problem was that pyodbc doesnt know how to get a resultset back from the stored procedure if the SP also returns counts for the things affected.
In my case the fix for this was to simply put "SET NOCOUNT ON" at the start of the SP.
I hope this helps.
I am not sure if this works and I can't test it because I don't have MS SQL Server, but have you tried executing everything in a single statement:
cursor.execute("""
DECLARE #TableVariable AS [AList];
INSERT INTO #TableVariable (val) VALUES ('value-1');
INSERT INTO #TableVariable (val) VALUES ('value-2');
EXEC [dbo].[sp_MyProc] #param = #TableVariable;
""");
I had this same problem, but none of the answers here fixed it. I was unable to get "SET NOCOUNT ON" to work, and I was also unable to make a single batch operation working with a table variable. What did work was to use a temporary table in two batches, but it all day to find the right syntax. The code which follows creates and populates a temporary table in the first batch, then in the second, it executes a stored proc using the database name followed by two dots before the stored proc name. This syntax is important for avoiding the error, "Could not find stored procedure 'x'. (2812) (SQLExecDirectW))".
def create_incidents(db_config, create_table, columns, tuples_list, upg_date):
"""Executes trackerdb-dev mssql stored proc.
Args:
config (dict): config .ini file with mssqldb conn.
create_table (string): temporary table definition to be inserted into 'CREATE TABLE #TempTable ()'
columns (tuple): columns of the table table into which values will be inserted.
tuples_list (list): list of tuples where each describes a row of data to insert into the table.
upg_date (string): date on which the items in the list will be upgraded.
Returns:
None
"""
sql_create = """IF OBJECT_ID('tempdb..#TempTable') IS NOT NULL
DROP TABLE #TempTable;
CREATE TABLE #TempTable ({});
INSERT INTO #TempTable ({}) VALUES {};
"""
columns = '"{}"'.format('", "'.join(item for item in columns))
# this "params" variable is an egregious offense against security professionals everywhere. Replace it with parameterized queries asap.
params = ', '.join([str(tupl) for tupl in tuples_list])
sql_create = sql_create.format(
create_table
, columns
, params)
msconn.autocommit = True
cur = msconn.cursor()
try:
cur.execute(sql_create)
cur.execute("DatabaseName..TempTable_StoredProcedure ?", upg_date)
except pyodbc.DatabaseError as err:
print(err)
else:
cur.close()
return
create_table = """
int_column int
, name varchar(255)
, datacenter varchar(25)
"""
create_incidents(
db_config = db_config
, create_table = create_table
, columns = ('int_column', 'name', 'datacenter')
, cloud_list = tuples_list
, upg_date = '2017-09-08')
The stored proc uses IF OBJECT_ID('tempdb..#TempTable') IS NULL syntax to validate the temporary table has been created. If it has, the procedure selects data from it and continues. If the temporary table has not been created, the proc aborts. This forces the stored proc to use a copy of the #TempTable created outside the stored procedure itself but in the same session. The pyodbc session lasts until the cursor or connection is closed and the temporary table created by pyodbc has the scope of the entire session.
IF OBJECT_ID('tempdb..#TempTable') IS NULL
BEGIN
-- #TempTable gets created here only because SQL Server Management Studio throws errors if it isn't.
CREATE TABLE #TempTable (
int_column int
, name varchar(255)
, datacenter varchar(25)
);
-- This error is thrown so that the stored procedure requires a temporary table created *outside* the stored proc
THROW 50000, '#TempTable table not found in tempdb', 1;
END
ELSE
BEGIN
-- the stored procedure has now validated that the temporary table being used is coming from outside the stored procedure
SELECT * FROM #TempTable;
END;
Finally, note that "tempdb" is not a placeholder, like I thought when I first saw it. "tempdb" is an actual MS SQL Server database system object.
Set connection.autocommit = True and use cursor.execute() only once instead of multiple times. The SQL string that you pass to cursor.execute() must contain all 3 steps:
Declaring the table variable
Filling the table variable with data
Executing the stored procedure that uses that table variable as an input
You don't need semicolons between the 3 steps.
Here's a fully functional demo. I didn't bother with parameter passing since it's irrelevant, but it also works fine with this, for the record.
SQL Setup (execute ahead of time)
CREATE TYPE dbo.type_MyTableType AS TABLE(
a INT,
b INT,
c INT
)
GO
CREATE PROCEDURE dbo.CopyTable
#MyTable type_MyTableType READONLY
AS
BEGIN
SET NOCOUNT ON;
SELECT * INTO MyResultTable FROM #MyTable
END
python
import pyodbc
CONN_STRING = (
'Driver={SQL Server Native Client 11.0};'
'Server=...;Database=...;UID=...;PWD=...'
)
class DatabaseConnection(object):
def __init__(self, connection_string):
self.conn = pyodbc.connect(connection_string)
self.conn.autocommit = True
self.cursor = self.conn.cursor()
def __enter__(self):
return self.cursor
def __exit__(self, *args):
self.cursor.close()
self.conn.close()
sql = (
'DECLARE #MyTable type_MyTableType'
'\nINSERT INTO #MyTable VALUES'
'\n(11, 12, 13),'
'\n(21, 22, 23)'
'\nEXEC CopyTable #MyTable'
)
with DatabaseConnection(CONN_STRING) as cursor:
cursor.execute(sql)
If you want to spread the SQL across multiple calls to cursor.execute(), then you need to use a temporary table instead. Note that in that case, you still need connection.autocommit = True.
As Timothy pointed out the catch is to use nextset().
What I have found out is that when you execute() a multiple statement query, pyodbc checks (for any syntax errors) and executes only the first statement in the batch but not the entire batch unless you explicitly specify nextset().
say your query is :
cursor.execute('select 1 '
'select 1/0')
print(cursor.fetchall())
your result is:
[(1, )]
but as soon as you instruct it to move further in the batch that is the syntactically erroneous part via the command:
cursor.nextset()
there you have it:
pyodbc.DataError: ('22012', '[22012] [Microsoft][ODBC SQL Server Driver][SQL Server]Divide by zero error encountered. (8134) (SQLMoreResults)')
hence solves the issue that I encountered with working with variable tables in a multiple statement query.