Python and Sqlite3, concatenate passed strings

Python and Sqlite3, concatenate passed strings - python

The following is a generalisation of my problem
Col_Index = 'Foo'
data = SQLfunctions.fetchSQL("SELECT Col_" + Col_Index + "FROM MyTable)
Where fetchSQL is a function that returns the values of the SQL statement. I tried to re-write this so as to pass the Col_Index using ? and concatenate the strings i.e.
data = SQLfunctions.fetchSQL("SELECT Col_ || ? FROM MyTable", [Col_Index])
But this did not work as it did not seem to perform the || with the ? (error something like column Col_ does not exist).
How do I change this to make it work?

You cannot use SQL parameters for anything other than values.
You cannot use them for table names or columns or SQL functions or any other part of the SQL grammar, precisely because they are designed to prevent such use.
In other words, if SQL parameters could be used to produce SQL code (including object names), they would be useless for one of their main uses: to prevent SQL injection attacks.
You are stuck with generating that part of your query, I am afraid.
Do be very, very careful with user-provided data, don't ever take Col_Index from external sources without sanitizing and/or severely restricting the range of values it can hold.
You could look at SQLAlchemy to generate the SQL for you, as it ensures that any object names are properly escaped as well:
from sqlalchemy.sql import table, literal_column, select
tbl = table('MyTable')
column = literal_column('Col_' + Col_Index)
session.execute(select([column], '', [tbl]))

Related

mysql query from backend python server syntax error with backticks on table name [duplicate]

Pretty new to sqlite3, so bear with me here..
I'd like to have a function to which I can pass the table name, and the values to update.
I initially started with something like this:
def add_to_table(table_name, string):
cursor.execute('INSERT INTO {table} VALUES ({var})'
.format(
table=table_name,
var=string)
)
Which works A-OK, but further reading about sqlite3 suggested that this was a terribly insecure way to go about things. However, using their ? syntax, I'm unable to pass in a name to specify the variable.
I tried adding in a ? in place of the table, but that throws a syntax error.
cursor.execute('INSERT INTO ? VALUES (?)', ('mytable','"Jello, world!"'))
>> >sqlite3.OperationalError: near "?": syntax error
Can the table in an sql statement be passed in safely and dynamically?

Its not the dynamic string substitution per-se thats the problem. Its dynamic string substitution with an user-supplied string thats the big problem because that opens you to SQL-injection attacks. If you are absolutely 100% sure that the tablename is a safe string that you control then splicing it into the SQL query will be safe.
if some_condition():
table_name = 'TABLE_A'
else:
table_name = 'TABLE_B'
cursor.execute('INSERT INTO '+ table_name + 'VALUES (?)', values)
That said, using dynamic SQL like that is certainly a code smell so you should double check to see if you can find a simpler alternative without the dynamically generated SQL strings. Additionally, if you really want dynamic SQL then something like SQLAlchemy might be useful to guarantee that the SQL you generate is well formed.

Composing SQL statements using string manipulation is odd not only because of security implications, but also because strings are "dumb" objects. Using sqlalchemy core (you don't even need the ORM part) is almost like using strings, but each fragment will be a lot smarter and allow for easier composition. Take a look at the sqlalchemy wiki to get a notion of what I'm talking about.
For example, using sqlsoup your code would look like this:
db = SQLSoup('sqlite://yourdatabase')
table = getattr(db, tablename)
table.insert(fieldname='value', otherfield=123)
db.commit()
Another advantage: code is database independent - want to move to oracle? Change the connection string and you are done.

Python variables in SQL query

I am creating a Python Flask app that interfaces with an SQL database. One of the things it does is take user input and stores it in a database. My current way of doing it looks something like this
mycursor.execute(f"SELECT * FROM privileges_groups WHERE id = {PrivID}")
This is not a good or correct way of doing this. Not only can certain characters such as ' cause errors, it also leaves me susceptible to SQL injection. Could anyone inform me of a good way of doing this?

To protect against injection attacks you should use placeholders for values.
So change
mycursor.execute(f"SELECT * FROM privileges_groups WHERE id = {PrivID}")
to
mycursor.execute("SELECT * FROM privileges_groups WHERE id = ?", (PrivID,))
Placeholders can only store a value of the given type and not an arbitrary SQL fragment. This will help to guard against strange (and probably invalid) parameter values.
However, you can't use placeholders for table names and column names.
Note: trailing comma is required for one-element tuples only but not necessary for multiple-element tuples. The comma disambiguates a tuple from an expression surrounded by parentheses.
Related: How do parameterized queries help against SQL injection?

So, if you want to avoid a sql injection...you have to have a secure query i.e. you don't want your query to doing something it shouldn't be.
queryRun = "SELECT * FROM privileges_groups WHERE id = %s" % (PrivID)
When you use "%s" this variable as a placeholder, you avoid ambiguity as to what the injection can or cannot cause to the overall system.
then..run the .execute() call:
mycursor.execute(queryRun)
Note: this also can be done in one step having all the changes within the .execute() call but you maybe better off splitting into piece-wise approach.
This isn't 100 % but should help a lot.

query of sqlite3 with python using '?'

I have a table of three columnsid,word,essay.I want to do a query using (?). The sql sentence is sql1 = "select id,? from training_data". My code is below:
def dbConnect(db_name,sql,flag):
conn = sqlite3.connect(db_name)
cursor = conn.cursor()
if (flag == "danci"):
itm = 'word'
elif flag == "wenzhang":
itm = 'essay'
n = cursor.execute(sql,(itm,))
res1 = cursor.fetchall()
return res1
However, when I print dbConnect("data.db",sql1,"danci")
The result I obtained is [(1,'word'),(2,'word'),(3,'word')...].What I really want to get is [(1,'the content of word column'),(2,'the content of word column')...]. What should I do ? Please give me some ideas.

You can't use placeholders for identifiers -- only for literal values.
I don't know what to suggest in this case, as your function takes a database nasme, an SQL string, and a flag to say how to modify that string. I think it would be better to pass just the first two, and write something like
sql = {
"danci": "SELECT id, word FROM training_data",
"wenzhang": "SELECT id, essay FROM training_data",
}
and then call it with one of
dbConnect("data.db", sql['danci'])
or
dbConnect("data.db", sql['wenzhang'])
But a lot depends on why you are asking dbConnect to decide on the columns to fetch based on a string passed in from outside; it's an unusual design.
Update - SQL Injection
The problems with SQL injection and tainted data is well documented, but here is a summary.
The principle is that, in theory, a programmer can write safe and secure programs as long as all the sources of data are under his control. As soon as they use any information from outside the program without checking its integrity, security is under threat.
Such information ranges from the obvious -- the parameters passed on the command line -- to the obscure -- if the PATH environment variable is modifiable then someone could induce a program to execute a completely different file from the intended one.
Perl provides direct help to avoid such situations with Taint Checking, but SQL Injection is the open door that is relevant here.
Suppose you take the value for a database column from an unverfied external source, and that value appears in your program as $val. Then, if you write
my $sql = "INSERT INTO logs (date) VALUES ('$val')";
$dbh->do($sql);
then it looks like it's going to be okay. For instance, if $val is set to 2014-10-27 then $sql becomes
INSERT INTO logs (date) VALUES ('2014-10-27')
and everything's fine. But now suppose that our data is being provided by someone less than scrupulous or downright malicious, and your $val, having originated elsewhere, contains this
2014-10-27'); DROP TABLE logs; SELECT COUNT(*) FROM security WHERE name != '
Now it doesn't look so good. $sql is set to this (with added newlines)
INSERT INTO logs (date) VALUES ('2014-10-27');
DROP TABLE logs;
SELECT COUNT(*) FROM security WHERE name != '')
which adds an entry to the logs table as before, end then goes ahead and drops the entire logs table and counts the number of records in the security table. That isn't what we had in mind at all, and something we must guard against.
The immediate solution is to use placeholders ? in a prepared statement, and later passing the actual values in a call to execute. This not only speeds things up, because the SQL statement can be prepared (compiled) just once, but protects the database from malicious data by quoting every supplied value appropriately for the data type, and escaping any embedded quotes so that it is impossible to close one statement and another open another.
This whole concept was humourised in Randall Munroe's excellent XKCD comic

Being that string substitution is frowned upon with forming SQL queries, how do you assign the table name dynamically?

Pretty new to sqlite3, so bear with me here..
I'd like to have a function to which I can pass the table name, and the values to update.
I initially started with something like this:
def add_to_table(table_name, string):
cursor.execute('INSERT INTO {table} VALUES ({var})'
.format(
table=table_name,
var=string)
)
Which works A-OK, but further reading about sqlite3 suggested that this was a terribly insecure way to go about things. However, using their ? syntax, I'm unable to pass in a name to specify the variable.
I tried adding in a ? in place of the table, but that throws a syntax error.
cursor.execute('INSERT INTO ? VALUES (?)', ('mytable','"Jello, world!"'))
>> >sqlite3.OperationalError: near "?": syntax error
Can the table in an sql statement be passed in safely and dynamically?

Its not the dynamic string substitution per-se thats the problem. Its dynamic string substitution with an user-supplied string thats the big problem because that opens you to SQL-injection attacks. If you are absolutely 100% sure that the tablename is a safe string that you control then splicing it into the SQL query will be safe.
if some_condition():
table_name = 'TABLE_A'
else:
table_name = 'TABLE_B'
cursor.execute('INSERT INTO '+ table_name + 'VALUES (?)', values)
That said, using dynamic SQL like that is certainly a code smell so you should double check to see if you can find a simpler alternative without the dynamically generated SQL strings. Additionally, if you really want dynamic SQL then something like SQLAlchemy might be useful to guarantee that the SQL you generate is well formed.

Composing SQL statements using string manipulation is odd not only because of security implications, but also because strings are "dumb" objects. Using sqlalchemy core (you don't even need the ORM part) is almost like using strings, but each fragment will be a lot smarter and allow for easier composition. Take a look at the sqlalchemy wiki to get a notion of what I'm talking about.
For example, using sqlsoup your code would look like this:
db = SQLSoup('sqlite://yourdatabase')
table = getattr(db, tablename)
table.insert(fieldname='value', otherfield=123)
db.commit()
Another advantage: code is database independent - want to move to oracle? Change the connection string and you are done.

How to avoid multiple queries in one execute call

I've just realized that psycopg2 allows multiple queries in one execute call.
For instance, this code will actually insert two rows in my_table:
>>> import psycopg2
>>> connection = psycopg2.connection(database='testing')
>>> cursor = connection.cursor()
>>> sql = ('INSERT INTO my_table VALUES (1, 2);'
... 'INSERT INTO my_table VALUES (3, 4)')
>>> cursor.execute(sql)
>>> connection.commit()
Does psycopg2 have some way of disabling this functionality? Or is there some other way to prevent this from happening?
What I've come so far is to search if the query has any semicolon (;) on it:
if ';' in sql:
# Multiple queries not allowed!
But this solution is not perfect, because it wouldn't allow some valid queries like:
SELECT * FROM my_table WHERE name LIKE '%;'
EDIT: SQL injection attacks are not an issue here. I do want to give to the user full access of the database (he can even delete the whole database if he wants).

If you want a general solution to this kind of problem, the answer is always going to be "parse format X, or at least parse it well enough to handle your needs".
In this case, it's probably pretty simple. PostgreSQL doesn't allow semicolons in the middle of column or table names, etc.; the only places they can appear are inside strings, or as statement terminators. So, you don't need a full parser, just one that can handle strings.
Unfortunately, even that isn't completely trivial, because you have to know the rules for what counts as a string literal in PostgreSQL. For example, is "abc\"def" a string abc"def?
But once you write or find a parser that can identify strings in PostgreSQL, it's easy: skip all the strings, then see if there are any semicolons left over.
For example (this is probably not the correct logic,* and it's also written in a verbose and inefficient way, just to show you the idea):
def skip_quotes(sql):
in_1, in_2 = False, False
for c in sql:
if in_1:
if c == "'":
in_1 = False
elif in_2:
if c == '"':
in_2 = False
else:
if c == "'":
in_1 = True
elif c == '"':
in_2 = True
else:
yield c
Then you can just write:
if ';' in skip_quotes(sql):
# Multiple queries not allowed!
If you can't find a pre-made parser, the first things to consider are:
If it's so trivial that simple string operations like find will work, do that.
If it's a simple, regular language, use re.
If the logic can be explained descriptively (e.g., via a BNF grammar), use a parsing library or parser-generator library like pyparsing or pybison.
Otherwise, you will probably need to write a state machine, or even explicit iterative code (like my example above). But this is very rarely the best answer for anything but teaching purposes.
* This is correct for a dialect that accepts either single- or double-quoted strings, does not escape one quote type within the other, and escapes quotes by doubling them (we will incorrectly treat 'abc''def' as two strings abc and def, rather than one string abc'def, but since all we're doing is skipping the strings anyway, we get the right result), but does not have C-style backslash escapes or anything else. I believe this matches sqlite3 as it actually works, although not sqlite3 as it's documented, and I have no idea whether it matches PostgreSQL.

Allowing users to make arbitrary queries (even single queries) can open your program up to SQL injection attacks and denial-of-service (DOS) attacks. The safest way to deal with potentially malicious users is to enumerate exactly what what queries are allowable and only allow the user to supply parameter values, not the entire SQL query itself.
So for example, you could define
sql = 'INSERT INTO my_table VALUES (%s, %s)'
args = [1, 2] # <-- Supplied by the user
and then safely execute the INSERT statement with:
cursor.execute(sql, args)
This is called parametrized SQL because the sql uses %s as parameter placemarkers, and the cursor.execute statement takes two arguments. The second argument is expected to be a sequence, and the database driver (e.g. psycopg2) will replace the parameter placemarkers with propertly quoted values supplied by args.
This will prevent SQL injection attacks.
The onus is still on you (when you write your allowable SQL) to prevent denial-of-service attacks. You can attempt to protect yourself from DOS attacks by making sure the arguments supplied by the user is in a reasonable range, for instance.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.