How to use complex SQL scripts with python

How to use complex SQL scripts with python - python

I could write a SP inside Mysql and excute with a call statement. But looking to write it in python instead. I got stuck with using sql script on multiple lines.
conn = pyodbc.connect('DSN=MySQL;PWD=xxxx')
csr = conn.cursor()
Sql= 'SELECT something, something
FROM table
WHERE foo=bar
ORDER BY foo '
csr.execute(Sql)
sqld = csr.fetchall()

Heh, I don't mind to make it a proper answer.
String literals in triple quotes can include linebreaks and won't cause syntax errors. Otherwise (with "string" or 'string') you will need to include a backslash before every linebreak to make it work. And from experience, that's easy to screw up. :)
As a minor note, in Python variables are usually started with a lowercase letter, names starting with capital letters usually being given to classes.
So:
Sql = """SELECT something, something
FROM table
WHERE foo=bar
ORDER BY foo"""

If you don't mind the overhead, take a look at sqlalchemy:
SQLAlchemy is the Python SQL toolkit and Object Relational Mapper that gives application developers the full power and flexibility of SQL.

Related

mysql query from backend python server syntax error with backticks on table name [duplicate]

Pretty new to sqlite3, so bear with me here..
I'd like to have a function to which I can pass the table name, and the values to update.
I initially started with something like this:
def add_to_table(table_name, string):
cursor.execute('INSERT INTO {table} VALUES ({var})'
.format(
table=table_name,
var=string)
)
Which works A-OK, but further reading about sqlite3 suggested that this was a terribly insecure way to go about things. However, using their ? syntax, I'm unable to pass in a name to specify the variable.
I tried adding in a ? in place of the table, but that throws a syntax error.
cursor.execute('INSERT INTO ? VALUES (?)', ('mytable','"Jello, world!"'))
>> >sqlite3.OperationalError: near "?": syntax error
Can the table in an sql statement be passed in safely and dynamically?

Its not the dynamic string substitution per-se thats the problem. Its dynamic string substitution with an user-supplied string thats the big problem because that opens you to SQL-injection attacks. If you are absolutely 100% sure that the tablename is a safe string that you control then splicing it into the SQL query will be safe.
if some_condition():
table_name = 'TABLE_A'
else:
table_name = 'TABLE_B'
cursor.execute('INSERT INTO '+ table_name + 'VALUES (?)', values)
That said, using dynamic SQL like that is certainly a code smell so you should double check to see if you can find a simpler alternative without the dynamically generated SQL strings. Additionally, if you really want dynamic SQL then something like SQLAlchemy might be useful to guarantee that the SQL you generate is well formed.

Composing SQL statements using string manipulation is odd not only because of security implications, but also because strings are "dumb" objects. Using sqlalchemy core (you don't even need the ORM part) is almost like using strings, but each fragment will be a lot smarter and allow for easier composition. Take a look at the sqlalchemy wiki to get a notion of what I'm talking about.
For example, using sqlsoup your code would look like this:
db = SQLSoup('sqlite://yourdatabase')
table = getattr(db, tablename)
table.insert(fieldname='value', otherfield=123)
db.commit()
Another advantage: code is database independent - want to move to oracle? Change the connection string and you are done.

Python Postgresql query with text string

In PgAdmin, I can do the following query successfully:
select * from "Faces" where "Face_Name" = 'Alex'
However, when I try to do the exact same query in python, I get endless syntax errors.
I am trying to write the line like this:
cursor.execute('SELECT * from "Faces" where ("Face_Name" = 'Alex')
I understand the table and column names need to be in double quotes, and the whole query needs to be in single quotes. Also seems the string (in this case 'Alex') that I am searching for needs to be in single quotes.
How do I put all this together into a single line?

Assuming you did need to escape the table and column names, you could use double quotes. In that case, just escape the double quotes inside the Python SQL string:
sql = "SELECT * FROM \"Faces\" WHERE \"Face_Name\" = 'Alex'"
cursor.execute(sql)

There are two issues here:
As others already wrote, you need to be careful not to mix up the Python and SQL quotes; depending on the field name you may need to have both in the query, and either escape one of them or use """ for the Python string.
If the name "Alex" comes from a variable in Python, rather than being a constant, you should use a placeholder and pass it separately. This will help avoid security problems (SQL Injection) and is a good habit to get into whether or not it's required in this particular case.
Putting these together, the query should be:
cursor.execute('SELECT * from "Faces" where "Face_Name" = %s', ('Alex',))

Being that string substitution is frowned upon with forming SQL queries, how do you assign the table name dynamically?

Pretty new to sqlite3, so bear with me here..
I'd like to have a function to which I can pass the table name, and the values to update.
I initially started with something like this:
def add_to_table(table_name, string):
cursor.execute('INSERT INTO {table} VALUES ({var})'
.format(
table=table_name,
var=string)
)
Which works A-OK, but further reading about sqlite3 suggested that this was a terribly insecure way to go about things. However, using their ? syntax, I'm unable to pass in a name to specify the variable.
I tried adding in a ? in place of the table, but that throws a syntax error.
cursor.execute('INSERT INTO ? VALUES (?)', ('mytable','"Jello, world!"'))
>> >sqlite3.OperationalError: near "?": syntax error
Can the table in an sql statement be passed in safely and dynamically?

Its not the dynamic string substitution per-se thats the problem. Its dynamic string substitution with an user-supplied string thats the big problem because that opens you to SQL-injection attacks. If you are absolutely 100% sure that the tablename is a safe string that you control then splicing it into the SQL query will be safe.
if some_condition():
table_name = 'TABLE_A'
else:
table_name = 'TABLE_B'
cursor.execute('INSERT INTO '+ table_name + 'VALUES (?)', values)
That said, using dynamic SQL like that is certainly a code smell so you should double check to see if you can find a simpler alternative without the dynamically generated SQL strings. Additionally, if you really want dynamic SQL then something like SQLAlchemy might be useful to guarantee that the SQL you generate is well formed.

Composing SQL statements using string manipulation is odd not only because of security implications, but also because strings are "dumb" objects. Using sqlalchemy core (you don't even need the ORM part) is almost like using strings, but each fragment will be a lot smarter and allow for easier composition. Take a look at the sqlalchemy wiki to get a notion of what I'm talking about.
For example, using sqlsoup your code would look like this:
db = SQLSoup('sqlite://yourdatabase')
table = getattr(db, tablename)
table.insert(fieldname='value', otherfield=123)
db.commit()
Another advantage: code is database independent - want to move to oracle? Change the connection string and you are done.

How to avoid multiple queries in one execute call

I've just realized that psycopg2 allows multiple queries in one execute call.
For instance, this code will actually insert two rows in my_table:
>>> import psycopg2
>>> connection = psycopg2.connection(database='testing')
>>> cursor = connection.cursor()
>>> sql = ('INSERT INTO my_table VALUES (1, 2);'
... 'INSERT INTO my_table VALUES (3, 4)')
>>> cursor.execute(sql)
>>> connection.commit()
Does psycopg2 have some way of disabling this functionality? Or is there some other way to prevent this from happening?
What I've come so far is to search if the query has any semicolon (;) on it:
if ';' in sql:
# Multiple queries not allowed!
But this solution is not perfect, because it wouldn't allow some valid queries like:
SELECT * FROM my_table WHERE name LIKE '%;'
EDIT: SQL injection attacks are not an issue here. I do want to give to the user full access of the database (he can even delete the whole database if he wants).

If you want a general solution to this kind of problem, the answer is always going to be "parse format X, or at least parse it well enough to handle your needs".
In this case, it's probably pretty simple. PostgreSQL doesn't allow semicolons in the middle of column or table names, etc.; the only places they can appear are inside strings, or as statement terminators. So, you don't need a full parser, just one that can handle strings.
Unfortunately, even that isn't completely trivial, because you have to know the rules for what counts as a string literal in PostgreSQL. For example, is "abc\"def" a string abc"def?
But once you write or find a parser that can identify strings in PostgreSQL, it's easy: skip all the strings, then see if there are any semicolons left over.
For example (this is probably not the correct logic,* and it's also written in a verbose and inefficient way, just to show you the idea):
def skip_quotes(sql):
in_1, in_2 = False, False
for c in sql:
if in_1:
if c == "'":
in_1 = False
elif in_2:
if c == '"':
in_2 = False
else:
if c == "'":
in_1 = True
elif c == '"':
in_2 = True
else:
yield c
Then you can just write:
if ';' in skip_quotes(sql):
# Multiple queries not allowed!
If you can't find a pre-made parser, the first things to consider are:
If it's so trivial that simple string operations like find will work, do that.
If it's a simple, regular language, use re.
If the logic can be explained descriptively (e.g., via a BNF grammar), use a parsing library or parser-generator library like pyparsing or pybison.
Otherwise, you will probably need to write a state machine, or even explicit iterative code (like my example above). But this is very rarely the best answer for anything but teaching purposes.
* This is correct for a dialect that accepts either single- or double-quoted strings, does not escape one quote type within the other, and escapes quotes by doubling them (we will incorrectly treat 'abc''def' as two strings abc and def, rather than one string abc'def, but since all we're doing is skipping the strings anyway, we get the right result), but does not have C-style backslash escapes or anything else. I believe this matches sqlite3 as it actually works, although not sqlite3 as it's documented, and I have no idea whether it matches PostgreSQL.

Allowing users to make arbitrary queries (even single queries) can open your program up to SQL injection attacks and denial-of-service (DOS) attacks. The safest way to deal with potentially malicious users is to enumerate exactly what what queries are allowable and only allow the user to supply parameter values, not the entire SQL query itself.
So for example, you could define
sql = 'INSERT INTO my_table VALUES (%s, %s)'
args = [1, 2] # <-- Supplied by the user
and then safely execute the INSERT statement with:
cursor.execute(sql, args)
This is called parametrized SQL because the sql uses %s as parameter placemarkers, and the cursor.execute statement takes two arguments. The second argument is expected to be a sequence, and the database driver (e.g. psycopg2) will replace the parameter placemarkers with propertly quoted values supplied by args.
This will prevent SQL injection attacks.
The onus is still on you (when you write your allowable SQL) to prevent denial-of-service attacks. You can attempt to protect yourself from DOS attacks by making sure the arguments supplied by the user is in a reasonable range, for instance.

How to quote a string value explicitly (Python DB API/Psycopg2)

For some reasons, I would like to do an explicit quoting of a string value (becoming a part of constructed SQL query) instead of waiting for implicit quotation performed by cursor.execute method on contents of its second parameter.
By "implicit quotation" I mean:
value = "Unsafe string"
query = "SELECT * FROM some_table WHERE some_char_field = %s;"
cursor.execute( query, (value,) ) # value will be correctly quoted
I would prefer something like that:
value = "Unsafe string"
query = "SELECT * FROM some_table WHERE some_char_field = %s;" % \
READY_TO_USE_QUOTING_FUNCTION(value)
cursor.execute( query ) # value will be correctly quoted, too
Is such low level READY_TO_USE_QUOTING_FUNCTION expected by Python DB API specification (I couldn't find such functionality in PEP 249 document). If not, maybe Psycopg2 provides such function? If not, maybe Django provides such function? I would prefer not to write such function myself...

Ok, so I was curious and went and looked at the source of psycopg2. Turns out I didn't have to go further than the examples folder :)
And yes, this is psycopg2-specific. Basically, if you just want to quote a string you'd do this:
from psycopg2.extensions import adapt
print adapt("Hello World'; DROP DATABASE World;")
But what you probably want to do is to write and register your own adapter;
In the examples folder of psycopg2 you find the file 'myfirstrecipe.py' there is an example of how to cast and quote a specific type in a special way.
If you have objects for the stuff you want to do, you can just create an adapter that conforms to the 'IPsycopgSQLQuote' protocol (see pydocs for the myfirstrecipe.py-example...actually that's the only reference I can find to that name) that quotes your object and then registering it like so:
from psycopg2.extensions import register_adapter
register_adapter(mytype, myadapter)
Also, the other examples are interesting; esp. 'dialtone.py' and 'simple.py'.

I guess you're looking for the mogrify function.
Example:
>>> cur.mogrify("INSERT INTO test (num, data) VALUES (%s, %s)", (42, 'bar'))
"INSERT INTO test (num, data) VALUES (42, E'bar')"

You should try to avoid doing your own quoting. Not only will it be DB-specific as people have pointed out, but flaws in quoting are the source of SQL injection bugs.
If you don't want to pass around queries and values separately, then pass around a list of the parameters:
def make_my_query():
# ...
return sql, (value1, value2)
def do_it():
query = make_my_query()
cursor.execute(*query)
(I probably have the syntax of cursor.execute wrong) The point here is that just because cursor.execute takes a number of arguments, that doesn't mean you have to handle them all separately. You can deal with them as one list.

This'll be database dependent (iirc, mysql allows \ as an escape character, while something like oracle expects quotes to be doubled: 'my '' quoted string').
Someone correct me if i'm wrong, but the double-quoting method is the standard method.
It may be worth looking at what other db abstraction libraries do (sqlalchemy, cx_Oracle, sqlite, etc).
I've got to ask - why do you want to inline the values instead of bind them?

This is going to be DB dependent. In the case of MySQLdb, for example, the connection class has a literal method that will convert the value to the correct escaped representation for passing to MySQL (that's what cursor.execute uses).
I imagine Postgres has something similar, but I don't think there is a function to escape values as part of the DB API 2.0 spec.

I don't think you give any sufficient reasoning behind your avoidance to do this The Right Way. Please, use the APi as it is designed and don't try so hard to make your code less readable for the next guy and more fragile.

Your code snippet would get just like this, according to psycopg extension docs
from psycopg2.extensions import adapt
value = "Unsafe string"
query = "SELECT * FROM some_table WHERE some_char_field = %s;" % \
adapt(value).getquoted()
cursor.execute( query ) # value will be correctly quoted, too
The getquoted function returns the value as a quoted and escaped string, so you could also go: "SELECT * FROM some_table WHERE some_char_field = " + adapt(value).getquoted() .

PyPika in another good option for building SQL statements. Usage example (based on an example on the project's homepage):
>>> from pypika import Order, Query
>>> Query.from_('customers').select('id', 'fname', 'lname', 'phone').orderby('id', order=Order.desc)
SELECT "id","fname","lname","phone" FROM "customers" ORDER BY "id" DESC

If you use django you might want to use the quoting function which is automatically adapted to the currently configured DBMS :
from django.db import backend
my_quoted_variable = backend.DatabaseOperations().quote_name(myvar)

import re
def db_quote(s):
return "\"" + re.escape(s) + "\""
can do the job of simple quoting that works at least with MySQL. What we really need, though is cursor.format() function that would work like cursor.execute() except it would return the resulting query instead of executing it. There are times when you do not want the query to be executed quite yet - e.g you may want to log it first, or print it out for debugging before you go ahead with it.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.