Forming an insert statement with single quotes in Python with Postgres - python

I am having a field called comments.
I am effectively trying to read values from one large table into multiple tables.
Hence, my select query fetches the comment field for me.
I am constructing a Python script to do the copying from table to table.
My insert query fails when it encounters a comment field like "Sorry! we can't process your order" because of the single quote.
I have tried using $ quotes but in vain
Here is what I am trying
#!/usr/bin/python
import psycopg2
conn = psycopg2.connect("dbname='postgres' user='postgres' host='localhost' )
mark=conn.cursor()
/* fectching the rows and doing other stuff */
addthis="insert into my_table(something) values("$$"+str(row[8])+"$$")
mark.execute(addthis)
conn.commit()
Appreciate the help!

Your insert statement should use a placeholder. In the case of psycopg2, it is %s.
You should pass the parameter(s) as a second argument to execute(). That way you don't have quoting issues and you guard against SQL-injection attack.
For example:
addthis = "INSERT INTO my_table (something) VALUES (%s);"
mark.execute(addthis, ('a string you wish to insert',))

You could use a placeholder, as suggested by bernie. This is the preferred way.
There are however situations where using a placeholder is not possible. You then have to escape the qoutes manually. This is done by backslashing them:
addthis="insert into my_table(something) values(%s)" % str(row[8]).replace('"', r'\"').replace("'", r"\'")
mark.execute(addthis)

Related

mysql query from backend python server syntax error with backticks on table name [duplicate]

Pretty new to sqlite3, so bear with me here..
I'd like to have a function to which I can pass the table name, and the values to update.
I initially started with something like this:
def add_to_table(table_name, string):
cursor.execute('INSERT INTO {table} VALUES ({var})'
.format(
table=table_name,
var=string)
)
Which works A-OK, but further reading about sqlite3 suggested that this was a terribly insecure way to go about things. However, using their ? syntax, I'm unable to pass in a name to specify the variable.
I tried adding in a ? in place of the table, but that throws a syntax error.
cursor.execute('INSERT INTO ? VALUES (?)', ('mytable','"Jello, world!"'))
>> >sqlite3.OperationalError: near "?": syntax error
Can the table in an sql statement be passed in safely and dynamically?
Its not the dynamic string substitution per-se thats the problem. Its dynamic string substitution with an user-supplied string thats the big problem because that opens you to SQL-injection attacks. If you are absolutely 100% sure that the tablename is a safe string that you control then splicing it into the SQL query will be safe.
if some_condition():
table_name = 'TABLE_A'
else:
table_name = 'TABLE_B'
cursor.execute('INSERT INTO '+ table_name + 'VALUES (?)', values)
That said, using dynamic SQL like that is certainly a code smell so you should double check to see if you can find a simpler alternative without the dynamically generated SQL strings. Additionally, if you really want dynamic SQL then something like SQLAlchemy might be useful to guarantee that the SQL you generate is well formed.
Composing SQL statements using string manipulation is odd not only because of security implications, but also because strings are "dumb" objects. Using sqlalchemy core (you don't even need the ORM part) is almost like using strings, but each fragment will be a lot smarter and allow for easier composition. Take a look at the sqlalchemy wiki to get a notion of what I'm talking about.
For example, using sqlsoup your code would look like this:
db = SQLSoup('sqlite://yourdatabase')
table = getattr(db, tablename)
table.insert(fieldname='value', otherfield=123)
db.commit()
Another advantage: code is database independent - want to move to oracle? Change the connection string and you are done.

Use of '.format()' vs. '%s' in cursor.execute() for mysql JSON field, with Python mysql.connector,

My objective is to store a JSON object into a MySQL database field of type json, using the mysql.connector library.
import mysql.connector
import json
jsonData = json.dumps(origin_of_jsonData)
cnx = mysql.connector.connect(**config_defined_elsewhere)
cursor = cnx.cursor()
cursor.execute('CREATE DATABASE dataBase')
cnx.database = 'dataBase'
cursor = cnx.cursor()
cursor.execute('CREATE TABLE table (id_field INT NOT NULL, json_data_field JSON NOT NULL, PRIMARY KEY (id_field))')
Now, the code below WORKS just fine, the focus of my question is the use of '%s':
insert_statement = "INSERT INTO table (id_field, json_data_field) VALUES (%s, %s)"
values_to_insert = (1, jsonData)
cursor.execute(insert_statement, values_to_insert)
My problem with that: I am very strictly adhering to the use of '...{}'.format(aValue) (or f'...{aValue}') when combining variable aValue(s) into a string, thus avoiding the use of %s (whatever my reasons for that, let's not debate them here - but it is how I would like to keep it wherever possible, hence my question).
In any case, I am simply unable, whichever way I try, to create something that stores the jsonData into the mySql dataBase using something that resembles the above structure and uses '...{}'.format() (in whatever shape or form) instead of %s. For example, I have (among many iterations) tried
insert_statement = "INSERT INTO table (id_field, json_data_field) VALUES ({}, {})".format(1, jsonData)
cursor.execute(insert_statement)
but no matter how I turn and twist it, I keep getting the following error:
ProgrammingError: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '[some_content_from_jsonData})]' at line 1
Now my question(s):
1) Is there a way to avoid the use of %s here that I am missing?
2) If not, why? What is it that makes this impossible? Is it the cursor.execute() function, or is it the fact that it is a JSON object, or is it something completely different? Shouldn't {}.format() be able to do everything that %s could do, and more?
First of all: NEVER DIRECTLY INSERT YOUR DATA INTO YOUR QUERY STRING!
Using %s in a MySQL query string is not the same as using it in a python string.
In python, you just format the string and 'hello %s!' % 'world' becomes 'hello world!'. In SQL, the %s signals parameter insertion. This sends your query and data to the server separately. You are also not bound to this syntax. The python DB-API specification specifies more styles for this: DB-API parameter styles (PEP 249). This has several advantages over inserting your data directly into the query string:
Prevents SQL injection
Say you have a query to authenticate users by password. You would do that with the following query (of course you would normally salt and hash the password, but that is not the topic of this question):
SELECT 1 FROM users WHERE username='foo' AND password='bar'
The naive way to construct this query would be:
"SELECT 1 FROM users WHERE username='{}' AND password='{}'".format(username, password)
However, what would happen if someone inputs ' OR 1=1 as password. The formatted query would then become
SELECT 1 FROM users WHERE username='foo' AND password='' OR 1=1
which will allways return 1. When using parameter insertion:
execute('SELECT 1 FROM users WHERE username=%s AND password=%s', username, password)
this will never happen, as the query will be interpreted by the server separately.
Performance
If you run the same query many times with different data, the performance difference between using a formatted query and parameter insertion can be significant. With parameter insertion, the server only has to compile the query once (as it is the same every time) and execute it with different data, but with string formatting, it will have to compile it over and over again.
In addition to what was said above, I would like to add some details that I did not immediately understand, and that other (newbies like me ;)) may also find helpful:
1) "parameter insertion" is meant for only for values, it will not work for table names, column names, etc. - for those, the Python string substitution works fine in the sql syntax defintion
2) the cursor.execute function requires a tuple to work (as specified here, albeit not immediately clear, at least to me: https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursor-execute.html)
EXAMPLE for both in one function:
def checkIfRecordExists(column, table, condition_name, condition_value):
...
sqlSyntax = 'SELECT {} FROM {} WHERE {} = %s'.format(column, table, condition_name)
cursor.execute(sqlSyntax, (condition_value,))
Note both the use of .format in the initial sql syntax definition and the use of (condition_value,) in the execute function.

Being that string substitution is frowned upon with forming SQL queries, how do you assign the table name dynamically?

Pretty new to sqlite3, so bear with me here..
I'd like to have a function to which I can pass the table name, and the values to update.
I initially started with something like this:
def add_to_table(table_name, string):
cursor.execute('INSERT INTO {table} VALUES ({var})'
.format(
table=table_name,
var=string)
)
Which works A-OK, but further reading about sqlite3 suggested that this was a terribly insecure way to go about things. However, using their ? syntax, I'm unable to pass in a name to specify the variable.
I tried adding in a ? in place of the table, but that throws a syntax error.
cursor.execute('INSERT INTO ? VALUES (?)', ('mytable','"Jello, world!"'))
>> >sqlite3.OperationalError: near "?": syntax error
Can the table in an sql statement be passed in safely and dynamically?
Its not the dynamic string substitution per-se thats the problem. Its dynamic string substitution with an user-supplied string thats the big problem because that opens you to SQL-injection attacks. If you are absolutely 100% sure that the tablename is a safe string that you control then splicing it into the SQL query will be safe.
if some_condition():
table_name = 'TABLE_A'
else:
table_name = 'TABLE_B'
cursor.execute('INSERT INTO '+ table_name + 'VALUES (?)', values)
That said, using dynamic SQL like that is certainly a code smell so you should double check to see if you can find a simpler alternative without the dynamically generated SQL strings. Additionally, if you really want dynamic SQL then something like SQLAlchemy might be useful to guarantee that the SQL you generate is well formed.
Composing SQL statements using string manipulation is odd not only because of security implications, but also because strings are "dumb" objects. Using sqlalchemy core (you don't even need the ORM part) is almost like using strings, but each fragment will be a lot smarter and allow for easier composition. Take a look at the sqlalchemy wiki to get a notion of what I'm talking about.
For example, using sqlsoup your code would look like this:
db = SQLSoup('sqlite://yourdatabase')
table = getattr(db, tablename)
table.insert(fieldname='value', otherfield=123)
db.commit()
Another advantage: code is database independent - want to move to oracle? Change the connection string and you are done.

How to insert strings with quotes and newlines into sqlite db with Python?

I'm trying to insert strings read from a file into an sqlite database in Python. The strings have whitespace (newline, tab characters, and spaces) and also have appearances of single or double quotes. Here's how I try to do it:
import sqlite3
conn = sqlite3.connect('example.db')
c = conn.cursor()
# Create table
c.execute('''CREATE TABLE test
(a text, b text)''')
f = open("foo", "w")
f.write("hello\n\'world\'\n")
f.close()
testfield = open("foo").read()
# Insert a row of data
c.execute("INSERT INTO test VALUES ('%s', 'bar')" %(testfield))
# Save (commit) the changes
conn.commit()
I find that this fails with the error:
c.execute("INSERT INTO test VALUES ('%s', 'bar')" %(testfield))
sqlite3.OperationalError: near "world": syntax error
How can I achieve this? Do the strings need to be escaped before insertion in the db, and if so how? thanks.
You use SQL parameters instead of string formatting:
c.execute("INSERT INTO test VALUES (?, 'bar')", (testfield,))
When using SQL parameters you let the database library handle the quoting, and even better, give the database to optimize the query and reuse the optimized query plan for multiple executions of the same basic query (with different parameters).
Last but not least, you are much better defended against SQL injection attacks as the database library knows best how to escape dangerous SQL-like values.
To quote the sqlite3 documentation:
Usually your SQL operations will need to use values from Python variables. You shouldn’t assemble your query using Python’s string operations because doing so is insecure; it makes your program vulnerable to an SQL injection attack (see http://xkcd.com/327/ for humorous example of what can go wrong).
Instead, use the DB-API’s parameter substitution. Put ? as a placeholder wherever you want to use a value, and then provide a tuple of values as the second argument to the cursor’s execute() method.

How to quote a string value explicitly (Python DB API/Psycopg2)

For some reasons, I would like to do an explicit quoting of a string value (becoming a part of constructed SQL query) instead of waiting for implicit quotation performed by cursor.execute method on contents of its second parameter.
By "implicit quotation" I mean:
value = "Unsafe string"
query = "SELECT * FROM some_table WHERE some_char_field = %s;"
cursor.execute( query, (value,) ) # value will be correctly quoted
I would prefer something like that:
value = "Unsafe string"
query = "SELECT * FROM some_table WHERE some_char_field = %s;" % \
READY_TO_USE_QUOTING_FUNCTION(value)
cursor.execute( query ) # value will be correctly quoted, too
Is such low level READY_TO_USE_QUOTING_FUNCTION expected by Python DB API specification (I couldn't find such functionality in PEP 249 document). If not, maybe Psycopg2 provides such function? If not, maybe Django provides such function? I would prefer not to write such function myself...
Ok, so I was curious and went and looked at the source of psycopg2. Turns out I didn't have to go further than the examples folder :)
And yes, this is psycopg2-specific. Basically, if you just want to quote a string you'd do this:
from psycopg2.extensions import adapt
print adapt("Hello World'; DROP DATABASE World;")
But what you probably want to do is to write and register your own adapter;
In the examples folder of psycopg2 you find the file 'myfirstrecipe.py' there is an example of how to cast and quote a specific type in a special way.
If you have objects for the stuff you want to do, you can just create an adapter that conforms to the 'IPsycopgSQLQuote' protocol (see pydocs for the myfirstrecipe.py-example...actually that's the only reference I can find to that name) that quotes your object and then registering it like so:
from psycopg2.extensions import register_adapter
register_adapter(mytype, myadapter)
Also, the other examples are interesting; esp. 'dialtone.py' and 'simple.py'.
I guess you're looking for the mogrify function.
Example:
>>> cur.mogrify("INSERT INTO test (num, data) VALUES (%s, %s)", (42, 'bar'))
"INSERT INTO test (num, data) VALUES (42, E'bar')"
You should try to avoid doing your own quoting. Not only will it be DB-specific as people have pointed out, but flaws in quoting are the source of SQL injection bugs.
If you don't want to pass around queries and values separately, then pass around a list of the parameters:
def make_my_query():
# ...
return sql, (value1, value2)
def do_it():
query = make_my_query()
cursor.execute(*query)
(I probably have the syntax of cursor.execute wrong) The point here is that just because cursor.execute takes a number of arguments, that doesn't mean you have to handle them all separately. You can deal with them as one list.
This'll be database dependent (iirc, mysql allows \ as an escape character, while something like oracle expects quotes to be doubled: 'my '' quoted string').
Someone correct me if i'm wrong, but the double-quoting method is the standard method.
It may be worth looking at what other db abstraction libraries do (sqlalchemy, cx_Oracle, sqlite, etc).
I've got to ask - why do you want to inline the values instead of bind them?
This is going to be DB dependent. In the case of MySQLdb, for example, the connection class has a literal method that will convert the value to the correct escaped representation for passing to MySQL (that's what cursor.execute uses).
I imagine Postgres has something similar, but I don't think there is a function to escape values as part of the DB API 2.0 spec.
I don't think you give any sufficient reasoning behind your avoidance to do this The Right Way. Please, use the APi as it is designed and don't try so hard to make your code less readable for the next guy and more fragile.
Your code snippet would get just like this, according to psycopg extension docs
from psycopg2.extensions import adapt
value = "Unsafe string"
query = "SELECT * FROM some_table WHERE some_char_field = %s;" % \
adapt(value).getquoted()
cursor.execute( query ) # value will be correctly quoted, too
The getquoted function returns the value as a quoted and escaped string, so you could also go: "SELECT * FROM some_table WHERE some_char_field = " + adapt(value).getquoted() .
PyPika in another good option for building SQL statements. Usage example (based on an example on the project's homepage):
>>> from pypika import Order, Query
>>> Query.from_('customers').select('id', 'fname', 'lname', 'phone').orderby('id', order=Order.desc)
SELECT "id","fname","lname","phone" FROM "customers" ORDER BY "id" DESC
If you use django you might want to use the quoting function which is automatically adapted to the currently configured DBMS :
from django.db import backend
my_quoted_variable = backend.DatabaseOperations().quote_name(myvar)
import re
def db_quote(s):
return "\"" + re.escape(s) + "\""
can do the job of simple quoting that works at least with MySQL. What we really need, though is cursor.format() function that would work like cursor.execute() except it would return the resulting query instead of executing it. There are times when you do not want the query to be executed quite yet - e.g you may want to log it first, or print it out for debugging before you go ahead with it.

Categories

Resources