I have a python script that reads raw movie text files into an sqlite database.
I use re.escape(title) to add escape chars into the strings to make them db safe before executing the inserts.
Why does this not work:
In [16]: c.execute("UPDATE movies SET rating = '8.7' WHERE name='\'Allo\ \'Allo\!\"\ \(1982\)'")
--------------------------------------------------------------------------- OperationalError Traceback (most recent call last)
/home/rajat/Dropbox/amdb/<ipython console> in <module>()
OperationalError: near "Allo": syntax error
Yet this works (removed \' in two places) :
In [17]: c.execute("UPDATE movies SET rating = '8.7' WHERE name='Allo\ Allo\!\"\ \(1982\)'") Out[17]: <sqlite3.Cursor object at 0x9666e90>
I can't figure it out. I also can't ditch those leading quotes because they're actually part of the movie title.
Thank you.
You're doing it wrong. Literally. You should be using parameters, like this:
c.execute("UPDATE movies SET rating = ? WHERE name = ?", (8.7, "'Allo 'Allo! (1982)"))
Like that, you won't need to do any quoting at all and (if those values are coming from anyone untrusted) you'll be 100% safe (here) from SQL injection attacks too.
I use re.escape(title) to add escape
chars into the strings to make them db
safe
Note that re.escape makes a string re-safe -- nothing to do with making it db safe. Rather, as #Donal says, what you need is the parameter substitution concept of the Python DB API -- that makes things "db safe" as you need.
SQLite doesn't support backslash escape sequences. Apostrophes in string literals are indicated by doubling them: '''Allo ''Allo! (1982)'.
But, like Donal said, you should be using parameters.
I've one simple tip you could use to handle this problem:
When your SQL statement string has single quote:', then you could use double quote to enclose your statement string. And when your SQL statement string has double quotes:", then you could use single quote:" to enclose your statement string.
E.g.
sqlString="UPDATE movies SET rating = '8.7' WHERE name='Allo Allo !' (1982 )"
c.execute(sqlString)
Or,
sqlString='UPDATE movies SET rating = "8.7" WHERE name="Allo Allo !" (1982 )'
c.execute(sqlString)
This solution works for me in Python environment.
Related
I have a program in python and I want to insert data into a table(using insert into statement). I receive data from web(web scraping) and the data contain both single and double quotes. As you know MySQL allows to insert both single and double quotes to a table so the error is not from database. Problem appears when I use that data in python and an error appears.
No matters if i use single or double quotes in the string (insert into statement values) in python, in both ways error appears because of the data(that contain single or double quotes).I use MySQL and Connector/python and in my script I import mysql. I hope you got this, sorry about bad English.
Most likely explanation for the behavior is a SQL Injection vulnerability. (That's just a guess because we are speculating about code we haven't seen; only a description of the behavior.)
The short answer is to use prepared statements with bind placeholders
https://pynative.com/python-mysql-execute-parameterized-query-using-prepared-statement/
If for some reason that is not possible, then at a bare minimum, any potentially unsafe values included in SQL text must be properly escaped to make them safe for inclusion
(The single quote in Little Bobby Tables https://xkcd.com/327/ is not escaped.)
As example, this SQL will throw an error, because the second single quote ends the string literal, and what follows the end of the string literal "s wrong" is gibberish in terms of SQL:
INSERT INTO mytab (mycol) VALUES ( 'It's wrong' )
^
But this will work:
INSERT INTO mytab (mycol) VALUES ( 'It''ll work' )
^^
Because the single quote within the string literal is escaped, by preceding it with another single quote.
The OWASP project provides a good overview of SQL Injection.
https://www.owasp.org/index.php/SQL_Injection
https://www.owasp.org/index.php/SQL_Injection_Prevention_Cheat_Sheet
I'm looking to escape special characters in string for Python 2.7.
For example, if I have :
str = "You're the best "dog" on earth."
I would have :
str = "You\'re the best \"dog\" on earth."
I want it because I'm inserting strings in SQL database using pymySQL and I can't find a way to do this.
I guess escaping characters must be like this ? (not really sure)
I also would find a way to do the reverse action remove escpaing characters.
You are approaching this entirely the wrong way. You should never need to escape special characters when inserting a string into a SQL database: always use parametrised SQL queries and any needed escaping will be done for you. If you start trying to escape the strings yourself you are opening your code up to all manner of security problems.
with connection.cursor() as cursor:
# Create a new record
sql = "INSERT INTO `mytable` (`thestring`) VALUES (%s)"
cursor.execute(sql, (str,))
If you ever find yourself building a query string out of data that has come from any outside source stop and reconsider: you should never need to do that.
You don't need to escape values for the purpose of SQL by hand! Let the database API take care of that.
Form a valid string literal in Python source code:
str = "You're the best \"dog\" on earth."
str = 'You\'re the best "dog" on earth.'
str = """You're the best "dog" on earth."""
These are all equivalent, you just need to escape the appropriate quotes that you're using as string literal terminators.
Use the database API correctly and don't worry about escaping. From the manual:
sql = "INSERT INTO `users` (`email`, `password`) VALUES (%s, %s)"
cursor.execute(sql, ('webmaster#python.org', 'very-secret'))
Escaping is handled by separating the query and values, not by adding backslashes.
I know that variants of this topic have been discussed elsewhere, but none of the other threads were helpful.
I want to hand over a string from python to sql. It might however happen that apostrophes (') occur in the string. I want to escape them with a backslash.
sql = "update tf_data set authors=\'"+(', '.join(authors).replace("\'","\\\'"))+"\' where tf_data_id="+str(tf_data_id)+";"
However, this will always give \\' in my string. Therefore, the backslash itself is escaped and the sql statement doesn't work.
Can someone help me or give me an alternative to the way I am doing this?
Thanks
Simply don't.
Also don't concatenate sql queries as these are prone to sql injections.
Instead, use a parameterized query:
sql = "update tf_data set authors=%(authors)s where tf_data_id=%(data_id)s"
# or :authors and :data_id, I get confused with all those sql dialects out there
authors = ', '.join(authors)
data_id = str(tf_data_id)
# db or whatever your db instance is called
db.execute(sql, {'authors': authors, 'data_id': data_id})
You're using double-quoted strings, but still escaping the single quotes within them. That's not required, all you need to do is escape the backslash that you want to use in the replace operation.
>>> my_string = "'Hello there,' I said."
>>> print(my_string)
'Hello there,' I said.
>>> print(my_string.replace("'", "\\'"))
\'Hello there,\' I said.
Note that I'm using print. If you just ask Python to show you its representation of the string after the replace operation, you'll see double backslashes because they need to be escaped.
>>> my_string.replace("'", "\\'")
"\\'Hello there,\\' I said."
As others have alluded to, if you are using a python package to execute your SQL use the provided methods with parameter placeholders(if available).
My answer addresses the escaping issues mentioned.
Use a String literal with prefix r
print(r"""the\quick\fox\\\jumped\'""")
Output:
the\quick\fox\\\jumped\'
I have a little script that creates a certain INSERT SQL statement for me.
For postgresql I need to wrap the values to be inserted within two single quotes.
Unfortunately some of the value strings to be inserted also contain a single quote, and I need to escape them automatically.
for line in f:
out.write('(\'' + line[:2] + '\', \'' + line[3:-1] + '\'),\n')
How can I make sure that any single quote (e.g. ' ) inside line[3:-1] is automatically escaped?
Thanks,
UPDATE:
e.g. the line
CI|Cote D'ivoire
fails due '
Update 2:
I can't use double quotes in values, e.g.
INSERT INTO "App_country" (country_code, country_name) VALUES ("AF", "Afghanistan")
I get the error message: ERROR: column "AF" does not exist
This however works fine:
INSERT INTO "App_country" (country_code, country_name) VALUES ('AF', 'Afghanistan')
As described in the PEP-249, the DBPI is a generic interface to various databases. Different implementations exist for different databases. For postgres there is psycopg. from the docs:
cur.execute(
... """INSERT INTO some_table (an_int, a_date, a_string)
... VALUES (%s, %s, %s);""",
... (10, datetime.date(2005, 11, 18), "O'Reilly"))
You simple pass your parameters in a tuple. The underlying library escapes it for you. This is much safer and easier than trying to roll your own.
The SQL standard way to escape a quote is to double it:
'This won''t be a problem.'
So replace every quote with two quotes (and use double quotes in Python to stay sane):
out.write("('" + line[:2] + "', '" + line[3:-1].replace("'", "''") + "'),\n")
Never use a generated, rolled-your-own escaping for DML. Use the appropriate DBAPI as Keith has mentioned. Work would have gone into that to make sure escapes from various sources and type conversion can occur almost transparently. If you're using DDL such as a CREATE TABLE whatever (...) - you can be more slight slack-handed if you trust your own datasource.
using data shown in example:
import sqlite3
text = "CI|Cote D'ivoire" # had to been escaped as it's a string literal, but from another data source - possibly not...
code, name = text.split('|', 1)
db = sqlite3.connect(':memory:')
db.execute('create table something(code, name)')
db.execute('insert into something(code, name) values(?, ?)', (code, name))
for row in db.execute('select * from something'):
print row
# (u'CI', u"Cote D'ivoire")
For a complete solution toadd escape characters to a string, use:
re.escape(string)
>>> re.escape('\ a.*$')
'\\\\\\ a\\.\\*\\$'
for more, see: http://docs.python.org/library/re.html
Not sure if there are some SQL related limitations, but you could always use double quotes to surround your string that contains the single quote.
Eg.
print "That's all Folks!"
or single quotes to surround double quotes:
print 'The name of the file is "rosebud".'
This string:
"CREATE USER %s PASSWORD %s", (user, pw)
always gets expanded to:
CREATE USER E'someuser' PASSWORD E'somepassword'
Can anyone tell me why?
Edit:
The expanded string above is the string my database gives me back in the error message. I'm using psycopg2 to access my postgres database. The real code looks like this:
conn=psycopg2.connect(user=adminuser, password=adminpass, host=host)
cur = conn.cursor()
#user and pw are simple standard python strings the function gets as parameter
cur.execute("CREATE USER %s PASSWORD %s", (user, pw))
conn.commit()
To pass identifiers to postgresql through psycopg use AsIs from the extensions module
from psycopg2.extensions import AsIs
import psycopg2
connection = psycopg2.connect(database='db', user='user')
cur = connection.cursor()
cur.mogrify(
'CREATE USER %s PASSWORD %s', (AsIs('someuser'), AsIs('somepassword'))
)
'CREATE USER someuser PASSWORD somepassword'
That works also for passing conditions to clauses like order by:
cur.mogrify(
'select * from t order by %s', (AsIs('some_column, another column desc'),)
)
'select * from t order by some_column, another column desc'
As the OP's edit reveals he's using PostgreSQL, the docs for it are relevant, and they say:
PostgreSQL also accepts "escape"
string constants, which are an
extension to the SQL standard. An
escape string constant is specified by
writing the letter E (upper or lower
case) just before the opening single
quote, e.g. E'foo'.
In other words, psycopg is correctly generating escape string constants for your strings (so that, as the docs also say:
Within an escape string, a backslash
character () begins a C-like
backslash escape sequence, in which
the combination of backslash and
following character(s) represents a
special byte value.
(which as it happens are also the escape conventions of non-raw Python string literals).
The OP's error clearly has nothing to do with that, and, besides the excellent idea of studying PostgreSQL's excellent docs, he should not worry about that E'...' form in this case;-).
Not only the E but the quotes appear to come from whatever type user and pw have. %s simply does what str() does, which may fall back to repr(), both of which have corresponding methods __str__ and __repr__. Also, that isn't the code that generates your result (I'd assumed there was a %, but now see only a comma). Please expand your question with actual code, types and values.
Addendum: Considering that it looks like SQL, I'd hazard a guess that you're seeing escape string constants, likely properly generated by your database interface module or library.
Before attempting something like:
statement = "CREATE USER %s PASSWORD %s" % (user, pw)
Please ensure you read: http://www.initd.org/psycopg/docs/usage.html
Basically the issue is that if you are accepting user input (I assume so as someone is entering in the user & pw) you are likely leaving yourself open to SQL injection.
As PsyCopg2 states:
Warning Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.
As has been identified, Postgres (or Psycopg2) doesn't seem to provide a good answer to escaping identifiers. In my opinion, the best way to resolve this is to provide a 'whitelist' filtering method.
ie: Identify what characters are allowed in a 'user' and a 'pw'. (perhaps A-Za-z0-9_). Be careful that you don't include escape characters (' or ;, etc..) or if you do, that you escape these values.