Python: urllib.urlencode is escaping my stuff *twice* - python

... but it's not escaping it the same way twice.
I'm trying to upload ASCII output from gpg to a website. So, the bit I've got, so far, just queries the table, shows me the data it got, and then shows it to me after it encodes it for a HTTP POST request:
cnx = connect()
sql = ("SELECT Data FROM SomeTable")
cursor = cnx.cursor()
cursor.execute(sql)
for (data) in cursor:
print "encoding : %s" % data
postdata = urllib.urlencode( { "payload" : data } )
print "encoded as %s" % postdata
... but what I get is:
encoding : -----BEGIN PGP MESSAGE-----
Version: GnuPG v1.4.12 (GNU/Linux)
.... etc...
encoded as payload=%28u%27-----BEGIN+PGP+MESSAGE-----%5CnVersion%3A+GnuPG+v1.4.12+... etc ...
The part to notice is that the newlines aren't getting turned into %0A, like I'd expect. Instead, they're somehow getting escaped into "\n", and then the backslashes are escaped to %5C, so a newline becomes "%5Cn". Even stranger, the data gets prepended with %28u%27, which comes out to "(u'".
Oddly, if I just do a basic test with:
data = "1\n2"
print data
print urllib.urlencode( { "payload" : data } )
I get what I expect, newlines turn into %0A...
1
2
payload=1%0A2
So, my hunch is that the data element returned from the mysql query isn't the same kind of string as my literal "1\n2" (maybe a 1-element dict... dunno), but I don't have the Python kung-fu to know how to inspect it.
Anybody know what's going on, here, and how I can fix it? If not, any suggestions for how to POST this via HTTP with everything getting escaped properly?

Assuming connect() is a function from some DB-API 2.0 compatible database interface (like the built-in sqlite3, or the most popular mysql interface), for (data) in cursor: is iterating Row objects, not strings.
When you print it out, you're effectively printing str(data) (by passing it to a %s format). If you want to encode the same thing, you have to encode str(data).
However, a better way to do it is to handle the rows as rows (of one column) in the first place, instead of relying on str to do what you want.
PS, if you were trying to rely on tuple unpacking to make data the first element of each row, you're doing it wrong:
for (data) in cursor:
… is identical to:
for data in cursor:
If you want a one-element tuple, you need a comma:
for data, in cursor:
(You can also add the parens if you want, but they still don't make a difference either way.)
Specifically, iterating the cursor will call the optional __iter__ method, which returns the cursor itself, then loop calling the next method on it, which does the same thing as calling fetchone() until the result set is exhausted, and fetchone is documented to return "a single sequence", whose type isn't defined. In most implementations, that's a special row type, like sqlite3.Row, which can be accessed as if it were a tuple but has special semantics for things like printing in tabular format, allowing by-name access, etc.

Related

Best practises when inserting a json variable into a MySQL table column of type json, using Python's pymysql library

I have a Python script, that's using PyMySQL to connect to a MySQL database, and insert rows in there. Some of the columns in the database table are of type json.
I know that in order to insert a json, we can run something like:
my_json = {"key" : "value"}
cursor = connection.cursor()
cursor.execute(insert_query)
"""INSERT INTO my_table (my_json_column) VALUES ('%s')""" % (json.dumps(my_json))
connection.commit()
The problem in my case is that the json is variable over which I do not have much control (it's coming from an API call to a third party endpoint), so my script keeps throwing new error for non-valid json variables.
For example, the json could very well contain a stringified json as a value, so my_json would look like:
{"key": "{\"key_str\":\"val_str\"}"}
→ In this case, running the usual insert script would throw a [ERROR] OperationalError: (3140, 'Invalid JSON text: "Missing a comma or \'}\' after an object member." at position 1234 in value for column \'my_table.my_json_column\'.')
Or another example are json variables that contain a single quotation mark in some of the values, something like:
{"key" : "Here goes my value with a ' quotation mark"}
→ In this case, the usual insert script returns an error similar to the below one, unless I manually escape those single quotation marks in the script by replacing them.
[ERROR] ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'key': 'Here goes my value with a ' quotation mark' at line 1")
So my question is the following:
Are there any best practices that I might be missing on, and that I can use in order to avoid my script breaking, in the 2 scenarios mentioned above, but also in any other potential examples of jsons that might break the insert query ?
I read some existing posts like this one here or this one, where it's recommended to insert the json into a string or a blob column, but I'm not sure if that's a good practice / if other issues (like string length limitations for example) might arise from using a string column instead of json.
Thanks !

What do SQL queries Parameterized with '%s' look like?

Consider a particular SQL query in the form
cursor.execute(string, array)
Where string is some string containing '%s' and array is some array satisfying len(array) == string.count("%s"), not necessarily containing only strings.
For example:
cursor.execute("INSERT INTO tablename(col_one, col_two, col_three) VALUES (%s,%s,%s)",("text", 123, datetime.time(12,0)))
When I run this, I get an unhelpful error message about 'You have an error in your SQL syntax...' and then a partial text of the query. However, to debug this, I want to know the full text of the query.
When the query cursor.execute(string, array) is run, what is the actual text of the query the the cursor executes?
As you can read here:
Syntax:
cursor.execute(operation, params=None, multi=False)
iterator = cursor.execute(operation, params=None, multi=True)
This method executes the given database operation (query or command).
The parameters found in the tuple or dictionary params are bound to
the variables in the operation. Specify variables using %s or %(name)s
parameter style (that is, using format or pyformat style). execute()
returns an iterator if multi is True.
So when you use %s, it will replace that value with the one in the params list.
In case you want to debug your statement, you can print the last executed query with: cursor._last_executed:
try:
cursor.execute(sql, (arg1, arg2))
connection.commit()
except:
print("Error: "+cursor._last_executed)
raise
finally :
print(cursor._last_executed)
source
Your string is actually your parameterized query, where you should pass your elements to match your %s.
You can get examples in the mySql documentation
Note in there that the parameters are not in an array but in a tuple.
Your example becomes :
cursor.execute("INSERT INTO tablename(col_one, col_two, col_three) VALUES (%s,%s,%s)", ('text', 123, datetime.time(12,0)))
I also changed your " to 'as I doubt it liked it too much.
I'm also never sure of the date format, try without a date if you still have trouble (then fix the date format if needed).

Python strings - strange replacing behavior

I have a function, where I get a string as parameter. I want to save this string to a database. So I have a command like:
sql_command = """INSERT INTO some_table(some_text_row) VALUE (
'{0}');""".format(some_text)
But the parameter can contain characters like '. So I need to replace this sort of characters. I do this with this function:
some_text = given_parameter.replace("'", r"\'")
But now comes the strange behavior: Sometimes, I get a result of \\' and sometimes I get a result of \'. I want to have the second one.
To give you more information: The given_parameter is the HTML code of a webpage. I get the HTML code from the library called requests
Does anyone have some tipps?
Don't construct the query using string formatting - this is unsafe, you are making it vulnerable to SQL injections.
Instead, parameterize the query and let the mysql driver worry about quotes:
sql_command = """
INSERT INTO
some_table(some_text_row)
VALUES
(%s)"""
cursor.execute(sql_command, (some_text, ))

Python MySQL insert and retrieve a list in Blob

I'm trying to insert a list of element into a MySQL database (into a Blob column). This is an example of my code is:
myList = [1345,22,3,4,5]
myListString = str(myList)
myQuery = 'INSERT INTO table (blobData) VALUES (%s)'
cursor.execute(query, myListString)
Everything works fine and I have my list stored in my database. But, when I want to retrieve my list, because it's now a string I have no idea how to get a real integer list instead of a string.
For example, if now i do :
myQuery = 'SELECT blobData FROM db.table'
cursor.execute(myQuery)
myRetrievedList = cursor.fetch_all()
print myRetrievedList[0]
I ll get :
[
instead of :
1345
Is there any way to transform my string [1345,22,3,4,5] into a list ?
You have to pick a data format for your list, common solutions in order of my preference are:
json -- fast, readable, allows nested data, very useful if your table is ever used by any other system. checks if blob is valid format. use json.dumps() and json.loads() to convert to and from string/blob representation
repr() -- fast, readable, works across Python versions. unsafe if someone gets into your db. user repr() and eval() to get data to and from string/blob format
pickle -- fast, unreadable, does not work across multiple architectures (afaik). does not check if blob is truncated. use cPickle.dumps(..., protocol=(cPickle.HIGHEST_PROTOCOL)) and cPickle.loads(...) to convert your data.
As per the comments in this answer, the OP has a list of lists being entered as the blob field. In that case, the JSON seems a better way to go.
import json
...
...
myRetrievedList = cursor.fetch_all()
jsonOfBlob = json.loads(myRetrievedList)
integerListOfLists = []
for oneList in jsonOfBlob:
listOfInts = [int(x) for x in oneList]
integerListOfLists.append(listOfInts)
return integerListOfLists #or print, or whatever

"ValueError: Unsupported format character ' " ' (0x22) at..." in Python / String

I've seen a couple similar threads, but attempting to escape characters isn't working for me.
In short, I have a list of strings, which I am iterating through, such that I am aiming to build a query that incorporates however many strings are in the list, into a 'Select, Like' query.
Here is my code (Python)
def myfunc(self, cursor, var_list):
query = "Select var FROM tble_tble WHERE"
substring = []
length = len(var_list)
iter = length
for var in var_list:
if (iter != length):
substring.append(" OR tble_tble.var LIKE %'%s'%" % var)
else:
substring.append(" tble_tble.var LIKE %'%s'%" % var)
iter = iter - 1
for str in substring:
query = query + str
...
That should be enough. If it wasn't obvious from my previously stated claims, I am trying to build a query which runs the SQL 'LIKE' comparison across a list of relevant strings.
Thanks for your time, and feel free to ask any questions for clarification.
First, your problem has nothing to do with SQL. Throw away all the SQL-related code and do this:
var = 'foo'
" OR tble_tble.var LIKE %'%s'%" % var
You'll get the same error. It's because you're trying to do %-formatting with a string that has stray % signs in it. So, it's trying to figure out what to do with %', and failing.
You can escape these stray % signs like this:
" OR tble_tble.var LIKE %%'%s'%%" % var
However, that probably isn't what you want to do.
First, consider using {}-formatting instead of %-formatting, especially when you're trying to build formatted strings with % characters all over them. It avoids the need for escaping them. So:
" OR tble_tble.var LIKE %'{}'%".format(var)
But, more importantly, you shouldn't be doing this formatting at all. Don't format the values into a SQL string, just pass them as SQL parameters. If you're using sqlite3, use ? parameters markers; for MySQL, %s; for a different database, read its docs. So:
" OR tble_tble.var LIKE %'?'%"
There's nothing that can go wrong here, and nothing that needs to be escaped. When you call execute with the query string, pass [var] as the args.
This is a lot simpler, and often faster, and neatly avoids a lot of silly bugs dealing with edge cases, and, most important of all, it protects against SQL injection attacks.
The sqlite3 docs explain this in more detail:
Usually your SQL operations will need to use values from Python variables. You shouldn’t assemble your query using Python’s string operations… Instead, use the DB-API’s parameter substitution. Put ? as a placeholder wherever you want to use a value, and then provide a tuple of values as the second argument to the cursor’s execute() method. (Other database modules may use a different placeholder, such as %s or :1.) …
Finally, as others have pointed out in comments, with LIKE conditions, you have to put the percent signs inside the quotes, not outside. So, no matter which way you solve this, you're going to have another problem to solve. But that one should be a lot easier. (And if not, you can always come back and ask another question.)
You need to escape % like this you need to change the quotes to include the both % generate proper SQL
" OR tble_tble.var LIKE '%%%s%%'"
For example:
var = "abc"
print " OR tble_tble.var LIKE '%%%s%%'" % var
It will be translated to:
OR tble_tble.var LIKE '%abc%'
This is an old question so here is what I had to do to make this work with recent releases of all software mentioned above:
citp = "SomeText" + "%%" # if your LIKE wants database rows that match start text, else ...
citp = "%%" + "SomeQueryText" + "%%"
chek_copies = 'SELECT id, code, null as num from indicator WHERE code LIKE "%s" AND owner = 1 ;'
check_copies = (chek_copies % (citp))
copies_checked = pd.read_sql(check_copies, con=engine)
Works like a charm - but what a load of trial and error

Categories

Resources