Flask and jsonify: Escaping characters - python

I have a flask view which returns some JSON formatted data:
def myview():
entities = get_my_entities()
return jsonify({'entities': entities})
entities if a list of dictionaries; in each dictionary there is a value like http://example.com/get/<user_id>/12345678 where <user_id> is a placeholder where the user should insert an indentifier that that have been given (and which should not appear in the JSON result).
The problem is, the <user_id> gets escaped and appears as %3Cuser_id%3E. Is there a way to stop the characters getting escaped?

Thanks to Martijn. I'm using url_for to build the URLs, and its doing the escaping.

Related

Added escaped quotes to JSON in Flask app with mongoDB

I am trying to create API for my Flask project. I have data stored in mongoDB and for building API I am using flask_restful. The problem is that in JSON are added escaped quotes and I cannot figure why and I rather have my JSON without them.
This is how my get function looks like:
from flask_restful import Resource
import json
from bson import json_util
class Harvests(Resource):
def get(self):
json_docs = []
for doc in db.collection.find():
json_doc = json.dumps(doc, default=json_util.default)
json_docs.append(json_doc)
return json_docs
In app.py it is just like that
api = Api(app)
api.add_resource(Harvests, '/api/harvests')
And I get JSON with escaped quotes (in browser or with curl)
[
"{\"_id\": {\"$oid\": \"5c05429cc4247917d66163a7\"},...
]
If I try this outside Flask (print JSON from mongo) and it works just fine. I tried use .replace(), but I think is not most elegant solution, but it did not work anyway. Any idea how I should get rid off these backslashes?
What you see is absolutely what you should expect to see according to your code, so I think there is a misunderstanding at some point. Let me explain what you are doing.
You convert each doc (a data structure) into a jsonified version (a string) of this data. Then you gather these strings in a list. Later you see this list, and of course you see a list of strings. Each of these strings contains a jsonified version of a data structure (a dictionary with opening braces, keys and values inside, and each key is a string itself with quotes, so these quotes are escaped within the jsonified string).
I recommend to collect your documents into a list and then convert that list to json instead:
def get(self):
docs = []
for doc in db.collection.find():
docs.append(doc)
return json.dumps(docs, default=json_util.default)
This way you get one json string representing the list of docs.
Maybe your framework is already applying a jsonifying automatically, in this case just don't do this step yourself:
return docs
Just use this instead.

Generate a list in a URL with Flask's url_for

I am using webargs to parse parameters from request.args with Marshmallow and pass them as arguments to a Flask view. My client uses a comma separated list to represent multiple values for a key:
/queues/generate?queue_id=4&include_ids=1,2,3
To parse this I use Marshmallow's DelimitedList field.
from marshmallow import Schema, fields
from webargs import use_args
from webargs.fields import DelimitedList
class GenerationParamsSchema(Schema):
queue_id = fields.Integer(required=True)
include_ids = DelimitedList(fields.Integer(), required=False)
#queues.route('/generate_content', methods=['GET'])
#use_args(GenerationParamsSchema(strict=True))
def generate_content_count(generation_params):
...
However, if I generate the URL with Flask's url_for, it produces duplicate keys for each value:
url_for('queues.generate', queue_id=4, include_ids=[1, 2, 3])
/queues/generate?queue_id=4&include_ids=1&include_ids=2&include_ids=3
Parsing this with a DelimitedList field only captures the first value. Changing to a List field correctly captures the values again. So either my Flask URLs fail, or my client URLs fail.
I can't change how my client generates URLs, so I'd like to stick with parsing using the DelimitedField. How can I make url_for generate the same style?
There's no standard for specifying multiple values for a key in a query string. Flask, browsers, and many other web technologies use the "repeat key" style that you see with url_for and request.args. Your client has chosen to use a different style.
If you want url_for to generate the delimited style, you'll need to pre-process the values you pass to url_for. Write a wrapper around url_for and use it instead.
from flask import url_for as _url_for
#app.template_global()
def url_for(endpoint, **values):
for key, value in values.items():
if isinstance(value, (tuple, list)):
values[key] = ','.join(value)
return _url_for(endpoint, **values)
Keep in mind that requests.args only understands the repeat key style, so you'll have to parse, with webargs or otherwise, any incoming comma separated values. It may be easier to generate the repeat key style from your client instead.

Escaping characters for instance query matching in webpy

(The title may be in error here, but I believe that the problem is related to escaping characters)
I'm using webpy to create a VERY simple todo list using peewee with Sqlite to store simple, user submitted todo list items, such as "do my taxes" or "don't forget to interact with people", etc.
What I've noticed is that the DELETE request fails on certain inputs that contain specific symbols. For example, while I can add the following entries to my Sqlite database that contains all the user input, I cannot DELETE them:
what?
test#
test & test
This is a test?
Any other user input with any other symbols I'm able to DELETE with no issues. Here's the webpy error message I get in the browser when I try to DELETE the inputs list above:
<class 'peewee.UserInfoDoesNotExist'> at /del/test
Instance matching query does not exist: SQL: SELECT "t1"."id", "t1"."title" FROM "userinfo" AS t1 WHERE ("t1"."title" = ?) PARAMS: [u'test']
Python /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/peewee.py in get, line 2598
Web POST http://0.0.0.0:7700/del/test
When I view the database file (called todoUserList.db) in sqlitebrowser, I can see that these entries do exist with the symbols, they're all there.
In my main webpy app script, I'm using a regex to search through the db to make a DELETE request, it looks like this:
urls = (
'/', 'Index',
'/del/(.*?)', 'Delete'
)
I've tried variations of the regex, such as '/del/(.*)', but still get the same error, so I don't think that's the problem.
Given the error message above, is webpy not "seeing" certain symbols in the user input because they're not being escaped properly?
Confused as to why it seems to only happen with the select symbols listed above.
Depending on how the URL escaping is functioning it could be an issue in particular with how "?" and "&" are interpreted by the browser (in a typical GET style request & and ? are special character used to separate query string parameters)
Instead of passing those in as part of the URL itself you should pass them in as an escaped querystring. As far as I know, no web server is going to respect wacky values like that as part of a URL. If they are escaped and put in the querystring (or POST body) you'll be fine, though.

Testing request parameters in Django ("+" behaves differently)

I have a Django View that uses a query parameter to do some content filtering. Something like this:
/page/?filter=one+and+two
/page/?filter=one,or,two
I have noticed that Django converts the + to a space (request.GET.get('filter') returns one and two), and I´m OK with that. I just need to adjust the split() function I use in the View accordingly.
But...
When I try to test this View, and I call:
from django.test import Client
client = Client()
client.get('/page/', {'filter': 'one+and+two'})
request.GET.get('filter') returns one+and+two: with plus signs and no spaces. Why is this?
I would like to think that Client().get() mimics the browser behaviour, so what I would like to understand is why calling client.get('/page/', {'filter': 'one+and+two'}) is not like browsing to /page/?filter=one+and+two. For testing purposes it should be the same in my opinion, and in both cases the view should receive a consistent value for filter: be it with + or with spaces.
What I don´t get is why there are two different behaviours.
The plusses in a query string are the normal and correct encoding for spaces. This is a historical artifact; the form value encoding for URLs differs ever so slightly from encoding other elements in the URL.
Django is responsible for decoding the query string back to key-value pairs; that decoding includes decoding the URL percent encoding, where a + is decoded to a space.
When using the test client, you pass in unencoded data, so you'd use:
client.get('/page/', {'filter': 'one and two'})
This is then encoded to a query string for you, and subsequently decoded again when you try and access the parameters.
This is because the test client (actually, RequestFactory) runs django.utils.http.urlencode on your data, resulting in filter=one%2Band%2Btwo. Similarly, if you were to use {'filter': 'one and two'}, it would be converted to filter=one%20and%20two, and would come into your view with spaces.
If you really absolutely must have the pluses in your query string, I believe it may be possible to manually override the query string with something like: client.get('/page/', QUERY_STRING='filter=one+and+two'), but that just seems unnecessary and ugly in my opinion.

In Python Flask, how to access complete raw URL prior to un-escaping

I see Flask provides a few parsed fields in Request, however the url is after removing escapes. Any way to access the url prior to un-escaping done by Flask?
For example when a rest client makes request for "http://www.example.com/my_url%20is%20here?arg1=2&?arg2=3", Flask provides me with the request.base_url of "http://www.example.com/my_url is here" where %20 is replaced with spaces. I can quote this myself to get the original URL as someone responded, but preferably I would like to access the original URL as it was sent by the client rather than deriving it.
The fields are not URLs, or even URIs, they are IRIs. Use iri_to_uri:
from werkzeug.urls import iri_to_uri
iri_to_uri(request.url)
From werkzeug/wrappers.py:
"""
...
Note that the string returned might contain unicode characters as the
representation is an IRI not an URI. If you need an ASCII only
representation you can use the :func:`~werkzeug.urls.iri_to_uri`
function:
>>> from werkzeug.urls import iri_to_uri
>>> iri_to_uri(get_current_url(env))
'http://localhost/script/?param=foo'
...
"""
One of the nice things about flask and werkzeug is that you can always follow things through in the source code.
You mean %XX escapes?
from urllib.parse import quote
quote(url)

Categories

Resources