modify flask url before routing - python

My Flask app has url routing defined as
self.add_url_rule('/api/1/accounts/<id_>', view_func=self.accounts, methods=['GET'])
Problem is one of the application making queries to this app adds additional / in url like /api/1//accounts/id. It's not in my control to correct the application which makes such queries so I cant change it.
To resolve this problem currently I have added multiple rules
self.add_url_rule('/api/1/accounts/<id_>', view_func=self.accounts, methods=['GET'])
self.add_url_rule('/api/1//accounts/<id_>', view_func=self.accounts, methods=['GET'])
There are number of such routes and it's ugly workaround. Is there a way in flask to modify URL before it hits the routing logic?

I'd normalise the path before it gets to Flask, either by having the HTTP server that hosts the WSGI container or a proxy server that sits before your stack do this, or by using WSGI middleware.
The latter is easily written:
import re
from functools import partial
class PathNormaliser(object):
_collapse_slashes = partial(re.compile(r'/+').sub, r'/')
def __init__(self, application):
self.application = application
def __call__(self, env, start_response):
env['PATH_INFO'] = self._collapse_slashes(env['PATH_INFO'])
return self.application(env, start_response)
You may want to log that you are applying this transformation, together with diagnostic information like the REMOTE_HOST and HTTP_USER_AGENT entries. Personally, I'd force that specific application to generate non-broken URLs as soon as possible.
Look at your WSGI server documentation to see how to add in extra WSGI middleware components.

Related

Webob / Pyramid query string parameters out of order upon receipt

I am running Pyramid as my API server. Recently we started getting query string parameters out of order when handed to the RESTful API server. For example, a GET to /v1/finishedGoodRequests?exact=true&id=39&join=OR&exact=false&name=39
is logged by the RESTful api module upon init as request.url:
v1/finishedGoodRequests?join=OR&name=39&exact=true&exact=false&id=39
with request.query_string: join=OR&name=39&exact=true&exact=false&id=39
I process the query params in order to qualify the search, in this case id exactly 39 or 39 anywhere in the name. What kind of possible server setting or bug could have crept in to the server code to cause such a thing? It is still a MultiDict...
As a simple example, this works fine for me, and the MultiDict has always preserved the order and so I suspect something is getting rewritten by something you're using in your stack.
from pyramid.config import Configurator
from pyramid.view import view_config
from waitress import serve
#view_config(renderer='json')
def view(request):
return list(request.GET.items())
config = Configurator()
config.scan(__name__)
app = config.make_wsgi_app()
serve(app, listen='127.0.0.1:8080')
$ curl http://localhost:8080\?join=OR\&name=39\&exact=true\&exact=false\&id=39
[["join", "OR"], ["name", "39"], ["exact", "true"], ["exact", "false"], ["id", "39"]]
Depending on which WSGI server you are using, often you can view environ vars to see the original url which may be handy. Waitress does not, so instead just put something high up in the pipeline (wsgi middleware) that can log out the environ['QUERY_STRING'] and see if it doesn't match somewhere lower down in your stack.

Make Flask's url_for use the 'https' scheme in an AWS load balancer without messing with SSLify

I've recently added a SSL certificate to my webapp. It's deployed on Amazon Web Services uses load balancers. The load balancers work as reverse proxies, handling external HTTPS and sending internal HTTP. So all traffic to my Flask app is HTTP, not HTTPS, despite being a secure connection.
Because the site was already online before the HTTPS migration, I used SSLify to send 301 PERMANENT REDIRECTS to HTTP connections. It works despite all connections being HTTP because the reverse proxy sets the X-Forwarded-Proto request header with the original protocol.
The problem
url_for doesn't care about X-Forwarded-Proto. It will use the my_flask_app.config['PREFERRED_URL_SCHEME'] when a scheme isn't available, but during a request a scheme is available. The HTTP scheme of the connection with the reverse proxy.
So when someone connects to https://example.com, it connects to the load balancer, which then connects to Flask using http://example.com. Flask sees the http and assumes the scheme is HTTP, not HTTPS as it originally was.
That isn't a problem in most url_for used in templates, but any url_for with _external=True will use http instead of https. Personally, I use _external=True for rel=canonical since I heard it was recommended practice. Besides that, using Flask.redirect will prepend non-_external urls with http://example.com, since the redirect header must be a fully qualified URL.
If you redirect on a form post for example, this is what would happen.
Client posts https://example.com/form
Server issues a 303 SEE OTHER to http://example.com/form-posted
SSLify then issues a 301 PERMANENT REDIRECT to https://example.com/form-posted
Every redirect becomes 2 redirects because of SSLify.
Attempted solutions
Adding PREFERRED_URL_SCHEME config
https://stackoverflow.com/a/26636880/1660459
my_flask_app.config['PREFERRED_URL_SCHEME'] = 'https'
Doesn't work because there is a scheme during a request, and that one is used instead. See https://github.com/mitsuhiko/flask/issues/1129#issuecomment-51759359
Wrapping a middleware to mock HTTPS
https://stackoverflow.com/a/28247577/1660459
def _force_https(app):
def wrapper(environ, start_response):
environ['wsgi.url_scheme'] = 'https'
return app(environ, start_response)
return wrapper
app = Flask(...)
app = _force_https(app)
As is, this didn't work because I needed that app later. So I used wsgi_app instead.
def _force_https(wsgi_app):
def wrapper(environ, start_response):
environ['wsgi.url_scheme'] = 'https'
return wsgi_app(environ, start_response)
return wrapper
app = Flask(...)
app.wsgi_app = _force_https(app.wsgi_app)
Because wsgi_app is called before any app.before_request handlers, doing this makes SSLify think the app is already behind a secure request and then it won't do any HTTP-to-HTTPS redirects.
Patching url_for
(I can't even find where I got this one from)
from functools import partial
import Flask
Flask.url_for = partial(Flask.url_for, _scheme='https')
This could work, but Flask will give an error if you set _scheme but not _external. Since most of my app url_for are internal, it doesn't work at all.
I was having these same issues with `redirect(url_for('URL'))' behind an AWS Elastic Load Balancer recently & I solved it this using the werkzeug.contrib.fixers.ProxyFix
call in my code.
example:
from werkzeug.contrib.fixers import ProxyFix
app = Flask(__name__)
app.wsgi_app = ProxyFix(app.wsgi_app)
The ProxyFix(app.wsgi_app) adds HTTP proxy support to an application that was not designed with HTTP proxies in mind. It sets REMOTE_ADDR, HTTP_HOST from X-Forwarded headers.
Example:
from werkzeug.middleware.proxy_fix import ProxyFix
# App is behind one proxy that sets the -For and -Host headers.
app = ProxyFix(app, x_for=1, x_host=1)
Digging around Flask source code, I found out that url_for uses the Flask._request_ctx_stack.top.url_adapter when there is a request context.
The url_adapter.scheme defines the scheme used. To make the _scheme parameter work, url_for will swap the url_adapter.scheme temporarily and then set it back before the function returns.
(this behavior has been discussed on github as to whether it should be the previous value or PREFERRED_URL_SCHEME)
Basically, what I did was set url_adapter.scheme to https with a before_request handler. This way doesn't mess with the request itself, only with the thing generating the urls.
def _force_https():
# my local dev is set on debug, but on AWS it's not (obviously)
# I don't need HTTPS on local, change this to whatever condition you want.
if not app.debug:
from flask import _request_ctx_stack
if _request_ctx_stack is not None:
reqctx = _request_ctx_stack.top
reqctx.url_adapter.url_scheme = 'https'
app.before_request(_force_https)

Flask url_for generating http URL instead of https

I am using url_for to generate a redirect URL when a user has logged out:
return redirect(url_for('.index', _external=True))
However, when I changed the page to a https connection, the url_for still gives me http.
I would like to explicitly ask url_for to add https at the beginning of a URL.
Can you point me how to change it? I looked at Flask docs, without luck.
With Flask 0.10, there will be a much better solution available than wrapping url_for. If you look at https://github.com/mitsuhiko/flask/commit/b5069d07a24a3c3a54fb056aa6f4076a0e7088c7, a _scheme parameter has been added. Which means you can do the following:
url_for('secure_thingy',
_external=True,
_scheme='https',
viewarg1=1, ...)
_scheme sets the URL scheme, generating a URL like https://.. instead of http://. However, by default Flask only generates paths (without host or scheme), so you will need to include the _external=True to go from /secure_thingy to https://example.com/secure_thingy.
However, consider making your website HTTPS-only instead. It seems that you're trying to partially enforce HTTPS for only a few "secure" routes, but you can't ensure that your https-URL is not changed if the page linking to the secure page is not encrypted. This is similar to mixed content.
If you want to affect the URL scheme for all server-generated URLs (url_for and redirect), rather than having to set _scheme on every call, it seems that the "correct" answer is to use WSGI middleware, as in this snippet: http://flask.pocoo.org/snippets/35/
(This Flask bug seems to confirm that that is the preferred way.)
Basically, if your WSGI environment has environ['wsgi.url_scheme'] = 'https', then url_for will generate https: URLs.
I was getting http:// URLs from url_for because my server was deployed behind an Elastic Beanstalk load balancer, which communicates with the server in regular HTTP. My solution (specific to Elastic Beanstalk) was like this (simplified from the snippet linked above):
class ReverseProxied(object):
def __init__(self, app):
self.app = app
def __call__(self, environ, start_response):
scheme = environ.get('HTTP_X_FORWARDED_PROTO')
if scheme:
environ['wsgi.url_scheme'] = scheme
return self.app(environ, start_response)
app = Flask(__name__)
app.wsgi_app = ReverseProxied(app.wsgi_app)
The Elastic Beanstalk-specific part of that is HTTP_X_FORWARDED_PROTO. Other environments would have other ways of determining whether the external URL included https. If you just want to always use HTTPS, you could unconditionally set environ['wsgi.url_scheme'] = 'https'.
PREFERRED_URL_SCHEME is not the way to do this. It's ignored whenever a request is in progress.
I tried the accepted answer with an url_for arg but I found it easier to use the PREFERRED_URL_SCHEME config variable and set it to https with:
app.config.update(dict(
PREFERRED_URL_SCHEME = 'https'
))
since you don't have to add it to every url_for call.
If your are accessing your website through a reverse proxy like Nginx, then Flask correctly dectects the scheme being HTTP.
Browser -----HTTPS----> Reverse proxy -----HTTP----> Flask
The easiest solution is to configure your reverse proxy to set the X-Forwarded-Proto header. Flask will automatically detect this header and manage scheme accordingly. There is a more detailed explanation in the Flask documentation under the Proxy Setups section. For example, if you use Nginx, you will have to add the following line in your location block.
proxy_set_header X-Forwarded-Proto $scheme;
As other mentionned, if you can't change the configuration of your proxy, you can either use the werkzeug ProxyFix or build your own fix as described in the documentation:
http://flask.pocoo.org/docs/0.12/deploying/wsgi-standalone/#proxy-setups
Setting _scheme on every url_for() call is extremely tedious, and PREFERRED_URL_SCHEME doesn't seem to work. However, mucking with what the request's supposed scheme is at the WSGI level seems to successfully convince Flask to always construct HTTPS URLs:
def _force_https(app):
def wrapper(environ, start_response):
environ['wsgi.url_scheme'] = 'https'
return app(environ, start_response)
return wrapper
app = Flask(...)
app = _force_https(app)
For anyone ending up here recently there is an official uwsgi fixer for this:
https://stackoverflow.com/a/23504684/13777925
FWIW this still didn't work for me since the header wasn't being set correctly so I augmented the ReversedProxied middleware to prefer https if found thusly:
class ReverseProxied(object):
"""
Because we are reverse proxied from an aws load balancer
use environ/config to signal https
since flask ignores preferred_url_scheme in url_for calls
"""
def __init__(self, app):
self.app = app
def __call__(self, environ, start_response):
# if one of x_forwarded or preferred_url is https, prefer it.
forwarded_scheme = environ.get("HTTP_X_FORWARDED_PROTO", None)
preferred_scheme = app.config.get("PREFERRED_URL_SCHEME", None)
if "https" in [forwarded_scheme, preferred_scheme]:
environ["wsgi.url_scheme"] = "https"
return self.app(environ, start_response)
Called as:
app = flask.Flask(__name__)
app.wsgi_app = ReverseProxied(app.wsgi_app)
This way if you've set the environment var "PREFERRED_URL_SCHEME" explicitly or if the nginx/etc/proxy sets the X_FORWARDED_PROTO, it does the right thing.
I personally could not fix this problem with any of the answers here, but found that simply adding --cert=adhoc to the end of the flask run command, which makes the flask app run with https, solved the issue.
flask run --host=0.0.0.0 --cert=adhoc

How to access wsgi params from a request inside a middleware and a flask request without side effect?

I need to read some values from the wsgi request before my flask app is loaded. If I read the url from the wsgi request I can access the file without any issues once the flask app is loaded (after the middleware runs).
But if I attempt to access the params it seems to remove the post data once the flask app is loaded. I even went to the extreme of wrapping the wsgi request with a special Webob Request to prevent this "read once" problem.
Does anyone know how to access values from the wsgi request in middleware without doing any sort of side effect harm to the request so you can get post data / file data in a flask app?
from webob import Request
class SomeMiddleware(object):
def __init__(self, environ):
self.request = Request(environ)
self.orig_environ = environ
def apply_middleware(self):
print self.request.url #will not do any harm
print self.request.params #will cause me to lose data
Here is my flask view
#app.route('/')
def hello_world():
from flask import request
the_file = request.files['file']
print "and the file is", the_file
From what I can tell, this is a limitation of the way that WSGI works. The stream needs only be consumable once (PEP 333 and 3333 only require that the stream support read* calls, tell does not need to be supported). Once the stream is exhausted it cannot be re-streamed to other WSGI applications further "inward". Take a look at these two sections of Werkzeug's documentation for more information:
http://werkzeug.pocoo.org/docs/request_data/
http://werkzeug.pocoo.org/docs/http/#module-werkzeug.formparser
The way to avoid this issue is to wrap the input stream (wsgi.input) in an object that implements the read and readline methods. Then, only when the final application in the chain actually attempts to exhaust the stream will your methods be run. See Flask's documentation on generating a request checksum for an example of this pattern.
That being said, are you sure a middleware is the best solution to your problems? If you need to perform some action (dispatch, logging, authentication) based on the content of the body of the request you may be better off making it a part of your application, rather than a stand-alone application of its own.

Web interface for a twisted application

I have a application written in Twisted and I want to add a web interface to control and monitor it. I'll need plenty of dynamic pages that show the current status and configuration, so I hoped for a framework that offers at least a templating language with inheritance and some basic routing.
Since I am using Twisted anyways I wanted to use twisted.web - but it's templating language is too basic and it seems that the only framework, Nevow is quite dead (it's on launchpad but the homepage and wiki are down and I can't find any documentation).
So what are my options?
Is there any other twisted.web based framework?
Are there other frameworks that work with twisted's reactor?
Should I just get a web framework (I'm thinking web.py or flask) and run it in a thread?
Thanks for your answers.
Since Nevow is still down and I didn't want to write routing and support for a templating lib myself, I ended up using Flask. It turned out to be quite easy:
# make a Flask app
from flask import Flask, render_template, g
app = Flask(__name__)
#app.route("/")
def index():
return render_template("index.html")
# run in under twisted through wsgi
from twisted.web.wsgi import WSGIResource
from twisted.web.server import Site
resource = WSGIResource(reactor, reactor.getThreadPool(), app)
site = Site(resource)
# bind it etc
# ...
It works flawlessly so far.
You can bind it directly into the reactor like the example below:
reactor.listenTCP(5050, site)
reactor.run()
If you need to add children to a WSGI root visit this link for more details.
Here is an example showing how to combine WSGI Resource with a static child.
from twisted.internet import reactor
from twisted.web import static as Static, server, twcgi, script, vhost
from twisted.web.resource import Resource
from twisted.web.wsgi import WSGIResource
from flask import Flask, g, request
class Root( Resource ):
"""Root resource that combines the two sites/entry points"""
WSGI = WSGIResource(reactor, reactor.getThreadPool(), app)
def getChild( self, child, request ):
# request.isLeaf = True
request.prepath.pop()
request.postpath.insert(0,child)
return self.WSGI
def render( self, request ):
"""Delegate to the WSGI resource"""
return self.WSGI.render( request )
def main():
static = Static.File("/path/folder")
static.processors = {'.py': script.PythonScript,
'.rpy': script.ResourceScript}
static.indexNames = ['index.rpy', 'index.html', 'index.htm']
root = Root()
root.putChild('static', static)
reactor.listenTCP(5050, server.Site(root))
reactor.run()
Nevow is the obvious choice. Unfortunately the divmod web server hardware and the backup server hardware failed at the same time. They are attempting to recover the data and publish it on launchpad, but it may take a while.
You could also use basically any existing template module with twisted.web; Jinja2 comes to mind.

Categories

Resources