Application setup: Flask running in an eventlet WSGI using the default session
On the client side I am using jQuery to send POST requests to the server from within an asynchronous event handler:
function set_option(option, value) {
$.post('/api/options/' + option, {'value': value});
}
$('.option').change(function() {
var element_id = $(this).prop('id');
var value = $(this).val();
set_option(element_id, value);
});
On the server side I am adding the option and its value to the Flask session:
from flask import Flask, request, session
app = Flask(__name__)
[...]
#app.route('/api/options/<option>', methods=['POST'])
def set_option(option=None):
if request.method == 'POST' and option is not None:
option_value = request.form.get('value')
if option_value is not None:
session[option] = option_value
The above client side event handler can sometimes be called multiple times within a given moment. This appears to cause a race condition with the Flask session cookies. For example, if the event handler gets fired twice from two forced .change() calls on different elements the session then only ends up being modified according to the last POST request.
Let's say there are two elements being changed to the following values:
option_one='1234' and option_two='5678'
The Flask session already contains the following:
{'option_one': '0', 'option_two': '0'}
Two separate requests are sent to the Flask server respectively for each option. The first request sets the Flask session to:
{'option_one': '1234', 'option_two': '0'}
The second request sets the Flask session to:
{'option_one':'0', 'option_two': '5678'}
The session from the second request ends up replacing the session from the first request thus eliminating the desired value which was stored for option_one.
With this in mind, the session cookies do appear to update appropriately if the event handlers are called with a greater length of time between each call.
Is this behaviour of Flask and its session/cookies management to be expected when requests are made within a couple hundred milliseconds of each other?
Would server-side session management be a solution?
As Martijn and davidism pointed out, the session behaviour is a race condition and is to be expected.
The solution would be to store such data on the server and ensure modification conflicts are handled appropriately.
Related
I am playing around with Flask, striving to understand details of how sessions are working, I am using:
Python 3.6.1
Flask 0.12.2
Flask documentation clearly states (bold is mine):
The session object works pretty much like an ordinary dict, with the
difference that it keeps track on modifications.
This is a proxy.
...
Section on proxies mentions that (again, bold is mine):
If you need to get access
to the underlying object that is proxied, you can use the
_get_current_object() method
Thus, underlying object (session._get_current_object()) must remain the same for the request or, as was suggested by this answer and comment, a thread. Though, it does not persist nor inside request, nor thread.
Here is a demonstration code:
import threading
from flask import (
Flask,
session,
)
app = Flask(__name__)
app.secret_key = 'some random secret key'
#app.route('/')
def index():
print("session ID is: {}".format(id(session)))
print("session._get_current_object() ID is: {}".format(id(session._get_current_object())))
print("threading.current_thread().ident is: {}".format(threading.current_thread().ident))
print('________________________________')
return 'Check the console! ;-)'
If you will run Flask application above, and repeatedly go to the / — id returned by session._get_current_object() will, occasionally, change, while threading.current_thread().ident never changes.
This leads me to ask the following questions:
What exactly is returned by session._get_current_object()?
I get that it is an object underlying session proxy, but what this underlying object is bound to (if it is not a request and not a thread, if anything I would expect it never to change, for the simple application above)?
What exactly is returned by session._get_current_object()?
Technically speaking, it is the object referenced in the session attribute of the top-most element in the LocalStack instance named _request_ctx_stack.
This top-most element of that stack is a RequestContext that is instantiated in Flask.wsgi_app, which is called for every HTTP request. The RequestContext implements methods to push and pop itself to and from the local stack _request_ctx_stack. The push method also takes care of requesting a new session for the context.
This session is what is made available in the session proxy; the request, that the RequestContext has been initialized with, is made available via the request proxy. These two proxies are only usable inside a request context, i.e. with an active HTTP request being processed.
I get that it is an object underlying session proxy, but what this
underlying object is bound to (if it is not a request and not a
thread, if anything I would expect it never to change, for the simple
application above)?
As outlined above, the request context's session, proxied by the session local proxy, belongs to the RequestContext. And it is changing with every request. As documented in Lifetime of the Context, a new context is created for each request, and it creates a new session every time push is executed.
The id of session._get_current_object() staying the same between consecutive requests is, probably, due to the new session object being created in the same memory address that the old one from the previous request occupied.
See also: How the Context Works section of the Flask documentation.
Here is a modified code snippet, to illustrate answer by shmee
import threading
from flask import (
Flask,
session,
request
)
app = Flask(__name__)
app.secret_key = 'some random secret key'
#app.route('/')
def index():
print(">>> session <<<")
session_id = id(session)
session_object_id = id(session._get_current_object())
print("ID: {}".format(session_id),
"Same as previous: {}".format(session.get('prev_sess_id', '') == session_id))
print("_get_current_object() ID: {}".format(session_object_id),
"Same as previous: {}".format(session.get('prev_sess_obj_id', '') == session_object_id))
session['prev_sess_id'] = session_id
session['prev_sess_obj_id'] = session_object_id
print("\n>>> request <<<")
request_id = id(request)
request_object_id = id(request._get_current_object())
print("request ID is: {}".format(request_id),
"Same as previous: {}".format(session.get('prev_request_id', '') == request_id))
print("request._get_current_object() ID is: {}".format(id(request._get_current_object())),
"Same as previous: {}".format(session.get('prev_request_obj_id', '') == request_object_id))
session['prev_request_id'] = request_id
session['prev_request_obj_id'] = request_object_id
print("\n>>> thread <<<")
thread_id = threading.current_thread().ident
print("threading.current_thread().ident is: {}".format(threading.current_thread().ident),
"Same as previous: {}".format(session.get('prev_thread', '') == thread_id))
session['prev_thread'] = thread_id
print('-' * 100)
return 'Check the console! ;-)'
The only obscurity left is, indeed, why sometimes session._get_current_object() remains unchanged between between consecutive requests. And as suggested by shmee (bold is mine), it is:
probably, due to the new session object being created in the same memory address that the old one from the previous request occupied.
I'm looking for some advice, or a relevant tutorial regarding the following:
My task is to set up a flask route that POSTs to API endpoint X, receives a new endpoint Y in X's response, then GETs from endpoint Y repeatedly until it receives a certain status message in the body of Y's response, and then returns Y's response.
The code below (irrelevant data redacted) accomplishes that goal in, I think, a very stupid way. It returns the appropriate data occasionally, but not reliably. (It times out 60% of the time.) When I console log very thoroughly, it seems as though I have bogged down my server with multiple while loops running constantly, interfering with each other.
I'll also receive this error occasionally:
SIGPIPE: writing to a closed pipe/socket/fd (probably the client disconnected) on request /book
import sys, requests, time, json
from flask import Flask, request
# create the Flask app
app = Flask(__name__)
# main booking route
#app.route('/book', methods=['POST']) #GET requests will be blocked
def book():
# defining the api-endpoints
PRICING_ENDPOINT = ...
# data to be sent to api
data = {...}
# sending post request and saving response as response object
try:
r_pricing = requests.post(url = PRICING_ENDPOINT, data = data)
except requests.exceptions.RequestException as e:
return e
sys.exit(1)
# extracting response text
POLL_ENDPOINT = r_pricing.headers['location']
# setting data for poll
data_for_poll = {...}
r_poll = requests.get(POLL_ENDPOINT, data = data_for_poll)
# poll loop, looking for 'UpdatesComplete'
j = 1
poll_json = r_poll.json()
update_status = poll_json['Status']
while update_status == 'UpdatesPending':
time.sleep(2)
j = float(j) + float(1)
r_poll = requests.get(POLL_ENDPOINT, data = data_for_poll)
poll_json = r_poll.json()
update_status = poll_json['Status']
return r_poll.text
This is more of an architectural issue more than a Flask issue. Long-running tasks in Flask views are always a poor design choice. In this case, the route's response is dependent on two endpoints of another server. In effect, apart from carrying the responsibility of your app, you are also carrying the responsibility of another server.
Since the application's design seems to be a proxy for another service, I would recommend creating the proxy in the right way. Just like book() offers the proxy for PRICING_ENDPOINT POST request, create another route for POLL_ENDPOINT GET request and move the polling logic to the client code (JS).
Update:
If you cannot for some reason trust the client (browser -> JS) with the POLL_ENDPOINT information in a hidden proxy like situation, then maybe move the polling to a task runner like Celery or Python RQ. Although, it will introduce additional components to your application, it would be the right way to go.
Probably you get that error because of the HTTP connection time out with your API server that is looping. There are some standards for HTTP time connection and loop took more time that is allowed for the connection. The first (straight) solution is to "play" with Apache configs and increase the HTTP connection time for your wsgi. You can also make a socket connection and in it check the update status and close it while the goal was achieved. Or you can move your logic to the client side.
I am using Flask, with the flask-session plugin for server-side sessions stored in a Redis backend. I have flask set up to use persistent sessions, with a session timeout. How can I make an AJAX request to get the time remaining on the session without resetting the timeout?
The idea is for the client to check with the server before displaying a timeout warning (or logging out the user) in case the user is active in a different tab/window of the same browser.
EDIT: after some digging, I found the config directive SESSION_REFRESH_EACH_REQUEST, which it appears I should be able to use to accomplish what I want: set that to False, and then the session should only be refreshed if something actually changes in the session, so I should be able to make a request to get the timeout without the session timeout changing. It was added in 0.11, and I'm running 0.11.1, so it should be available.
Unfortunately, in practice this doesn't appear to work - at least when checking the ttl of the redis key to get the time remain. I checked, and session.modified is False, so it's not just that I am doing something in the request that modifies the session (unless it just doesn't set that flag)
The following works, though it is rather hacky:
In the application __init__.py, or wherever you call Session(app) or init_app(app):
#set up the session
Session(app)
# Save a reference to the original save_session function so we can call it
original_save_session = app.session_interface.save_session
#----------------------------------------------------------------------
def discretionary_save_session(self, *args, **kwargs):
"""A wrapper for the save_session function of the app session interface to
allow the option of not saving the session when calling specific functions,
for example if the client needs to get information about the session
(say, expiration time) without changing the session."""
# bypass: list of functions on which we do NOT want to update the session when called
# This could be put in the config or the like
#
# Improvement idea: "mark" functions on which we want to bypass saving in
# some way, then check for that mark here, rather than having a hard-coded list.
bypass = ['check_timeout']
#convert function names to URL's
bypass = [flask.url_for(x) for x in bypass]
if not flask.request.path in bypass:
# if the current request path isn't in our bypass list, go ahead and
# save the session normally
return original_save_session(self, *args, **kwargs)
# Override the save_session function to ours
app.session_interface.save_session = discretionary_save_session
Then, in the check_timeout function (which is in the bypass list, above), we can do something like the following to get the remaining time on the session:
#app.route('/auth/check_timeout')
def check_timeout():
""""""
session_id = flask.session.sid
# Or however you want to get a redis instance
redis = app.config.get('REDIS_MASTER')
# If used
session_prefix = app.config.get('SESSION_KEY_PREFIX')
#combine prefix and session id to get the session key stored in redis
redis_key = "{}{}".format(session_prefix, session_id)
# The redis ttl is the time remaining before the session expires
time_remain = redis.ttl(redis_key)
return str(time_remain)
I'm sure the above can be improved upon, however the result is as desired: when calling /auth/check_timeout, the time remaining on the session is returned without modifying the session in any way.
I am using Flask kvsession to avoid replay attacks, as the client side cookie based session used by Flask-login are prone to it.
Eg: If on /index page your cookie in the header is set for your app header like
myapp_session : 'value1'
and if you navigate to /important page you will get a new header like
myapp_session : 'value2' so if a hacker gets 'value1' he can perform replay attacks and misuse it, as it is never invalidated.
To solve this I am using flask-kvsession which stores the session cookie header value in a cache or some backend. SO basically only one myapp_session is generated and invalidated when you logout. But the problem is :-
__init__.py
from simplekv.memory.redisstore import RedisStore
import redis
store = RedisStore(redis.StrictRedis())
#store = memcache.Client(['127.0.0.1:11211'], debug =0)
store.ttl_support = True
app = create_app(__name__)
current_kvsession = KVSessionExtension(store, app)
If you look at the cleanup_session part of the code for kv-session
http://pythonhosted.org/Flask-KVSession/#flask_kvsession.KVSessionExtension.cleanup_sessions
It only deletes the expired sessions. But If I want to explicitly delete the value for the current myapp_session for a particular user on logout, how do I do that?
#app.before_request
def redirect_if_logout():
if request.path == url_for('logout'):
for key in app.kvsession_store.keys():
logger.debug(key)
m = current_kvsession.key_regex.match(key)
logger.debug('found %s', m)
app.kvsession_store.delete(key)
But this deletes all the keys as I don`t know what the unique key for the current session is.
Q2. Also, how to use memcache instead of redis as it doesn`t have the app.kvsession_store.keys() function and gives i/o error.
I think I just figured the 1st part of your question on how you can delete the specific key on logout.
As mentioned in the docs:
Internally, Flask-KVSession stores session ids that are serialized as
KEY_CREATED, where KEY is a random number (the sessions “true” id) and
CREATED a UNIX-timestamp of when the session was created.
Sample cookie value that gets created on client side (you can check with that cookie manager extenion for firefox):
c823af88aedaf496_571b3fd5.4kv9X8UvyQqtCtNV87jTxy3Zcqc
and session id stored in redis as key:
c823af88aedaf496_571b3fd5
So on logout handler, you just need to read the cookie value, split it and use the first part of the string:
Sample Code which worked for me:
import redis
from flask import Flask
from flask_kvsession import KVSessionExtension
from simplekv.memory.redisstore import RedisStore
store = RedisStore(redis.StrictRedis())
app = Flask(__name__)
KVSessionExtension(store, app)
#Logout Handler
#app.route('/logout', methods=['GET'])
def logout():
#here you are reading the cookie
cookie_val = request.cookies.get('session').split(".")[0]
store.delete(cookie_val)
and since you have added ttl_support:
store.ttl_support = True
It will match the TTL(seconds) value from permanent_session_lifetime, if you have set that in config file or in the beginning of your app.py file.
For example, in my application I have set in the beginning of app.py file as:
session.permanent = True
app.permanent_session_lifetime = timedelta(minutes=5)
now, when I logout, it deletes the key in redis but it will not be removed until TTL for that turns to 0 from 300 (5 Min as mentioned in permanent_session_lifetime value ).
If you want to remove it from redis immediately, for that you can manually change the app.permanent_session_lifetime to 1 second, which will in turn change TTL for redis.
import redis
import os
from flask import Flask
from flask_kvsession import KVSessionExtension
from simplekv.memory.redisstore import RedisStore
store = RedisStore(redis.StrictRedis())
app = Flask(__name__)
KVSessionExtension(store, app)
#Logout Handler
#app.route('/logout', methods=['GET'])
def logout():
cookie_val = request.cookies.get('session').split(".")[0]
app.permanent_session_lifetime = timedelta(seconds=1)
store.delete(cookie_val)
Using the above code, I was able to thwart session replay attacks.
and solution to your 2nd question:
3 possible mistakes that I can see are:
1: In the beginning of your code you have created:
store = RedisStore(redis.StrictRedis())
but in the loop you are using it as kvsession_store instead of just store:
app.kvsession_store.keys()
To use it without any errors/exceptions you can use it as store.keys() instead of app.store.keys():
from flask_kvsession import KVSessionExtension
from simplekv.memory.redisstore import RedisStore
store = RedisStore(redis.StrictRedis())
for key in store.keys():
print key
store.delete(key) is not deleting the all keys, you are running it inside the loop which is one by one deleting all keys.
I am implementing an endpoint in my Flask application that receives a collection of HTTP requests, and returns a collection of the corresponding HTTP responses. In order to accomplish this, I need my endpoint to call other endpoints in order to construct the result. However, because Flask is blocking while processing the original request, it cannot process the nested requests and the application gets deadlocked.
Is there any way to issue a request within a request in flask in a way that doesn't result in a deadlock?
I included a segment of my code which I believe should be enough to illustrate the problem without overwhelming you. If you would like to see more of it please let me know and I'll share.
from requests import Session, Request
def split(request):
multipart = request.stream.read()
boundary = request.content_type.split(';')[1]
prefix = ' boundary"'
suffix = '"'
delimiter = '--%s' % boundary[len(prefix)+1:-len(suffix)]
subrequests = [s.lstrip() for s in multipart.split(delimiter)]
for sub in subrequests:
status_line, _, more_lines = sub.partition('\n')
method, path, version = status_line.split()
headers, _, body = more_lines.partition('\n\n')
url = 'http://localhost:3000' + path
return Request(method, url, headers=headers, data=body)
#app.route('/batch', methods=["GET", "POST"])
def batch():
subrequests = split(request)
session = Session()
responses = []
for sub in subrequests:
response.append(s.send(sub.prepare())) # Deadlock!
There are two candidate solutions that I considered which I found to be unsatisfactory:
Don't issue a full request. Instead, just call the function that is mapped to the endpoint of interest (url_for). I am unsatisfied by this approach because the nested requests have their own headers and cookies which are neglected by this approach. Furthermore, code in the 'before_request' and 'after_request' handlers won't get called automatically
Run multiple instances of the application. This will solve the problem, but expose my service to a pretty simple DoS attack. If I have X instances running, All an attacker would need to do is to hit my service with X different requests to cause a deadlock.
Thank you.
Knowing that the internal flask server is not production-ready, when using only for development, pass the threaded=true parameter to app.run.
app.run(debug=True, threaded=True)
This happens cause you're using the flask devserver. It's not for production use.
In production environment you would use an application server (uWSGI, GUnicorn, Tornado, ...) with or without a webserver layer (NGINX, Apache,...) to proxy/balance connections to the workers protecting (not completely but in a lot of environments it's acceptable) from DoS attacks.