flask application timeout with amazon load balancer - python

I'm trying to use a Flask application behind an Amazon Load Balancer and the Flask threads keep timing out. It appears that the load balancer is sending a Connection: keep-alive header and this is causing the Flask process to never return (or takes a long time). With gunicorn in front the processes are killed and new ones started. We also tried using uWSGI and simply exposign the Flask app directly (no wrapper). All result in the Flask process just not responding.
I see nothing in the Flask docs which would make it ignore this header. I'm at a loss as to what else I can do with Flask to fix the problem.
Curl and direct connections to the machine work fine, only those via the load balancer are causing the problem. The load balancer itself doesn't appear to be doing anything wrong and we use it successfully with several other stacks.

The solution I have now is using gunicorn as a wrapper around the flask application. For the worker_class I am using eventlet with several workers. This combination seems stable and responsive. Gunicorn is also configured for HTTPS.
I assume it is a defect in Flask that causes the problem and this is an effective workaround.

Did you remember to set session.permanent = True and app.permanent_session_lifetime?

The easiest way is to force all connections is to make sure you are using HTTP/1.0 and not adding the header Connection: Keep-Alive to the response.
Please checkout werkzeug.http.remove_hop_by_hop_headers().

Do you need an HTTP load balancer? Using a layer 4 balancer might just as well solve your problem, because it does not interfere with higher protocol levels.

Related

Socket.io POST Requests from Socket.IO-Client-Swift

I am running socket.io on an Apache server through Python Flask. We're integrating it into an iOS app (using the Socket.IO-Client-Swift library) and we're having a weird issue.
From the client side code in the app (written in Swift), I can view the actual connection log (client-side in XCode) and see the connection established from the client's IP and the requests being made. The client never receives the information back (or any information back; even when using a global event response handler) from the socket server.
I wrote a very simple test script in Javascript on an HTML page and sent requests that way and received the proper responses back. With that said, it seems to likely be an issue with iOS. I've found these articles (but none of them helped fix the problem):
https://github.com/nuclearace/Socket.IO-Client-Swift/issues/95
https://github.com/socketio/socket.io-client-swift/issues/359
My next thought is to extend the logging of socket.io to find out exact what data is being POSTed to the socket namespace. Is there a way to log exactly what data is coming into the server (bear in mind that the 'on' hook on the server side that I've set up is not getting any data; I've tried to log it from there but it doesn't appear to even get that far).
I found mod_dumpio for Linux to log all POST requests but I'm not sure how well it will play with multi-threading and a socket server.
Any ideas on how to get the exact data being posted so we can at least troubleshoot the syntax and make sure the data isn't being malformed when it's sent to the server?
Thanks!
Update
When testing locally, we got it working (it was a setting in the Swift code where the namespace wasn't being declared properly). This works fine now on localhost but we are having the exact same issues when emitting to the Apache server.
We are not using mod_wsgi (as far as I know; I'm relatively new to mod_wsgi, apologies for any ignorance). We used to have a .wsgi file that called the main app script to run but we had to change that because mod_wsgi is not compatible with Flask SocketIO (as stated in the uWSGI Web Server section here). The way I am running the script now is by using supervisord to run the .py file as a daemon (using that specifically so it will autostart in the event of a server crash).
Locally, it worked great once we installed the eventlet module through pip. When I ran pip freeze on my virtual environment on the server, eventlet was installed. I uninstalled and reinstalled it just to see if that cleared anything up and that did nothing. No other Python modules that are on my local copy seem to be something that would affect this.
One other thing to keep in mind is that in the function that initializes the app, we change the port to port 80:
socketio.run(app,host='0.0.0.0',port=80)
because we have other API functions that run through a domain that is pointing to the server in this app. I'm not sure if that would affect anything but it doesn't seem to matter on the local version.
I'm at a dead end again and am trying to find anything that could help. Thanks for your assistance!
Another Update
I'm not exactly sure what was happening yet but we went ahead and rewrote some of the code, making sure to pay extra special attention to the namespace declarations within each socket event on function. It's working fine now. As I get more details, I will post them here as I figure this will be something useful for other who have the same problem. This thread also has some really valuable information on how to go about debugging/logging these types of issues although we never actually fully figured out the answer to the original question.
I assume you have verified that Apache does get the POST requests. That should be your first test, if Apache does not log the POST requests coming from iOS, then you have a different kind of problem.
If you do get the POST requests, then you can add some custom code in the middleware used by Flask-SocketIO and print the request data forwarded by Apache's mod_wsgi. The this is in file flask_socketio/init.py. The relevant portion is this:
class _SocketIOMiddleware(socketio.Middleware):
# ...
def __call__(self, environ, start_response):
# log what you need from environ here
environ['flask.app'] = self.flask_app
return super(_SocketIOMiddleware, self).__call__(environ, start_response)
You can find out what's in environ in the WSGI specification. In particular, the body of the request is available in environ['wsgi.input'], which is a file-like object you read from.
Keep in mind that once you read the payload, this file will be consumed, so the WSGI server will not be able to read from it again. Seeking the file back to the position it was before the read may work on some WSGI implementations. A safer hack I've seen people do to avoid this problem is to read the whole payload into a buffer, then replace environ['wsgi.input'] with a brand new StringIO or BytesIO object.
Are you using flask-socketio on the server side? If you are, there is a lot of debugging available in the constructor.
socketio = SocketIO(app, async_mode=async_mode, logger=True, engineio_logger=True)

Bottle server not responding while calculating

I have a bottle server running on port 8080, using the "gevent" server. I use this server to support some simple "server sent events".
My question is probably related to not knowing exactly how my set up is working. I hope someone can take the time to elaborate on this.
All routes and serving of files from the server is working great, but I have an issue when accessing a specific route "/get_data". This gathers data from the web as well as from some internal data sources. The gathering takes about 30 minutes. While this process is running, I am not able to access any routes on the server, i.e. "/" or "/login". Once the process is finished, everything works again and the database is updated with the gathered information.
I tried replacing the gathering algorithms by a simple time.sleep(60), and while the timer was active, I was still able to access other routes just fine.
This leads to my two questions:
Why am I not able to access the server while this process is running. Is it the port that is blocked (from reading web-information), or maybe it has something to do with threading?
What would be the best way to run a demanding / long process on my server? Preferably I would like to access this from my web app, but I have thought about just putting this in a seperate python file and run this localy on the server, in a seperate instance of python. This process is run at most once per day, maybe as seldom as once per week.
This happen because WSGI handle request/response synchronously.
You can use gunicorn to run your application, it will handle multi requests and response, or you can use other methods described in bottle website:
Primer to Asynchronous Applications

nginx + uwsgi 502 Bad Gateway python

I'm running a script in python and takes a long time to process. The thing is if the function takes to long to run, i guess the nginx has a timeout, in his configuration and that prevents somekind of errors, and prevents the function to run completely.
I just want to know were i can increse the value of the timeout. Because i've tried some commands in the file conf of nginx such as:
uwsgi_connect_timeout 75;
uwsgi_send_timeout 75;
uwsgi_read_timeout 75;
keepalive_timeout 650;
but none of this worked.
Thks in advance
The problem with just extending the timeout is that no matter how much longer you set it to you will run into limitations somewhere along the line. Either with the web server, the browser or your geocode calls. If it is something that routinely fails n times in a request, then you can't really make any guarantees.
So rather than having the client request hanging on a long running process (and by extension risking a server timeout), why don't you use something like celery to run those geocode tasks and on the client-side, submit your client-side request via javascript and poll the server for the answer via ajax until it get's a response?
I also had Bad gateway error in NGIX + uWSGI configuration, and for sake of people who google this question: it might be missing uwsgi python plugin. Please see: uWSGI configuration issue: uwsgi fails without any error message..
I tried everything written in the above response as well as other places but they did not work.
My solution was changing my socket in both the uwsgi.conf and nginx.conf files.

How do I cleanly bridge client connections between a frontend webserver and a backend running CherryPy?

The title may be a bit vague, but here's my goal: I have a frontend webserver which takes incoming HTTP requests, does some preprocessing on them, and then passes the requests off to my real webserver to get the HTTP response, which is then passed back to the client.
Currently, my frontend is built off of BaseHTTPServer.HTTPServer and the backend is CherryPy.
So the question is: Is there a way to take these HTTP requests / client connections and insert them into a CherryPy server to get the HTTP response? One obvious solution is to run an instance of the CherryPy backend on a local port or using UNIX domain sockets, and then the frontend webserver establishes a connection with the backend and relays any requests/responses. Obviously, this isn't ideal due to the overhead.
What I'd really like is for the CherryPy backend to not bind to any port, but just sit there waiting for the frontend to pass the client's socket (as well as the modified HTTP Request info), at which point it does its normal CherryPy magic and returns the request directly to the client.
I've been perusing the CherryPy source to find some way to accomplish this, and currently am attempting to modify wsgiserver.CherryPyWSGIServer, but it's getting pretty hairy and is probably not the best approach.
Is your main app a wsgi application? If so, you could write some middleware that wraps around it and does all the request wrangling before passing on to the main application.
If this this is possible it would avoid you having to run two webservers and all the problems you are encountering.
Answered the Upgrade question at Handling HTTP/1.1 Upgrade requests in CherryPy. Not sure if that addresses this one or not.

How should I implement reverse AJAX in a Django application?

How should I implement reverse AJAX when building a chat application in Django? I've looked at Django-Orbited, and from my understanding, this puts a comet server in front of the HTTP server. This seems fine if I'm just running the Django development server, but how does this work when I start running the application from mod_wsgi? How does having the orbited server handling every request scale? Is this the correct approach?
I've looked at another approach (long polling) that seems like it would work, although I'm not sure what all would be involved. Would the client request a page that would live in its own thread, so as not to block the rest of the application? Would it even block? Wouldn't the script requested by the client have to continuously poll for information?
Which of the approaches is more proper? Which is more portable, scalable, sane, etc? Are there other good approaches to this (aside from the client polling for messages) that I have overlooked?
How about using the awesome nginx push module?
Have take a look at Tornado?
Using WSGI for comet/long-polling apps is not a good choice because don't support non-blocking requests.
The Nginx Push Stream Module provides a simple HTTP interface for both the server and the client.
The Nginx HTTP Push Module is similar, but seems to no longer be maintained.

Categories

Resources