python: APScheduler in WSGI app

python: APScheduler in WSGI app - python

I would like to run APScheduler which is a part of WSGI (via Apache's modwsgi with 3 workers) webapp. I am new in WSGI world thus I would appreciate if you could resolve my doubts:
If APScheduler is a part of webapp - it becomes alive just after first request (first after start/reset Apache) which is run at least by one worker? Starting/resetting Apache won't start it - at least one request is needed.
What about concurrent requests - would every worker run same set of APScheduler's tasks or there will be only one set shared between all workers?
Would once running process (webapp run via worker) keep alive (so APScheduler's tasks will execute) or it could terminate after some idle time (as a consequence - APScheduler's tasks won't execute)?
Thank you!

You're right -- the scheduler won't start until the first request comes in.
Therefore running a scheduler in a WSGI worker is not a good idea. A better idea would be to run the scheduler in a separate process and connect to the scheduler when necessary via some RPC mechanism like RPyC or Execnet.

Related

Uninterrupted nginx + 2 gunicorns setup

So what's the trick? Nginx is facing the client. Normally the requests are forwarded to gunicorn A at port 80.
You can't run code update in-place, since something might be wrong. So you do a fresh code checkout and launch a separate gunicorn B on some port 5678.
Once you test the new code on a development/testing database, you:
Adjust gunicorn B to point to the database, but do not send any requests.
Stop gunicorn A. Nginx now, ever so briefly, responds with an error.
Set nginx to point to gunicorn B, still at port 5678.
Restart nginx.
Is this about right? Do you just write a script to run the four actions faster and minimize the duration (between steps 2 and 4) the server responds with an error?

Nginx supports configuration reloading. Using this feature, updating your application can work like this:
Start a new instance Gunicorn B.
Adjust the nginx configuration to forward traffic to Gunicorn B.
Reload the nginx configuration with nginx -s reload. After this, Gunicorn B will serve new requests, while Gunicorn A will still finish serving old requests.
Wait for the old nginx worker process to exit (which means all requests initiated before the reload are now done) and then stop Gunicorn A.
Assuming your application works correctly with two concurrent instances, this gives you a zero-downtime update.
The relevant excerpt from the nginx documentation:
Once the master process receives the signal to reload configuration, it checks the syntax validity of the new configuration file and tries to apply the configuration provided in it. If this is a success, the master process starts new worker processes and sends messages to old worker processes, requesting them to shut down. Otherwise, the master process rolls back the changes and continues to work with the old configuration. Old worker processes, receiving a command to shut down, stop accepting new connections and continue to service current requests until all such requests are serviced. After that, the old worker processes exit.

how to updates and synchronize all workers task and code

i have celery running on few computers and using flower for monitoring.
the computers is used by different people.
celery beat is generating jobs for all the workers from one of the computer.
every time new coded task is ready, all the workers less the beat-computer will have task not registered exception.
what is the recommended direction to sync all the code to all other computers in the network, is there a prehook kind of mechanism in celery to check for new code?

Unfortunately, you need to update the code on all the workers (nodes) and after that you need to restart all of them. This is by (good) design.
A clever systemd service could in theory be able to
send the graceful shutdown signal
run pip install -U your-project
start the Celery service

Logging uWSGI application stop in Python

I have a Flask app that I run with uWSGI. I have configured logging to file in the Python/Flask application, so on service start it logs that the application has been started.
I want to be able to do this when the service stops as well, but I don't know how to implement it.
For example, if I run the uwsgi app in console, and then interrupt it with Ctrl-C, I get only uwsgi logs ("Goodbye to uwsgi" etc) in console, but no logs from the stopped python application. Not sure how to do this.
I would be glad if someone advised on possible solutions.
Edit:
I've tried to use Python's atexit module, but the function that I registered to run on exit is executed not one time, but 4 times (which is the number of uWSGI workers).

There is no "stop" event in WSGI, so there is no way to detect when the application stops, only when the server / worker stops.

Multithreading Falcon in Python

I'm creating a REST API for an application using Falcon. When launching two or more requests to the API on different endpoints, there's no multi-threaded execution (One request has to be finished to execute the next one)
The problem is coming from a POST endpoint that executes a complex machine learning process (takes dozen of seconds to finish) and the whole API is blocked when the process is being executed, because it waits for the process to be completed to return some results.
I'm using wsgiref simple_server to serve the requests:
if __name__ == '__main__':
httpd = simple_server.make_server('127.0.0.1', 8000, app)
httpd.serve_forever()
Is there any way to make the execution parallel to serve multiple requests in the same time.

Probably the server is not running in multiprocess or multithreaded mode.
But even if it was, it is not a good idea to occupy the web server for long-running tasks. The long running tasks should be run by some other worker processes.
Take a look at Celery

zaher ideally you should use Celery as giorgosp mention but if it is mandatory to return result for API request then you can use Gunicorn
gunicorn --workers 3 -b localhost:8000 main:app --reload
Here, in above code I have mention 3 workers so at a time you can serve/process 3 requests.
Ideally no of workers can be
cpu_count * 2 + 1
You can use any port number you like, but make sure that it is above 1024 and it's not used by any other program.
The main:app option tells Gunicorn to invoke the application object app available in the file main.py.
Gunicorn provides an optional --reload switch that tells Gunicorn to detect any code changes on the fly. This way you can change your code without having to restart Gunicorn.
And if this approach is not suitable for your need than I think you should use Tornado instead of Falcon.
Let me know if any further clarification needed.

This can be easily achieved by coupling Falcon with Gunicorn. With Gunicorn, achieving multi-threading/multi-processing will be relatively easier without needing to implement Celery (Although, nothing is stopping one from implementing it. Celery is awesome!)
gunicorn -b localhost:8000 main:app --threads 3 --workers 3 --reload
The above command will sping up 3 workers with each worker having 3 threads. You as a developer can tweak the number of workers and threads required. I would strongly advise to understand difference between multithreading and multiprocessing before tweaking these settings.

How to setup WSGI server to run similarly to Apache?

I'm coming from PHP/Apache world where running an application is super easy. Whenever PHP application crashes Apache process running that request will stop but server will be still ruining happily and respond to other clients. Is there a way to have Python application work in a smilar way. How would I setup wsgi server like Tornado or CherryPy so it will work similarly? also, how would I run several applications from one server with different domains?

What you are after would possibly happen anyway for WSGI severs. This is because any Python exception only affects the current request and the framework or WSGI server would catch the exception, log it and translate it to a HTTP 500 status page. The application would still be in memory and would continue to handle future requests.
What we get down to is what exactly you mean by 'crashes Apache process'.
It would be rare for your code to crash, as in cause the process to completely exit due to a core dump, the whole process. So are you being confused in your terminology in equating an application language level error to a full process crash.
Even if you did find a way to crash a process, Apache/mod_wsgi handles that okay and the process will be replaced. The Gunicorn WSGI server will also do that. CherryPy will not unless you have a process manager running which monitors it and the process monitor restarts it. Tornado in its single process mode will have the same problem. Using Tornado as the worker in Gunicorn is one way around that plus I believe Tornado itself may have some process manager in it now for running multiple process which allow it to restart processes if they die.
Do note that if your application bug which caused the Python exception is bad enough and it corrupts state within the process, subsequent requests may possibly have issues. This is the one difference with PHP. With PHP, after any request, whether successful or not, the application is effectively thrown away and doesn't persist. So buggy code cannot affect subsequent requests. In Python, because the process with loaded code and retained state is kept between requests, then technically you could get things in a state where you would have to restart the process to fix it. I don't know of any WSGI server though that has a mechanism to automatically restart a process if one request returned an error response.

If you're in an UNIX-like environment, you can run mod_wsgi under Apache in Daemon Mode. This means there will be a separate process for the Python code, and even if it crashes the server will continue running normally (and hopefully the WSGI process will restart itself). A WSGI application can run under multiple processes and multiple threads per process.
As for running multiple domains in the same server, check Name-Based Virtual Hosts.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.