How to trigger clean shutdown of FastAPI/Uvicorn

How to trigger clean shutdown of FastAPI/Uvicorn - python

I am running a number of FastAPI instances with uvicorn with python's subprocess.Popen. I have a small GUI made with PySimpleGUI with which I want to be able to close servers and restart them at will.
The first problem I encountered is that, at least in Windows, starting the uvicorn server appear to create not one, but two, new processes, and calling Popen.terminate() only closes one of these processes, which does not free up the port associated with the server. I fixed this problem using the psutil package to check what new processes have been created after I instantiate a Popen object, and track and terminate the second process with psutil.
What is still a major problem, is that calling psutil.terminate() on the process does not call the FastAPI function under #app.on_event("shutdown"). In the past, we have run all of our servers in individual terminal windows, and find that ctrl-c on those terminal windows will call the shutdown event, but I have found no other way of doing so. ctrl-c on my interface will obviously take down the interface and all the servers, and is somewhat unreliable in hitting the shutdown events for all the servers. My other idea was use psutil.send_signal(signal.CTRL_C_EVENT), but this has the same effect as calling ctrl-c in terminal.
So I am at a loss. I have seen multiple posts around saying that this is a general shortcoming of uvicorn, but have not seen anything that directly confirms my own experience or offers a solution. I also know that the "shutdown" and "startup" events in FastAPI are ported in from Starlette, and are not very well documented in either package. I have seen suggestions to use guvicorn, but my brief look into that confirmed that it is not compatible with windows. Any suggestions?

TL;DR:
APIs are meant to be long-running processes
there is a whole industry around virtualizing to manage the orchestration automatically of when to start or stop a service
there is also "serverless" infrastructure you can hang any of your processes with you not having to spend any effort in this field as it is not meant to be a thing
If you still want to go against everyone else and do manage it your self you can do as this answered question
##### SOLUTION #####
pid = proc.pid
parent = psutil.Process(pid)
for child in parent.children(recursive=True):
child.kill()
##### SOLUTION END ####
A bit of explanation:
From the conception of Rest API as an architecture pattern it was meant to be awaiting always for user's requests coming over the web. it has never been the general intent to manage gracefully and develop a product to handle gracefully the shut down of something that "was meant to run forever" and we build processes to do work to keep it running 24/7/365 as an industry.
If you ever want to leverage the ability to start or stop one to many APIS simultaneously withing the same device is highly recommended d you at least go with something like containers and Kubernetes and just scripting commands against the CLI of Kubernetes for such purpose. In exchange for the extra effort you will gain process isolation from others and the base OS layer ( which will still be less effort than building all that tooling yourself on your own.
My personal favorite is not doing even that and going straight with lambdas as is way easier and better in so many ways. Don't take it from me but from one of the industry-leading companies, Cloudflare and their statements on the subject
Serverless computing offers a number of advantages over traditional cloud-based or server-centric infrastructure. For many developers, serverless architectures offer greater scalability, more flexibility, and quicker time to release, all at a reduced cost.

Related

Named Event in Python

In python, is there a cross platform way of creating something similar to Windows named Event in one process, and set it from another process to signal something to the first one?
My specific problem is that I need to create a process that on startup will check if any other instances of itself are running, and if so, signal them to quit. With Windows API I would use CreateEvent with the lpName parameter, and SetEvent.

I've spent about a day now searching for a good answer to this and here is what I am coming up with at this moment:
It is possible to use signals to indicate to the process that some change needs to take place, however in a more complex legacy codebase I am dealing with it causes the process to crash. Signaling interrupts various I/O processes and alike based on python signal docs. You can implement signal handler with signal.SIGUSR1
import signal
def signal_handler(signum, stack):
print('Signal %d received'%signum)
signal.signal(signal.SIGUSR1, signal_handler)
This code can be triggered in Linux et al. through:
$ kill -s SIGUSR1 $pid
I am presently leaning towards kazoo Python Zookeeper library. It requires to stand up Zookeeper as infrastructure.
I do have an additional need for toggling configuration values in my case. However Zookeeper supports a number of interprocessor communication tools that will serve your needs.
UPDATE:
I finally settled on a named pipe (FIFO), calling it inside a thread with readline.
if not os.path.exists(fifo_name):
os.mkfifo(fifo_name)
while True:
with open(fifo_name, 'r') as config_fifo:
line = config_fifo.readline()[:-1]
print(line)
I used tempfile.gettempdir() to find a good location to place the FIFO in the file system. It requires quite a bit of refinement however, since I did not care to parse passed content while you might. Also if you are planing on having more then one consumer of the event you are going to have it propagated to only one consumer as it is a queue.

It seems to me that this is not so much a question as to whether this is possible in Python, but whether such a cross-platform approach exists: if one does, then even if no directly written Python exists, one can always make system calls using subprocess.call() and the like.
As for whether it's a possibility, I can't profess to be much of an expert, but a bit of a search has thrown up these discussions which might prove helpful to you.

How to fork a process in python/django?

This is more of a Python general question however in a context of django.
For now I have this view in django which has to process a lot of data. Usually it takes the server (nginx with django running in proxy using) a couple of minutes to do it. Sometimes the server times out. I don't want to increase the time-out time in nginx. I realize that if I can fork a process in python in the django view so that the forked (child) process will do all the data crunching independently of the django view, then the view would be able to return the request to the user immediately (therefore never timing-out) and the child process would continue running in the background finishing up all the calculation.
So here is the question:
How can I fork an independent process in python (and if possible for the python code to be in the same file)? And if possible how can I assign a unix process priority level to it?
I looked at some of the ways of forking a process in python and it seems there are a few options. Which one is the best appropriate for this scenario?
Thank you.

the 'best practice' answer is to use a queue manager, typically RabbitMQ or any backend handled by Django-celery.
Still, there are a few lighter options that do spawn a new thread. what these options usually lack is some way to track progress, or keep the number of threads under control.
check Django-utils to see if it's enough. if not, go for Celery.

If you really want to fork and set priority, you can use os.fork and os.nice, but I think the multiprocessing module or Celery would be more applicable here.

How to run Python code using less than 1% of CPU?

I am developing some Python code for Windows. A criteria is that it will use less than 1% of CPU. I understand that it is impossible to guarantee this all the time due to things like garbage collection, but what would be the best practice to get as close as possible. My current solution is to spread a lot of time.sleep(0.1) around the code, especially in loops. There are, however, obvious problems with this approach.
What other approaches could be taken?
I should also mention that the application has lots of threads in it using the threading library.
EDIT: Setting the process priority is not what I am after.

It is the job of the operating system to schedule CPU time. Use your operating system's built-in process-limits mechanisms (hopefully they exist on Windows) to restrict your process to <1% CPU.
This style of sprinkling unnecessary sleeps every few lines in the code will make the code terrible to create and extend and maintain, not to mention incredibly inelegant. (Rate-limiting yourself may be useful in very small, limited, critical sections -- for example your program is queuing lots of IO requests and you don't wish to inundate the operating system, you might wish to put a single sleep-until-[condition] in each critical loop which has the potential to inundate the system, but otherwise use extremely sparingly.)
Ideally you would call an API to the appropriate OS mechanisms from within your program when you start up, telling the OS to throttle you appropriately.

If the goal is to not bother the user then "below 1% CPU" is the wrong approach. What you really want is "don't take time away from other processes but still complete as fast as possible" - that's what "below normal" process priority is for. See http://code.activestate.com/recipes/496767-set-process-priority-in-windows/ for an example of how process priority can be changed for the current process (calling that function with default parameters will do).
For the sales pitch you can show the task manager while the computer is idle ("See? 99%, my application gets lots of work done") and then start some CPU-intensive application ("Almost all CPU time is spent in the application the user is working with, my application simply went into background").

If the box used for the demonstration is a Windows Server, it can use Windows System Resource Manager for restricting CPU usage below the desired threshold. Trying to force this behavior by code is impossible, unless a Windows API exposes this capability explicitly.

Async spawing of processes: design question - Celery or Twisted

All: I'm seeking input/guidance/and design ideas. My goal is to find a lean but reliable way to take XML payload from an HTTP POST (no problems with this part), parse it, and spawn a relatively long-lived process asynchronously.
The spawned process is CPU intensive and will last for roughly three minutes. I don't expect much load at first, but there's a definite possibility that I will need to scale this out horizontally across servers as traffic hopefully increases.
I really like the Celery/Django stack for this use: it's very intuitive and has all of the built-in framework to accomplish exactly what I need. I started down that path with zeal, but I soon found my little 512MB RAM cloud server had only 100MB of free memory and I started sensing that I was headed for trouble once I went live with all of my processes running full-tilt. Also, it's got several moving parts: RabbitMQ, MySQL, cerleryd, ligthttpd and the django container.
I can absolutely increase the size of my server, but I'm hoping to keep my costs down to a minimum at this early phase of this project.
As an alternative, I'm considering using twisted for the process management, as well as perspective broker for the remote systems, should they be needed. But for me at least, while twisted is brilliant, I feel like I'm signing up for a lot going down that path: writing protocols, callback management, keeping track of job states, etc. The benefits here are pretty obvious - excellent performance, far fewer moving parts, and a smaller memory footprint (note: I need to verify the memory part). I'm heavily skewed toward Python for this - it's much more enjoyable for me than the alternatives :)
I'd greatly appreciate any perspective on this. I'm concerned about starting things off on the wrong track, and redoing this later with production traffic will be painful.
-Matt

On my system, RabbitMQ running with pretty reasonable defaults is using about 2MB of RAM. Celeryd uses a bit more, but not an excessive amount.
In my opinion, the overhead of RabbitMQ and celery are pretty much negligible compared to the rest of the stack. If you're processing jobs that are going to take several minutes to complete, those jobs are what will overwhelm your 512MB server as soon as your traffic increases, not RabbitMQ. Starting off with RabbitMQ and Celery will at least set you up nicely to scale those jobs out horizontally though, so you're definitely on the right track there.
Sure, you could write your own job control in Twisted, but I don't see it gaining you much. Twisted has pretty good performance, but I wouldn't expect it to outperform RabbitMQ by enough to justify the time and potential for introducing bugs and architectural limitations. Mostly, it just seems like the wrong spot to worry about optimizing. Take the time that you would've spent re-writing RabbitMQ and work on reducing those three minute jobs by 20% or something. Or just spend an extra $20/month and double your capacity.

I'll answer this question as though I was the one doing the project and hopefully that might give you some insight.
I'm working on a project that will require the use of a queue, a web server for the public facing web application and several job clients.
The idea is to have the web server continuously running (no need for a very powerful machine here). However, the work is handled by these job clients which are more powerful machines that can be started and stopped at will. The job queue will also reside on the same machine as the web application. When a job gets inserted into the queue, a process that starts the job clients will kick into action and spin the first client. Using a load balancer that can start new servers as the load increases, I don't have to bother about managing the number of servers running to process jobs in the queue. If there are no jobs in the queue after a while, all job clients can be terminated.
I will suggest using a setup similar to this. You don't want job execution to affect the performance of your web application.

I Add, quite late another possibility: using Redis.
Currently I using redis with twisted : I distribute work to worker. They perform work and return result asynchronously.
The "List" type is very useful :
http://www.redis.io/commands/rpoplpush
So you can use the Reliable queue Pattern to send work and having a process that block/wait until he have a new work to do(a new message coming in queue.
you can use several worker on the same queue.
Redis have a low memory foot print but be careful of number of pending message , that will increase the memory that Redis use.

Python Job Service Daemon?

What packages should I look at for writing a python daemon and processing jobs? Also, what do I need to do for a python daemon?

I'm pretty happy with beanstalkd, which has client libraries available in various languages:
Daemon:
http://kr.github.com/beanstalkd/
Python client library:
http://code.google.com/p/pybeanstalk/

Your question is a bit ambiguous, but I'm assuming you mean you would like to write a python daemon that will process jobs that get thrown in a queue. If not, please say as much. :-)
I've heard a lot of great things about redis. The folks at github built resque as a job processing daemon for Ruby. If you're language flexible, you could just use that, but if you're not, you could emulate it in as much or as little depth as you like making use of redis as your queue system. Depending on how pluggable and extensible you need it to be, this could be a really simple thing to implement.
Another option I ran across after some more googling is redqueue. It looks like it might already implement most of a job queue.
If you're using django, you may wish to consider the Celery project. It's a job queue system based on RabbitMQ which is yet another queuing server with excellent reviews.
As far as creating a daemon in python, there are a number of options. You can look at this page on activestate, which is a good start. Better yet, you can use python-daemon to do it all for you. But if you use one of the above options or beanstalkd as recommended by mczepiel, you probably won't have to make your process run as a daemon.

I have recently (this week) implemented a queue in RabbitMQ with a python daemon extracting the information and storing it on a database (using Django ORM). The daemon has a intermediate buffer so it will wait a little and write in the database in batches, instead of writing each time a little message arrives.
I've made the integration with the queue using this little flopsy module, which is easy to set up. The only problem I've got it to be able to set up a timeout for waiting a message, as the module has not a clear way of doing that. After a while playing with the interactive shell and making a few dir(), I manage to get to the socket object and set up the timeout.
I considered also Celery, but seems to be more focused on using internally a RabbitMQ to allow you to launch tasks (periodically or asynchronously), more that using a queue to communicating with other systems. In our case, the queue can be feed both by Python systems and Ruby ones.
Once I've completed the process, I've made some adjustments to allow running it as a daemon (mostly storing the standard output to a file to allow easy logging) and then create a bash script that launch a start-stop-daemon command. I've followed more or less this schema
I discovered python-daemon just about one day late, so after the work is done it makes no sense revisiting it, but maybe it makes more sense for a Python project.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.