Something that I just thought about:
Say I'm writing view code for my Django site, and I make a mistake and create an infinite loop.
Whenever someone would try to access the view, the worker assigned to the request (be it a Gevent worker or a Python thread) would stay in a loop indefinitely.
If I understand correctly, the server would send a timeout error to the client after 30 seconds. But what will happen with the Python worker? Will it keep on working indefinitely? That sounds dangerous!
Imagine I've got a server in which I've allocated 10 workers. I let it run and at some point, a client tries to access the view with the infinite loop. A worker will be assigned to it, and will be effectively dead until the next server restart. The dangerous thing is that at first I wouldn't notice it, because the site would just be imperceptibly slower, having 9 workers instead of 10. But then it might happen again and again throughout a long span of time, maybe months. The site would just get progressively slower, until eventually it would be really slow with just one worker.
A server restart would solve the problem, but I'd hate to have my site's functionality depend on server restarts.
Is this a real problem that happens? Is there a way to avoid it?
Update: I'd also really appreciate a way to take a stacktrace of the thread/worker that's stuck in an infinite loop, so I could have that emailed to me so I'll be aware of the problem. (I don't know how to do this because there is no exception being raised.)
Update to people saying things to the effect of "Avoid writing code that has infinite loops": In case it wasn't obvious, I do not spend my free time intentionally putting infinite loops into my code. When these things happen, they are mistakes, and mistakes can be minimized but never completely avoided. I want to know that even when I make a mistake, there'll be a safety net that will notify me and allow me to fix the problem.
It is a real problem. In case of gevent, due to context switching, it can even immediately stop your website from responding.
Everything depends on your environment. For example, when running django in production through uwsgi you can set harakiri - that is time in seconds, after which thread handling the request will be killed if it didn't finish handling the response. It is strongly recommended to set such a value in order to deal with some faulty requests or bad code. Such event is reported in uwsgi log. I believe other solutions for running Django in production have similar options.
Otherwise, due to network architecture, client disconnection will not stop the infinite loop, and by default there will be no response at all - just infinite loading. Various timeout options (one of which harakiri is) may end up showing connection timeout - for example, php has (as far as i remember) default timeout of 30 seconds and it will return 504 gateway timeout. Socket disconnection timeout depends on http server settings and it will not stop application thread, it will only close client socket.
If not using gevent (or any other green threads), infinite loop will tend to take up 100% of available CPU power (limited to one core), possibly eating up more and more memory, so your website will work pretty slow and/or timeout really quick. Django itself is not aware of request time, so - as mentioned before - your production environment stack is the way to prevent this from happening. In case of uwsgi, http://uwsgi-docs.readthedocs.org/en/latest/Options.html#harakiri-verbose is the way to go.
Harakiri does print stack trace of the killed proces: (https://uwsgi-docs.readthedocs.org/en/latest/Tracebacker.html?highlight=harakiri) straight to uwsgi log, and due to alarm system you can get notified through e-mail (http://uwsgi-docs.readthedocs.org/en/latest/AlarmSubsystem.html)
I just tested this on Django's development server.
Results:
Does not give a timeout after 30 seconds. (this might because its not a production server though)
Stays in loading until i close the page.
I guess one way to avoid it, without actually just avoiding a code like that, would be to use threading to have control of timeouts and be able to stop the thread.
Maybe something like:
import threading
from django.http import HttpResponse
class MyThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
def run(self):
print "your possible infinite loop code here"
def possible_loop_view(request):
thread = MyThread()
thread.start()
return HttpResponse("html response")
Yes, your analysis is correct. The worker thread/process will keep running. Moreover, if there is no wait/sleep in the loop, it will hog the CPU. Other threads/process will get very little cpu, resulting your entire site on slow response.
Also, I don't think server will send any timeout error to client explicitly. If the TCP timeout is set, TCP connection will be closed.
Client may also have some timeout setting to get response, which may come into picture.
Avoiding such code is best way to avoid such code. You can also have some monitoring tool on server to look for CPU/memory usage and notify for abnormal activity so that you can take action.
Related
I'm using grpc.aio.server and I am stuck with a problem that if I try to make a load test on my service it will have some requests lagging for 10 seconds, but the requests are similar. The load is stable (200rps) and the latency of almost all requests are almost the same. I'm ok with higher latency, as long as it's stable. I've tried to google something like async task priority, but in my mind it means that something is wrong with the priority of tasks which wait a very long time, but they're finished or the full request task is waiting to start for a long time.
e.g 1000 requests were sent to the gRPC service, they all have the same logic to execute, the same db instance, the same query to the db, the same time to get results from db, etc, everything is the same. I see that e.g. 10th request latency is 10 seconds, but 13th request latency is 5 seconds. I can also see in logs, that the db queries have almost the same execution time.
Any suggestions? maybe I understand something wrong
There are multiple reasons why this behaviour may happen. Here are a few things that you can take a look at:
What type of workload do you have? Is it I/O bound or CPU bound?
Does your code block the event loop at some point? The path for each request is fully asynchronous? The doc states pretty clear that blocking the event loop is costly:
Blocking (CPU-bound) code should not be called directly. For example, if a function performs a CPU-intensive calculation for 1 second, all concurrent asyncio Tasks and IO operations would be delayed by 1 second.
What happens with the memory when you see those big latencies? You can run a memory profiling using this tool and check the memory. There are high chances to see a correlation between the latency and an intense activity of the Python Memory Manager which tries to reclaim the memory. Here is a nice article around memory management that you can check out.
Every server will have latency differences between requests. However, the scale should be much lower then what you experience
Your question does not have the server initialization code so can't know what config is used. I would start by looking at the thread pool size for the server.
According to the docs the thread pool instance is a required argument so maybe try to set a different pool size. My guess is that the threads are exhausted and then the latency goes up since the request is waiting for a thread to free up
I'm using Python http.client.HTTPResponse.read() to read data from a stream. That is, the server keeps the connection open forever and sends data periodically as it becomes available. There is no expected length of response. In particular, I'm getting Tweets through the Twitter Streaming API.
To accomplish this, I repeatedly call http.client.HTTPResponse.read(1) to get the response, one byte at a time. The problem is that the program will hang on that line if there is no data to read, which there isn't for large periods of time (when no Tweets are coming in).
I'm looking for a method that will get a single byte of the HTTP response, if available, but that will fail instantly if there is no data to read.
I've read that you can set a timeout when the connection is created, but setting a timeout on the connection defeats the whole purpose of leaving it open for a long time waiting for data to come in. I don't want to set a timeout, I want to read data if there is data to be read, or fail if there is not, without waiting at all.
I'd like to do this with what I have now (using http.client), but if it's absolutely necessary that I use a different library to do this, then so be it. I'm trying to write this entirely myself, so suggesting that I use someone else's already-written Twitter API for Python is not what I'm looking for.
This code gets the response, it runs in a separate thread from the main one:
while True:
try:
readByte = dc.request.read(1)
except:
readByte = []
if len(byte) != 0:
dc.responseLock.acquire()
dc.response = dc.response + chr(byte[0])
dc.responseLock.release()
Note that the request is stored in dc.request and the response in dc.response, these are created elsewhere. dc.responseLock is a Lock that prevents dc.response from being accessed by multiple threads at once.
With this running on a separate thread, the main thread can then get dc.response, which contains the entire response received so far. New data is added to dc.response as it comes in without blocking the main thread.
This works perfectly when it's running, but I run into a problem when I want it to stop. I changed my while statement to while not dc.twitterAbort, so that when I want to abort this thread I just set dc.twitterAbort to True, and the thread will stop.
But it doesn't. This thread remains for a very long time afterward, stuck on the dc.request.read(1) part. There must be some sort of timeout, because it does eventually get back to the while statement and stop the thread, but it takes around 10 seconds for that to happen.
How can I get my thread to stop immediately when I want it to, if it's stuck on the call to read()?
Again, this method is working to get Tweets, the problem is only in getting it to stop. If I'm going about this entirely the wrong way, feel free to point me in the right direction. I'm new to Python, so I may be overlooking some easier way of going about this.
Your idea is not new, there are OS mechanisms(*) for making sure that an application is only calling I/O-related system calls when they are guaranteed to be not blocking . These mechanisms are usually used by async I/O frameworks, such as tornado or gevent. Use one of those, and you will find it very easy to run code "while" your application is waiting for an I/O event, such as waiting for incoming data on a socket.
If you use gevent's monkey-patching method, you can proceed using http.client, as requested. You just need to get used to the cooperative scheduling paradigm introduced by gevent/greenlets, in which your execution flow "jumps" between sub-routines.
Of course you can also perform blocking I/O in another thread (like you did), so that it does not affect the responsiveness of your main thread. Regarding your "How can I get my thread to stop immediately" problem:
Forcing a thread that's blocking in a system call to stop is usually not a clean or even valid process (also see Is there any way to kill a Thread in Python?). Either -- if your application has finished its jobs -- you take down the entire process, which also affects all contained threads, or you just leave the thread be and give it as much time to terminate as required (these 10 seconds you were referring to are not a problem -- are they?)
If you do not want to have such long-blocking system calls anywhere in your application (be it in the main thread or not), then use above-mentioned techniques to prevent blocking system calls.
(*) see e.g. O_NONBLOCK option in http://man7.org/linux/man-pages/man2/open.2.html
I'm running a Django application on Apache with mod_wsgi. Will there be any downtime during an upgrade?
Mod_wsgi is running in daemon mode, so I can reload my code by touching the .wsgi script file, as described in the "ReloadingSourceCode" document: http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode. Presumably, that reload requires some non-zero amount of time. What happens if a request comes in during the reload? Will Apache queue the request and then complete it once the wsgi daemon is ready?
The documentation includes the following statement:
So, if you are using Django in daemon mode and needed to change your 'settings.py' file, once you have made the required change, also touch the script file containing the WSGI application entry point. Having done that, on the next request the process will be restarted and your Django application reloaded.
To me, that suggests that Apache will gracefully handle every request, but I thought I would ask to be sure. My app isn't critical (a little downtime wouldn't be disastrous) so the question is mostly academic.
Thank you.
In daemon mode there is no concept of a graceful restart when WSGI script file is touched to force a download. That is, unlike Apache itself, which will start new Apache server child processes while waiting for old processes to finish up with current requests, for mod_wsgi daemon processes, the existing process must exit before a new one starts up.
The consequences of this are that mod_wsgi can't wait indefinitely for current requests to complete. If it did, then there is a risk that if all daemon processes are tied up waiting for current requests to finish, that clients would see a noticeable delay in being handled.
At the other end of the scale however, the daemon process can't be immediately killed as that would cause current requests to be interrupted.
A middle ground therefore exists. The daemon process will wait for requests to finish before exiting, but if they haven't completed within the shutdown period, then the daemon process will be forcibly quit and the active requests will be interrupted.
The period of this shutdown timeout defaults to 5 seconds. It can be overridden using the shutdown-timeout option to WSGIDaemonProcess directive, but due consideration should be given to the effects of changing it.
Thus, in respect of this specific issue, if you have long running requests still active when the first request comes in after you touched the WSGI script file, there is the risk that the active long requests will be interrupted.
The next notable thing you may see is that even if there are no long running requests and processes shutdown promptly, then it is still necessary to load up the WSGI application again within the new process. The time this takes will be seen as a delay in handling the request. How big that delay is will depend on the framework and your application. The worst offender as far as time taken to start up that I know of is TurboGears. Django somewhat better and the best as far as quick start up times being lightweight micro frameworks such as Flask.
Do note that any new requests which come in while these shutdown and startup delays occur should not be lost. This is because the HTTP listener socket has a certain depth and connections queue up in that waiting to be accepted. If the number of requests arriving is huge though and that queue fills up, then you will start to see connection refused errors in the browser.
No, there will be no downtime. Requests using the old code will complete, and new requests will use the new code.
There will be a small bit more load on the server as the new code loads but unless your application is colossal and your servers are already nearly overloaded this will be unnoticeable.
This is like the apachectl graceful command for Apache as a whole, which tells it to start a new configuration without downtime.
I'm using threads and xmlrpclib in python at the same time. Periodically, I create a bunch of thread to complete a service on a remote server via xmlrpclib. The problem is that, there are times that the remote server doesn't answer. This causes the thread to wait forever for a response which it never gets. Over time, number of threads in this state increases and will reach the maximum number of allowed threads on the system (I'm using fedora).
I tried to use socket.setdefaulttimeout(10); but the exception that is created by that will cause the server to defunct. I used it at server side but it seems that it doesn't work :/
Any idea how can I handle this issue?
You are doing what I usually call (originally in Spanish xD) "happy road programming". You should implement your programs to handle undesired cases, not only the ones you want to happen.
The threads here are only showing an underlying mistake: your server can't handle a timeout, and the implementation is rigid in a way that adding a timeout causes the server to crash due to an unhandled exception.
Implement it more robustly: it must be able to withstand an exception, servers can't die because of a misbehaving client. If you don't fix this kind of problem now, you may have similar issues later on.
It seems like your real problem is that the server hangs on certain requests, and dies if the client closes the socket - the threads are just a side effect of the implementation. If I'm understanding what you're saying correctly, then the only way to fix this would be to fix the server to respond to all requests, or to be more robust with network failure, or (preferably) both.
I'm using CherryPy in order to serve a python application through WSGI.
I tried benchmarking it, but it seems as if CherryPy can only handle exactly 10 req/sec. No matter what I do.
Built a simple app with a 3 second pause, in order to accurately determine what is going on... and I can confirm that the 10 req/sec has nothing to do with the resources used by the python script.
__
Any ideas?
By default, CherryPy's builtin HTTP server will use a thread pool with 10 threads. If you are still using the defaults, you could try increasing this in your config file.
[global]
server.thread_pool = 30
See the cpserver documentation
Or the archive.org copy of the old documentation
This was extremely confounding for me too. The documentation says that CherryPy will automatically scale its thread pool based on observed load. But my experience is that it will not. If you have tasks which might take a while and may also use hardly any CPU in the mean time, then you will need to estimate a thread_pool size based on your expected load and target response time.
For instance, if the average request will take 1.5 seconds to process and you want to handle 50 requests per second, then you will need 75 threads in your thread_pool to handle your expectations.
In my case, I delegated the heavy lifting out to other processes via the multiprocessing module. This leaves the main CherryPy process and threads at idle. However, the CherryPy threads will still be blocked awaiting output from the delegated multiprocessing processes. For this reason, the server needs enough threads in the thread_pool to have available threads for new requests.
My initial thinking is that the thread_pool would not need to be larger than the multiprocessing pool worker size. But this turns out also to be a misled assumption. Somehow, the CherryPy threads will remain blocked even where there is available capacity in the multiprocessing pool.
Another misled assumption is that the blocking and poor performance have something to do with the Python GIL. It does not. In my case I was already farming out the work via multiprocessing and still needed a thread_pool sized on the average time and desired requests per second target. Raising the thread_pool size addressed the issue. Although it looks like and incorrect fix.
Simple fix for me:
cherrypy.config.update({
'server.thread_pool': 100
})
Your client needs to actually READ the server's response. Otherwise the socket/thread will stay open/running until timeout and garbage collected.
use a client that behaves correctly and you'll see that your server will behave too.