Currently I have a basic HTTP server set up using BaseHTTPRequestHandler and I use the do_GET method of the same. Id like a function check to be invoked if a request does not come in for 5 seconds.
I'm considering using multiprocessing along with the time module for the same, but I'm concerned about its reliability. Are there any suggestions for best practices relating to the same?
Thanks.
[EDIT]
Marjin's solution is really cool but I end up with the following traceback :-
Traceback (most recent call last):
File "test.py", line 89, in <module>
main()
File "test.py", line 83, in main
server.serve_forever()
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/SocketServer.py", line 224, in serve_forever
r, w, e = select.select([self], [], [], poll_interval)
select.error: (4, 'Interrupted system call')
[EDIT 2]
I tried it on Python 2.7 but the error still occurs.
[EDIT 3]
Traceback (most recent call last):
File "test.py", line 90, in <module>
main()
File "test.py", line 84, in main
server.serve_forever()
File "/usr/local/lib/python2.7/SocketServer.py", line 225, in serve_forever
r, w, e = select.select([self], [], [], poll_interval)
select.error: (4, 'Interrupted system call')
For a simple server such as one based on BaseHTTPRequestHandler you could use a signal handler:
import time
import signal
import sys
last_request = sys.maxint # arbitrary high value to *not* trigger until there has been 1 requests at least
def itimer_handler(signum, frame):
print 'itimer heartbeat'
if time.time() - last_request > 300: # 5 minutes have passed at least with no request
# do stuff now to log, kill, restart, etc.
print 'Timeout, no requests for 5 minutes!'
signal.signal(signal.SIGALRM, itimer_handler)
signal.setitimer(signal.ITIMER_REAL, 30, 30) # check for a timeout every 30 seconds
# ...
def do_GET(..):
global last_request
last_request = time.time() # reset the timer again
The signal.setitimer() call causes the OS to send a periodic SIGALRM signal to our process. This isn't too precise; the setitimer) call is set for 30 second intervals. Any incoming request resets a global timestamp and the itimer_handler being called every 30 seconds compares checks if 5 minutes have passed since the last time the timestamp has been set.
The SIGALRM signal will interrupt a running request as well, so whatever you do in that handler needs to finish quickly. When the function returns the normal python code flow resumes, just like a thread.
Note that this requires at least Python 2.7.4 for this to work; see issue 7978, and 2.7.4 is not yet released. You can either download the SocketServer.py file that will be included in Python 2.7.4, or you could apply the following backport to add the errorno.EINTR handling introduced in that version:
'''Backport of 2.7.4 EINTR handling'''
import errno
import select
import SocketServer
def _eintr_retry(func, *args):
"""restart a system call interrupted by EINTR"""
while True:
try:
return func(*args)
except (OSError, select.error) as e:
if e.args[0] != errno.EINTR:
raise
def serve_forever(self, poll_interval=0.5):
"""Handle one request at a time until shutdown.
Polls for shutdown every poll_interval seconds. Ignores
self.timeout. If you need to do periodic tasks, do them in
another thread.
"""
self._BaseServer__is_shut_down.clear()
try:
while not self._BaseServer__shutdown_request:
# XXX: Consider using another file descriptor or
# connecting to the socket to wake this up instead of
# polling. Polling reduces our responsiveness to a
# shutdown request and wastes cpu at all other times.
r, w, e = _eintr_retry(select.select, [self], [], [],
poll_interval)
if self in r:
self._handle_request_noblock()
finally:
self._BaseServer__shutdown_request = False
self._BaseServer__is_shut_down.set()
def handle_request(self):
"""Handle one request, possibly blocking.
Respects self.timeout.
"""
# Support people who used socket.settimeout() to escape
# handle_request before self.timeout was available.
timeout = self.socket.gettimeout()
if timeout is None:
timeout = self.timeout
elif self.timeout is not None:
timeout = min(timeout, self.timeout)
fd_sets = _eintr_retry(select.select, [self], [], [], timeout)
if not fd_sets[0]:
self.handle_timeout()
return
self._handle_request_noblock()
# patch in updated methods
SocketServer.BaseServer.serve_forever = serve_forever
SocketServer.BaseServer.handle_request = handle_request
Related
This question already has answers here:
Paho MQTT Python Client: No exceptions thrown, just stops
(3 answers)
Closed 2 years ago.
I have a place in my code where I made a mistake in the name of the key of a dict. It took some time to understand why the code was not running past that place because a traceback was not thrown.
The code is below, I put it for completeness, highlighting with →→→ the place where the issue is:
class Alert:
lock = threading.Lock()
sent_alerts = {}
#staticmethod
def start_alert_listener():
# load existing alerts to keep persistancy
try:
with open("sent_alerts.json") as f:
json.load(f)
except FileNotFoundError:
# there is no file, never mind - it will be created at some point
pass
# start the listener
log.info("starting alert listener")
client = paho.mqtt.client.Client()
client.on_connect = Alert.mqtt_connection_alert
client.on_message = Alert.alert
client.connect("mqtt.XXXX", 1883, 60)
client.loop_forever()
#staticmethod
def mqtt_connection_alert(client, userdata, flags, rc):
if rc:
log.critical(f"error connecting to MQTT: {rc}")
sys.exit()
topic = "monitor/+/state"
client.subscribe(topic)
log.info(f"subscribed alert to {topic}")
#staticmethod
def alert(client, userdata, msg):
event = json.loads(msg.payload)
log.debug(f"received alert: {event}")
→→→ if event['ok']:
# remove existing sent flag, not thread safe!
with Alert.lock:
Alert.sent_alerts.pop(msg['id'], None)
return
(...)
The log coming from the line just above is
2021-01-14 22:03:02,617 [monitor] DEBUG received alert: {'full_target_name': 'ThisAlwaysFails → a failure', 'isOK': False, 'why': 'explicit fail', 'extra': None, 'id': '6507a61c9688199a34cb006b354c8433', 'last': '2021-01-14T22:03:02.612912+01:00', 'last_ko': '2021-01-14T22:03:02.612912+01:00'}
This is the dict in which I am trying to erroneously access ok, which should raise an exception and a traceback. But nothing happens. The code does not further than that as if the error was silently discarded (and the method silently fails).
I tried to put a raise Exception("wazaa") between the log.debug() and the if - same thing, the method fails at that point but an exception is not raised.
I am at loss about the reason where an exception could not be visible though a traceback?
The alert() method is called in a separate thread, if this matters. For completeness I tried the follwong code just to make sure threading does not interfere but no (I do not see a reason why it should)
import threading
class Hello:
#staticmethod
def a():
raise Exception("I am an exception")
threading.Thread(target=Hello.a).start()
outputs
Exception in thread Thread-1:
Traceback (most recent call last):
File "C:\Python38\lib\threading.py", line 932, in _bootstrap_inner
self.run()
File "C:\Python38\lib\threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "C:/Users/yop/AppData/Roaming/JetBrains/PyCharm2020.3/scratches/scratch_1.py", line 7, in a
raise Exception("I am an exception")
Exception: I am an exception
It appears to call your callback within a try, then logs that error:
try:
self.on_message(self, self._userdata, message)
except Exception as err:
self._easy_log(
MQTT_LOG_ERR, 'Caught exception in on_message: %s', err)
if not self.suppress_exceptions:
raise
What I can't explain however is why the exception isn't being raised. I can't see why self.suppress_exceptions would be true for you since you never set it, but try:
Manually setting suppress_exceptions using client.suppress_exceptions = False. This shouldn't be necessary since that appears to be the default, but it's worth a try.
Checking the log that it apparently maintains. You'll need to refer to the docs though on how to do that, since I've never touched this library before.
I have a client-server application using envisage framework, I'm using threads to handle the connection, here is a token from the code
....
SocketServer.TCPServer.allow_reuse_address = True
self.server = TCPFactory( ( HOST, PORT ), TCPRequestHandler, self.application)
self.server_thread = threading.Thread( target = self.server.serve_forever )
self.server_thread.setDaemon( True )
self.server_thread.start()
class TCPFactory( SocketServer.ThreadingTCPServer ):
def __init__( self, server_address, RequestHandlerClass, application ):
SocketServer.ThreadingTCPServer.__init__( self, server_address, RequestHandlerClass )
self.application = application
class TCPRequestHandler( SocketServer.BaseRequestHandler ):
""""""
def setup( self ):
.....
In the envisage framework I call the open_file( ) function, which give us a popup window, but when this window appear than I'm receiving the following error
Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 504, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/lib/python2.7/SocketServer.py", line 225, in serve_forever
r, w, e = select.select([self], [], [], poll_interval)
error: (4, 'Interrupted system call')
How can I handle this error?
After Armin Rigo comment, I modified the SockeServer.py
def serve_forever(self, poll_interval=0.5):
"""Handle one request at a time until shutdown.
Polls for shutdown every poll_interval seconds. Ignores
self.timeout. If you need to do periodic tasks, do them in
another thread.
"""
self.__is_shut_down.clear()
try:
while not self.__shutdown_request:
# XXX: Consider using another file descriptor or
# connecting to the socket to wake this up instead of
# polling. Polling reduces our responsiveness to a
# shutdown request and wastes cpu at all other times.
try:
r, w, e = select.select([self], [], [], poll_interval)
except select.error as ex:
#print ex
if ex[0] == 4:
continue
else:
raise
if self in r:
self._handle_request_noblock()
finally:
self.__shutdown_request = False
self.__is_shut_down.set()
I just ran into a similar problem when I added a little httpd server to a program, it receives various signals from other processes. After playing around I came up with a simple solution that avoids actually modifying stlib code, but I'm thinking it's a little risky. I simply wrapped the serve_forever call in a loop that catches and ignores socket errors:
def non_int_serve_forever(self, poll_interval=0.5):
while 1:
try:
self.serve_forever(poll_interval=poll_interval)
break
except select.error:
pass
This removes the risk of needing different solutions for different versions of SocketServer.py, but it's not obvious that serve_forever() should be restartable multiple times, even though it appears to work now.
Any thoughts?
I have code for reading an url like this:
from urllib2 import Request, urlopen
req = Request(url)
for key, val in headers.items():
req.add_header(key, val)
res = urlopen(req, timeout = timeout)
# This line blocks
content = res.read()
The timeout works for the urlopen() call. But then the code gets to the res.read() call where I want to read the response data and the timeout isn't applied there. So the read call may hang almost forever waiting for data from the server. The only solution I've found is to use a signal to interrupt the read() which is not suitable for me since I'm using threads.
What other options are there? Is there a HTTP library for Python that handles read timeouts? I've looked at httplib2 and requests and they seem to suffer the same issue as above. I don't want to write my own nonblocking network code using the socket module because I think there should already be a library for this.
Update: None of the solutions below are doing it for me. You can see for yourself that setting the socket or urlopen timeout has no effect when downloading a large file:
from urllib2 import urlopen
url = 'http://iso.linuxquestions.org/download/388/7163/http/se.releases.ubuntu.com/ubuntu-12.04.3-desktop-i386.iso'
c = urlopen(url)
c.read()
At least on Windows with Python 2.7.3, the timeouts are being completely ignored.
It's not possible for any library to do this without using some kind of asynchronous timer through threads or otherwise. The reason is that the timeout parameter used in httplib, urllib2 and other libraries sets the timeout on the underlying socket. And what this actually does is explained in the documentation.
SO_RCVTIMEO
Sets the timeout value that specifies the maximum amount of time an input function waits until it completes. It accepts a timeval structure with the number of seconds and microseconds specifying the limit on how long to wait for an input operation to complete. If a receive operation has blocked for this much time without receiving additional data, it shall return with a partial count or errno set to [EAGAIN] or [EWOULDBLOCK] if no data is received.
The bolded part is key. A socket.timeout is only raised if not a single byte has been received for the duration of the timeout window. In other words, this is a timeout between received bytes.
A simple function using threading.Timer could be as follows.
import httplib
import socket
import threading
def download(host, path, timeout = 10):
content = None
http = httplib.HTTPConnection(host)
http.request('GET', path)
response = http.getresponse()
timer = threading.Timer(timeout, http.sock.shutdown, [socket.SHUT_RD])
timer.start()
try:
content = response.read()
except httplib.IncompleteRead:
pass
timer.cancel() # cancel on triggered Timer is safe
http.close()
return content
>>> host = 'releases.ubuntu.com'
>>> content = download(host, '/15.04/ubuntu-15.04-desktop-amd64.iso', 1)
>>> print content is None
True
>>> content = download(host, '/15.04/MD5SUMS', 1)
>>> print content is None
False
Other than checking for None, it's also possible to catch the httplib.IncompleteRead exception not inside the function, but outside of it. The latter case will not work though if the HTTP request doesn't have a Content-Length header.
I found in my tests (using the technique described here) that a timeout set in the urlopen() call also effects the read() call:
import urllib2 as u
c = u.urlopen('http://localhost/', timeout=5.0)
s = c.read(1<<20)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/socket.py", line 380, in read
data = self._sock.recv(left)
File "/usr/lib/python2.7/httplib.py", line 561, in read
s = self.fp.read(amt)
File "/usr/lib/python2.7/httplib.py", line 1298, in read
return s + self._file.read(amt - len(s))
File "/usr/lib/python2.7/socket.py", line 380, in read
data = self._sock.recv(left)
socket.timeout: timed out
Maybe it's a feature of newer versions? I'm using Python 2.7 on a 12.04 Ubuntu straight out of the box.
One possible (imperfect) solution is to set the global socket timeout, explained in more detail here:
import socket
import urllib2
# timeout in seconds
socket.setdefaulttimeout(10)
# this call to urllib2.urlopen now uses the default timeout
# we have set in the socket module
req = urllib2.Request('http://www.voidspace.org.uk')
response = urllib2.urlopen(req)
However, this only works if you're willing to globally modify the timeout for all users of the socket module. I'm running the request from within a Celery task, so doing this would mess up timeouts for the Celery worker code itself.
I'd be happy to hear any other solutions...
I'd expect this to be a common problem, and yet - no answers to be found anywhere... Just built a solution for this using timeout signal:
import urllib2
import socket
timeout = 10
socket.setdefaulttimeout(timeout)
import time
import signal
def timeout_catcher(signum, _):
raise urllib2.URLError("Read timeout")
signal.signal(signal.SIGALRM, timeout_catcher)
def safe_read(url, timeout_time):
signal.setitimer(signal.ITIMER_REAL, timeout_time)
url = 'http://uberdns.eu'
content = urllib2.urlopen(url, timeout=timeout_time).read()
signal.setitimer(signal.ITIMER_REAL, 0)
# you should also catch any exceptions going out of urlopen here,
# set the timer to 0, and pass the exceptions on.
The credit for the signal part of the solution goes here btw: python timer mystery
Any asynchronous network library should allow to enforce the total timeout on any I/O operation e.g., here's gevent code example:
#!/usr/bin/env python2
import gevent
import gevent.monkey # $ pip install gevent
gevent.monkey.patch_all()
import urllib2
with gevent.Timeout(2): # enforce total timeout
response = urllib2.urlopen('http://localhost:8000')
encoding = response.headers.getparam('charset')
print response.read().decode(encoding)
And here's asyncio equivalent:
#!/usr/bin/env python3.5
import asyncio
import aiohttp # $ pip install aiohttp
async def fetch_text(url):
response = await aiohttp.get(url)
return await response.text()
text = asyncio.get_event_loop().run_until_complete(
asyncio.wait_for(fetch_text('http://localhost:8000'), timeout=2))
print(text)
The test http server is defined here.
pycurl.TIMEOUT option works for the whole request:
#!/usr/bin/env python3
"""Test that pycurl.TIMEOUT does limit the total request timeout."""
import sys
import pycurl
timeout = 2 #NOTE: it does limit both the total *connection* and *read* timeouts
c = pycurl.Curl()
c.setopt(pycurl.CONNECTTIMEOUT, timeout)
c.setopt(pycurl.TIMEOUT, timeout)
c.setopt(pycurl.WRITEFUNCTION, sys.stdout.buffer.write)
c.setopt(pycurl.HEADERFUNCTION, sys.stderr.buffer.write)
c.setopt(pycurl.NOSIGNAL, 1)
c.setopt(pycurl.URL, 'http://localhost:8000')
c.setopt(pycurl.HTTPGET, 1)
c.perform()
The code raises the timeout error in ~2 seconds. I've tested the total read timeout with the server that sends the response in multiple chunks with the time less than the timeout between chunks:
$ python -mslow_http_server 1
where slow_http_server.py:
#!/usr/bin/env python
"""Usage: python -mslow_http_server [<read_timeout>]
Return an http response with *read_timeout* seconds between parts.
"""
import time
try:
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer, test
except ImportError: # Python 3
from http.server import BaseHTTPRequestHandler, HTTPServer, test
def SlowRequestHandlerFactory(read_timeout):
class HTTPRequestHandler(BaseHTTPRequestHandler):
def do_GET(self):
n = 5
data = b'1\n'
self.send_response(200)
self.send_header("Content-type", "text/plain; charset=utf-8")
self.send_header("Content-Length", n*len(data))
self.end_headers()
for i in range(n):
self.wfile.write(data)
self.wfile.flush()
time.sleep(read_timeout)
return HTTPRequestHandler
if __name__ == "__main__":
import sys
read_timeout = int(sys.argv[1]) if len(sys.argv) > 1 else 5
test(HandlerClass=SlowRequestHandlerFactory(read_timeout),
ServerClass=HTTPServer)
I've tested the total connection timeout with http://google.com:22222.
This isn't the behavior I see. I get a URLError when the call times out:
from urllib2 import Request, urlopen
req = Request('http://www.google.com')
res = urlopen(req,timeout=0.000001)
# Traceback (most recent call last):
# File "<stdin>", line 1, in <module>
# ...
# raise URLError(err)
# urllib2.URLError: <urlopen error timed out>
Can't you catch this error and then avoid trying to read res?
When I try to use res.read() after this I get NameError: name 'res' is not defined. Is something like this what you need:
try:
res = urlopen(req,timeout=3.0)
except:
print 'Doh!'
finally:
print 'yay!'
print res.read()
I suppose the way to implement a timeout manually is via multiprocessing, no? If the job hasn't finished you can terminate it.
Had the same issue with socket timeout on the read statement. What worked for me was putting both the urlopen and the read inside a try statement. Hope this helps!
We suddenly started see "Interrupted system call" on Queue operations like this:
Exception in thread Thread-2:
Traceback (most recent call last):
[ . . . ]
result = self.pager.results.get(True, self.WAIT_SECONDS)
File "/usr/lib/python2.5/site-packages/processing-0.52-py2.5-linux-x86_64.egg/processing/queue.py", line 128, in get
if not self._poll(block and (deadline-time.time()) or 0.0):
IOError: [Errno 4] Interrupted system call
This is a Fedora 10 / Python 2.5 machine that recently had a security update. Prior to that our software had run for about a year without incident, now it is crashing daily.
Is it correct/necessary to catch this exception and retry the Queue operation?
We don't have any signal handlers that we set, but this is a Tkinter app maybe it sets some. Is it safe to clear the SIGINT handler, would that solve the problem? Thanks.
Based on this thread on comp.lang.python and this reply from Dan Stromberg I wrote a RetryQueue which is a drop-in replacement for Queue and which does the job for us:
from multiprocessing.queues import Queue
import errno
def retry_on_eintr(function, *args, **kw):
while True:
try:
return function(*args, **kw)
except IOError, e:
if e.errno == errno.EINTR:
continue
else:
raise
class RetryQueue(Queue):
"""Queue which will retry if interrupted with EINTR."""
def get(self, block=True, timeout=None):
return retry_on_eintr(Queue.get, self, block, timeout)
Finally this got fixed in python itself, so the other solution is to update to a newer python:
http://bugs.python.org/issue17097
I am currently writing a nginx proxy server module with a Request queue in front, so the requests are not dropped when the servers behind the nginx can't handle the requests (nginx is configured as a load balancer).
I am using
from BaseHTTPServer import HTTPServer, BaseHTTPRequestHandler
The idea is to put the request in a queue before handling them. I know multiprocessing.Queue supports only simple object and cannot support raw sockets, so I tried using a multiprocess.Manager to make a shared dictionary. The Manager also uses sockets for connection, so this method failed too. Is there a way to share network sockets between processes?
Here is the problematic part of the code:
class ProxyServer(Threader, HTTPServer):
def __init__(self, server_address, bind_and_activate=True):
HTTPServer.__init__(self, server_address, ProxyHandler,
bind_and_activate)
self.manager = multiprocessing.Manager()
self.conn_dict = self.manager.dict()
self.ticket_queue = multiprocessing.Queue(maxsize= 10)
self._processes = []
self.add_worker(5)
def process_request(self, request, client):
stamp = time.time()
print "We are processing"
self.conn_dict[stamp] = (request, client) # the program crashes here
#Exception happened during processing of request from ('172.28.192.34', 49294)
#Traceback (most recent call last):
# File "/usr/lib64/python2.6/SocketServer.py", line 281, in _handle_request_noblock
# self.process_request(request, client_address)
# File "./nxproxy.py", line 157, in process_request
# self.conn_dict[stamp] = (request, client)
# File "<string>", line 2, in __setitem__
# File "/usr/lib64/python2.6/multiprocessing/managers.py", line 725, in _callmethod
# conn.send((self._id, methodname, args, kwds))
#TypeError: expected string or Unicode object, NoneType found
self.ticket_queue.put(stamp)
def add_worker(self, number_of_workers):
for worker in range(number_of_workers):
print "Starting worker %d" % worker
proc = multiprocessing.Process(target=self._worker, args = (self.conn_dict,))
self._processes.append(proc)
proc.start()
def _worker(self, conn_dict):
while 1:
ticket = self.ticket_queue.get()
print conn_dict
a=0
while a==0:
try:
request, client = conn_dict[ticket]
a=1
except Exception:
pass
print "We are threading!"
self.threader(request, client)
U can use multiprocessing.reduction to transfer the connection and socket objects between processes
Example Code
# Main process
from multiprocessing.reduction import reduce_handle
h = reduce_handle(client_socket.fileno())
pipe_to_worker.send(h)
# Worker process
from multiprocessing.reduction import rebuild_handle
h = pipe.recv()
fd = rebuild_handle(h)
client_socket = socket.fromfd(fd, socket.AF_INET, socket.SOCK_STREAM)
client_socket.send("hello from the worker process\r\n")
Looks like you need to pass file descriptors between processes (assuming Unix here, no clue about Windows). I've never done this in Python, but here is link to python-passfd project that you might want to check.
You can look at this code - https://gist.github.com/sunilmallya/4662837 which is
multiprocessing.reduction socket server with parent processing passing connections to client after accepting connections