I am trying to handle a hard to duplicate error case using TCPServer. We have seen a few occurrences that when a socket timeout happens in the handler, the code never recovers and keeps returning the socket.timeout exception.
This looks to be from the following snippet in the Socket.py library code:
def readinto(self, b):
"""Read up to len(b) bytes into the writable buffer *b* and return
the number of bytes read. If the socket is non-blocking and no bytes
are available, None is returned.
If *b* is non-empty, a 0 return value indicates that the connection
was shutdown at the other end.
"""
self._checkClosed()
self._checkReadable()
if self._timeout_occurred:
raise OSError("cannot read from timed out object")
while True:
try:
return self._sock.recv_into(b)
except timeout:
self._timeout_occurred = True
raise
except InterruptedError:
continue
except error as e:
if e.args[0] in _blocking_errnos:
return None
raise
once a timeout has occurred _timeout_occurred is set to True, and the next pass into this function, the socket has the flag set, and will immediately exit with the cannot read from timed out object error.
Now the code uses TCP Server (only relevant code included) Basically it is reading stuff from the socket, and queing it to be handled separately.
def get_event(file_):
pre_package_len = 8
msg = file_.read(pre_package_len)
if len(msg) == pre_package_len:
pkg = PRE_PACKAGE_FRAME.unpack(msg)
msg = file_.read(pkg['len'])
logger.debug('recv: type: %s len: %s bytes read: %s',
pkg['type'], pkg['len'], len(msg))
if len(msg) >= pkg['len']:
if pkg['type'] == cdefs.kNotification:
e = EVENT_FRAME.unpack(msg)
return decode_event(e)
logger.warn('received unsupported package type: %s', pkg['type'])
else:
logger.error('failed to recv')
class _EventHandler(StreamRequestHandler):
def handle(self):
logger.debug("Got event from %s", self.client_address)
try:
e = get_event(self.rfile)
if e:
self.q.put(e)
except socket.timeout:
logger.error('timed out reading event')
def process_event(q, handler, shutdown_sentinel):
for e in iter(q.get, shutdown_sentinel):
try:
handler(e)
except Exception:
logger.exception('Unhandled exception handling event: %s', e)
logger.info('exiting')
def eventhandler_maker(q, timeout):
return type('NewEventHandler',
(_EventHandler, object),
dict(q=q, timeout=timeout))
def process_events(handler, address, timeout=20):
sentinel = object()
q = Queue()
eventhandler = eventhandler_maker(q, timeout)
server = TCPServer(address, eventhandler)
start_thread(server.serve_forever)
start_thread(process_event, (q, handler, sentinel))
def shutdown():
logger.info('shutting down')
q.put(sentinel)
server.shutdown()
def add_event(e):
q.put(e)
return shutdown, add_event
The symptoms are that once the timeout happens, the log keeps showing 'timed out reading event' and the code never does anything anymore. I added code to dump out the server.socket.gettimeout() and socket.getdefaulttimeout() and both return None. This application is running on an embedded Linux 3.10 kernel with python 3.4.0.
I have 2 questions here:
What is a good recovery strategy here? Shutdown() / Close() the server socket and then restart it? Or are there better strategies?
Is there a good third party tool to provoke a timeout so a recovery strategy can be proven to be correct?
Related
sock1.settimeout(2)
conn.settimeout(1) #conn comes from sock1
except socket.timeout, e:
print <responsible socket>
Is there a way to distinguish the socket responsible for the timeout?
Perhaps I'm doing something wrong if I have two sockets that are timing out.
As far as I can tell, there's nothing in the socket.timeout exception object that identifies the socket. So you need to keep track of which socket you're reading from, that will be the one that timed out:
try:
cursock = sock1
data = sock1.recv(bufsize)
cursock = conn
data1 = conn.recv(bufsize)
except socket.timeout, e:
print cursock
Or you could wrap try/except around each recv call. You could put this into a helper function:
def try_recv(sock, bufsize, flags=0):
try:
return sock.recv(bufsize, flag)
except socket.timeout, e:
print sock
I have a socket-connection going on and I wanna improve the exception handling and I'm stuck. Whenever I call socket.connect(server_address) with an invalid argument the program stops, but doesn't seem to raise an exception. Here is my code:
import socket
import sys
import struct
class ARToolkit():
def __init__(self):
self.x = 0
self.y = 0
self.z = 0
self.sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self.logging = False
def connect(self,server_address):
try:
self.sock.connect(server_address)
except socket.error, msg:
print "Couldnt connect with the socket-server: %s\n terminating program" % msg
sys.exit(1)
def initiate(self):
self.sock.send("start_logging")
def log(self):
self.logging = True
buf = self.sock.recv(6000)
if len(buf)>0:
nbuf = buf[len(buf)-12:len(buf)]
self.x, self.y, self.z = struct.unpack("<iii", nbuf)
def stop_logging(self):
print "Stopping logging"
self.logging = False
self.sock.close()
The class maybe looks a bit wierd but its used for receiving coordinates from another computer running ARToolKit. Anyway, the issue is at the function connect():
def connect(self,server_address):
try:
self.sock.connect(server_address)
except socket.error, msg:
print "Couldnt connect with the socket-server: %s\n terminating program" % msg
sys.exit(1)
If I call that function with a random IP-address and portnumber the whole program just stops up at the line:
self.sock.connect(server_address)
The documentation I've read states that in case of an error it will throw a socket.error-exception. I've also tried with just:
except Exception, msg:
This, if I'm not mistaken, will catch any exceptions, and still it yields no result. I would be very grateful for a helping hand. Also, is it okay to exit programs using sys.exit when an unwanted exception occurs?
Thank you
If you have chosen a random, but valid, IP address and port, socket.connect() will attempt to make a connection to that endpoint. By default, if no explicit timeout is set for the socket, it will block while doing so and eventually timeout, raising exception socket.error: [Errno 110] Connection timed out.
The default timeout on my machine is 120 seconds. Perhaps you are not waiting long enough for socket.connect() to return (or timeout)?
You can try reducing the timeout like this:
import socket
s = socket.socket()
s.settimeout(5) # 5 seconds
try:
s.connect(('123.123.123.123', 12345)) # "random" IP address and port
except socket.error, exc:
print "Caught exception socket.error : %s" % exc
Note that if a timeout is explicitly set for the socket, the exception will be socket.timeout which is derived from socket.error and will therefore be caught by the above except clause.
The problem with your last general exception is the colon placement. It needs to be after the entire exception, not after the except statement. Thus to capture all exceptions you would need to do:
except Exception,msg:
However from Python 2.6+ you should use the as statement instead of a comma like so:
except Exception as msg:
I was able to run the code fine (note you need to throw in a tuple to the connect method). If you want to specifically catch only socket errors then you would need to except the socket.error class. Like you have:
except socket.error as msg:
If you want to make sure that a tuple is entered simply add another exception loop:
except socket.error as msg:
print "Socket Error: %s" % msg
except TypeError as msg:
print "Type Error: %s" % msg
I am writing a connector using UDP in Python 3.3
When I am sending data to the UDP port, everything works fine. The problem is that when I am not sending any data, I get an error generated by the receiving port once per minute that says "timed out". While debugging, I used the socket.gettimeout() function and it returned 'None'.
Why am I getting this timeout error? Any help would be greatly appreciated!
import socket
from EventArgs import EventArgs
import logging
class UDPServer(object):
"""description of class"""
def __init__(self, onMessageReceivedEvent = '\x00'):
self.__onMessageReceivedEvent = onMessageReceivedEvent
self.__s = '\x00'
self.__r = '\x00'
def openReceivePort(self,port):
try:
self.__r = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
self.__r.bind(("",port))
print ("opening port: ", port)
except socket.error as e:
logging.getLogger("ConnectorLogger").critical(e)
def openBroadcastPort(self):
try:
self.__s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
self.__s.bind(("",2101))
print ("opening port: ", 2101)
except socket.error as e:
logging.getLogger("ConnectorLogger").critical(e)
def closePorts():
if self.__r != '\x00':
self.__r.close()
if self.__s != '\x00':
self.__s.close()
def getUDPData(self):
try:
data, addr = self.__r.recvfrom(1024)
if self.__onMessageReceivedEvent != '\x00':
eventArgs = EventArgs()
eventArgs.Addr = addr
eventArgs.Data = data
self.__onMessageReceivedEvent.fire(self, eventArgs)
except socket.error as e:
logging.getLogger("ConnectorLogger").critical(e)
def send(self,ipAddress,port,message):
try:
self.__s.sendto(message.encode(),(ipAddress,23456))
except socket.error as e:
logging.getLogger("ConnectorLogger").critical(e)
I figured out the answer to my own problem. I was using the default configuration for socket.setblocking which is 0 (non-blocking). The documentation says that using this configuration is the equivalent of using a settimeout value of 0. If I use a blocking socket, it is the equivalent of using a settimeout value of 'None'. Once I changed to a blocking socket I no longer saw this error.
socket.setblocking(flag)-Set blocking or non-blocking mode of the socket: if flag is 0, the socket is set to non- blocking, else to blocking mode. Initially all sockets are in blocking mode. In non-blocking mode, if a recv() call doesn’t find any data, or if a send() call can’t immediately dispose of the data, a error exception is raised; in blocking mode, the calls block until they can proceed. s.setblocking(0) is equivalent to s.settimeout(0.0); s.setblocking(1) is equivalent to s.settimeout(None)*
Basically, I've read in several places that socket.recv() will return whatever it can read, or an empty string signalling that the other side has shut down (the official docs don't even mention what it returns when the connection is shut down... great!). This is all fine and dandy for blocking sockets, since we know that recv() only returns when there actually is something to receive, so when it returns an empty string, it MUST mean the other side has closed the connection, right?
Okay, fine, but what happens when my socket is non-blocking?? I have searched a bit (maybe not enough, who knows?) and can't figure out how to tell when the other side has closed the connection using a non-blocking socket. There seems to be no method or attribute that tells us this, and comparing the return value of recv() to the empty string seems absolutely useless... is it just me having this problem?
As a simple example, let's say my socket's timeout is set to 1.2342342 (whatever non-negative number you like here) seconds and I call socket.recv(1024), but the other side doesn't send anything during that 1.2342342 second period. The recv() call will return an empty string and I have no clue as to whether the connection is still standing or not...
In the case of a non blocking socket that has no data available, recv will throw the socket.error exception and the value of the exception will have the errno of either EAGAIN or EWOULDBLOCK. Example:
import sys
import socket
import fcntl, os
import errno
from time import sleep
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('127.0.0.1',9999))
fcntl.fcntl(s, fcntl.F_SETFL, os.O_NONBLOCK)
while True:
try:
msg = s.recv(4096)
except socket.error, e:
err = e.args[0]
if err == errno.EAGAIN or err == errno.EWOULDBLOCK:
sleep(1)
print 'No data available'
continue
else:
# a "real" error occurred
print e
sys.exit(1)
else:
# got a message, do something :)
The situation is a little different in the case where you've enabled non-blocking behavior via a time out with socket.settimeout(n) or socket.setblocking(False). In this case a socket.error is stil raised, but in the case of a time out, the accompanying value of the exception is always a string set to 'timed out'. So, to handle this case you can do:
import sys
import socket
from time import sleep
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('127.0.0.1',9999))
s.settimeout(2)
while True:
try:
msg = s.recv(4096)
except socket.timeout, e:
err = e.args[0]
# this next if/else is a bit redundant, but illustrates how the
# timeout exception is setup
if err == 'timed out':
sleep(1)
print 'recv timed out, retry later'
continue
else:
print e
sys.exit(1)
except socket.error, e:
# Something else happened, handle error, exit, etc.
print e
sys.exit(1)
else:
if len(msg) == 0:
print 'orderly shutdown on server end'
sys.exit(0)
else:
# got a message do something :)
As indicated in the comments, this is also a more portable solution since it doesn't depend on OS specific functionality to put the socket into non-blockng mode.
See recv(2) and python socket for more details.
It is simple: if recv() returns 0 bytes; you will not receive any more data on this connection. Ever. You still might be able to send.
It means that your non-blocking socket have to raise an exception (it might be system-dependent) if no data is available but the connection is still alive (the other end may send).
When you use recv in connection with select if the socket is ready to be read from but there is no data to read that means the client has closed the connection.
Here is some code that handles this, also note the exception that is thrown when recv is called a second time in the while loop. If there is nothing left to read this exception will be thrown it doesn't mean the client has closed the connection :
def listenToSockets(self):
while True:
changed_sockets = self.currentSockets
ready_to_read, ready_to_write, in_error = select.select(changed_sockets, [], [], 0.1)
for s in ready_to_read:
if s == self.serverSocket:
self.acceptNewConnection(s)
else:
self.readDataFromSocket(s)
And the function that receives the data :
def readDataFromSocket(self, socket):
data = ''
buffer = ''
try:
while True:
data = socket.recv(4096)
if not data:
break
buffer += data
except error, (errorCode,message):
# error 10035 is no data available, it is non-fatal
if errorCode != 10035:
print 'socket.error - ('+str(errorCode)+') ' + message
if data:
print 'received '+ buffer
else:
print 'disconnected'
Just to complete the existing answers, I'd suggest using select instead of nonblocking sockets. The point is that nonblocking sockets complicate stuff (except perhaps sending), so I'd say there is no reason to use them at all. If you regularly have the problem that your app is blocked waiting for IO, I would also consider doing the IO in a separate thread in the background.
This looks like a duplicate of How do I abort a socket.recv() from another thread in Python, but it's not, since I want to abort recvfrom() in a thread, which is UDP, not TCP.
Can this be solved by poll() or select.select() ?
If you want to unblock a UDP read from another thread, send it a datagram!
Rgds,
Martin
A good way to handle this kind of asynchronous interruption is the old C pipe trick. You can create a pipe and use select/poll on both socket and pipe: Now when you want interrupt receiver you can just send a char to the pipe.
pros:
Can work both for UDP and TCP
Is protocol agnostic
cons:
select/poll on pipes are not available on Windows, in this case you should replace it by another UDP socket that use as notification pipe
Starting point
interruptable_socket.py
import os
import socket
import select
class InterruptableUdpSocketReceiver(object):
def __init__(self, host, port):
self._host = host
self._port = port
self._socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
self._socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
self._r_pipe, self._w_pipe = os.pipe()
self._interrupted = False
def bind(self):
self._socket.bind((self._host, self._port))
def recv(self, buffersize, flags=0):
if self._interrupted:
raise RuntimeError("Cannot be reused")
read, _w, errors = select.select([self._r_pipe, self._socket], [], [self._socket])
if self._socket in read:
return self._socket.recv(buffersize, flags)
return ""
def interrupt(self):
self._interrupted = True
os.write(self._w_pipe, "I".encode())
A test suite:
test_interruptable_socket.py
import socket
from threading import Timer
import time
from interruptable_socket import InterruptableUdpSocketReceiver
import unittest
class Sender(object):
def __init__(self, destination_host, destination_port):
self._socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
self._dest = (destination_host, destination_port)
def send(self, message):
self._socket.sendto(message, self._dest)
class Test(unittest.TestCase):
def create_receiver(self, host="127.0.0.1", port=3010):
receiver = InterruptableUdpSocketReceiver(host, port)
receiver.bind()
return receiver
def create_sender(self, host="127.0.0.1", port=3010):
return Sender(host, port)
def create_sender_receiver(self, host="127.0.0.1", port=3010):
return self.create_sender(host, port), self.create_receiver(host, port)
def test_create(self):
self.create_receiver()
def test_recv_async(self):
sender, receiver = self.create_sender_receiver()
start = time.time()
send_message = "TEST".encode('UTF-8')
Timer(0.1, sender.send, (send_message, )).start()
message = receiver.recv(128)
elapsed = time.time()-start
self.assertGreaterEqual(elapsed, 0.095)
self.assertLess(elapsed, 0.11)
self.assertEqual(message, send_message)
def test_interrupt_async(self):
receiver = self.create_receiver()
start = time.time()
Timer(0.1, receiver.interrupt).start()
message = receiver.recv(128)
elapsed = time.time()-start
self.assertGreaterEqual(elapsed, 0.095)
self.assertLess(elapsed, 0.11)
self.assertEqual(0, len(message))
def test_exception_after_interrupt(self):
sender, receiver = self.create_sender_receiver()
receiver.interrupt()
with self.assertRaises(RuntimeError):
receiver.recv(128)
if __name__ == '__main__':
unittest.main()
Evolution
Now this code is just a starting point. To make it more generic I see we should fix follow issues:
Interface: return empty message in interrupt case is not a good deal, is better to use an exception to handle it
Generalization: we should have just a function to call before socket.recv(), extend interrupt to others recv methods become very simple
Portability: to make simple port it to windows we should isolate the async notification in a object to choose the right implementation for our operating system
First of all we change test_interrupt_async() to check exception instead empty message:
from interruptable_socket import InterruptException
def test_interrupt_async(self):
receiver = self.create_receiver()
start = time.time()
with self.assertRaises(InterruptException):
Timer(0.1, receiver.interrupt).start()
receiver.recv(128)
elapsed = time.time()-start
self.assertGreaterEqual(elapsed, 0.095)
self.assertLess(elapsed, 0.11)
After this we can replace return '' by raise InterruptException and the tests pass again.
The ready to extend version can be :
interruptable_socket.py
import os
import socket
import select
class InterruptException(Exception):
pass
class InterruptableUdpSocketReceiver(object):
def __init__(self, host, port):
self._host = host
self._port = port
self._socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
self._socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
self._async_interrupt = AsycInterrupt(self._socket)
def bind(self):
self._socket.bind((self._host, self._port))
def recv(self, buffersize, flags=0):
self._async_interrupt.wait_for_receive()
return self._socket.recv(buffersize, flags)
def interrupt(self):
self._async_interrupt.interrupt()
class AsycInterrupt(object):
def __init__(self, descriptor):
self._read, self._write = os.pipe()
self._interrupted = False
self._descriptor = descriptor
def interrupt(self):
self._interrupted = True
self._notify()
def wait_for_receive(self):
if self._interrupted:
raise RuntimeError("Cannot be reused")
read, _w, errors = select.select([self._read, self._descriptor], [], [self._descriptor])
if self._descriptor not in read:
raise InterruptException
def _notify(self):
os.write(self._write, "I".encode())
Now wraps more recv function, implement a windows version or take care of socket timeouts become really simple.
The solution here is to forcibly close the socket. The problem is that the method for doing this is OS-specific and Python does not do a good job of abstracting the way to do it or the consequences. Basically, you need to do a shutdown() followed by a close() on the socket. On POSIX systems such as Linux, the shutdown is the key element in forcing recvfrom to stop (a call to close() alone won't do it). On Windows, shutdown() does not affect the recvfrom and the close() is the key element. This is exactly the behavior that you would see if you were implementing this code in C and using either native POSIX sockets or Winsock sockets, so Python is providing a very thin layer on top of those calls.
On both POSIX and Windows systems, this sequence of calls results in an OSError being raised. However, the location of the exception and the details of it are OS-specific. On POSIX systems, the exception is raised on the call to shutdown() and the errno value of the exception is set to 107 (Transport endpoint is not connected). On Windows systems, the exception is raised on the call to recvfrom() and the winerror value of the exception is set to 10038 (An operation was attempted on something that is not a socket). This means that there's no way to do this in an OS-agnositc way, the code has to account for both Windows and POSIX behavior and errors. Here's a simple example I wrote up:
import socket
import threading
import time
class MyServer(object):
def __init__(self, port:int=0):
if port == 0:
raise AttributeError('Invalid port supplied.')
self.port = port
self.socket = socket.socket(family=socket.AF_INET,
type=socket.SOCK_DGRAM)
self.socket.bind(('0.0.0.0', port))
self.exit_now = False
print('Starting server.')
self.thread = threading.Thread(target=self.run_server,
args=[self.socket])
self.thread.start()
def run_server(self, socket:socket.socket=None):
if socket is None:
raise AttributeError('No socket provided.')
buffer_size = 4096
while self.exit_now == False:
data = b''
try:
data, address = socket.recvfrom(buffer_size)
except OSError as e:
if e.winerror == 10038:
# Error is, "An operation was attempted on something that
# is not a socket". We don't care.
pass
else:
raise e
if len(data) > 0:
print(f'Received {len(data)} bytes from {address}.')
def stop(self):
self.exit_now = True
try:
self.socket.shutdown(socket.SHUT_RDWR)
except OSError as e:
if e.errno == 107:
# Error is, "Transport endpoint is not connected".
# We don't care.
pass
else:
raise e
self.socket.close()
self.thread.join()
print('Server stopped.')
if __name__ == '__main__':
server = MyServer(5555)
time.sleep(2)
server.stop()
exit(0)
Implement a quit command on the server and client sockets. Should work something like this:
Thread1:
status: listening
handler: quit
Thread2: client
exec: socket.send "quit" ---> Thread1.socket # host:port
Thread1:
status: socket closed()
To properly close a tcp socket in python, you have to call socket.shutdown(arg) before calling socket.close(). See the python socket documentation, the part about shutdown.
If the socket is UDP, you can't call socket.shutdown(...), it would raise an exception. And calling socket.close() alone would, like for tcp, keep the blocked operations blocking. close() alone won't interrupt them.
Many suggested solutions (not all), don't work or are seen as cumbersome as they involve 3rd party libraries. I haven't tested poll() or select(). What does definately work, is the following:
firstly, create an official Thread object for whatever thread is running socket.recv(), and save the handle to it. Secondly, import signal. Signal is an official library, which enables sending/recieving linux/posix signals to processes (read its documentation). Thirdly, to interrupt, assuming that handle to your thread is called udpThreadHandle:
signal.pthread_kill(udpthreadHandle.ident, signal.SIGINT)
and ofcourse, in the actual thread/loop doing the recieving:
try:
while True:
myUdpSocket.recv(...)
except KeyboardInterrupt:
pass
Notice, the exception handler for KeyboardInterrupt (generated by SIGINT), is OUTSIDE the recieve loop. This silently terminates the recieve loop and its thread.