How to use python socket.settimeout() properly - python

As far as I know, when you call socket.settimeout(value) and you set a float value greater than 0.0, that socket will raise a scocket.timeout when a call to, for example, socket.recv has to wait longer than the value specified.
But imagine I have to receive a big amount of data, and I have to call recv() several times, then how does settimeout affect that?
Given the following code:
to_receive = # an integer representing the bytes we want to receive
socket = # a connected socket
socket.settimeout(20)
received = 0
received_data = b""
while received < to_receive:
tmp = socket.recv(4096)
if len(tmp) == 0:
raise Exception()
received += len(tmp)
received_data += tmp
socket.settimeout(None)
The third line of the code sets the timeout of the socket to 20 seconds. Does that timeout reset every iteration? Will timeout be raised only if one of those iteration takes more than 20 seconds?
A) How can I recode it so that it raises an exception if it is taking more than 20 seconds to receive all the expected data?
B) If I don't set the timeout to None after we read all data, could anything bad happen? (the connection is keep-alive and more data could be requested in the future).

The timeout applies independently to each call to socket read/write operation. So next call it will be 20 seconds again.
A) To have a timeout shared by several consequential calls, you'll have to track it manually. Something along these lines:
deadline = time.time() + 20.0
while not data_received:
if time.time() >= deadline:
raise Exception() # ...
socket.settimeout(deadline - time.time())
socket.read() # ...
B) Any code that's using a socket with a timeout and isn't ready to handle socket.timeout exception will likely fail. It is more reliable to remember the socket's timeout value before you start your operation, and restore it when you are done:
def my_socket_function(socket, ...):
# some initialization and stuff
old_timeout = socket.gettimeout() # Save
# do your stuff with socket
socket.settimeout(old_timeout) # Restore
# etc
This way, your function will not affect the functioning of the code that's calling it, no matter what either of them do with the socket's timeout.

The timeout applies to each call to recv().
A) simply use your existing timeout and call recv(to_receive) - I.e. Try to receive all the data in one recv call - in fact I don't see why you shouldn't use this as the default way it works
B) No nothing bad could happen, but any other code which uses that socket needs to be aware of handling timeout.
On your existing code, shouldn't the recv() call be recv(max(4096,to_receive-received)) - that way you won't unintentionally consume any data which follows after the to_receive bytes.

See my server script, you will get the idea to use it properly.
import socket
import sys
fragments = []
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind(("192.168.1.4",9001))
s.listen(5)
while True:
c,a = s.accept()
c.settimeout(10.0)
print "Someone came in Server from %s and port %s" %(a[0],a[1])
c.send("Welcome to system")
while True:
chunk = c.recv(2048)
if not chunk.strip():
break
else:
fragments.append(chunk)
continue
combiner = "".join(fragments)
print combiner
shutdown = str(raw_input("Wanna Quit(Y/y) or (N/n): "))
if shutdown == 'Y' or shutdown == 'y':
c.close()
sys.exit()
else:
continue
This script is just to give you an idea about the socket.settimeout().

https://docs.python.org/3/library/socket.html#socket.socket.settimeout
"Changed in version 3.5: The socket timeout is no more reset each time data is sent successfully. The socket timeout is now the maximum total duration to send all data."

Related

Python 3.x thread hanging on join() call

I'm having trouble with a thread hanging when I call join() on it. What I am trying to do is use the Go-back N protocol for sending/receiving packets over a network, and I created a separate thread for handling the ACK's that come back from the server.
I have a single thread run on this method that checks for incoming packets and retrieves the ACK number, then stores that number in a variable set-up in the init called self.lastAck. Simplified version of the method:
#Anything not explicitly defined here is global variable
def ack_check(self):
ack_num = 0
pktHdrData = '!BBBBHHLLQQLL'
# Listening for ack number from server and store it in self.lastAck.
while True:
# variable also inside the __init__ method
if (self.finish == 1):
break
data,address = sock.recvfrom(4096)
clientAck = struct.unpack(pktHdrData,data)
ackNumRecv = clientAck[9]
self.lastAck = ackNumRecv
A simplified version of the function that creates the thread and handles the sending of the client packets:
def send(self,buffer):
# Assume packet header and all relevant data is set up correctly here
# ...
t1 = threading.Thread(target = self.ack_check, args=())
t1.setDaemon = True
t1.start()
# All of this works perfectly and breaks as expected
while True:
# Packets/data get sent here and break when self.lastAck reaches a specific number. Assume this works properly and breaks
self.finish = 1
print("About to hang here")
t1.join()
return bytessent
I end up hanging right after printing the About to end here and I can't figure out why. I can get it to work if I break out of the while True loop in the else section, but then I end up closing the thread before I receive all the ACK numbers from the receiver. So instead of the full 32 ACK's I'll end up with anywhere from 1 ACK to the full 32.
I think the problem lies in the def ack_check(self) method where it doesn't break out of the loop even though it should be after I call self.finish = 1 but it just ends up hanging every time.
Additionally there is nothing else outside of these two methods that are calling self.finish and self.lastAck. I know about deadlocking but I couldn't see how that would be possible in this situation.
Sidenote: I realize the Go-Back N protocol is not properly implemented at all here, but this was the first step I took in creating it.
As per the comments, the recvfrom call in ack_check left the thread hanging. Fixed code:
def ack_check(self):
ack_num = 0
pktHdrData = '!BBBBHHLLQQLL'
# Listening for ack number from server and store it in self.lastAck.
while True:
# variable also inside the __init__ method
if (self.finish == 1):
break
sock.timeout(0.2)
try:
data,address = sock.recvfrom(4096)
except socket.timeout:
break
clientAck = struct.unpack(pktHdrData,data)
ackNumRecv = clientAck[9]
self.lastAck = ackNumRecv

Python: Multithreaded socket server runs endlessly when client stops unexpectedly

I have created a multithreaded socket server to connect many clients to the server using python. If a client stops unexpectedly due to an exception, server runs nonstop. Is there a way to kill that particular thread alone in the server and the rest running
Server:
class ClientThread(Thread):
def __init__(self,ip,port):
Thread.__init__(self)
self.ip = ip
self.port = port
print("New server socket thread started for " + ip + ":" + str(port))
def run(self):
while True :
try:
message = conn.recv(2048)
dataInfo = message.decode('ascii')
print("recv:::::"+str(dataInfo)+"::")
except:
print("Unexpected error:", sys.exc_info()[0])
Thread._stop(self)
tcpServer = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
tcpServer.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
tcpServer.bind((TCP_IP, 0))
tcpServer.listen(10)
print("Port:"+ str(tcpServer.getsockname()[1]))
threads = []
while True:
print( "Waiting for connections from clients..." )
(conn, (ip,port)) = tcpServer.accept()
newthread = ClientThread(ip,port)
newthread.start()
threads.append(newthread)
for t in threads:
t.join()
Client:
def Main():
s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
s.connect((host,int(port)))
while True:
try:
message = input("Enter Command")
s.send(message.encode('ascii'))
except Exception as ex:
logging.exception("Unexpected error:")
break
s.close()
Sorry about a very, very long answer but here goes.
There are quite a many issues with your code. First of all, your client does not actually close the socket, as s.close() will never get executed. Your loop is interrupted at break and anything that follows it will be ignored. So change the order of these statements for the sake of good programming but it has nothing to do with your problem.
Your server code is wrong in quite a many ways. As it is currently written, it never exits. Your threads also do not work right. I have fixed your code so that it is a working, multithreaded server, but it still does not exit as I have no idea what would be the trigger to make it exit. But let us start from the main loop:
while True:
print( "Waiting for connections from clients..." )
(conn, (ip,port)) = tcpServer.accept()
newthread = ClientThread(conn, ip,port)
newthread.daemon = True
newthread.start()
threads.append(newthread) # Do we need this?
for t in threads:
t.join()
I have added passing of conn to your client thread, the reason of which becomes apparent in a moment. However, your while True loop never breaks, so you will never enter the for loop where you join your threads. If your server is meant to be run indefinitely, this is not a problem at all. Just remove the for loop and this part is fine. You do not need to join threads just for the sake of joining them. Joining threads only allows your program to block until a thread has finished executing.
Another addition is newthread.daemon = True. This sets your threads to daemonic, which means they will exit as soon as your main thread exits. Now your server responds to control + c even when there are active connections.
If your server is meant to be never ending, there is also no need to store threads in your main loop to threads list. This list just keeps growing as a new entry will be added every time a client connects and disconnects, and this leaks memory as you are not using the threads list for anything. I have kept it as it was there, but there still is no mechanism to exit the infinite loop.
Then let us move on to your thread. If you want to simplify the code, you can replace the run part with a function. There is no need to subclass Thread in this case, but this works so I have kept your structure:
class ClientThread(Thread):
def __init__(self,conn, ip,port):
Thread.__init__(self)
self.ip = ip
self.port = port
self.conn = conn
print("New server socket thread started for " + ip + ":" + str(port))
def run(self):
while True :
try:
message = self.conn.recv(2048)
if not message:
print("closed")
try:
self.conn.close()
except:
pass
return
try:
dataInfo = message.decode('ascii')
print("recv:::::"+str(dataInfo)+"::")
except UnicodeDecodeError:
print("non-ascii data")
continue
except socket.error:
print("Unexpected error:", sys.exc_info()[0])
try:
self.conn.close()
except:
pass
return
First of all, we store conn to self.conn. Your version used a global version of conn variable. This caused unexpected results when you had more than one connection to the server. conn is actually a new socket created for the client connection at accept, and this is unique to each thread. This is how servers differentiate between client connections. They listen to a known port, but when the server accepts the connection, accept creates another port for that particular connection and returns it. This is why we need to pass this to the thread and then read from self.conn instead of global conn.
Your server "hung" upon client connetion errors as there was no mechanism to detect this in your loop. If the client closes connection, socket.recv() does not raise an exception but returns nothing. This is the condition you need to detect. I am fairly sure you do not even need try/except here but it does not hurt - but you need to add the exception you are expecting here. In this case catching everything with undeclared except is just wrong. You have also another statement there potentially raising exceptions. If your client sends something that cannot be decoded with ascii codec, you would get UnicodeDecodeError (try this without error handling here, telnet to your server port and copypaste some Hebrew or Japanese into the connection and see what happens). If you just caught everything and treated as socket errors, you would now enter the thread ending part of the code just because you could not parse a message. Typically we just ignore "illegal" messages and carry on. I have added this. If you want to shut down the connection upon receiving a "bad" message, just add self.conn.close() and return to this exception handler as well.
Then when you really are encountering a socket error - or the client has closed the connection, you will need to close the socket and exit the thread. You will call close() on the socket - encapsulating it in try/except as you do not really care if it fails for not being there anymore.
And when you want to exit your thread, you just return from your run() loop. When you do this, your thread exits orderly. As simple as that.
Then there is yet another potential problem, if you are not only printing the messages but are parsing them and doing something with the data you receive. This I do not fix but leave this to you.
TCP sockets transmit data, not messages. When you build a communication protocol, you must not assume that when your recv returns, it will return a single message. When your recv() returns something, it can mean one of five things:
The client has closed the connection and nothing is returned
There is exactly one full message and you receive that
There is only a partial message. Either because you read the socket before the client had transmitted all data, or because the client sent more than 2048 bytes (even if your client never sends over 2048 bytes, a malicious client would definitely try this)
There are more than one messages waiting and you received them all
As 4, but the last message is partial.
Most socket programming mistakes are related to this. The programmer expects 2 to happen (as you do now) but they do not cater for 3-5. You should instead analyse what was received and act accordingly. If there seems to be less data than a full message, store it somewhere and wait for more data to appear. When more data appears, concatenate these and see if you now have a full message. And when you have parsed a full message from this buffer, inspect the buffer to see if there is more data there - the first part of the next message or even more full messages if your client is fast and server is slow. If you process a message and then wipe the buffer, you might have wiped also bytes from your next message.

python recv losing first bytes of data

I have a problem when recieving messages sent over an ssl socket. On rare occasions I lose the first few bytes of data in the message. I am pretty certain this somehow is a speed problem since it only seems to happen when 2 messages are sent in rapid succession (1-2 milliseconds apart). I am running the recieving code in a separate thread with minimal code dumping the messages in a queue as they arrive.
queue = Queue()
...
def read_feed(session_key, hostname, port, ssl_socket):
''' READ whatever is coming on the stream '''
while (1):
try:
output = ssl_socket.recv(2048) # Message size always < 2048
except (ConnectionResetError, OSError):
logger.info("Connecting feed")
try:
ssl_socket.connect((hostname, port))
except ValueError: # Something's wrong, disconnect and do a new round
ssl_socket.close()
else:
cmd = {"cmd":"login", "args":{"session_key":session_key}}
data = str.encode(json.dumps(cmd) + "\n")
num_bytes = ssl_socket.send(data)
else:
queue.put(output)
...
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
ssl_socket = ssl.wrap_socket(s)
...
t3 = threading.Thread(target=read_feed, name = 'Read Feed', args=(session_key, hostname, port, ssl_socket))
t3.start()
I was first suspecting that somehow the other threads running was stealing too much CPU time so that the network buffer was filled before this thread got a chance to run, but I have tried to use a multi core machine and the problem persists.
In essense this should be the only code running when I am connected?
while (1):
output = ssl_socket.recv(2048) # Message size always < 2048
queue.put(output)
Or am I making the wrong assumptions here? Maybe the try:/except: construct is costly, or is the queue.put method slow and I should use something else? Or maybe Python is not the right tool for the job?
Any suggestions on how to improve the code so that I don't lose those few precious first bytes?

Timeout for python requests.get entire response

I'm gathering statistics on a list of websites and I'm using requests for it for simplicity. Here is my code:
data=[]
websites=['http://google.com', 'http://bbc.co.uk']
for w in websites:
r= requests.get(w, verify=False)
data.append( (r.url, len(r.content), r.elapsed.total_seconds(), str([(l.status_code, l.url) for l in r.history]), str(r.headers.items()), str(r.cookies.items())) )
Now, I want requests.get to timeout after 10 seconds so the loop doesn't get stuck.
This question has been of interest before too but none of the answers are clean.
I hear that maybe not using requests is a good idea but then how should I get the nice things requests offer (the ones in the tuple).
Set the timeout parameter:
r = requests.get(w, verify=False, timeout=10) # 10 seconds
Changes in version 2.25.1
The code above will cause the call to requests.get() to timeout if the connection or delays between reads takes more than ten seconds. See: https://requests.readthedocs.io/en/stable/user/advanced/#timeouts
What about using eventlet? If you want to timeout the request after 10 seconds, even if data is being received, this snippet will work for you:
import requests
import eventlet
eventlet.monkey_patch()
with eventlet.Timeout(10):
requests.get("http://ipv4.download.thinkbroadband.com/1GB.zip", verify=False)
UPDATE: https://requests.readthedocs.io/en/master/user/advanced/#timeouts
In new version of requests:
If you specify a single value for the timeout, like this:
r = requests.get('https://github.com', timeout=5)
The timeout value will be applied to both the connect and the read timeouts. Specify a tuple if you would like to set the values separately:
r = requests.get('https://github.com', timeout=(3.05, 27))
If the remote server is very slow, you can tell Requests to wait forever for a response, by passing None as a timeout value and then retrieving a cup of coffee.
r = requests.get('https://github.com', timeout=None)
My old (probably outdated) answer (which was posted long time ago):
There are other ways to overcome this problem:
1. Use the TimeoutSauce internal class
From: https://github.com/kennethreitz/requests/issues/1928#issuecomment-35811896
import requests from requests.adapters import TimeoutSauce
class MyTimeout(TimeoutSauce):
def __init__(self, *args, **kwargs):
connect = kwargs.get('connect', 5)
read = kwargs.get('read', connect)
super(MyTimeout, self).__init__(connect=connect, read=read)
requests.adapters.TimeoutSauce = MyTimeout
This code should cause us to set the read timeout as equal to the
connect timeout, which is the timeout value you pass on your
Session.get() call. (Note that I haven't actually tested this code, so
it may need some quick debugging, I just wrote it straight into the
GitHub window.)
2. Use a fork of requests from kevinburke: https://github.com/kevinburke/requests/tree/connect-timeout
From its documentation: https://github.com/kevinburke/requests/blob/connect-timeout/docs/user/advanced.rst
If you specify a single value for the timeout, like this:
r = requests.get('https://github.com', timeout=5)
The timeout value will be applied to both the connect and the read
timeouts. Specify a tuple if you would like to set the values
separately:
r = requests.get('https://github.com', timeout=(3.05, 27))
kevinburke has requested it to be merged into the main requests project, but it hasn't been accepted yet.
timeout = int(seconds)
Since requests >= 2.4.0, you can use the timeout argument, i.e:
requests.get('https://duckduckgo.com/', timeout=10)
Note:
timeout is not a time limit on the entire response download; rather,
an exception is raised if the server has not issued a response for
timeout seconds ( more precisely, if no bytes have been received on the
underlying socket for timeout seconds). If no timeout is specified
explicitly, requests do not time out.
To create a timeout you can use signals.
The best way to solve this case is probably to
Set an exception as the handler for the alarm signal
Call the alarm signal with a ten second delay
Call the function inside a try-except-finally block.
The except block is reached if the function timed out.
In the finally block you abort the alarm, so it's not singnaled later.
Here is some example code:
import signal
from time import sleep
class TimeoutException(Exception):
""" Simple Exception to be called on timeouts. """
pass
def _timeout(signum, frame):
""" Raise an TimeoutException.
This is intended for use as a signal handler.
The signum and frame arguments passed to this are ignored.
"""
# Raise TimeoutException with system default timeout message
raise TimeoutException()
# Set the handler for the SIGALRM signal:
signal.signal(signal.SIGALRM, _timeout)
# Send the SIGALRM signal in 10 seconds:
signal.alarm(10)
try:
# Do our code:
print('This will take 11 seconds...')
sleep(11)
print('done!')
except TimeoutException:
print('It timed out!')
finally:
# Abort the sending of the SIGALRM signal:
signal.alarm(0)
There are some caveats to this:
It is not threadsafe, signals are always delivered to the main thread, so you can't put this in any other thread.
There is a slight delay after the scheduling of the signal and the execution of the actual code. This means that the example would time out even if it only slept for ten seconds.
But, it's all in the standard python library! Except for the sleep function import it's only one import. If you are going to use timeouts many places You can easily put the TimeoutException, _timeout and the singaling in a function and just call that. Or you can make a decorator and put it on functions, see the answer linked below.
You can also set this up as a "context manager" so you can use it with the with statement:
import signal
class Timeout():
""" Timeout for use with the `with` statement. """
class TimeoutException(Exception):
""" Simple Exception to be called on timeouts. """
pass
def _timeout(signum, frame):
""" Raise an TimeoutException.
This is intended for use as a signal handler.
The signum and frame arguments passed to this are ignored.
"""
raise Timeout.TimeoutException()
def __init__(self, timeout=10):
self.timeout = timeout
signal.signal(signal.SIGALRM, Timeout._timeout)
def __enter__(self):
signal.alarm(self.timeout)
def __exit__(self, exc_type, exc_value, traceback):
signal.alarm(0)
return exc_type is Timeout.TimeoutException
# Demonstration:
from time import sleep
print('This is going to take maximum 10 seconds...')
with Timeout(10):
sleep(15)
print('No timeout?')
print('Done')
One possible down side with this context manager approach is that you can't know if the code actually timed out or not.
Sources and recommended reading:
The documentation on signals
This answer on timeouts by #David Narayan. He has organized the above code as a decorator.
Try this request with timeout & error handling:
import requests
try:
url = "http://google.com"
r = requests.get(url, timeout=10)
except requests.exceptions.Timeout as e:
print e
The connect timeout is the number of seconds Requests will wait for your client to establish a connection to a remote machine (corresponding to the connect()) call on the socket. It’s a good practice to set connect timeouts to slightly larger than a multiple of 3, which is the default TCP packet retransmission window.
Once your client has connected to the server and sent the HTTP request, the read timeout started. It is the number of seconds the client will wait for the server to send a response. (Specifically, it’s the number of seconds that the client will wait between bytes sent from the server. In 99.9% of cases, this is the time before the server sends the first byte).
If you specify a single value for the timeout, The timeout value will be applied to both the connect and the read timeouts. like below:
r = requests.get('https://github.com', timeout=5)
Specify a tuple if you would like to set the values separately for connect and read:
r = requests.get('https://github.com', timeout=(3.05, 27))
If the remote server is very slow, you can tell Requests to wait forever for a response, by passing None as a timeout value and then retrieving a cup of coffee.
r = requests.get('https://github.com', timeout=None)
https://docs.python-requests.org/en/latest/user/advanced/#timeouts
Most other answers are incorrect
Despite all the answers, I believe that this thread still lacks a proper solution and no existing answer presents a reasonable way to do something which should be simple and obvious.
Let's start by saying that as of 2022, there is still absolutely no way to do it properly with requests alone. It is a concious design decision by the library's developers.
Solutions utilizing the timeout parameter simply do not accomplish what they intend to do. The fact that it "seems" to work at the first glance is purely incidental:
The timeout parameter has absolutely nothing to do with the total execution time of the request. It merely controls the maximum amount of time that can pass before underlying socket receives any data. With an example timeout of 5 seconds, server can just as well send 1 byte of data every 4 seconds and it will be perfectly okay, but won't help you very much.
Answers with stream and iter_content are somewhat better, but they still do not cover everything in a request. You do not actually receive anything from iter_content until after response headers are sent, which falls under the same issue - even if you use 1 byte as a chunk size for iter_content, reading full response headers could take a totally arbitrary amount of time and you can never actually get to the point in which you read any response body from iter_content.
Here are some examples that completely break both timeout and stream-based approach. Try them all. They all hang indefinitely, no matter which method you use.
server.py
import socket
import time
server = socket.socket()
server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, True)
server.bind(('127.0.0.1', 8080))
server.listen()
while True:
try:
sock, addr = server.accept()
print('Connection from', addr)
sock.send(b'HTTP/1.1 200 OK\r\n')
# Send some garbage headers very slowly but steadily.
# Never actually complete the response.
while True:
sock.send(b'a')
time.sleep(1)
except:
pass
demo1.py
import requests
requests.get('http://localhost:8080')
demo2.py
import requests
requests.get('http://localhost:8080', timeout=5)
demo3.py
import requests
requests.get('http://localhost:8080', timeout=(5, 5))
demo4.py
import requests
with requests.get('http://localhost:8080', timeout=(5, 5), stream=True) as res:
for chunk in res.iter_content(1):
break
The proper solution
My approach utilizes Python's sys.settrace function. It is dead simple. You do not need to use any external libraries or turn your code upside down. Unlike most other answers, this actually guarantees that the code executes in specified time. Be aware that you still need to specify the timeout parameter, as settrace only concerns Python code. Actual socket reads are external syscalls which are not covered by settrace, but are covered by the timeout parameter. Due to this fact, the exact time limit is not TOTAL_TIMEOUT, but a value which is explained in comments below.
import requests
import sys
import time
# This function serves as a "hook" that executes for each Python statement
# down the road. There may be some performance penalty, but as downloading
# a webpage is mostly I/O bound, it's not going to be significant.
def trace_function(frame, event, arg):
if time.time() - start > TOTAL_TIMEOUT:
raise Exception('Timed out!') # Use whatever exception you consider appropriate.
return trace_function
# The following code will terminate at most after TOTAL_TIMEOUT + the highest
# value specified in `timeout` parameter of `requests.get`.
# In this case 10 + 6 = 16 seconds.
# For most cases though, it's gonna terminate no later than TOTAL_TIMEOUT.
TOTAL_TIMEOUT = 10
start = time.time()
sys.settrace(trace_function)
try:
res = requests.get('http://localhost:8080', timeout=(3, 6)) # Use whatever timeout values you consider appropriate.
except:
raise
finally:
sys.settrace(None) # Remove the time constraint and continue normally.
# Do something with the response
Condensed
import requests, sys, time
TOTAL_TIMEOUT = 10
def trace_function(frame, event, arg):
if time.time() - start > TOTAL_TIMEOUT:
raise Exception('Timed out!')
return trace_function
start = time.time()
sys.settrace(trace_function)
try:
res = requests.get('http://localhost:8080', timeout=(3, 6))
except:
raise
finally:
sys.settrace(None)
That's it!
Set stream=True and use r.iter_content(1024). Yes, eventlet.Timeout just somehow doesn't work for me.
try:
start = time()
timeout = 5
with get(config['source']['online'], stream=True, timeout=timeout) as r:
r.raise_for_status()
content = bytes()
content_gen = r.iter_content(1024)
while True:
if time()-start > timeout:
raise TimeoutError('Time out! ({} seconds)'.format(timeout))
try:
content += next(content_gen)
except StopIteration:
break
data = content.decode().split('\n')
if len(data) in [0, 1]:
raise ValueError('Bad requests data')
except (exceptions.RequestException, ValueError, IndexError, KeyboardInterrupt,
TimeoutError) as e:
print(e)
with open(config['source']['local']) as f:
data = [line.strip() for line in f.readlines()]
The discussion is here https://redd.it/80kp1h
This may be overkill, but the Celery distributed task queue has good support for timeouts.
In particular, you can define a soft time limit that just raises an exception in your process (so you can clean up) and/or a hard time limit that terminates the task when the time limit has been exceeded.
Under the covers, this uses the same signals approach as referenced in your "before" post, but in a more usable and manageable way. And if the list of web sites you are monitoring is long, you might benefit from its primary feature -- all kinds of ways to manage the execution of a large number of tasks.
I believe you can use multiprocessing and not depend on a 3rd party package:
import multiprocessing
import requests
def call_with_timeout(func, args, kwargs, timeout):
manager = multiprocessing.Manager()
return_dict = manager.dict()
# define a wrapper of `return_dict` to store the result.
def function(return_dict):
return_dict['value'] = func(*args, **kwargs)
p = multiprocessing.Process(target=function, args=(return_dict,))
p.start()
# Force a max. `timeout` or wait for the process to finish
p.join(timeout)
# If thread is still active, it didn't finish: raise TimeoutError
if p.is_alive():
p.terminate()
p.join()
raise TimeoutError
else:
return return_dict['value']
call_with_timeout(requests.get, args=(url,), kwargs={'timeout': 10}, timeout=60)
The timeout passed to kwargs is the timeout to get any response from the server, the argument timeout is the timeout to get the complete response.
Despite the question being about requests, I find this very easy to do with pycurl CURLOPT_TIMEOUT or CURLOPT_TIMEOUT_MS.
No threading or signaling required:
import pycurl
import StringIO
url = 'http://www.example.com/example.zip'
timeout_ms = 1000
raw = StringIO.StringIO()
c = pycurl.Curl()
c.setopt(pycurl.TIMEOUT_MS, timeout_ms) # total timeout in milliseconds
c.setopt(pycurl.WRITEFUNCTION, raw.write)
c.setopt(pycurl.NOSIGNAL, 1)
c.setopt(pycurl.URL, url)
c.setopt(pycurl.HTTPGET, 1)
try:
c.perform()
except pycurl.error:
traceback.print_exc() # error generated on timeout
pass # or just pass if you don't want to print the error
In case you're using the option stream=True you can do this:
r = requests.get(
'http://url_to_large_file',
timeout=1, # relevant only for underlying socket
stream=True)
with open('/tmp/out_file.txt'), 'wb') as f:
start_time = time.time()
for chunk in r.iter_content(chunk_size=1024):
if chunk: # filter out keep-alive new chunks
f.write(chunk)
if time.time() - start_time > 8:
raise Exception('Request took longer than 8s')
The solution does not need signals or multiprocessing.
Just another one solution (got it from http://docs.python-requests.org/en/master/user/advanced/#streaming-uploads)
Before upload you can find out the content size:
TOO_LONG = 10*1024*1024 # 10 Mb
big_url = "http://ipv4.download.thinkbroadband.com/1GB.zip"
r = requests.get(big_url, stream=True)
print (r.headers['content-length'])
# 1073741824
if int(r.headers['content-length']) < TOO_LONG:
# upload content:
content = r.content
But be careful, a sender can set up incorrect value in the 'content-length' response field.
timeout = (connection timeout, data read timeout) or give a single argument(timeout=1)
import requests
try:
req = requests.request('GET', 'https://www.google.com',timeout=(1,1))
print(req)
except requests.ReadTimeout:
print("READ TIME OUT")
this code working for socketError 11004 and 10060......
# -*- encoding:UTF-8 -*-
__author__ = 'ACE'
import requests
from PyQt4.QtCore import *
from PyQt4.QtGui import *
class TimeOutModel(QThread):
Existed = pyqtSignal(bool)
TimeOut = pyqtSignal()
def __init__(self, fun, timeout=500, parent=None):
"""
#param fun: function or lambda
#param timeout: ms
"""
super(TimeOutModel, self).__init__(parent)
self.fun = fun
self.timeer = QTimer(self)
self.timeer.setInterval(timeout)
self.timeer.timeout.connect(self.time_timeout)
self.Existed.connect(self.timeer.stop)
self.timeer.start()
self.setTerminationEnabled(True)
def time_timeout(self):
self.timeer.stop()
self.TimeOut.emit()
self.quit()
self.terminate()
def run(self):
self.fun()
bb = lambda: requests.get("http://ipv4.download.thinkbroadband.com/1GB.zip")
a = QApplication([])
z = TimeOutModel(bb, 500)
print 'timeout'
a.exec_()
Well, I tried many solutions on this page and still faced instabilities, random hangs, poor connections performance.
I'm now using Curl and i'm really happy about it's "max time" functionnality and about the global performances, even with such a poor implementation :
content=commands.getoutput('curl -m6 -Ss "http://mywebsite.xyz"')
Here, I defined a 6 seconds max time parameter, englobing both connection and transfer time.
I'm sure Curl has a nice python binding, if you prefer to stick to the pythonic syntax :)
There is a package called timeout-decorator that you can use to time out any python function.
#timeout_decorator.timeout(5)
def mytest():
print("Start")
for i in range(1,10):
time.sleep(1)
print("{} seconds have passed".format(i))
It uses the signals approach that some answers here suggest. Alternatively, you can tell it to use multiprocessing instead of signals (e.g. if you are in a multi-thread environment).
If it comes to that, create a watchdog thread that messes up requests' internal state after 10 seconds, e.g.:
closes the underlying socket, and ideally
triggers an exception if requests retries the operation
Note that depending on the system libraries you may be unable to set deadline on DNS resolution.
I'm using requests 2.2.1 and eventlet didn't work for me. Instead I was able use gevent timeout instead since gevent is used in my service for gunicorn.
import gevent
import gevent.monkey
gevent.monkey.patch_all(subprocess=True)
try:
with gevent.Timeout(5):
ret = requests.get(url)
print ret.status_code, ret.content
except gevent.timeout.Timeout as e:
print "timeout: {}".format(e.message)
Please note that gevent.timeout.Timeout is not caught by general Exception handling.
So either explicitly catch gevent.timeout.Timeout
or pass in a different exception to be used like so: with gevent.Timeout(5, requests.exceptions.Timeout): although no message is passed when this exception is raised.
The biggest problem is that if the connection can't be established, the requests package waits too long and blocks the rest of the program.
There are several ways how to tackle the problem but when I looked for a oneliner similar to requests, I couldn't find anything. That's why I built a wrapper around requests called reqto ("requests timeout"), which supports proper timeout for all standard methods from requests.
pip install reqto
The syntax is identical to requests
import reqto
response = reqto.get(f'https://pypi.org/pypi/reqto/json',timeout=1)
# Will raise an exception on Timeout
print(response)
Moreover, you can set up a custom timeout function
def custom_function(parameter):
print(parameter)
response = reqto.get(f'https://pypi.org/pypi/reqto/json',timeout=5,timeout_function=custom_function,timeout_args="Timeout custom function called")
#Will call timeout_function instead of raising an exception on Timeout
print(response)
Important note is that the import line
import reqto
needs to be earlier import than all other imports working with requests, threading, etc. due to monkey_patch which runs in the background.
I came up with a more direct solution that is admittedly ugly but fixes the real problem. It goes a bit like this:
resp = requests.get(some_url, stream=True)
resp.raw._fp.fp._sock.settimeout(read_timeout)
# This will load the entire response even though stream is set
content = resp.content
You can read the full explanation here

What does Python's socket.recv() return for non-blocking sockets if no data is received until a timeout occurs?

Basically, I've read in several places that socket.recv() will return whatever it can read, or an empty string signalling that the other side has shut down (the official docs don't even mention what it returns when the connection is shut down... great!). This is all fine and dandy for blocking sockets, since we know that recv() only returns when there actually is something to receive, so when it returns an empty string, it MUST mean the other side has closed the connection, right?
Okay, fine, but what happens when my socket is non-blocking?? I have searched a bit (maybe not enough, who knows?) and can't figure out how to tell when the other side has closed the connection using a non-blocking socket. There seems to be no method or attribute that tells us this, and comparing the return value of recv() to the empty string seems absolutely useless... is it just me having this problem?
As a simple example, let's say my socket's timeout is set to 1.2342342 (whatever non-negative number you like here) seconds and I call socket.recv(1024), but the other side doesn't send anything during that 1.2342342 second period. The recv() call will return an empty string and I have no clue as to whether the connection is still standing or not...
In the case of a non blocking socket that has no data available, recv will throw the socket.error exception and the value of the exception will have the errno of either EAGAIN or EWOULDBLOCK. Example:
import sys
import socket
import fcntl, os
import errno
from time import sleep
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('127.0.0.1',9999))
fcntl.fcntl(s, fcntl.F_SETFL, os.O_NONBLOCK)
while True:
try:
msg = s.recv(4096)
except socket.error, e:
err = e.args[0]
if err == errno.EAGAIN or err == errno.EWOULDBLOCK:
sleep(1)
print 'No data available'
continue
else:
# a "real" error occurred
print e
sys.exit(1)
else:
# got a message, do something :)
The situation is a little different in the case where you've enabled non-blocking behavior via a time out with socket.settimeout(n) or socket.setblocking(False). In this case a socket.error is stil raised, but in the case of a time out, the accompanying value of the exception is always a string set to 'timed out'. So, to handle this case you can do:
import sys
import socket
from time import sleep
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('127.0.0.1',9999))
s.settimeout(2)
while True:
try:
msg = s.recv(4096)
except socket.timeout, e:
err = e.args[0]
# this next if/else is a bit redundant, but illustrates how the
# timeout exception is setup
if err == 'timed out':
sleep(1)
print 'recv timed out, retry later'
continue
else:
print e
sys.exit(1)
except socket.error, e:
# Something else happened, handle error, exit, etc.
print e
sys.exit(1)
else:
if len(msg) == 0:
print 'orderly shutdown on server end'
sys.exit(0)
else:
# got a message do something :)
As indicated in the comments, this is also a more portable solution since it doesn't depend on OS specific functionality to put the socket into non-blockng mode.
See recv(2) and python socket for more details.
It is simple: if recv() returns 0 bytes; you will not receive any more data on this connection. Ever. You still might be able to send.
It means that your non-blocking socket have to raise an exception (it might be system-dependent) if no data is available but the connection is still alive (the other end may send).
When you use recv in connection with select if the socket is ready to be read from but there is no data to read that means the client has closed the connection.
Here is some code that handles this, also note the exception that is thrown when recv is called a second time in the while loop. If there is nothing left to read this exception will be thrown it doesn't mean the client has closed the connection :
def listenToSockets(self):
while True:
changed_sockets = self.currentSockets
ready_to_read, ready_to_write, in_error = select.select(changed_sockets, [], [], 0.1)
for s in ready_to_read:
if s == self.serverSocket:
self.acceptNewConnection(s)
else:
self.readDataFromSocket(s)
And the function that receives the data :
def readDataFromSocket(self, socket):
data = ''
buffer = ''
try:
while True:
data = socket.recv(4096)
if not data:
break
buffer += data
except error, (errorCode,message):
# error 10035 is no data available, it is non-fatal
if errorCode != 10035:
print 'socket.error - ('+str(errorCode)+') ' + message
if data:
print 'received '+ buffer
else:
print 'disconnected'
Just to complete the existing answers, I'd suggest using select instead of nonblocking sockets. The point is that nonblocking sockets complicate stuff (except perhaps sending), so I'd say there is no reason to use them at all. If you regularly have the problem that your app is blocked waiting for IO, I would also consider doing the IO in a separate thread in the background.

Categories

Resources