How to handle timeouts with httplib (python 2.6)? - python

I'm using httplib to access an api over https and need to build in exception handling in the event that the api is down.
Here's an example connection:
connection = httplib.HTTPSConnection('non-existent-api.com', timeout=1)
connection.request('POST', '/request.api', xml, headers={'Content-Type': 'text/xml'})
response = connection.getresponse()
This should timeout, so I was expecting an exception to be raised, and response.read() just returns an empty string.
How can I know if there was a timeout? Even better, what's the best way to gracefully handle the problem of a 3rd-party api being down?

Even better, what's the best way to gracefully handle the problem of a 3rd-party api being down?
what's mean API is down , API return http 404 , 500 ...
or you mean when the API can't be reachable ?
first of all i don't think you can know if a web service in general is down before trying to access it so i will recommend for first one you can do like this:
import httplib
conn = httplib.HTTPConnection('www.google.com') # I used here HTTP not HTTPS for simplify
conn.request('HEAD', '/') # Just send a HTTP HEAD request
res = conn.getresponse()
if res.status == 200:
print "ok"
else:
print "problem : the query returned %s because %s" % (res.status, res.reason)
and for checking if the API is not reachable i think you will be better doing a try catch:
import httplib
import socket
try:
# I don't think you need the timeout unless you want to also calculate the response time ...
conn = httplib.HTTPSConnection('www.google.com')
conn.connect()
except (httplib.HTTPException, socket.error) as ex:
print "Error: %s" % ex
You can mix the two ways if you want something more general ,Hope this will help

urllib and httplib don't expose timeout. You have to include socket and set the timeout there:
import socket
socket.settimeout(10) # or whatever timeout you want

This is what I found to be working correctly with httplib2. Posting it as it might still help someone :
import httplib2, socket
def check_url(url):
h = httplib2.Http(timeout=0.1) #100 ms timeout
try:
resp = h.request(url, 'HEAD')
except (httplib2.HttpLib2Error, socket.error) as ex:
print "Request timed out for ", url
return False
return int(resp[0]['status']) < 400

Related

Handling unsuccessful requests in python

I would like to make various http requests and display the actual response status code and reason regardless of any http exceptions, for e.g. if it returns 503 or 404 then just want to display that status code and handle it rather than throwing exception stack.
However, what happens currently in the following is reason variable is never populated if the request is unsuccessful so the request summary result is never displayed.
Any suggestions?
import http.server
import socketserver
import socket
import requests
PORT = 5000
URL1 = "https://foo/"
# URL2 =
class Handler(http.server.BaseHTTPRequestHandler):
def do_GET(self):
self.send_response(200)
self.send_header('Content-type','text/html')
self.end_headers()
self.wfile.write(("<br>Running on: %s" % socket.gethostname()).encode('utf-8'))
if self.path == '/status':
self.wfile.write(("<h1>status</h1>").encode('utf-8'))
try:
response = requests.get(URL1,verify=False)
self.wfile.write(("<br>Request client connection : {}, Response Status: {}, Response Reason: {}".format(response.url, response.status_code, response.reason)).encode('utf-8'))
except:
self.wfile.write(("exception").encode('utf-8'))
#self.wfile.write(("<br>Request client connection : {}, Response Status: {}, Response Reason: {}".format(response.url, response.status_code, response.reason)).encode('utf-8'))
return
return
httpd = socketserver.TCPServer(("", PORT), Handler)
print("serving at port: %s" % PORT)
httpd.serve_forever()
In this case the code is not checking for unsuccessful requests, the try will catch some exceptions but not all. What you want is the following function raise_for_status() that will raise an exception in case of a failed status code. See also Response raise for status

Adding timeout while fetching server certs via python

I am trying to fetch a list of server certificates and using the python standard SSL library to accomplish this. This is how I am doing it:
import ssl
from socket import *
urls = [i.strip().lower() for i in open("urls.txt")]
for urls in url:
try:
print ssl.get_server_certificate((url, 443))
except error:
print "No connection"
However for some URLs,there are connectivity issues and the connection just times out.However it waits for the default ssl timeout value(which is quite long) before timing out.How do i specify a timeout in the ssl.get_server_certificate method ? I have specified timeouts for sockets before but I am clueless as to how to do it for this method
From the docs:
SSL sockets provide the following methods of Socket Objects:
gettimeout(), settimeout(), setblocking()
So should just be as simple as:
import ssl
from socket import *
settimeout(10)
urls = [i.strip().lower() for i in open("urls.txt")]
for urls in url:
try:
print ssl.get_server_certificate((url, 443))
except (error, timeout) as err:
print "No connection: {0}".format(err)
This versions runs for me using with Python 3.9.12 (hat tip #bchurchill):
import ssl
import socket
socket.setdefaulttimeout(2)
urls = [i.strip().lower() for i in open("urls.txt")]
for url in urls:
try:
certificate = ssl.get_server_certificate((url, 443))
print (certificate)
except Exception as err:
print(f"No connection to {url} due to: {err}")

Python: checking internet connection (more than once)

I have implemented a quick solution to check for internet connection in one python program, using what I found on SO :
def check_internet(self):
try:
response=urllib2.urlopen('http://www.google.com',timeout=2)
print "you are connected"
return True
except urllib2.URLError as err:
print err
print "you are disconnected"
It works well ONCE, and show that I am not connected if I try it once. But if I re-establish the connection and try again, then it still says I am not connected.
Is the urllib2 connection not closed somehow ? Should I do something to reset it ?
This could be because of server-side caching.
Try this:
def check_internet(self):
try:
header = {"pragma" : "no-cache"} # Tells the server to send fresh copy
req = urllib2.Request("http://www.google.com", headers=header)
response=urllib2.urlopen(req,timeout=2)
print "you are connected"
return True
except urllib2.URLError as err:
print err
I haven't tested it. But according to the 'pragma' definition, it should work.
There is a good discussion here if you want to know about pragma: Difference between Pragma and Cache-control headers?
This is how I used to check my connectivity for one of my applications.
import httplib
import socket
test_con_url = "www.google.com" # For connection testing
test_con_resouce = "/intl/en/policies/privacy/" # may change in future
test_con = httplib.HTTPConnection(test_con_url) # create a connection
try:
test_con.request("GET", test_con_resouce) # do a GET request
response = test_con.getresponse()
except httplib.ResponseNotReady as e:
print "Improper connection state"
except socket.gaierror as e:
print "Not connected"
else:
print "Connected"
test_con.close()
I tested the code enabling/disabling my LAN connection repeatedly and it works.
It will be faster to just make a HEAD request so no HTML will be fetched.
Also I am sure google would like it better this way :)
# uncomment for python2
# import httplib
import http.client as httplib
def have_internet():
conn = httplib.HTTPConnection("www.google.com")
try:
conn.request("HEAD", "/")
return True
except:
return False

Read timeout using either urllib2 or any other http library

I have code for reading an url like this:
from urllib2 import Request, urlopen
req = Request(url)
for key, val in headers.items():
req.add_header(key, val)
res = urlopen(req, timeout = timeout)
# This line blocks
content = res.read()
The timeout works for the urlopen() call. But then the code gets to the res.read() call where I want to read the response data and the timeout isn't applied there. So the read call may hang almost forever waiting for data from the server. The only solution I've found is to use a signal to interrupt the read() which is not suitable for me since I'm using threads.
What other options are there? Is there a HTTP library for Python that handles read timeouts? I've looked at httplib2 and requests and they seem to suffer the same issue as above. I don't want to write my own nonblocking network code using the socket module because I think there should already be a library for this.
Update: None of the solutions below are doing it for me. You can see for yourself that setting the socket or urlopen timeout has no effect when downloading a large file:
from urllib2 import urlopen
url = 'http://iso.linuxquestions.org/download/388/7163/http/se.releases.ubuntu.com/ubuntu-12.04.3-desktop-i386.iso'
c = urlopen(url)
c.read()
At least on Windows with Python 2.7.3, the timeouts are being completely ignored.
It's not possible for any library to do this without using some kind of asynchronous timer through threads or otherwise. The reason is that the timeout parameter used in httplib, urllib2 and other libraries sets the timeout on the underlying socket. And what this actually does is explained in the documentation.
SO_RCVTIMEO
Sets the timeout value that specifies the maximum amount of time an input function waits until it completes. It accepts a timeval structure with the number of seconds and microseconds specifying the limit on how long to wait for an input operation to complete. If a receive operation has blocked for this much time without receiving additional data, it shall return with a partial count or errno set to [EAGAIN] or [EWOULDBLOCK] if no data is received.
The bolded part is key. A socket.timeout is only raised if not a single byte has been received for the duration of the timeout window. In other words, this is a timeout between received bytes.
A simple function using threading.Timer could be as follows.
import httplib
import socket
import threading
def download(host, path, timeout = 10):
content = None
http = httplib.HTTPConnection(host)
http.request('GET', path)
response = http.getresponse()
timer = threading.Timer(timeout, http.sock.shutdown, [socket.SHUT_RD])
timer.start()
try:
content = response.read()
except httplib.IncompleteRead:
pass
timer.cancel() # cancel on triggered Timer is safe
http.close()
return content
>>> host = 'releases.ubuntu.com'
>>> content = download(host, '/15.04/ubuntu-15.04-desktop-amd64.iso', 1)
>>> print content is None
True
>>> content = download(host, '/15.04/MD5SUMS', 1)
>>> print content is None
False
Other than checking for None, it's also possible to catch the httplib.IncompleteRead exception not inside the function, but outside of it. The latter case will not work though if the HTTP request doesn't have a Content-Length header.
I found in my tests (using the technique described here) that a timeout set in the urlopen() call also effects the read() call:
import urllib2 as u
c = u.urlopen('http://localhost/', timeout=5.0)
s = c.read(1<<20)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/socket.py", line 380, in read
data = self._sock.recv(left)
File "/usr/lib/python2.7/httplib.py", line 561, in read
s = self.fp.read(amt)
File "/usr/lib/python2.7/httplib.py", line 1298, in read
return s + self._file.read(amt - len(s))
File "/usr/lib/python2.7/socket.py", line 380, in read
data = self._sock.recv(left)
socket.timeout: timed out
Maybe it's a feature of newer versions? I'm using Python 2.7 on a 12.04 Ubuntu straight out of the box.
One possible (imperfect) solution is to set the global socket timeout, explained in more detail here:
import socket
import urllib2
# timeout in seconds
socket.setdefaulttimeout(10)
# this call to urllib2.urlopen now uses the default timeout
# we have set in the socket module
req = urllib2.Request('http://www.voidspace.org.uk')
response = urllib2.urlopen(req)
However, this only works if you're willing to globally modify the timeout for all users of the socket module. I'm running the request from within a Celery task, so doing this would mess up timeouts for the Celery worker code itself.
I'd be happy to hear any other solutions...
I'd expect this to be a common problem, and yet - no answers to be found anywhere... Just built a solution for this using timeout signal:
import urllib2
import socket
timeout = 10
socket.setdefaulttimeout(timeout)
import time
import signal
def timeout_catcher(signum, _):
raise urllib2.URLError("Read timeout")
signal.signal(signal.SIGALRM, timeout_catcher)
def safe_read(url, timeout_time):
signal.setitimer(signal.ITIMER_REAL, timeout_time)
url = 'http://uberdns.eu'
content = urllib2.urlopen(url, timeout=timeout_time).read()
signal.setitimer(signal.ITIMER_REAL, 0)
# you should also catch any exceptions going out of urlopen here,
# set the timer to 0, and pass the exceptions on.
The credit for the signal part of the solution goes here btw: python timer mystery
Any asynchronous network library should allow to enforce the total timeout on any I/O operation e.g., here's gevent code example:
#!/usr/bin/env python2
import gevent
import gevent.monkey # $ pip install gevent
gevent.monkey.patch_all()
import urllib2
with gevent.Timeout(2): # enforce total timeout
response = urllib2.urlopen('http://localhost:8000')
encoding = response.headers.getparam('charset')
print response.read().decode(encoding)
And here's asyncio equivalent:
#!/usr/bin/env python3.5
import asyncio
import aiohttp # $ pip install aiohttp
async def fetch_text(url):
response = await aiohttp.get(url)
return await response.text()
text = asyncio.get_event_loop().run_until_complete(
asyncio.wait_for(fetch_text('http://localhost:8000'), timeout=2))
print(text)
The test http server is defined here.
pycurl.TIMEOUT option works for the whole request:
#!/usr/bin/env python3
"""Test that pycurl.TIMEOUT does limit the total request timeout."""
import sys
import pycurl
timeout = 2 #NOTE: it does limit both the total *connection* and *read* timeouts
c = pycurl.Curl()
c.setopt(pycurl.CONNECTTIMEOUT, timeout)
c.setopt(pycurl.TIMEOUT, timeout)
c.setopt(pycurl.WRITEFUNCTION, sys.stdout.buffer.write)
c.setopt(pycurl.HEADERFUNCTION, sys.stderr.buffer.write)
c.setopt(pycurl.NOSIGNAL, 1)
c.setopt(pycurl.URL, 'http://localhost:8000')
c.setopt(pycurl.HTTPGET, 1)
c.perform()
The code raises the timeout error in ~2 seconds. I've tested the total read timeout with the server that sends the response in multiple chunks with the time less than the timeout between chunks:
$ python -mslow_http_server 1
where slow_http_server.py:
#!/usr/bin/env python
"""Usage: python -mslow_http_server [<read_timeout>]
Return an http response with *read_timeout* seconds between parts.
"""
import time
try:
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer, test
except ImportError: # Python 3
from http.server import BaseHTTPRequestHandler, HTTPServer, test
def SlowRequestHandlerFactory(read_timeout):
class HTTPRequestHandler(BaseHTTPRequestHandler):
def do_GET(self):
n = 5
data = b'1\n'
self.send_response(200)
self.send_header("Content-type", "text/plain; charset=utf-8")
self.send_header("Content-Length", n*len(data))
self.end_headers()
for i in range(n):
self.wfile.write(data)
self.wfile.flush()
time.sleep(read_timeout)
return HTTPRequestHandler
if __name__ == "__main__":
import sys
read_timeout = int(sys.argv[1]) if len(sys.argv) > 1 else 5
test(HandlerClass=SlowRequestHandlerFactory(read_timeout),
ServerClass=HTTPServer)
I've tested the total connection timeout with http://google.com:22222.
This isn't the behavior I see. I get a URLError when the call times out:
from urllib2 import Request, urlopen
req = Request('http://www.google.com')
res = urlopen(req,timeout=0.000001)
# Traceback (most recent call last):
# File "<stdin>", line 1, in <module>
# ...
# raise URLError(err)
# urllib2.URLError: <urlopen error timed out>
Can't you catch this error and then avoid trying to read res?
When I try to use res.read() after this I get NameError: name 'res' is not defined. Is something like this what you need:
try:
res = urlopen(req,timeout=3.0)
except:
print 'Doh!'
finally:
print 'yay!'
print res.read()
I suppose the way to implement a timeout manually is via multiprocessing, no? If the job hasn't finished you can terminate it.
Had the same issue with socket timeout on the read statement. What worked for me was putting both the urlopen and the read inside a try statement. Hope this helps!

Python send cmd on socket

I have a simple question about Python:
I have another Python script listening on a port on a Linux machine.
I have made it so I can send a request to it, and it will inform another system that it is alive and listening.
My problem is that I don't know how to send this request from another python script running on the same machine (blush)
I have a script running every minute, and I would like to expand it to also send this request. I dont expect to get a response back, my listening-script postes to a database.
In Internet Explorer, I write like this: http://192.168.1.46:8193/?Ping
I would like to know how to do this from Python, and preferably just send and not hang if the other script is not running.
thanks
Michael
It looks like you are doing an HTTP request, rather than an ICMP ping.
urllib2, built-in to Python, can help you do that.
You'll need to override the timeout so you aren't hanging too long. Straight from that article, above, here is some example code for you to tweak with your desired time-out and URL.
import socket
import urllib2
# timeout in seconds
timeout = 10
socket.setdefaulttimeout(timeout)
# this call to urllib2.urlopen now uses the default timeout
# we have set in the socket module
req = urllib2.Request('http://www.voidspace.org.uk')
response = urllib2.urlopen(req)
import urllib2
try:
response = urllib2.urlopen('http://192.168.1.46:8193/?Ping', timeout=2)
print 'response headers: "%s"' % response.info()
except IOError, e:
if hasattr(e, 'code'): # HTTPError
print 'http error code: ', e.code
elif hasattr(e, 'reason'): # URLError
print "can't connect, reason: ", e.reason
else:
raise # don't know what it is
This is a bit outside my knowledge, but maybe this question might help?
Ping a site in Python?
Considered Twisted? What you're trying to achieve could be taken straight out of their examples. It might be overkill, but if you'll eventually want to start adding authentication, authorization, SSL, etc. you might as well start in that direction.

Categories

Resources