In this line of code:
request = requests.get(url, verify=False, headers=headers, proxies=proxy, timeout=15)
How do I know that timeout=15 has been triggered so I can send a message that url did not send any data in 15 seconds?
If a response is not received from the server at all within the given time, then an exception requests.exceptions.Timeout is thrown, as per Exa's link from the other answer.
To test if this occurred we can use a try, except block to detect it and act accordingly, rather than just letting our program crash.
Expanding on the demonstration used in the docs:
import requests
try:
requests.get('https://github.com/', timeout=0.001)
except requests.exceptions.Timeout as e:
# code to run if we didn't get a reply
print("Request timed out!\nDetails:", e)
else:
# code to run if we did get a response, and only if we did.
print(r.headers)
Just substitute your url and timeout where appropriate.
An exception will be thrown. See this for more info.
Related
I'm working with Python requests and testing URLs from https://badssl.com/ certificate section and all the invalid URLs are returning errors except for https://revoked.badssl.com/ and https://pinning-test.badssl.com/. They are responding with 200 status codes. I would like someone to explain why this is happening, despite the pages exhibiting errors such as NET::ERR_CERT_REVOKED and NET::ERR_SSL_PINNED_KEY_NOT_IN_CERT_CHAIN for the former and latter respectively.
import requests
def check_connection():
url='https://revoked.badssl.com/' or 'https://pinning-test.badssl.com/'
try:
r = requests.get(url)
r.raise_for_status()
print(r)
except requests.exceptions.RequestException as err:
print ("OOps: Something Else",err)
except requests.exceptions.HTTPError as errh:
print ("Http Error:",errh)
except requests.exceptions.ConnectionError as errc:
print ("Error Connecting:",errc)
except requests.exceptions.Timeout as errt:
print ("Timeout Error:",errt)
check_connection()
You're not getting an analog to "NET::ERR_CERT_REVOKED" message because requests is just an HTTP request tool; it's not a browser. If you want to query an OCSP responder to see if a server certificate has been revoked, you can use the ocsp module to do that. There's an example here.
The answer is going to be similar for "NET::ERR_SSL_PINNED_KEY_NOT_IN_CERT_CHAIN"; the requests module isn't the sort of high-level tool that implements certificate pinning. In fact, even the development builds of major browsers don't implement this; there's some interesting discussion about this issue in https://github.com/chromium/badssl.com/issues/15.
Is there a way to receive the status code of a request if the request throws an exception? I am trying to send a request, but there is a ConnectionTimeout. I've tried printing the error, but there's not useful status info in the error message. The actual object that is returned by requests.request() doesn't seem to get created when the exception is thrown, at least in my case. Could this be due to internal server configurations or something?
What I have attempted:
try:
response = requests.request(
method=some_http_method,
url=some_url,
auth=some_auth,
data=some_data,
timeout=30,
verify=some_certificate)
except requests.exceptions.ConnectTimeout as e:
response = e.response
print(e)
print(response.status_code) # should print a status code, instead response is still None
The exception ConnectTimeout means that the request timed out while trying to connect to the remote server. Thus, there's no response to get the status code from. it is normal that response variable is None
It can be some network issue, make sure the URL you're trying to query is reachable (use curl to verify)
As for now I use Requests as follows:
for url in urls:
req1 = requests.request("GET", url, timeout=5)
req2 = requests.request("POST", url, data=foo(req1.text), timeout=5)
The problem is that as a result of slow connection, the request hangs frequently, causing the process to get stuck and throw the ConnectionClosed exception after max retries reached.
Based on the demand, it is not necessary that each url gets processed correctly, which means as long as most of the urls are processed, some can be skipped if the connection failed.
So I wrap the code above with a try...except... block:
for url in urls:
try:
req1 = requests.request("GET", url, timeout=5)
req2 = requests.request("POST", url, data=foo(req1.text), timeout=5)
except:
time.sleep(120)
continue
Now the problem is that while the urls can be iterated over theoretically, only some can be processed. In fact, after the first exception thrown, all the subsequent requests tend to suffer from the same connection problem, that is, the process actually falls into endless "time.sleep()".
However, if I restart the script when the exception gets thrown, the connection problem won't persist, at least it can run smoothly for plenty of time.
I guess there must be some mechanism ignored by me, but I'm not so sure. Is there any connection reuse that gets in the way? In that case, how should I reset the connection upon exception? Should I use session which closes after each iteration? In that case, how can I achieve decent efficiency?
Thanks in advance.
Related post: https://github.com/requests/requests/issues/520
I am new to this so please help me. I am using urllib.request to open and reading webpages. Can someone tell me how can my code handle redirects, timeouts, badly formed URLs?
I have sort of found a way for timeouts, I am not sure if it is correct though. Is it? All opinions are welcomed! Here it is:
from socket import timeout
import urllib.request
try:
text = urllib.request.urlopen(url, timeout=10).read().decode('utf-8')
except (HTTPError, URLError) as error:
logging.error('Data of %s not retrieved because %s\nURL: %s', name, error, url)
except timeout:
logging.error('socket timed out - URL %s', url)
Please help me as I am new to this. Thanks!
Take a look at the urllib error page.
So for following behaviours:
Redirect: HTTP code 302, so that's a HTTPError with a code. You could also use the HTTPRedirectHandler instead of failing.
Timeouts: You have that correct.
Badly formed URLs: That's a URLError.
Here's the code I would use:
from socket import timeout
import urllib.request
try:
text = urllib.request.urlopen("http://www.google.com", timeout=0.1).read()
except urllib.error.HTTPError as error:
print(error)
except urllib.error.URLError as error:
print(error)
except timeout as error:
print(error)
I can't finding a redirecting URL, so I'm not exactly sure how to check the HTTPError is a redirect.
You might find the requests package is a bit easier to use (it's suggested on the urllib page).
Using requests package I was able to find a better solution. With the only exception you need to handle are:
try:
r = requests.get(url, timeout =5)
except requests.exceptions.Timeout:
# Maybe set up for a retry, or continue in a retry loop
except requests.exceptions.TooManyRedirects as error:
# Tell the user their URL was bad and try a different one
except requests.exceptions.ConnectionError:
# Connection could not be completed
except requests.exceptions.RequestException as e:
# catastrophic error. bail.
And to get the text of that page, all you need to do is:
r.text
Using the HTTPretty library for Python, I can create mock HTTP responses of choice and then pick them up i.e. with the requests library like so:
import httpretty
import requests
# set up a mock
httpretty.enable()
httpretty.register_uri(
method=httpretty.GET,
uri='http://www.fakeurl.com',
status=200,
body='My Response Body'
)
response = requests.get('http://www.fakeurl.com')
# clean up
httpretty.disable()
httpretty.reset()
print(response)
Out: <Response [200]>
Is there also the possibility to register an uri which cannot be reached (e.g. connection timed out, connection refused, ...) such that no response is received at all (which is not the same as an established connection which gives an HTTP error code like 404)?
I want to use this behaviour in unit testing to ensure that my error handling works as expected (which does different things in case of 'no connection established' and 'connection established, bad bad HTTP status code'). As a workaround, I could try to connect to an invalid server like http://192.0.2.0 which would time out in any case. However, I would prefer to do all my unit testing without using any real network connections.
Meanwhile I got it, using a HTTPretty callback body seems to produce the desired behaviour. See inline comments below.
This is actually not exactly the same as I was looking for (it is not a server that cannot be reached and hence the request times out but a server that throws a timeout exception once it is reached, however, the effect is the same for my usecase.
Still, if anybody knows a different solution, I'm looking forward to it.
import httpretty
import requests
# enable HTTPretty
httpretty.enable()
# create a callback body that raises an exception when opened
def exceptionCallback(request, uri, headers):
# raise your favourite exception here, e.g. requests.ConnectionError or requests.Timeout
raise requests.Timeout('Connection timed out.')
# set up a mock and use the callback function as response's body
httpretty.register_uri(
method=httpretty.GET,
uri='http://www.fakeurl.com',
status=200,
body=exceptionCallback
)
# try to get a response from the mock server and catch the exception
try:
response = requests.get('http://www.fakeurl.com')
except requests.Timeout as e:
print('requests.Timeout exception got caught...')
print(e)
# do whatever...
# clean up
httpretty.disable()
httpretty.reset()