HTTP status code to status message - python

I'm looking for a way, in Python, to get the HTTP message from the status code. Is there a way that is better than building the dictionary yourself?
Something like:
>>> print http.codemap[404]
'Not found'

In Python 2.7, the httplib module has what you need:
>>> import httplib
>>> httplib.responses[400]
'Bad Request
Constants are also available:
httplib.responses[httplib.NOT_FOUND]
httplib.responses[httplib.OK]
Update
#shao.lo added a useful comment bellow. The http.client module can be used:
# For Python 3 use
import http.client
http.client.responses[http.client.NOT_FOUND]

This problem solved in Python 3.5 onward. Now Python http library has built-in module called HTTPStatus. You can easily get the http status details such as status code, description, etc. These are the example code for HTTPStatus.

Python 3:
You could use HTTPStatus(<error-code>).phrase to obtain the description for a given code value. E.g. like this:
try:
response = opener.open(url,
except HTTPError as e:
print(f'In func my_open got HTTP {e.code} {HTTPStatus(e.code).phrase}')
Would give for example:
In func my_open got HTTP 415 Unsupported Media Type
Which answers the question. However a much more direct solution can be obtained from HTTPError.reason:
try:
response = opener.open(url, data)
except HTTPError as e:
print(f'In func my_open got HTTP {e.code} {e.reason}')

A 404 error indicates that the page may have been moved or taken offline. It differs from a 400 error which would indicate pointing at a page that never existed.

Related

Python request.head(): How to get response code using python request module instead of Exception

It returns the 200, 301 and some other responses as expected. But when i try to get responses of some non existent website, instead of returning codes, it throws exception. Below is the code when i tried to get response code for "www.googl.com", I'm expecting a response code for this scenario.Even i can handle it in try and except but actually i need response code.
Code:
import requests
print (requests.head("https://www.googl.com"))
Since nothing is being returned, there is no response code, because there was no response.
Your best bet is just doing this:
import requests
try:
response = requests.head("https://www.googl.com")
except:
response = 404 # or whatever you want.

How to Manage Google API Errors in Python

I'm currently doing a lot of stuff with BigQuery, and am using a lot of try... except.... It looks like just about every error I get back from BigQuery is a apiclient.errors.HttpError, but with different strings attached to them, i.e.:
<HttpError 409 when requesting https://www.googleapis.com/bigquery/v2/projects/some_id/datasets/some_dataset/tables?alt=json returned "Already Exists: Table some_id:some_dataset.some_table">
<HttpError 404 when requesting https://www.googleapis.com/bigquery/v2/projects/some_id/jobs/sdfgsdfg?alt=json returned "Not Found: Job some_id:sdfgsdfg">
among many others. Right now the only way I see to handle these is to run regexs on the error messages, but this is messy and definitely not ideal. Is there a better way?
BigQuery is a REST API, the errors it uses follow standard HTTP error conventions.
In python, an HttpError has a resp.status field that returns the HTTP status code.
As you show above, 409 is 'conflict', 404 is 'not found'.
For example:
from googleapiclient.errors import HttpError
try:
...
except HttpError as err:
# If the error is a rate limit or connection error,
# wait and try again.
if err.resp.status in [403, 500, 503]:
time.sleep(5)
else: raise
The response is also a json object, an even better way is to parse the json and read the error reason field:
if err.resp.get('content-type', '').startswith('application/json'):
reason = json.loads(err.content).get('error').get('errors')[0].get('reason')
This can be:
notFound, duplicate, accessDenied, invalidQuery, backendError, resourcesExceeded, invalid, quotaExceeded, rateLimitExceeded, timeout, etc.
Google Cloud now provides exception handlers:
from google.api_core.exceptions import AlreadyExists, NotFound
try:
...
except AlreadyExists:
...
except NotFound:
...
This should prove more exact in catching the details of the error.
Please reference this source code to find other exceptions to utilize: http://google-cloud-python.readthedocs.io/en/latest/_modules/google/api_core/exceptions.html

How to accessing xml response when site also issues an HTTP error code

the following url returns the expected resonse in the browser:
http://ws.audioscrobbler.com/2.0/?method=user.getinfo&user=notonfile99&api_key=8e9de6bd545880f19d2d2032c28992b4
<lfm status="failed">
<error code="6">No user with that name was found</error>
</lfm>
But I am unable to access the xml in Python via the following code as an HTTPError exception is raised:
"due to an HTTP Error 400: Bad Request"
import urllib
urlopen('http://ws.audioscrobbler.com/2.0/?method=user.getinfo&user=notonfile99&api_key=8e9de6bd545880f19d2d2032c28992b4')
I see that I can work aound this via using urlretrieve rather than urlopen, but the html response gets written to disc.
Is there a way, just using the python v2.7 standard library, where I can get hold of the xml response, without having to read it from disc, and do housekeeping?
I see that this question has been asked before in a PHP context, but I don't know how to apply the answer to Python:
DOMDocument load on a page returning 400 Bad Request status
Copying from here: http://www.voidspace.org.uk/python/articles/urllib2.shtml#httperror
The exception that is thrown contains the full body of the error page:
#!/usr/bin/python2
import urllib2
try:
resp = urllib2.urlopen('http://ws.audioscrobbler.com/2.0/?method=user.getinfo&user=notonfile99&api_key=8e9de6bd545880f19d2d2032c28992b4')
except urllib2.HTTPError, e:
print e.code
print e.read()

How to get HTTP return code from python urllib's urlopen?

I have the following code:
f = urllib.urlopen(url)
html = f.read()
I would like to know the HTTP status code (HTTP 200, 404 etc) that comes from opening the url above.
Anybody knows how it can be done?
P.S.
I use python 2.5.
Thanks!!!
You can use the .getcode() method of the object returned by urlopen()
url = urllib.urlopen('http://www.stackoverflow.com/')
code = url.getcode()
getcode() was only added in Python 2.6. As far as I know, there is no way to get the status code from the request itself in 2.5 but FancyURLopener provides a set of functions which get called on certain error codes - you could potentially use that to save a status code somewhere. I subclassed it to tell me when a 404 occurred
import urllib
class TellMeAbout404s(urllib.FancyURLopener):
def http_error_404(self, url, fp, errcode, errmsg, headers, data=None):
print("==== Got a 404")
opener = TellMeAbout404s()
f = opener.open("http://www.google.com/sofbewfwl")
print(f.info())
info() provides the HTTP headers but not the status code.

how can i determine if anything at the given url does exist [duplicate]

This question already has answers here:
python: check if url to jpg exists
(11 answers)
Closed 9 years ago.
how can i determine if anything at the given url does exist in the web using python? it can be a html page or a pdf file, shouldnt be matter.
ive tried the solution written in this page http://code.activestate.com/recipes/101276/
but it just returns a 1 when its a pdf file or anything.
You need to check HTTP response code. Python example:
from urllib2 import urlopen
code = urlopen("http://example.com/").code
4xx and 5xx code probably mean that you cannot get anything from this URL. 4xx status codes describe client errors (like "404 Not found") and 5xx status codes describe server errors (like "500 Internal server error"):
if (code / 100 >= 4):
print "Nothing there."
Links:
HTTP status codes
urllib2 reference
Send a HEAD request
import httplib
connection = httplib.HTTPConnection(url)
connection.request('HEAD', '/')
response = connection.getresponse()
if response.status == 200:
print "Resource exists"
The httplib in that example is using HTTP/1.0 instead of 1.1, and as such Slashdot is returning a status code 301 instead of 200. I would recommend using urllib2, and also probably checking for codes 20* and 30*.
The documentation for httplib states:
It is normally not used directly — the module urllib uses it to handle URLs that use HTTP and HTTPS.
[...]
The HTTP class is retained only for backward compatibility with 1.5.2. It should not be used in new code. Refer to the online docstrings for usage.
So yes. urllib is the way to open URLs in Python — an HTTP/1.0 client won't get very far on modern web servers.
(Also, a PDF link works for me.)
This solution returns 1 because server is sending 200 OK response.
There's something wrong with your server. It should return 404 if the file doesn't exist.

Categories

Resources