i was opening a url by using urllib.request.urlopen.
The following exception was raised
http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
HTTPError
I went through the documentation of the requests library
*exception urllib.error.HTTPError
Though being an exception (a subclass of URLError), an HTTPError can also function as a non-exceptional file-like return value (the same thing that urlopen() returns). This is useful when handling exotic HTTP errors, such as requests for authentication.
code
An HTTP status code as defined in RFC 2616. This numeric value corresponds to a value found in the dictionary of codes as found in http.server.BaseHTTPRequestHandler.responses.
reason
This is usually a string explaining the reason for this error.
headers
The HTTP response headers for the HTTP request that caused the HTTPError.*
But in my case there was no error code or string which gave reason for the exception.
enter image description here
If you catch the exception you can see the reason, like this:
try:
urllib.request.urlopen(req)
except urllib.error.HTTPError as e:
print(e.reason)
Related
I am trying to request weather api data and I can pull some of the data but eventually the code throws an HTTPError despite the Try/Except that I already have there. What am I writing wrong for my Try/Except?
I have tried to put the HTTPError in parentheses and catch is with HTTPError as he to give me back the error as a variable so I could read it. I've tried from urllib.error import HTTPError. Nothing works.
from urllib.error import HTTPError
for city in cities:
current_city = owm.get_current(city, **settings)
try:
print(f'Current city is {current_city["name"]} and the city number is: {current_city["id"]}')
except HTTPError:
print("Ooops")
print("------------")
Here is the error message:
HTTPError: HTTP Error 404: Not Found
I'm looking for a way, in Python, to get the HTTP message from the status code. Is there a way that is better than building the dictionary yourself?
Something like:
>>> print http.codemap[404]
'Not found'
In Python 2.7, the httplib module has what you need:
>>> import httplib
>>> httplib.responses[400]
'Bad Request
Constants are also available:
httplib.responses[httplib.NOT_FOUND]
httplib.responses[httplib.OK]
Update
#shao.lo added a useful comment bellow. The http.client module can be used:
# For Python 3 use
import http.client
http.client.responses[http.client.NOT_FOUND]
This problem solved in Python 3.5 onward. Now Python http library has built-in module called HTTPStatus. You can easily get the http status details such as status code, description, etc. These are the example code for HTTPStatus.
Python 3:
You could use HTTPStatus(<error-code>).phrase to obtain the description for a given code value. E.g. like this:
try:
response = opener.open(url,
except HTTPError as e:
print(f'In func my_open got HTTP {e.code} {HTTPStatus(e.code).phrase}')
Would give for example:
In func my_open got HTTP 415 Unsupported Media Type
Which answers the question. However a much more direct solution can be obtained from HTTPError.reason:
try:
response = opener.open(url, data)
except HTTPError as e:
print(f'In func my_open got HTTP {e.code} {e.reason}')
A 404 error indicates that the page may have been moved or taken offline. It differs from a 400 error which would indicate pointing at a page that never existed.
I'd like to raise a Python-standard exception when an HTTP response code from querying an API is not 200, but what specific exception should I use? For now I raise an OSError:
if response.status_code != 200:
raise OSError("Response " + str(response.status_code)
+ ": " + response.content)
I'm aware of the documentation for built-in exceptions.
You can simply call Response.raise_for_status() on your response:
>>> import requests
>>> url = 'http://stackoverflow.com/doesnt-exist'
>>> r = requests.get(url)
>>>
>>> print r.status_code
404
>>> r.raise_for_status()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "requests/models.py", line 831, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found
This will raise a requests.HTTPError for any 4xx or 5xx response.
See the docs on Response Status Code for a more complete example.
Note that this does not exactly do what you asked (status != 200): It will not raise an exception for 201 Created or 204 No Content, or any of the 3xx redirects - but this is most likely the behavior you want: requests will just follow the redirects, and the other 2xx are usually just fine if you're dealing with an API.
The built-in Python exceptions are probably not a good fit for what you are doing. You will want to subclass the base class Exception, and throw your own custom exceptions based on each scenario you want to communicate.
A good example is how the Python Requests HTTP library defines its own exceptions:
In the event of a network problem (e.g. DNS failure, refused
connection, etc), Requests will raise a ConnectionError exception.
In the rare event of an invalid HTTP response, Requests will raise an
HTTPError exception.
If a request times out, a Timeout exception is raised.
If a request exceeds the configured number of maximum redirections, a
TooManyRedirects exception is raised.
All exceptions that Requests explicitly raises inherit from
requests.exceptions.RequestException.
the following url returns the expected resonse in the browser:
http://ws.audioscrobbler.com/2.0/?method=user.getinfo&user=notonfile99&api_key=8e9de6bd545880f19d2d2032c28992b4
<lfm status="failed">
<error code="6">No user with that name was found</error>
</lfm>
But I am unable to access the xml in Python via the following code as an HTTPError exception is raised:
"due to an HTTP Error 400: Bad Request"
import urllib
urlopen('http://ws.audioscrobbler.com/2.0/?method=user.getinfo&user=notonfile99&api_key=8e9de6bd545880f19d2d2032c28992b4')
I see that I can work aound this via using urlretrieve rather than urlopen, but the html response gets written to disc.
Is there a way, just using the python v2.7 standard library, where I can get hold of the xml response, without having to read it from disc, and do housekeeping?
I see that this question has been asked before in a PHP context, but I don't know how to apply the answer to Python:
DOMDocument load on a page returning 400 Bad Request status
Copying from here: http://www.voidspace.org.uk/python/articles/urllib2.shtml#httperror
The exception that is thrown contains the full body of the error page:
#!/usr/bin/python2
import urllib2
try:
resp = urllib2.urlopen('http://ws.audioscrobbler.com/2.0/?method=user.getinfo&user=notonfile99&api_key=8e9de6bd545880f19d2d2032c28992b4')
except urllib2.HTTPError, e:
print e.code
print e.read()
I'm having little trouble creating a script working with URLs. I'm using urllib.urlopen() to get content of desired URL. But some of these URLs requires authentication. And urlopen prompts me to type in my username and then password.
What I need is to ignore every URL that'll require authentication, just easily skip it and continue, is there a way to do this?
I was wondering about catching HTTPError exception, but in fact, exception is handled by urlopen() method, so it's not working.
Thanks for every reply.
You are right about the urllib2.HTTPError exception:
exception urllib2.HTTPError
Though being an exception (a subclass of URLError), an HTTPError can also function as a non-exceptional file-like return value (the same thing that urlopen() returns). This is useful when handling exotic HTTP errors, such as requests for authentication.
code
An HTTP status code as defined in RFC 2616. This numeric value corresponds to a value found in the dictionary of codes as found in BaseHTTPServer.BaseHTTPRequestHandler.responses.
The code attribute of the exception can be used to verify that authentication is required - code 401.
>>> try:
... conn = urllib2.urlopen('http://www.example.com/admin')
... # read conn and process data
... except urllib2.HTTPError, x:
... print 'Ignoring', x.code
...
Ignoring 401
>>>