How to Manage Google API Errors in Python - python

I'm currently doing a lot of stuff with BigQuery, and am using a lot of try... except.... It looks like just about every error I get back from BigQuery is a apiclient.errors.HttpError, but with different strings attached to them, i.e.:
<HttpError 409 when requesting https://www.googleapis.com/bigquery/v2/projects/some_id/datasets/some_dataset/tables?alt=json returned "Already Exists: Table some_id:some_dataset.some_table">
<HttpError 404 when requesting https://www.googleapis.com/bigquery/v2/projects/some_id/jobs/sdfgsdfg?alt=json returned "Not Found: Job some_id:sdfgsdfg">
among many others. Right now the only way I see to handle these is to run regexs on the error messages, but this is messy and definitely not ideal. Is there a better way?

BigQuery is a REST API, the errors it uses follow standard HTTP error conventions.
In python, an HttpError has a resp.status field that returns the HTTP status code.
As you show above, 409 is 'conflict', 404 is 'not found'.
For example:
from googleapiclient.errors import HttpError
try:
...
except HttpError as err:
# If the error is a rate limit or connection error,
# wait and try again.
if err.resp.status in [403, 500, 503]:
time.sleep(5)
else: raise
The response is also a json object, an even better way is to parse the json and read the error reason field:
if err.resp.get('content-type', '').startswith('application/json'):
reason = json.loads(err.content).get('error').get('errors')[0].get('reason')
This can be:
notFound, duplicate, accessDenied, invalidQuery, backendError, resourcesExceeded, invalid, quotaExceeded, rateLimitExceeded, timeout, etc.

Google Cloud now provides exception handlers:
from google.api_core.exceptions import AlreadyExists, NotFound
try:
...
except AlreadyExists:
...
except NotFound:
...
This should prove more exact in catching the details of the error.
Please reference this source code to find other exceptions to utilize: http://google-cloud-python.readthedocs.io/en/latest/_modules/google/api_core/exceptions.html

Related

Django - Returning full exception trace to front-end on a 400

So just for closed beta and internal testing purposes I'm trying to return the exception with the 400 error in my Django server. Here is an example below of how I am doing this in Django code. I have a partner working on the front-end and they said they can't get the error out of the 400 response. I assumed it'd be in like response.body['error']. How can I send back a 400 with the full error trace in a way that can be pulled out by the front-end?
except Exception as e:
return Response(dict(error=str(e),
user_message=error_message_generic),
status=status.HTTP_400_BAD_REQUEST)
The most common approach is this:
from django.http import JsonResponse, HttpResponseBadRequest
return HttpResponseBadRequest(JsonResponse({
"error": "Malformed request",
# Any other fields you'd want
}))
See more at official docs.
If you are interested on returning specifically a traceback of your back-end exception, you can go this way:
from django.http import JsonResponse, HttpResponseBadRequest
import traceback
...
except Exception as e:
return HttpResponseBadRequest(JsonResponse({
"error": "Malformed request",
"traceback": traceback.format_exc()
}))
traceback.format_exc()
But it's not the best way to do it in all cases. If you are sure, that exception is related to user's input, then traceback wouldn't be so helpful for user. You should better use Serializer, whose validation would be much better. If you're just delivering information about unknown error I'd advise to use HttpResponseServerError and log your exceptions anywhere, to sentry, for example.

Bad URLs in Python 3.4.3

I am new to this so please help me. I am using urllib.request to open and reading webpages. Can someone tell me how can my code handle redirects, timeouts, badly formed URLs?
I have sort of found a way for timeouts, I am not sure if it is correct though. Is it? All opinions are welcomed! Here it is:
from socket import timeout
import urllib.request
try:
text = urllib.request.urlopen(url, timeout=10).read().decode('utf-8')
except (HTTPError, URLError) as error:
logging.error('Data of %s not retrieved because %s\nURL: %s', name, error, url)
except timeout:
logging.error('socket timed out - URL %s', url)
Please help me as I am new to this. Thanks!
Take a look at the urllib error page.
So for following behaviours:
Redirect: HTTP code 302, so that's a HTTPError with a code. You could also use the HTTPRedirectHandler instead of failing.
Timeouts: You have that correct.
Badly formed URLs: That's a URLError.
Here's the code I would use:
from socket import timeout
import urllib.request
try:
text = urllib.request.urlopen("http://www.google.com", timeout=0.1).read()
except urllib.error.HTTPError as error:
print(error)
except urllib.error.URLError as error:
print(error)
except timeout as error:
print(error)
I can't finding a redirecting URL, so I'm not exactly sure how to check the HTTPError is a redirect.
You might find the requests package is a bit easier to use (it's suggested on the urllib page).
Using requests package I was able to find a better solution. With the only exception you need to handle are:
try:
r = requests.get(url, timeout =5)
except requests.exceptions.Timeout:
# Maybe set up for a retry, or continue in a retry loop
except requests.exceptions.TooManyRedirects as error:
# Tell the user their URL was bad and try a different one
except requests.exceptions.ConnectionError:
# Connection could not be completed
except requests.exceptions.RequestException as e:
# catastrophic error. bail.
And to get the text of that page, all you need to do is:
r.text

HTTP status code to status message

I'm looking for a way, in Python, to get the HTTP message from the status code. Is there a way that is better than building the dictionary yourself?
Something like:
>>> print http.codemap[404]
'Not found'
In Python 2.7, the httplib module has what you need:
>>> import httplib
>>> httplib.responses[400]
'Bad Request
Constants are also available:
httplib.responses[httplib.NOT_FOUND]
httplib.responses[httplib.OK]
Update
#shao.lo added a useful comment bellow. The http.client module can be used:
# For Python 3 use
import http.client
http.client.responses[http.client.NOT_FOUND]
This problem solved in Python 3.5 onward. Now Python http library has built-in module called HTTPStatus. You can easily get the http status details such as status code, description, etc. These are the example code for HTTPStatus.
Python 3:
You could use HTTPStatus(<error-code>).phrase to obtain the description for a given code value. E.g. like this:
try:
response = opener.open(url,
except HTTPError as e:
print(f'In func my_open got HTTP {e.code} {HTTPStatus(e.code).phrase}')
Would give for example:
In func my_open got HTTP 415 Unsupported Media Type
Which answers the question. However a much more direct solution can be obtained from HTTPError.reason:
try:
response = opener.open(url, data)
except HTTPError as e:
print(f'In func my_open got HTTP {e.code} {e.reason}')
A 404 error indicates that the page may have been moved or taken offline. It differs from a 400 error which would indicate pointing at a page that never existed.

How to accessing xml response when site also issues an HTTP error code

the following url returns the expected resonse in the browser:
http://ws.audioscrobbler.com/2.0/?method=user.getinfo&user=notonfile99&api_key=8e9de6bd545880f19d2d2032c28992b4
<lfm status="failed">
<error code="6">No user with that name was found</error>
</lfm>
But I am unable to access the xml in Python via the following code as an HTTPError exception is raised:
"due to an HTTP Error 400: Bad Request"
import urllib
urlopen('http://ws.audioscrobbler.com/2.0/?method=user.getinfo&user=notonfile99&api_key=8e9de6bd545880f19d2d2032c28992b4')
I see that I can work aound this via using urlretrieve rather than urlopen, but the html response gets written to disc.
Is there a way, just using the python v2.7 standard library, where I can get hold of the xml response, without having to read it from disc, and do housekeeping?
I see that this question has been asked before in a PHP context, but I don't know how to apply the answer to Python:
DOMDocument load on a page returning 400 Bad Request status
Copying from here: http://www.voidspace.org.uk/python/articles/urllib2.shtml#httperror
The exception that is thrown contains the full body of the error page:
#!/usr/bin/python2
import urllib2
try:
resp = urllib2.urlopen('http://ws.audioscrobbler.com/2.0/?method=user.getinfo&user=notonfile99&api_key=8e9de6bd545880f19d2d2032c28992b4')
except urllib2.HTTPError, e:
print e.code
print e.read()

urllib ignore authentication requests

I'm having little trouble creating a script working with URLs. I'm using urllib.urlopen() to get content of desired URL. But some of these URLs requires authentication. And urlopen prompts me to type in my username and then password.
What I need is to ignore every URL that'll require authentication, just easily skip it and continue, is there a way to do this?
I was wondering about catching HTTPError exception, but in fact, exception is handled by urlopen() method, so it's not working.
Thanks for every reply.
You are right about the urllib2.HTTPError exception:
exception urllib2.HTTPError
Though being an exception (a subclass of URLError), an HTTPError can also function as a non-exceptional file-like return value (the same thing that urlopen() returns). This is useful when handling exotic HTTP errors, such as requests for authentication.
code
An HTTP status code as defined in RFC 2616. This numeric value corresponds to a value found in the dictionary of codes as found in BaseHTTPServer.BaseHTTPRequestHandler.responses.
The code attribute of the exception can be used to verify that authentication is required - code 401.
>>> try:
... conn = urllib2.urlopen('http://www.example.com/admin')
... # read conn and process data
... except urllib2.HTTPError, x:
... print 'Ignoring', x.code
...
Ignoring 401
>>>

Categories

Resources