I'm designing a workflow engine for a very specific task and I'm thinking about exception handling.
I've got a main process that calls a few functions. Most of those functions call other more specific functions and so on. There are a few libraries involved so there are a lot of specific errors that can occur. IOError, OSError, AuthenticationException ...
I have to stop the workflow when an error occurs and log it so I can continue from that point when the error is resolved.
Example of what I mean:
def workflow_runner():
download_file()
...
(more calls with their own exceptions)
...
def download_file():
ftps = open_ftp_connection()
ftps.get(filename)
...
(more calls with their own exceptions)
...
def open_ftp_connection():
ftps = ftplib.FTP_TLS()
try:
ftps.connect(domain, port)
ftps.login(username, password)
except ftplib.all_errors as e:
print(e)
raise
return ftps
Your basic, run of the mill, modular functions.
My question is this:
What's considered the best way of doing top to bottom error handling in Python 3?
To raise every exception to the top and thus put "try except" over each function call up the stack?
To handle every exception when it happens, log and raise and have no "try except" at the "top"?
Some better alternative?
Would it be better to just finish and raise the error on the spot or catch it in the "download_file" and/or "workflow_runner" functions?
I ask because if I end up catching everything at the top I feel like I might end up with:
except AError
except BError
...
except A4Error
It depends… You catch an exception at the point where you can do something about it. That differs between different functions and different exception types. A piece of code calls a subsystem (generically speaking any function), and it knows that subsystem may raise exception A, B or C. It now needs to decide what exceptions it expects and/or what it can do about each one of them. In the end it may decide to catch A and B exceptions, but it wouldn't make sense for it to catch C exceptions because it can't do anything about them. This now means this piece of code may raise C exceptions, and its callers need to be aware of that and make the same kinds of decisions.
So different exceptions are caught at different layers, as appropriate.
In more concrete terms, say you have some system which consists of some HTTP object which downloads some stuff from remote servers, some job manager which wrangles a bunch of these HTTP objects and stores their result in a database, and a top level coordinator that starts and stops the job managers. The HTTP objects may obviously raise all sorts of HTTP exceptions when network requests fail, and the job managers may raise exceptions when something's wrong with the database. You will probably let the job managers worry about HTTP errors like 404, but not about something fundamental like ComputerDoesntHaveANetworkInterface errors; equally DatabaseIsUnreachable exceptions is nothing a job manager can do anything about, and should probably lead to the termination of the application.
Related
I'm using a Python script to get Stock price from an API,
Everythinks works great but sometimes I receive html errors instead of prices,
These errors prevent the script from continuing and the terminal stopped working,
how do I test the response from the API before passing the information to the next script?
I don't want the terminal to stop when it receives server errors.
Only one line to get price :
get_price = api_client.quote('TWTR')
The question, "How do I test the response from the API before passing the information" is well suited for a try/except block. For example the following will attempt to call quote() and if an exception is raised it will print a message to the console:
try:
get_price = api_client.quote('TWTR')
except <Exception Type Goes Here>:
# handle the exception case.
print('Getting quote did not work as expected')
The example is useful for illustrating the point but you should improve the exception handling. I recommend you take a look at resources to help you understand how to appropriately handle exceptions in your code. A few that I have found useful would be:
The Python Docs: Learn what an exception is and how they can be used
Python Exception Handling: A blog post that details how to combine some of the concepts in the Python documentation practically.
I am running a job which makes many requests to retrieve data from an API. In order to make the requests, I am using the requests module and an iteration over this code:
logger.debug("Some log message")
response = requests.get(
url=self._url,
headers=self.headers,
auth=self.auth,
)
logger.debug("Some other log message")
This usually produces the following logs:
[...] Some log message
[2019-08-27 03:00:57,201 - DEBUG - connectionpool.py:393] https://my.url.com:port "GET /some/important/endpoint?$skiptoken='12345' HTTP/1.1" 401 0
[2019-08-27 03:00:57,601 - DEBUG - connectionpool.py:393] https://my.url.com:port "GET /some/important/endpoint?$skiptoken='12345' HTTP/1.1" 200 951999
[...] Some other log message
In very rare occasions however, the job never terminates and in the logs it says:
[...] Some log message
[2019-08-27 03:00:57,201 - DEBUG - connectionpool.py:393] https://my.url.com:port "GET /some/important/endpoint?$skiptoken='12345' HTTP/1.1" 401 0
It never printed the remaining log messages and never returned. I am not able to reproduce the issue. I made the request which never returned manually but it gave me the desired response.
Questions:
Why does urllib3 always print a log with a status code 401 before printing a log with status code 200? Is this always the case or caused by an issue with the authentication or with the API server?
In the rare case of the second log snipped, is my assumption correct, that the application is stuck making a request which never returns? Or:
a) Could the requests.get throw an exception which results in the other log statements to never be printed and then is "magically" get caught somewhere in my code?
b) Is there a different possibility which I have not realised?
Additional Information:
Python 2.7.13 (We are already in the middle of upgrading to Python3, but this needs to be solved before that is completed)
requests 2.21.0
urllib3 1.24.3
auth is passed a requests.auth.HTTPDigestAuth(username, password)
My code has no try/except block which is why I wrote "magically" in Question 2.a. This is because we would prefer the job to Fail "loudly".
I am iterating over a generator yielding urls in order to make multiple requests
The job is run by
Jenkins 2.95 on a schedule
When everything runs successfully it makes around 300 requests in about 5min
I am running two python scripts both running the same code but against different endpoints in one job but in parallel
Update
Answer to Q1:
This seems to be expected behaviour for HTTP Digest Auth.
See this github issue and Wikipedia.
To answer your question,
1. Seems like its a problem from your API. To make sure can you run a curl command and see?
curl -i https://my.url.com:port/some/important/endpoint?$skiptoken='12345'
It never terminates, maybe due to that API is not responding back. Add timeout to avoid this kind of blocks.
response = requests.get(
url=self._url,
headers=self.headers,
auth=self.auth,
timeout=60
)
Hope this helps your problem.
As Vithulan already answered, you should always set a timeout value when doing network calls - unless you don't care about your process staying stuck forever that is...
Now wrt/ error handling etc:
a) Could the requests.get throw an exception which results in the
other log statements to never be printed and then is "magically" get
caught somewhere in my code?
It's indeed possible that some other try/except block upper in the call stack swallows the exception, but only you can tell. Note that if that's the case, you have some very ill-behaved code - a try/except should 1/ only target the exact exception(s) it's supposed to handle, 2/ have the least possible code within the try block to avoid catching similar errors from another part of the code and 3/ never silence an exception (IOW it should at least log the exception and traceback).
Note that you might as well just have a deactivated logger FWIW ;-)
This being said and until you made sure you didn't have such an issue, you can still get more debugging info by logging requests exceptions in your function:
logger.debug("Some log message")
try:
response = requests.get(
url=self._url,
headers=self.headers,
auth=self.auth,
timeout=SOME_TIMEOUT_VALUE
)
except Exception as e:
# this will log the full traceback too
logger.exception("oops, call to %s failed : %s", self._url, e)
# make sure we don't swallow the exception
raise
logger.debug("Some other log message")
Now a fact of life is that HTTP requests can fail for such an awful lot of reasons that you should actually expect it to fail, so you may want to have some retry mechanism. Also, the fact that the call to requests.get didn't raise doesn't mean the call failed - you still have to check the response code (or use response.raise_for_status()).
EDIT:
As mentioned in my question, my code has no try/except block because we would like the entire job to terminate if any problem occurs.
A try/except block doesn't prevent you from terminating the job - just re-raise the exception (eventually after X retries), or raise a new one instead, or call sys.exit() (which actually works by raising an exception) -, and it lets you get useful debugging infos etc, cf my example code.
If there is an issue with the logger, this would then only occur in rare occasions. I can not imagine a scenario where the same code is run but sometimes the loggers are activated and sometimes not.
I was talking about another logger upper in the call stack. But this was only for completeness, I really think you just have a request that never returns for the lack of a timeout.
Do you know why I am noticing the Issue I talk about in Question 1?
Nope, and that's actually something I'd immediatly investigate since, AFAICT, for a same request, you should either have only the 401 or only the 200.
According to the RFC:
10.4.2 401 Unauthorized
The request requires user authentication. The response MUST include a WWW-Authenticate header
field (section 14.47) containing a challenge applicable to the
requested resource. The client MAY repeat the request with a suitable
Authorization header field (section 14.8).
If the request already included Authorization credentials, then the 401 response indicates
that authorization has been refused for those credentials. If the 401
response contains the same challenge as the prior response, and the
user agent has already attempted authentication at least once, then
the user SHOULD be presented the entity that was given in the
response, since that entity might include relevant diagnostic
information.
So unless requests does something weird with auth headers (which is not the fact as far as I remember but...), you should only have one single response logged.
EDIT 2:
I wanted to say that if an exception is thrown but not explicitly caught by my code, it should terminate the job (which was the case in some tests I ran)
If an exception reaches the top of the call stack without being handled, the runtime will indeed terminate the process - but you have to be sure that no handler up the call stack kicks in and swallows the exception. Testing the function in isolation will not exhibit this issue, so you'd have to check the full call stack.
This being said:
The fact, that it does not terminate, suggests to me, that no exception is thrown.
That's indeed the most likely, but only you can make sure it's really the case (we don't know the full code, the logger configuration etc).
If I request data from an API using a raspberry pi in a while/for loop in python and append data to csv and one iteration fails due to something like faulty wifi connection that comes and goes, what is a foolproof method of having an indication that an error occurred and have it keep trying again either immediately or after some rest period?
Use try/except to catch the exception, e.g.:
while True:
try:
my_function_that_sometimes_fails()
except Exception e:
print e
I guess retry package (and decorator) will suit your needs. You can specify what kind of exception it should catch and how many times it should retry before stopping completely. You can also specify the amount of time between each try.
I am trying to implement a job which reads from Azure Queue and writes into db. occasionally some errors are raised from the Azure server such as timeout, server busy etc. How to handle such errors in the code, I tried ti run the code in a try catch loop but, I am not able to identify Azure errors?
I triedn to import WindowsAzureError from azure , but it doesn't work (no such module to import)?
Which is a good way to handle errors in this case?
If you're using 0.30+ all errors that occur after the request to the service has been will extend from AzureException. AzureException can be found in the azure.common package which Azure storage takes a dependency on. Errors which are thrown if invalid args are passed to a method (ex None for the queue name) might not extend from this and will be standard Python exception like ValueError.
Thanks #Terran,
exception azure.common.AzureConflictHttpError(message, status_code)
Bases: azure.common.AzureHttpError
exception azure.common.AzureException
Bases: exceptions.Exception
exception azure.common.AzureHttpError(message, status_code)
Bases: azure.common.AzureException
exception azure.common.AzureMissingResourceHttpError(message, status_code)
Bases: azure.common.AzureHttpError
This helped me.. http://azure-sdk-for-python.readthedocs.org/en/latest/ref/azure.common.html
I'm using twisted manhole (https://github.com/HoverHell/pyaux/blob/master/pyaux/runlib.py#L126), and I also send errors caught by Twisted into python logging (https://github.com/HoverHell/pyaux/blob/master/pyaux/twisted_aux.py#L9).
However, as a result, the log gets ConnectionDone() errors, which isn't a very interesting thing as an error.
What would be appropriate to change to avoid getting this (and, possibly, some other) not-exactly-errors? Filtering for twisted.python.failure.Failure cases, perhaps? And where from is the ConnectionDone() even raised and why?
ConnectionDone() instance is given to connectionLost() callback after the connection has been closed. You should be seeing this, when the client side decides to close the connection.
You definitely don't want to filter the Failure out. You can think of the failure as a "asynchronous analogy" of the Exception. The usual thing to do, not to see some kind of exceptions is something like:
from twisted.internet import error
...
def connectionLost(self, reason):
if reason.check(error.ConnectionDone):
# this is normal, ignore this
pass
else:
# do whatever you have been doing for logging