AWS Lambda provides built-in logging configuration as described here so that log records include timestamp and request ID. Request ID is a UUID for a particular invocation of a Lambda function, so it's important to include it in each log record because it allows the user to query for all log records with a given request ID (using CloudWatch Logs Insights), which allows for retrieving log records from a particular Lambda invocation. (It is not sufficient to simply filter by log stream, as a single log stream can contain logs from multiple invocations if the invocations were run within the same environment, as described here.)
Unfortunately, when a Python AWS Lambda function crashes due to an unhandled exception, the log record for the exception does not include the request ID. The obvious solution is to log unhandled exceptions via the logging module, so that request ID is included in the log record. This appears to be the cleanest solution but it does not work in this case because overriding sys.excepthook does not appear to have any effect within a Lambda environment (here is the only other source I can find that references this fact).
Therefore, I'm wondering if there's a recommended way to include request ID in the log record for an unhandled exception in a Python AWS Lambda function? This seems like a problem that should have a standard solution. Otherwise, I will just surround the lambda handler code with a try/except to log and re-raise all unhandled exceptions.
I'm using a Python script to get Stock price from an API,
Everythinks works great but sometimes I receive html errors instead of prices,
These errors prevent the script from continuing and the terminal stopped working,
how do I test the response from the API before passing the information to the next script?
I don't want the terminal to stop when it receives server errors.
Only one line to get price :
get_price = api_client.quote('TWTR')
The question, "How do I test the response from the API before passing the information" is well suited for a try/except block. For example the following will attempt to call quote() and if an exception is raised it will print a message to the console:
try:
get_price = api_client.quote('TWTR')
except <Exception Type Goes Here>:
# handle the exception case.
print('Getting quote did not work as expected')
The example is useful for illustrating the point but you should improve the exception handling. I recommend you take a look at resources to help you understand how to appropriately handle exceptions in your code. A few that I have found useful would be:
The Python Docs: Learn what an exception is and how they can be used
Python Exception Handling: A blog post that details how to combine some of the concepts in the Python documentation practically.
I am running a job which makes many requests to retrieve data from an API. In order to make the requests, I am using the requests module and an iteration over this code:
logger.debug("Some log message")
response = requests.get(
url=self._url,
headers=self.headers,
auth=self.auth,
)
logger.debug("Some other log message")
This usually produces the following logs:
[...] Some log message
[2019-08-27 03:00:57,201 - DEBUG - connectionpool.py:393] https://my.url.com:port "GET /some/important/endpoint?$skiptoken='12345' HTTP/1.1" 401 0
[2019-08-27 03:00:57,601 - DEBUG - connectionpool.py:393] https://my.url.com:port "GET /some/important/endpoint?$skiptoken='12345' HTTP/1.1" 200 951999
[...] Some other log message
In very rare occasions however, the job never terminates and in the logs it says:
[...] Some log message
[2019-08-27 03:00:57,201 - DEBUG - connectionpool.py:393] https://my.url.com:port "GET /some/important/endpoint?$skiptoken='12345' HTTP/1.1" 401 0
It never printed the remaining log messages and never returned. I am not able to reproduce the issue. I made the request which never returned manually but it gave me the desired response.
Questions:
Why does urllib3 always print a log with a status code 401 before printing a log with status code 200? Is this always the case or caused by an issue with the authentication or with the API server?
In the rare case of the second log snipped, is my assumption correct, that the application is stuck making a request which never returns? Or:
a) Could the requests.get throw an exception which results in the other log statements to never be printed and then is "magically" get caught somewhere in my code?
b) Is there a different possibility which I have not realised?
Additional Information:
Python 2.7.13 (We are already in the middle of upgrading to Python3, but this needs to be solved before that is completed)
requests 2.21.0
urllib3 1.24.3
auth is passed a requests.auth.HTTPDigestAuth(username, password)
My code has no try/except block which is why I wrote "magically" in Question 2.a. This is because we would prefer the job to Fail "loudly".
I am iterating over a generator yielding urls in order to make multiple requests
The job is run by
Jenkins 2.95 on a schedule
When everything runs successfully it makes around 300 requests in about 5min
I am running two python scripts both running the same code but against different endpoints in one job but in parallel
Update
Answer to Q1:
This seems to be expected behaviour for HTTP Digest Auth.
See this github issue and Wikipedia.
To answer your question,
1. Seems like its a problem from your API. To make sure can you run a curl command and see?
curl -i https://my.url.com:port/some/important/endpoint?$skiptoken='12345'
It never terminates, maybe due to that API is not responding back. Add timeout to avoid this kind of blocks.
response = requests.get(
url=self._url,
headers=self.headers,
auth=self.auth,
timeout=60
)
Hope this helps your problem.
As Vithulan already answered, you should always set a timeout value when doing network calls - unless you don't care about your process staying stuck forever that is...
Now wrt/ error handling etc:
a) Could the requests.get throw an exception which results in the
other log statements to never be printed and then is "magically" get
caught somewhere in my code?
It's indeed possible that some other try/except block upper in the call stack swallows the exception, but only you can tell. Note that if that's the case, you have some very ill-behaved code - a try/except should 1/ only target the exact exception(s) it's supposed to handle, 2/ have the least possible code within the try block to avoid catching similar errors from another part of the code and 3/ never silence an exception (IOW it should at least log the exception and traceback).
Note that you might as well just have a deactivated logger FWIW ;-)
This being said and until you made sure you didn't have such an issue, you can still get more debugging info by logging requests exceptions in your function:
logger.debug("Some log message")
try:
response = requests.get(
url=self._url,
headers=self.headers,
auth=self.auth,
timeout=SOME_TIMEOUT_VALUE
)
except Exception as e:
# this will log the full traceback too
logger.exception("oops, call to %s failed : %s", self._url, e)
# make sure we don't swallow the exception
raise
logger.debug("Some other log message")
Now a fact of life is that HTTP requests can fail for such an awful lot of reasons that you should actually expect it to fail, so you may want to have some retry mechanism. Also, the fact that the call to requests.get didn't raise doesn't mean the call failed - you still have to check the response code (or use response.raise_for_status()).
EDIT:
As mentioned in my question, my code has no try/except block because we would like the entire job to terminate if any problem occurs.
A try/except block doesn't prevent you from terminating the job - just re-raise the exception (eventually after X retries), or raise a new one instead, or call sys.exit() (which actually works by raising an exception) -, and it lets you get useful debugging infos etc, cf my example code.
If there is an issue with the logger, this would then only occur in rare occasions. I can not imagine a scenario where the same code is run but sometimes the loggers are activated and sometimes not.
I was talking about another logger upper in the call stack. But this was only for completeness, I really think you just have a request that never returns for the lack of a timeout.
Do you know why I am noticing the Issue I talk about in Question 1?
Nope, and that's actually something I'd immediatly investigate since, AFAICT, for a same request, you should either have only the 401 or only the 200.
According to the RFC:
10.4.2 401 Unauthorized
The request requires user authentication. The response MUST include a WWW-Authenticate header
field (section 14.47) containing a challenge applicable to the
requested resource. The client MAY repeat the request with a suitable
Authorization header field (section 14.8).
If the request already included Authorization credentials, then the 401 response indicates
that authorization has been refused for those credentials. If the 401
response contains the same challenge as the prior response, and the
user agent has already attempted authentication at least once, then
the user SHOULD be presented the entity that was given in the
response, since that entity might include relevant diagnostic
information.
So unless requests does something weird with auth headers (which is not the fact as far as I remember but...), you should only have one single response logged.
EDIT 2:
I wanted to say that if an exception is thrown but not explicitly caught by my code, it should terminate the job (which was the case in some tests I ran)
If an exception reaches the top of the call stack without being handled, the runtime will indeed terminate the process - but you have to be sure that no handler up the call stack kicks in and swallows the exception. Testing the function in isolation will not exhibit this issue, so you'd have to check the full call stack.
This being said:
The fact, that it does not terminate, suggests to me, that no exception is thrown.
That's indeed the most likely, but only you can make sure it's really the case (we don't know the full code, the logger configuration etc).
I am trying to implement a job which reads from Azure Queue and writes into db. occasionally some errors are raised from the Azure server such as timeout, server busy etc. How to handle such errors in the code, I tried ti run the code in a try catch loop but, I am not able to identify Azure errors?
I triedn to import WindowsAzureError from azure , but it doesn't work (no such module to import)?
Which is a good way to handle errors in this case?
If you're using 0.30+ all errors that occur after the request to the service has been will extend from AzureException. AzureException can be found in the azure.common package which Azure storage takes a dependency on. Errors which are thrown if invalid args are passed to a method (ex None for the queue name) might not extend from this and will be standard Python exception like ValueError.
Thanks #Terran,
exception azure.common.AzureConflictHttpError(message, status_code)
Bases: azure.common.AzureHttpError
exception azure.common.AzureException
Bases: exceptions.Exception
exception azure.common.AzureHttpError(message, status_code)
Bases: azure.common.AzureException
exception azure.common.AzureMissingResourceHttpError(message, status_code)
Bases: azure.common.AzureHttpError
This helped me.. http://azure-sdk-for-python.readthedocs.org/en/latest/ref/azure.common.html
I have been banging my head around on this issue for a bit and have not come up with a solution. I am attempting to trap the exception UploadEntityTooLargeEntity. This exception is raised by GAE when 2 things happen.
Set the max_bytes_total param in the create_upload_url:
self.template_values['AVATAR_SAVE_URL'] = blobstore.create_upload_url('/saveavatar,
max_bytes_total= 524288)
Attempt to post an item that exceeds the max_bytes_total.
I expect that, since my class is derived from RequestHandler that my error() method would be called. Instead I am getting a 413 screen telling me the upload is too large.
My request handler is derived from webapp2.RequestHandler. Is it expected that GAE will work with the error method derived from webapp2.RequestHandler? I'm not seeing this in GAE's code but I can't imagine there would be such an omission.
The 413 is generated by the App Engine infrastructure; the request neve reaches your app, so it's impossible to handle this condition yourself.