How to avoid timeout from AWS API Gateway and Lambda?

How to avoid timeout from AWS API Gateway and Lambda? - python

I have an API endpoint on AWS API Gateway with AWS Lambda (Python & Flask) to store some data from a JSON file.
e.g) curl -X POST http://www.xxx.yyy/store -d #zzz.json
However, when I tried executing the API with a bigger JSON file, I encountered a timeout error. Through my investigation, the maximum timeout setting for Lambda is 300 seconds, and API Gateway is 29 seconds. The maximum timeout for Lambda 300 sec sounds fine, but 29 seconds sounds too short. What kind of things could be a solution? The JSON data can be split by id, but it needs to be sent as one file.
EDIT:
Sure I can't change the number. Any suggestion to solve this problem using another technology/system design pattern? I can't change the input, though.
EDIT2:
Currently, the Lambda function has validation based on JSON scheme, parse into models, and save into database. Any suggestions?

Is there anyway you can update your Lambda function to hand off to another process?
By decoupling you could for example do the following:
API Gateway -> Lambda (Perform any mandatory action, then store in S3 as a blob) -> S3 -> Another Lambda to process.

Uploading files with lambdas can be tricky and a direct upload is not recommended unless the file size is under the limits.
Warning currently:
API Gateway has a payload limit of 10 MB
API Gateway has Maximum timeout of 30 s
Lambda has an invocation payload (request and response) limit of 6 MB
The best approach is a basically a two step process:
The client app makes an HTTP request to the lambda to get an upload URL. The lambda returns a pre-signed POST URL to S3
The client post the file using the pre-signed URL
API Gateway limits : https://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html
Lambda limits: https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html

The timeout value cannot be increased:
Resource or operation: Integration timeout
Default quota: 50 milliseconds - 29 seconds for all integration types, including
Lambda, Lambda proxy, HTTP, HTTP proxy, and AWS integrations.
Can be increased: Not for the lower or upper bounds.
Source: https://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html

Related

How to paginate aws lambda response to rest api gateway

I hope you can help me on how to paginate response from Lambda Function to api gateway.
I have an api gateway as lambda function trigger.
POST request, triggers a lambda (written in Python) that performs a query to a database, transform the data and return this data back to API.
Mi API POST method integration request has following parameters:
Integration type: Lambda Function
Use Lambda Proxy integration: YES
Rest of fields as default.
I want to paginate the API response but I can't find how to do it.
My lambda returns query details to the api as a big JSON. But I want that JSON limit can be 20 items per page.
Could you please guide me on how to implement it? Thanks

AWS API gateway + SQS + Lambda + get response back from Lambda to API

I want to use API Gateway to send a message to SQS which then needs to trigger Lambda. After calculations are finished within Lambda, I need to pass the results back to API Gateway. In other words, something like this:
Get request --> Gateway API --> SQS --> Lambda --> (back to the same SQS?) --> Gateway API
I have setup all the necessary permissions meaning that I can call Gateway API and send message to SQS which then sends that to Lambda (I can see in the Cloudwatch that Lambda received the message). However, I cannot get the Lambda response back to Gateway API...
Does anybody has some advice/tutorial/blog post about that? I have watched various youtube videos and searched posts on SO but didn't find solution for my problem.

AWS Lambda can handle a large number of concurrent invocations. The default is 1000 (one thousand) and can be increased via a support ticket to "Hundreds of thousands".
If you want to use SQS to smoothen intermittent request spikes, then the Lambda function invocations will be asynchronous with respect to the caller's/client's API Gateway call and you need to use other means to feedback the Lambda invocation result to the API Gateway caller/client.
One such possibility can be a callback URL that your Lambda will invoke on the caller's/client's side once it has processed the invocation. Or you can store the lambda invocation result somewhere (such as S3 or DynamoDB) and the caller/client can use polling to periodically ask for the invocation result (check whether it is ready and if so, retrieve it).
Either way, once you use SQS to decouple API Gateway invocations from the processing of those invocations by your Lambda function via SQS messages, then the processing of the Lambda invocations will be asynchronous to the API Gateway caller/client request. So, the HTTP request of the API Gateway caller/client will return right away without waiting for the Lambda invocation result.

Need better approach for azure api to process large amount of data

We have Azure http triggered function app(f1) which talks to another http triggered function app(f2) that has a prediction algorithm.
Depending upon input request size from function(f1), the response time of function(f2) increase a lot.
When the response time of function(f2) is more, the functions get timed out at 320 seconds.
Our requirement is to provide prediction algorithm as a
service(f2)
An orchestration API(f1) which will be called by the client and
based on the clients input request (f1) will collect the
data from database do data-validation and pass the data to
(f2) for prediction
After prediction (f2) would respond back predicted result to
(f1)
Once (f1) receives the response from (f2), (f1) would respond
back to client.
We are searching for alternative azure approach or solution which will
reduce the latency of an API and also the condition is to have f2
as a service.

If it takes more than 5 minutes in total to validate user input, retrieve additional data, feed it to the model and run the model itself, you might want to look at something different than APIs that return response synchronously.
With these kinds of running times, I would recommend a asynchronous pattern, such as F1 stores all data on a Azure Queue, F2 (Queue triggered) runs the model and stores data in database. Requestor can monitor database for updates. Of F1 takes the most time, than create a F0 that stores the request on a Queue and make F1 queue triggered as well.

As described in Limits for Http Trigger:
If a function that uses the HTTP trigger doesn't complete within 230 seconds, the Azure Load Balancer will time out and return an HTTP 502 error. The function will continue running but will be unable to return an HTTP response.
So it's not possible to make f1 and/or f2 Http Triggered.
Alternatives are many, but none can be synchronous (due to limitation above) if:
Interface to end user is REST API and
API is served by Http Triggered Azure Function and
Time needed to serve request is greater than 230 seconds.
Assuming:
Interface to end user is REST API and
API is served by Http Triggered Azure Function
one async possibility would be like this:
PS: I retained f1 and f2, which do the same as in your design. Though their trigger/output change.
Http Triggered REST API from f3 is the entry point for end user to trigger the job. Which would post to queue q1 and return a job-id / status-url as response.
user can query/poll current status/result of job by job-id using another Http Trigger API served by f4.
f1 and f2 are now triggered by queue trigger
f1, f2 and f3 update status for each job-id whenever needed into ADLS (which could be anything else like Redis cache or Table Storage etc).
f4 need not a separate function, it can be served as a different path/method from f3.

Limit return items on chalice and lambda service

I'm writing serverless apps in python and deploy use chalice, lambda, aws
Just a quick question:
I would like to limit the number of the item returned from the API.
Maximum 1000 items per day
Maximum 200 items per hour
Here is sample API:
#app.route('/items', authorizer=authorizer)
def get_items():
params = app.current_request.query_params
tickets = AvaiableItem(params).get()
return { 'items': items.serialize() }
How can I config the limit number?

Chalice does not have any builtin support for API throttling yet, but you can monitor the GitHub issue Add support for throttling per route.
In the interim, you can manually configure throttling via the AWS Console as described in Throttle API Requests for Better Throughput. You could of course also do this via boto3 if desired.

AWS Lambda sending HTTP request

This is likely a question with an easy answer, but i can't seem to figure it out.
Background: I have a python Lambda function to pick up changes in a DB, then using HTTP post the changes in json to a URL. I'm using urllib2 sort of like this:
# this runs inside a loop, in reality my error handling is much better
request = urllib2.Request(url)
request.add_header('Content-type', 'application/json')
try:
response = urllib2.urlopen(request, json_message)
except:
response = "Failed!"
It seems from the logs either the call to send the messages is skipped entirely, or times-out while waiting for a response.
Is there a permission setting I'm missing, the outbound rules in AWS appear to be right. [Edit] - The VPC applied to this lambda does have internet access, and the security groups applied appear to allow internet access. [/Edit]
I've tested the code locally (connected to the same data source) and it works flawlessly.
It appears the other questions related to posting from a lambda is related to node.js, and usually because the url is wrong. In this case, I'm using a requestb.in url, that i know is working as it works when running locally.
Edit:
I've setup my NAT gateway, and it should work, I've even gone as far as going to a different AWS account, re-creating the conditions, and it works fine. I can't see any Security Groups that would be blocking access anywhere. It's continuing to time-out.
Edit:
Turns out i was just an idiot when i setup my default route to the NAT Gateway, out of habit i wrote 0.0.0.0/24 instead of 0.0.0.0/0

If you've deployed your Lambda function inside your VPC, it does not obtain a public IP address, even if it's deployed into a subnet with a route to an Internet Gateway. It only obtains a private IP address, and thus can not communicate to the public Internet by itself.
To communicate to the public Internet, Lambda functions deployed inside your VPC need to be done so in a private subnet which has a route to either a NAT Gateway or a self-managed NAT instance.

I have also faced the same issue. I overcame it by using boto3 to invoke a lambda from another lambda.
import boto3
client = boto3.client('lambda')
response = client.invoke(
FunctionName='string',
InvocationType='Event'|'RequestResponse'|'DryRun',
LogType='None'|'Tail',
ClientContext='string',
Payload=b'bytes'|file,
Qualifier='string'
)
But make sure that you set the IAM policy for lambda role (in the Source AWS account) to invoke that another lambda.
Adding to the above, boto3 uses HTTP at the backend.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.