AWS Lambda Function cannot access other services - python

I have a problem with an AWS Lambda Function which depends upon DynamoDB and SQS to function properly. When I try to run the lambda stack, they time out when trying to connect to the SQS service. The AWS Lambda Function lies inside a VPC with the following setup:
A VPC with four subnets
Two subsets are public, routing their 0.0.0.0/16 traffic to an internet gateway
A MySQL server sits in a public subnet
The other two contain the lambdas and route their 0.0.0.0/16 traffic to a NAT which lives in one of the public subnets.
All route tables have a 10.0.0.0/16 to local rule (is this the problem because Lambdas use private Ip's inside a VPC?)
The main rout table is the one with the NAT, but I explicitly associated the public nets with the internet gateway routing table
The lambdas and the mysql server share a security group which allows for inbound internal access (10.x/16) as well as unrestricted outbound traffic (0.0.0.0/16).
Traffic between lambdas and the mysql instance is no problem (except if I put the lambdas outside the VPC, then they can't access the server even if I open up all ports). Assume the code for the lambdas is also correct, as it worked before I tried to mask it in a private net. Also the lambda execution roles have been set accordingly (or do they need adjustments after moving them to a private net?).
Adding a dynamodb endpoint solved the problems with the database, but there are no VPC endpoints available for some of the other services. Following some answers I found here, here, here and in the announcements / tutorials here and here, I am pretty sure I followed all the recommended steps.
I would be very thankful and glad for any hints where to check next, as I have currently no idea what could be the problem here.
EDIT: The function don't seem to have any internet access at all, since a toy example I checked also timed out:
import urllib.request
def lambda_handler(event, context):
test = urllib.request.urlopen(url="http://www.google.de")
return test.status

Of course the problem was sitting in front of the monitor again. Instead of routing 0.0.0.0/0 (any traffic) to the internet gateway, I had just specified 0.0.0.0/16 (traffic from machines with an 0.0.x.x ip) to the gate. Since no machines with such ip exists any traffic was blocked from entering leaving the VPC.
#John Rotenstein: Thx, though for the hint about lambdash. It seems like a very helpful tool.

Your configuration sounds correct.
You should test the configuration to see whether you can access any public Internet sites, then test connecting to AWS.
You could either write a Lambda function that attempts such connections or you could use lambdash that effectively gives you a remote shell running on Lambda. This way, you can easily test connectivity from the command line, such as curl.

Related

AWS Lambda to RDS PostgreSQL

Hello fellow AWS contributors, I’m currently working on a project to set up an example of connecting a Lambda function to our PostgreSQL database hosted on RDS. I tested my Python + SQL code locally (in VS code and DBeaver) and it works perfectly fine with including only basic credentials(host, dbname, username password). However, when I paste the code in Lambda function, it gave me all sorts of errors. I followed this template and modified my code to retrieve the credentials from secret manager instead.
I’m currently using boto3, psycopg2, and secret manager to get credentials and connect to the database.
List of errors I’m getting-
server closed the connection unexpectedly. This probably means the server terminated abnormally before or while processing the request
could not connect to server: Connection timed out. Is the server running on host “db endpoint” and accepting TCP/IP connections on port 5432?
FATAL: no pg_hba.conf entry for host “ip:xxx”, user "userXXX", database "dbXXX", SSL off
Things I tried -
RDS and Lambda are in the same VPC, same subnet, same security group.
IP address is included in the inbound rule
Lambda function is set to run up to 15 min, and it always stops before it even hits 15 min
I tried both database endpoint and database proxy endpoint, none of it works.
It doesn’t really make sense to me that when I run the code locally, I only need to provide the host, dbname, username, and password, that’s it, and I’m able to write all the queries and function I want. But when I throw the code in lambda function, it’s requiring all these secret manager, VPC security group, SSL, proxy, TCP/IP rules etc. Can someone explain why there is a requirement difference between running it locally and on lambda?
Finally, does anyone know what could be wrong in my setup? I'm happy to provide any information in related to this, any general direction to look into would be really helpful. Thanks!
Following the directions at the link below to build a specific psycopg2 package and also verifying the VPC subnets and security groups were configured correctly solved this issue for me.
I built a package for PostgreSQL 10.20 using psycopg2 v2.9.3 for Python 3.7.10 running on an Amazon Linux 2 AMI instance. The only change to the directions I had to make was to put the psycopg2 directory inside a python directory (i.e. "python/psycopg2/") before zipping it -- the import psycopg2 statement in the Lambda function failed until I did that.
https://kalyanv.com/2019/06/10/using-postgresql-with-python-on-aws-lambda.html
This the VPC scenario I'm using. The Lambda function is executing inside the Public Subnet and associated Security Group. Inbound rules for the Private Subnet Security Group only allow TCP connections to 5432 for the Public Subnet Security Group.
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_VPC.Scenarios.html#USER_VPC.Scenario1

Cant write to Database with AWS Lambda

I am trying to write files to a postgres database with AWS Lambda but I am facing an error:
Calling the invoke API action failed with this message: Network Error
My code looks like this:
from sqlalchemy import create_engine
import pandas as pd
def test(event=None, context=None):
conn = create_engine('postgresql://user:password#url:5439/database')
df = pd.DataFrame([{'A': 'foo', 'B': 'green', 'C': 11},{'A':'bar', 'B':'blue', 'C': 20}])
df.to_sql('your_table', conn, index=False, if_exists='replace', schema='schema')
test()
Resources:
Memory - 1280MB
Timeout - 2 minutes
What is the problem here and how else could I write pandas Dataframe to a Database with AWS Lambda?
I'm assuming the Postgres instance is in RDS.
Is your lambda in your VPC? You can check this on the function's page in admin console, in the VPC box. By default it's not and the VPC box says "None".
Case 1: Lambda is not in VPC
Then the issue might be that the security group associated with your RDS instance does not allow connections from outside the VPC. That's the default if you didn't touch the security group. Find the security group for your RDS instance from the RDS admin, then check out the "Inbound rules" for that security group. Lambdas don't have an IP so you'll need to add an inbound rule allowing at least postgres traffic for source "0.0.0.0/0", i.e. the entire internet.
This should be sufficient but note that this is not considered very good for security, since anyone can now in theory reach your DB (and worse if they can guess the password). But depending on your project that might not be a problem for you. If that is an issue for you, you could instead associate your lambda with the same VPC the RDS instance is in, in order to provide better networking security, and move to Case 2.
Case 2: Lambda is in a VPC
I'm assuming you put the lambda in the same VPC as the RDS instance for simplicity - if not you probably know what you're doing.
All you need to do now (providing you didn't touch other network configs) is ensure your RDS instance's security group allows access from your lambda's security group. So you could put both in the default security group, or put them in separate groups but make sure the RDS one has an inbound rule allowing the lambda one.
Note that if your lambda also needs to call external services (since you mention querying an API), in order to enable that, after linking it to your VPC you'll also need to create a NAT Gateway like I described here: https://stackoverflow.com/a/61273118/299754

Obtain EC2 instance IP from AWS lambda function and use requests library

I am writing a lambda function on Amazon AWS Lambda. It accesses the URL of an EC2 instance, on which I am running a web REST API. The lambda function is triggered by Alexa and is coded in the Python language (python3.x).
Currently, I have hard coded the URL of the EC2 instance in the lambda function and successfully ran the Alexa skill.
I want the lambda function to automatically obtain the IP from the EC2 instance, which keeps changing whenever I start the instance. This would ensure that I don't have to go the code and hard code the URL each time I start the EC2 instance.
I stumbled upon a similar question on SO, but it was unanswered. However, there was a reply which indicated updating IAM roles. I have already created IAM roles for other purposes before, but I am still not used to it.
Is this possible? Will it require managing of security groups of the EC2 instance?
Do I need to set some permissions/configurations/settings? How can the lambda code achieve this?
Additionally, I pip installed the requests library on my system, and I tried uploading a '.zip' file with the structure :
REST.zip/
requests library folder
index.py
I am currently using the urllib library
When I use zip files for my code upload (I currently edit code inline), it can't even accesse index.py file to run the code
You could do it using boto3, but I would advise against that architecture. A better approach would be to use a load balancer (even if you only have one instance), and then use the CNAME record of the load balancer in your application (this will not change for as long as the LB exists).
An even better way, if you have access to your own domain name, would be to create a CNAME record and point it to the address of the load balancer. Then you can happily use the DNS name in your Lambda function without fear that it would ever change.

Python AWS lambda function connecting to RDS mysql: Timeout error

When the python lambda function is executed I get "Task timed out after 3.00 seconds" error. I am trying the same example function.
When I try to run the same code from eclipse it works fine and I can see the query result. Same way I can connect to the db instance from local-machine Mysql workbench without any issues.
I tried creating a role with with full administrator access policy for this lambda function and even then its not working fine. The db instance has a vpc and I just added my local ip address there using the edit CIDR option so I can access the instance through my local machine workbench. For VPC, subnet and security group parameter in lambda function I gave the same values as I have in the RDS db instance.
I have also increased the timeout for lambda function and still I see the timeout error.
Any input would be appreciated.
For VPC, subnet and security group parameter in lambda function I gave the same values as I have in the RDS db instance.
Security groups don't automatically trust their own members to access other members.
Add a rule to this security group for "MySQL" (TCP port 3306) but instead of specifying an IP address, start typing s g into the box and select the id of the security group that you are adding the rule to -- so that the group is self-referential.
Note that this is probably not the correct long-term fix, because if your Lambda function needs to access the Internet or most AWS services, the Lambda function needs to be on a private subnet behind a NAT device. That does not describe the configuration of the subnet where your RDS instance is currently configured, because you mentioned adding your local IP to allow access to RDS. That suggests your RDS is on a public subnet.
See also Why Do We Need Private Subnets in VPC for a better understanding of public vs. private subnets.

AWS Lambda sending HTTP request

This is likely a question with an easy answer, but i can't seem to figure it out.
Background: I have a python Lambda function to pick up changes in a DB, then using HTTP post the changes in json to a URL. I'm using urllib2 sort of like this:
# this runs inside a loop, in reality my error handling is much better
request = urllib2.Request(url)
request.add_header('Content-type', 'application/json')
try:
response = urllib2.urlopen(request, json_message)
except:
response = "Failed!"
It seems from the logs either the call to send the messages is skipped entirely, or times-out while waiting for a response.
Is there a permission setting I'm missing, the outbound rules in AWS appear to be right. [Edit] - The VPC applied to this lambda does have internet access, and the security groups applied appear to allow internet access. [/Edit]
I've tested the code locally (connected to the same data source) and it works flawlessly.
It appears the other questions related to posting from a lambda is related to node.js, and usually because the url is wrong. In this case, I'm using a requestb.in url, that i know is working as it works when running locally.
Edit:
I've setup my NAT gateway, and it should work, I've even gone as far as going to a different AWS account, re-creating the conditions, and it works fine. I can't see any Security Groups that would be blocking access anywhere. It's continuing to time-out.
Edit:
Turns out i was just an idiot when i setup my default route to the NAT Gateway, out of habit i wrote 0.0.0.0/24 instead of 0.0.0.0/0
If you've deployed your Lambda function inside your VPC, it does not obtain a public IP address, even if it's deployed into a subnet with a route to an Internet Gateway. It only obtains a private IP address, and thus can not communicate to the public Internet by itself.
To communicate to the public Internet, Lambda functions deployed inside your VPC need to be done so in a private subnet which has a route to either a NAT Gateway or a self-managed NAT instance.
I have also faced the same issue. I overcame it by using boto3 to invoke a lambda from another lambda.
import boto3
client = boto3.client('lambda')
response = client.invoke(
FunctionName='string',
InvocationType='Event'|'RequestResponse'|'DryRun',
LogType='None'|'Tail',
ClientContext='string',
Payload=b'bytes'|file,
Qualifier='string'
)
But make sure that you set the IAM policy for lambda role (in the Source AWS account) to invoke that another lambda.
Adding to the above, boto3 uses HTTP at the backend.

Categories

Resources