Connecting to DocumentDB from AWS Lambda using Python - python

I am trying to connect to DocumentDB from a Lambda function.
I have configured my DocumentDB as per this tutorial and can access it through the cloud9 command prompt.
The documentDB cluster is part of two security groups. The first security group is called demoDocDB and the second called default and is the vpc defulat security group.
The inbound rules for demoDocDB forward requests from the cloud9 instance to port 27017 where my documentDB database is running.
The inbound rules for the defualt security group specify all traffic, all port ranges and a source of itself. The VPC ID is the default VPC setup.
In lambda when editing the VPC details, I have inputted:
VPC - The defualt VPC
Subnets - Chosen all 3 subnets available
Security Groups - The default security group for VPC
The function has worked twice in writting to the Database, the rest of the time it has timed out, the timeout on the Lambda function is 2 minutes but before reaching that it will throw a time out error.
[ERROR] ServerSelectionTimeoutError: MY_DATABASE_URL:27017: [Errno -2] Name or service not known
The snippet of code below is what is trying to be executed, the function will never reach the print("INSERTED DATA") it times out during the insert statement.
def getDBConnection():
client = pymongo.MongoClient(***MY_URL***)
##Specify the database to be used
db = client.test
print("GOT CONNECTION",db)
##Specify the collection to be used
col = db.myTestCollection
print("GOT COL",col)
##Insert a single document
col.insert_one({'hello':'Amazon DocumentDB'})
print("INSERTED DATA")
##Find the document that was previously written
x = col.find_one({'hello':'Amazon DocumentDB'})
##Print the result to the screen
print("RETRIEVED DATA",x)
##Close the connection
client.close()
I have tried changing the version of pymongo as this thread suggested however it did not help.

Make sure your Lambda function is not in the public subnet, otherwise, it will not work. So, that means you need to go back to the Lambda console and remove the public subnet from the VPC editable section.
Make sure you have a Security group specifically for your Lambda Function as follows:
Lambda Security Group Outbound Rule:
Type Protocol Port Range Destination
All Traffic All All 0.0.0.0/0
You can also restrict this to HTTP/HTTPS on Ports 80/443 if you'd like.
2.Check the Security Group of your DocumentDB Cluster to see if it is set up with an inbound rule as follows:
Type Protocol Port Range Source
Custom TCP TCP 27017 Lambda Security Group
Your Lambda Function needs to have the correct permissions, those are:
Managed policy AWSLambdaBasicExecutionRole
Managed policy AWSLambdaVPCAccessExecutionRole
After doing this your VPC section should look something like this:
1. VPC - The default VPC
2. Subnets - Chosen 2 subnets (Both Private)
3. Security Group for your Lambda function. Not the default security group
And that should do it for you. Let me know if it does not work though and I'll try and help you troubleshoot.

Related

Cant write to Database with AWS Lambda

I am trying to write files to a postgres database with AWS Lambda but I am facing an error:
Calling the invoke API action failed with this message: Network Error
My code looks like this:
from sqlalchemy import create_engine
import pandas as pd
def test(event=None, context=None):
conn = create_engine('postgresql://user:password#url:5439/database')
df = pd.DataFrame([{'A': 'foo', 'B': 'green', 'C': 11},{'A':'bar', 'B':'blue', 'C': 20}])
df.to_sql('your_table', conn, index=False, if_exists='replace', schema='schema')
test()
Resources:
Memory - 1280MB
Timeout - 2 minutes
What is the problem here and how else could I write pandas Dataframe to a Database with AWS Lambda?
I'm assuming the Postgres instance is in RDS.
Is your lambda in your VPC? You can check this on the function's page in admin console, in the VPC box. By default it's not and the VPC box says "None".
Case 1: Lambda is not in VPC
Then the issue might be that the security group associated with your RDS instance does not allow connections from outside the VPC. That's the default if you didn't touch the security group. Find the security group for your RDS instance from the RDS admin, then check out the "Inbound rules" for that security group. Lambdas don't have an IP so you'll need to add an inbound rule allowing at least postgres traffic for source "0.0.0.0/0", i.e. the entire internet.
This should be sufficient but note that this is not considered very good for security, since anyone can now in theory reach your DB (and worse if they can guess the password). But depending on your project that might not be a problem for you. If that is an issue for you, you could instead associate your lambda with the same VPC the RDS instance is in, in order to provide better networking security, and move to Case 2.
Case 2: Lambda is in a VPC
I'm assuming you put the lambda in the same VPC as the RDS instance for simplicity - if not you probably know what you're doing.
All you need to do now (providing you didn't touch other network configs) is ensure your RDS instance's security group allows access from your lambda's security group. So you could put both in the default security group, or put them in separate groups but make sure the RDS one has an inbound rule allowing the lambda one.
Note that if your lambda also needs to call external services (since you mention querying an API), in order to enable that, after linking it to your VPC you'll also need to create a NAT Gateway like I described here: https://stackoverflow.com/a/61273118/299754

Connect AWS RDS (psql) in AWS Lambda

I wrote a simple lambda function in python to fetch some data from AWS RDS. PostgreSQL is the database engines.
conn = psycopg2.connect(host=hostname, user=username, password=password, dbname=db_name, connect_timeout=50)
I did like this. But it didn't work. Always returns an error like this
Response:
{
"errorMessage": "2018-06-06T11:28:53.775Z Task timed out after 3.00 seconds"
}
How can I resolve this??
It is most probably timing-out because the network connection cannot be established.
If you wish to connect to the database via a public IP address, then your Lambda function should not be connected to the VPC. Instead, the connection will go from Lambda, via the internet, into the VPC and to the Amazon RDS instance.
If you wish to connect to the database via a private IP address, then your Lambda function should be configured to use the same VPC as the Amazon RDS instance.
In both cases, the connection should be established using the DNS Name of the RDS instance, but it will resolve differently inside and outside of the VPC.
Finally, the Security Group associated with the Amazon RDS instance needs to allow the incoming connection. This, too, will vary depending upon whether the request is coming from public or private space. You can test by opening the security group to 0.0.0.0/0 and, if it works, then try to restrict it to the minimum possible range.

AWS Lambda Function cannot access other services

I have a problem with an AWS Lambda Function which depends upon DynamoDB and SQS to function properly. When I try to run the lambda stack, they time out when trying to connect to the SQS service. The AWS Lambda Function lies inside a VPC with the following setup:
A VPC with four subnets
Two subsets are public, routing their 0.0.0.0/16 traffic to an internet gateway
A MySQL server sits in a public subnet
The other two contain the lambdas and route their 0.0.0.0/16 traffic to a NAT which lives in one of the public subnets.
All route tables have a 10.0.0.0/16 to local rule (is this the problem because Lambdas use private Ip's inside a VPC?)
The main rout table is the one with the NAT, but I explicitly associated the public nets with the internet gateway routing table
The lambdas and the mysql server share a security group which allows for inbound internal access (10.x/16) as well as unrestricted outbound traffic (0.0.0.0/16).
Traffic between lambdas and the mysql instance is no problem (except if I put the lambdas outside the VPC, then they can't access the server even if I open up all ports). Assume the code for the lambdas is also correct, as it worked before I tried to mask it in a private net. Also the lambda execution roles have been set accordingly (or do they need adjustments after moving them to a private net?).
Adding a dynamodb endpoint solved the problems with the database, but there are no VPC endpoints available for some of the other services. Following some answers I found here, here, here and in the announcements / tutorials here and here, I am pretty sure I followed all the recommended steps.
I would be very thankful and glad for any hints where to check next, as I have currently no idea what could be the problem here.
EDIT: The function don't seem to have any internet access at all, since a toy example I checked also timed out:
import urllib.request
def lambda_handler(event, context):
test = urllib.request.urlopen(url="http://www.google.de")
return test.status
Of course the problem was sitting in front of the monitor again. Instead of routing 0.0.0.0/0 (any traffic) to the internet gateway, I had just specified 0.0.0.0/16 (traffic from machines with an 0.0.x.x ip) to the gate. Since no machines with such ip exists any traffic was blocked from entering leaving the VPC.
#John Rotenstein: Thx, though for the hint about lambdash. It seems like a very helpful tool.
Your configuration sounds correct.
You should test the configuration to see whether you can access any public Internet sites, then test connecting to AWS.
You could either write a Lambda function that attempts such connections or you could use lambdash that effectively gives you a remote shell running on Lambda. This way, you can easily test connectivity from the command line, such as curl.

Python AWS lambda function connecting to RDS mysql: Timeout error

When the python lambda function is executed I get "Task timed out after 3.00 seconds" error. I am trying the same example function.
When I try to run the same code from eclipse it works fine and I can see the query result. Same way I can connect to the db instance from local-machine Mysql workbench without any issues.
I tried creating a role with with full administrator access policy for this lambda function and even then its not working fine. The db instance has a vpc and I just added my local ip address there using the edit CIDR option so I can access the instance through my local machine workbench. For VPC, subnet and security group parameter in lambda function I gave the same values as I have in the RDS db instance.
I have also increased the timeout for lambda function and still I see the timeout error.
Any input would be appreciated.
For VPC, subnet and security group parameter in lambda function I gave the same values as I have in the RDS db instance.
Security groups don't automatically trust their own members to access other members.
Add a rule to this security group for "MySQL" (TCP port 3306) but instead of specifying an IP address, start typing s g into the box and select the id of the security group that you are adding the rule to -- so that the group is self-referential.
Note that this is probably not the correct long-term fix, because if your Lambda function needs to access the Internet or most AWS services, the Lambda function needs to be on a private subnet behind a NAT device. That does not describe the configuration of the subnet where your RDS instance is currently configured, because you mentioned adding your local IP to allow access to RDS. That suggests your RDS is on a public subnet.
See also Why Do We Need Private Subnets in VPC for a better understanding of public vs. private subnets.

Boto create launch configuration in different VPC with fabric and boto

I keep getting this error returned from my boto create_launch_configuration() cmd wrapped in a fabric task.
This is the cmd:
if user_data != '':
security_groups=list('sg-d73fc5b2')
print "Trying to use this AMI [%s]" % image_ami
lc = LaunchConfiguration(
name=launch_config_name,
image_id=image_ami,
key_name=env.aws_key_name,
security_groups=security_groups,
instance_type=instance_type
)
launch_config = autoscale_conn.create_launch_configuration(lc)
and this is the response
<ErrorResponse xmlns="http://autoscaling.amazonaws.com/doc/2011-01-01/">
<Error>
<Type>Sender</Type>
<Code>ValidationError</Code>
<Message>No default VPC for this user</Message>
</Error>
<RequestId>4371fa63-e008-11e3-8554-ff532bce5053</RequestId>
</ErrorResponse>
We disabled the default VPC in order to try and minimise mistakes being applied to a VPC via API calls. We have several VPC's running from the same account and it would be useful to be able to specify the VPC via boto.
Has anyone any idea how I can set this default VPC on a per task basis?
As stated here you should specify a subnet when creating auto-scaling group. And though it is not stated that you have to have default VPC for creating launch configuration, I would say that reading this. Particularly this lines:
If your AWS account comes with a default VPC and if you want to create your Auto Scaling group in default VPC, follow the instructions in ...
So you just need to create auto-scaling group in the desired subnet and use your launch configuration for this group.

Categories

Resources