Connect to Redshift from AWS Lambda - python

I'm trying to connect to my Redshift database from my AWS Lambda function:
con = psycopg2.connect(
dbname="my_dbname",
host="my_url",
port= 5439,
user="username",
password="my_password")
cur = con.cursor()
But I can't access my database, my function raises the following error:
OperationalError: could not connect to server: Connection timed out
Is the server running on host "my_url" (54.217.83.88) and accepting
TCP/IP connections on port 5439?
Can i get some help with this please ? (and if possible I'd like to have a very detailed answer, because I'm new to AWS )
PS: I know that I have to configure the VPC, but I don't know what to do exactly
Thanks you in advance

Your goal is to have the AWS Lambda function communicate with the Amazon Redshift cluster within the same VPC, via private IP address. This keeps the traffic within the VPC.
The AWS Lambda function will need to be configured to connect to a private subnet in the same VPC as the Amazon Redshift cluster.
See: Configuring a Lambda Function to Access Resources in an Amazon VPC
Either connect to the Private IP address of the cluster or (preferably) follow the directions on Managing Clusters in an Amazon Virtual Private Cloud (VPC) to enable DNS Hostnames and DNS Resolution on the VPC so that the host name will automatically resolve to the Private IP address.
The Security Group associated with the Amazon Redshift cluster will need to permit inbound traffic on port 5439 from the CIDR range of the the VPC (or as appropriate).
See: Amazon Redshift Cluster Security Groups

Related

Connecting to DocumentDB from AWS Lambda using Python

I am trying to connect to DocumentDB from a Lambda function.
I have configured my DocumentDB as per this tutorial and can access it through the cloud9 command prompt.
The documentDB cluster is part of two security groups. The first security group is called demoDocDB and the second called default and is the vpc defulat security group.
The inbound rules for demoDocDB forward requests from the cloud9 instance to port 27017 where my documentDB database is running.
The inbound rules for the defualt security group specify all traffic, all port ranges and a source of itself. The VPC ID is the default VPC setup.
In lambda when editing the VPC details, I have inputted:
VPC - The defualt VPC
Subnets - Chosen all 3 subnets available
Security Groups - The default security group for VPC
The function has worked twice in writting to the Database, the rest of the time it has timed out, the timeout on the Lambda function is 2 minutes but before reaching that it will throw a time out error.
[ERROR] ServerSelectionTimeoutError: MY_DATABASE_URL:27017: [Errno -2] Name or service not known
The snippet of code below is what is trying to be executed, the function will never reach the print("INSERTED DATA") it times out during the insert statement.
def getDBConnection():
client = pymongo.MongoClient(***MY_URL***)
##Specify the database to be used
db = client.test
print("GOT CONNECTION",db)
##Specify the collection to be used
col = db.myTestCollection
print("GOT COL",col)
##Insert a single document
col.insert_one({'hello':'Amazon DocumentDB'})
print("INSERTED DATA")
##Find the document that was previously written
x = col.find_one({'hello':'Amazon DocumentDB'})
##Print the result to the screen
print("RETRIEVED DATA",x)
##Close the connection
client.close()
I have tried changing the version of pymongo as this thread suggested however it did not help.
Make sure your Lambda function is not in the public subnet, otherwise, it will not work. So, that means you need to go back to the Lambda console and remove the public subnet from the VPC editable section.
Make sure you have a Security group specifically for your Lambda Function as follows:
Lambda Security Group Outbound Rule:
Type Protocol Port Range Destination
All Traffic All All 0.0.0.0/0
You can also restrict this to HTTP/HTTPS on Ports 80/443 if you'd like.
2.Check the Security Group of your DocumentDB Cluster to see if it is set up with an inbound rule as follows:
Type Protocol Port Range Source
Custom TCP TCP 27017 Lambda Security Group
Your Lambda Function needs to have the correct permissions, those are:
Managed policy AWSLambdaBasicExecutionRole
Managed policy AWSLambdaVPCAccessExecutionRole
After doing this your VPC section should look something like this:
1. VPC - The default VPC
2. Subnets - Chosen 2 subnets (Both Private)
3. Security Group for your Lambda function. Not the default security group
And that should do it for you. Let me know if it does not work though and I'll try and help you troubleshoot.

sshtunnel forwarding and mysql connection on google cloud function

I am trying to connect to a MySQL server through ssh tunnel on one of my google cloud functions. This works fine in my home environment. I assume it is some port issue on cloud function.
Edit: For clarification the MySQL server sits on a Namecheap shared hosting web server. Not Google Cloud SQL
Every time I run this I timeout with "unknown error". The tunnel appears to be successful. I am however unable to get the mysql connection to work.
import base64
import sshtunnel
import mysql.connector
def testing(event, context):
"""
Testing function
"""
with sshtunnel.SSHTunnelForwarder(
("server address", port),
ssh_username="user",
ssh_password="password",
remote_bind_address=("127.0.0.1",3306),
) as server:
print(server.local_bind_port)
with mysql.connector.connect(
user="user",
password="password",
host="localhost",
database="database",
port=server.local_bind_port
) as connection:
print(connection)
There's too many steps to list, but I'm wondering if the "connector" setup plays a difference even for SSH. Maybe you have to create a connector as shown here (notice how the instructions in "Private IP" tab are different than on your local computer). Then, configure Cloud Functions to use that connector. Make sure you also use the right port.
A Serverless VPC Access connector handles communication to your VPC
network. To connect directly with private IP, you need to:
Make sure that the Cloud SQL instance created above has a private IP
address. If you need to add one, see the Configuring private IP page
for instructions.
Create a Serverless VPC Access connector in the same
VPC network as your Cloud SQL instance. Unless you're using Shared
VPC, a connector must be in the same project and region as the
resource that uses it, but the connector can send traffic to resources
in different regions. Serverless VPC Access supports communication to VPC networks connected via Cloud VPN and VPC Network Peering. Serverless VPC Access does not support legacy networks.
Configure Cloud Functions to use the connector.
Connect using your
instance's private IP and port 3306.
Keep in mind, this "unknown" error could also very well be due to the Cloud SQL Admin API not being enabled here. As a matter of fact, make sure you follow that entire page as it's a broad question.
Let us know what worked for this type of error.

Config for using psycopg2 to access to Cloud SQL instance

I am using a PostgreSQL in Cloud SQL along with psycopg2 library to connect to database from my python code. The current one that I have is associated with a VPC network in which my google compute engines also in that VPC. So, in this case, my config to connect to this cloud sql instance can use private ip and my config look like this:
config['db-cloudsql'] = {
"host" : "10.x.x.x" # cloud sql private ip address
"user" : "postgres"
"password" :"xxxxx"
"database" : "postgres"
}
But now I have another vm instance from another VPC network that need to access to this cloud sql instance. I know that I can access to the cloud sql instance using public ip. (by adding my vm to authorised network) But this vm need to access to the cloud sql instance very often. So I am not sure that the cost from accessing with public ip will be higher compared with accessing with private ip (cannot find any related document about this). I tried peering 2 vpc networks for accessing with private ip but just found that I cannot use this method to connect to sql with private ip from this document
I have found from the document that I can use instance connection name as a host for my config so it should be something like:
config['db-cloudsql'] = {
"host" : "project-name:asia-southeast1:mydbname" #instance connection name
"user" : "postgres"
"password" :"xxxxx"
"database" : "postgres"
}
I haven't try this method yet, it might not working but if it somehow works, how will it differ from using public ip addess in term of cost?
Thank you
It seems like your question boils down into two parts:
Are there any alternatives to public IP or private IP?
No. You have to use one or the other to connect to your Cloud SQL instance. Private IP allows access from a VPC, Public IP is used pretty much everywhere else.
Cost of public IP vs private IP
You can find a breakdown of the costs here. In short, there is not really any extra charges with a Public IP. You do have to pay $.01 per hour while the instance is idle (to reserve the public IP address), and just like private IPs, you are responsible for costs of the Network Egress between regions.
I can use instance connection name as a host for my config
This is incorrect. If you are using the Cloud SQL Proxy to connect, it can create a Unix domain socket (at /cloudsql/INSTANCE_CONNECTION_NAME) that can be used to connect to your instance. However the proxy only authenticates your connection - it still needs a valid connection path (Public vs Private).
As you can see in this PostgreSQL pricing documentation there is no difference in the pricing as you use the public IP or the connection name on the Cloud SQL side. If you want to keep the cost as low as possible you can try to keep your Compute Engine instances in the same region, or at least in the same continent as your Cloud SQL instance.

Connect AWS RDS (psql) in AWS Lambda

I wrote a simple lambda function in python to fetch some data from AWS RDS. PostgreSQL is the database engines.
conn = psycopg2.connect(host=hostname, user=username, password=password, dbname=db_name, connect_timeout=50)
I did like this. But it didn't work. Always returns an error like this
Response:
{
"errorMessage": "2018-06-06T11:28:53.775Z Task timed out after 3.00 seconds"
}
How can I resolve this??
It is most probably timing-out because the network connection cannot be established.
If you wish to connect to the database via a public IP address, then your Lambda function should not be connected to the VPC. Instead, the connection will go from Lambda, via the internet, into the VPC and to the Amazon RDS instance.
If you wish to connect to the database via a private IP address, then your Lambda function should be configured to use the same VPC as the Amazon RDS instance.
In both cases, the connection should be established using the DNS Name of the RDS instance, but it will resolve differently inside and outside of the VPC.
Finally, the Security Group associated with the Amazon RDS instance needs to allow the incoming connection. This, too, will vary depending upon whether the request is coming from public or private space. You can test by opening the security group to 0.0.0.0/0 and, if it works, then try to restrict it to the minimum possible range.

Python AWS lambda function connecting to RDS mysql: Timeout error

When the python lambda function is executed I get "Task timed out after 3.00 seconds" error. I am trying the same example function.
When I try to run the same code from eclipse it works fine and I can see the query result. Same way I can connect to the db instance from local-machine Mysql workbench without any issues.
I tried creating a role with with full administrator access policy for this lambda function and even then its not working fine. The db instance has a vpc and I just added my local ip address there using the edit CIDR option so I can access the instance through my local machine workbench. For VPC, subnet and security group parameter in lambda function I gave the same values as I have in the RDS db instance.
I have also increased the timeout for lambda function and still I see the timeout error.
Any input would be appreciated.
For VPC, subnet and security group parameter in lambda function I gave the same values as I have in the RDS db instance.
Security groups don't automatically trust their own members to access other members.
Add a rule to this security group for "MySQL" (TCP port 3306) but instead of specifying an IP address, start typing s g into the box and select the id of the security group that you are adding the rule to -- so that the group is self-referential.
Note that this is probably not the correct long-term fix, because if your Lambda function needs to access the Internet or most AWS services, the Lambda function needs to be on a private subnet behind a NAT device. That does not describe the configuration of the subnet where your RDS instance is currently configured, because you mentioned adding your local IP to allow access to RDS. That suggests your RDS is on a public subnet.
See also Why Do We Need Private Subnets in VPC for a better understanding of public vs. private subnets.

Categories

Resources