Unable to connect to aws redshift from python within lambda - python

I am trying to connect to redshift with python through lambda. The purpose is to perform queries on the redshift database.
I've tried this by getting the temp aws credentials and connecting with psycopg2, but it isn't successful without any error messages. (IE: the lambda just time out)
rs_host = "mytest-cluster.fooooooobaaarrrr.region111111.redshift.amazonaws.com"
rs_port = 5439
rs_dbname = "dev"
db_user = "barrr_user"
def lambda_handler(events, contx):
# The cluster_creds is able to be obtained successfully. No issses here
cluster_creds = client.get_cluster_credentials(DbUser=db_user,
DbName=rs_dbname,
ClusterIdentifier="mytest-cluster",
AutoCreate=False)
try:
# It is this psycopg2 connection that cant work...
conn = psycopg2.connect(host=rs_host,
port=rs_port,
user=cluster_creds['DbUser'],
password=cluster_creds['DbPassword'],
database=rs_dbname
)
return conn
except Exception as e:
print(e)
Also, the lambda execution role itself has these policies:
I am not sure why am I still not able to connect to redshift via python to perform queries.
I have also tried with the sqlalchemy libary but no luck there.

As what Johnathan Jacobson mentioned above. It was the security groups and network permissions that caused my problem.

You can maybe review the documentation at Create AWS Lambda Function to Connect Amazon Redshift with C-Sharp in Visual Studio
Since you have already your code in Python, you can concentrate on the networking part of the tutorial
While launching AWS Lambda functions, it is possible to select a VPC and subnet where the serverless lambda function servers will spinup
You can choose exactly the same VPC and the subnet(s) where you have created your Amazon Redshift cluster
Also, revise the IAM role you have attached to the AWS Lambda function. It requires additionally the AWSLambdaVPCAccessExecutionRole policy
This will be solving issues between connections from different VPCs
Again, even you have launched the lambda function in the same VPC and subnet with Redshift cluster, it is better to check the security group of the cluster so that it accepts connections
Hope it works,

Related

AWS Lambda to RDS PostgreSQL

Hello fellow AWS contributors, I’m currently working on a project to set up an example of connecting a Lambda function to our PostgreSQL database hosted on RDS. I tested my Python + SQL code locally (in VS code and DBeaver) and it works perfectly fine with including only basic credentials(host, dbname, username password). However, when I paste the code in Lambda function, it gave me all sorts of errors. I followed this template and modified my code to retrieve the credentials from secret manager instead.
I’m currently using boto3, psycopg2, and secret manager to get credentials and connect to the database.
List of errors I’m getting-
server closed the connection unexpectedly. This probably means the server terminated abnormally before or while processing the request
could not connect to server: Connection timed out. Is the server running on host “db endpoint” and accepting TCP/IP connections on port 5432?
FATAL: no pg_hba.conf entry for host “ip:xxx”, user "userXXX", database "dbXXX", SSL off
Things I tried -
RDS and Lambda are in the same VPC, same subnet, same security group.
IP address is included in the inbound rule
Lambda function is set to run up to 15 min, and it always stops before it even hits 15 min
I tried both database endpoint and database proxy endpoint, none of it works.
It doesn’t really make sense to me that when I run the code locally, I only need to provide the host, dbname, username, and password, that’s it, and I’m able to write all the queries and function I want. But when I throw the code in lambda function, it’s requiring all these secret manager, VPC security group, SSL, proxy, TCP/IP rules etc. Can someone explain why there is a requirement difference between running it locally and on lambda?
Finally, does anyone know what could be wrong in my setup? I'm happy to provide any information in related to this, any general direction to look into would be really helpful. Thanks!
Following the directions at the link below to build a specific psycopg2 package and also verifying the VPC subnets and security groups were configured correctly solved this issue for me.
I built a package for PostgreSQL 10.20 using psycopg2 v2.9.3 for Python 3.7.10 running on an Amazon Linux 2 AMI instance. The only change to the directions I had to make was to put the psycopg2 directory inside a python directory (i.e. "python/psycopg2/") before zipping it -- the import psycopg2 statement in the Lambda function failed until I did that.
https://kalyanv.com/2019/06/10/using-postgresql-with-python-on-aws-lambda.html
This the VPC scenario I'm using. The Lambda function is executing inside the Public Subnet and associated Security Group. Inbound rules for the Private Subnet Security Group only allow TCP connections to 5432 for the Public Subnet Security Group.
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_VPC.Scenarios.html#USER_VPC.Scenario1

Not able to connect with mongo from AWS lambda functionin python

I have aws lambda which is not able to connect with mongo through VPC.
import pymongo
def handler(event, context):
try:
client = pymongo.MongoClient(host="xxxxxxx", port=27017, username=x1, password=x2, authsource="x3, authMechanism='SCRAM-SHA-1')
except pymongo.errors.ServerSelectionTimeoutError as err:
print(err)
Not able to found the server.
I have created a security group and new roles have given VPC and lambda full access too but not able to connect.
Taken help from https://blog.shikisoft.com/access-mongodb-instance-from-aws-lambda-python/ as well as https://blog.shikisoft.com/running-aws-lambda-in-vpc-accessing-rds/ links.
Please be helpful.
Trying since yesterday but no luck.
Let me try to help you figured out where the problem is.
1. Are your MongoDB EC2 Instance and your Lambda hosted on the same VPC?
If this the cause of your problem, you should move your services into the same VPC.
2. Is your Security Group that attached to your MongoDB EC2 Instance
and your Lambda has whitelisted/include the default sg?
You have to include the default sg into your Security Group so, services/instances within that VPC can communicate.
3. Is your hostname publicly or privately accessed ?
If Lambda needs to connect over Internet to access your MongoDB instance, you don't need to attach your Lambda into a VPC.
Inside a VPC, Lambda requires a NAT Gateway to communicate to open
world. Try to communicate privately if your MongoDB instance and
Lambda are in the same VPC.
https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/
Hope these answers are helpful to you.

Internet access within AWS Glue job

Do Glue jobs have internet access?
Using this test job:
def have_internet():
conn = httplib.HTTPConnection("www.google.com", timeout=5)
try:
conn.request("HEAD", "/")
conn.close()
logger.warn('ok')
except:
conn.close()
logger.warn('no ok')
have_internet()
It appears they do not...
Also, within a properly configured Glue dev endpoint, there is no internet access.
By properly configured, I mean within a public subnet (internet gateway), with S3 endpoint and internet gateway, and a working 'connection', and security groups.
But still no internet access...
I want internet access to be able to interrogate an on prem database, save to S3, and run another job to transform, and load to rds...
Can I use glue for the extract?
This issue has resolved itself now, I suspect due to an update in Glue, or the associated infrastructure.
The connectivity issue was occuring from within the PySpark REPL, and not on the actual Dev Endpoint instance itself...
Anyway, for anyone else troubleshooting similar network connectivity issues with Glue, here is a list of possible causes:
Dev Endpoint needs to be in a 'public' subnet*
DHCP options need to have the default setting
Security groups, security groups, security groups
Subnet should be associated with an S3 Endpoint
...

Python AWS lambda function connecting to RDS mysql: Timeout error

When the python lambda function is executed I get "Task timed out after 3.00 seconds" error. I am trying the same example function.
When I try to run the same code from eclipse it works fine and I can see the query result. Same way I can connect to the db instance from local-machine Mysql workbench without any issues.
I tried creating a role with with full administrator access policy for this lambda function and even then its not working fine. The db instance has a vpc and I just added my local ip address there using the edit CIDR option so I can access the instance through my local machine workbench. For VPC, subnet and security group parameter in lambda function I gave the same values as I have in the RDS db instance.
I have also increased the timeout for lambda function and still I see the timeout error.
Any input would be appreciated.
For VPC, subnet and security group parameter in lambda function I gave the same values as I have in the RDS db instance.
Security groups don't automatically trust their own members to access other members.
Add a rule to this security group for "MySQL" (TCP port 3306) but instead of specifying an IP address, start typing s g into the box and select the id of the security group that you are adding the rule to -- so that the group is self-referential.
Note that this is probably not the correct long-term fix, because if your Lambda function needs to access the Internet or most AWS services, the Lambda function needs to be on a private subnet behind a NAT device. That does not describe the configuration of the subnet where your RDS instance is currently configured, because you mentioned adding your local IP to allow access to RDS. That suggests your RDS is on a public subnet.
See also Why Do We Need Private Subnets in VPC for a better understanding of public vs. private subnets.

Cannot connect to EC2 using python boto

I'm a complete noob with Python and boto and trying to establish a basic connection to ec2 services.
I'm running the following code:
ec2Conn = boto.connect_ec2('username','password')
group_name = 'python_central'
description = 'Python Central: Test Security Group.'
group = ec2Conn.create_security_group(group_name, description)
group.authorize('tcp', 8888,8888, '0.0.0.0/0')
and getting the following error:
AWS was not able to validate the provided access credentials
I've read some posts that this might be due to time difference between my machine and the EC2 server but according to the logs, they are the same:
host:ec2.us-east-1.amazonaws.com x-amz-date:20161213T192005Z
host;x-amz-date
515db222f793e7f96aa93818abf3891c7fd858f6b1b9596f20551dcddd5ca1be
2016-12-13 19:20:05,132 boto [DEBUG]:StringToSign:
Any idea how to get this connection running?
Thanks!
Call made to the AWS API require authentication via Access Key and Secret Key. These can be obtained from the Identity and Access Management (IAM) console, under the Security Credentials tab for a user.
See: Getting Your Access Key ID and Secret Access Key
If you are unfamiliar with Python, you might find it easier to call AWS services by using the AWS Command-Line Interface (CLI). For example, this single-line command can launch an Amazon EC2 instance:
aws ec2 run-instances --image-id ami-c2d687ad --key-name joe --security-group-id sg-23cb34f6 --instance-type t1.micro
See: AWS CLI run-instances documentation

Categories

Resources