--vanlla: Rscript: command not found when running in AWS Lambda - python

I have created a lambda function which gets triggered when any R script file gets uploaded on S3 bucket.
test.R code:
a <- 1
b <- 3
c <- a + b
data = 1:20
print(data)
lambda function code:
import subprocess
import json
import urllib.parse
import boto3
print('Loading function')
def lambda_handler(event, context):
s3 = boto3.client('s3')
bucket_name = event['Records'][0]['s3']['bucket']['name']
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
message = "Hey file is got uploaded " + key + " to this bucket " + bucket_name
print(message)
#3 - Fetch the file from S3
response = s3.get_object(Bucket=bucket_name, Key=key)
text = response["Body"].read().decode()
print(text)
command = "Rscript"
arg = "--vanlla"
path2script = key
retcode = subprocess.call([command, arg, path2script], shell=True)
Trigger is working fine but its giving --vanlla: Rscript: command not found error. but when I run simple python script to run R script code, its working fine. Not working with AWS lambda.
Need help in this.

Related

How do i trigger sql file using psql command from ephemeral storage of lambda function in python

connection and downloaded file output# Access sql files from S3 to lambda and execute.
S3 -> Lambda -> RDS instance
a. Integration between funtion, databse and S3-> DONE
a1. download the .sql file from the S3 bucket and write the file to the /tmp storage of the Lambda function. -> DONE
b1. Import a Python library or create Lambda layer to include the relevant libraries/dependencies into the Lambda function to perform the psql command to execute the SQL file.
or
b2. you can download the file and convert it to a string and pass the string as a parameter to 'ExecuteSql' API call which allows you to run one or more SQL statements.
c. Once we able to successfully execute the sql files then check how to export the generated .csv,txt,html,TAB files to S3 OUTPUT path.
So far i have integrated function, S3 and RDS and able to view the output of table (to test connection) and download the sql file from S3 path to ephemeral storage /tmp of lambda function.
Now looking forward that how to execute downloaded sql file from /tmp of lambda function using psql command or convert file to a string and pass the string as a parameter to 'ExecuteSql' API which allows you to run one or more SQL statements. Please help share any ways to achieve.
Please refer below code in python which i am using with lambda function.
from dataclasses import dataclass
import psycopg2
from psycopg2.extras import RealDictCursor
import json
from datetime import datetime
import csv
import boto3
from botocore.exceptions import ClientError
import os
def get_secret():
secret_name = "baardsnonprod-qa2db"
region_name = "us-east-1"
# Create a Secrets Manager client
session = boto3.session.Session()
client = session.client(
service_name='secretsmanager',
region_name=region_name
)
secret = client.get_secret_value(
SecretId=secret_name
)
secret_dict = json.loads(secret['SecretString'])
return secret_dict
def download_sql_from_s3(sql_filename):
s3 = boto3.resource('s3')
bucket = 'baa-non-prod-baa-assets'
key = "RDS-batch/Reports/" + sql_filename
local_path = '/tmp/' + sql_filename
response = s3.Bucket(bucket).download_file(key, local_path)
return "file successfully downloaded"
def lambda_handler(event, context):
secret_dict = get_secret()
print(secret_dict)
hostname = secret_dict['host']
portnumber = secret_dict['port']
databasename = secret_dict['database']
username = secret_dict['username']
passwd = secret_dict['password']
print(hostname,portnumber,databasename,username,passwd)
conn = psycopg2.connect(host = hostname, port = portnumber, database = databasename, user = username, password = passwd)
cur = conn.cursor(cursor_factory = RealDictCursor)
cur.execute("SELECT * FROM PROFILE")
results = cur.fetchall()
json_result = json.dumps(results, default = str)
print(json_result)
status = download_sql_from_s3("ConsumerPopularFIErrors.sql")
# file_status = os.path.isfile('/tmp/ConsumerPopularFIErrors.sql')
print(status)
# print(file_status)
# with open('/tmp/ConsumerPopularFIErrors.sql') as file:
# content = file.readlines()
# for line in content:
# print(line)
#lambda_handler()

Trying to read ssm parameter and working fine but write as text and uploading inside my bucket its not happening please find below code

import boto3
import os
client = boto3.client('ssm')
s3 = boto3.client("s3")
def lambda_handler(event, context):
parameter = client.get_parameter(Name='otherparam', WithDecryption=True)
#print(parameter)
return parameter ['Parameter']['Value']
#file = open("/sample.txt", "w")
#file.write(parameter)
#file.close
with open("/tmp/log.txt", "w") as f:
file.write(parameter)
s3.upload_file("/tmp/log.txt", "copys3toecsbucket-117", "logs.txt")
#bucket = "copys3toecsbucket-117"
#file = "/sample.txt"
#response = s3_client.put_object(Body=file,Bucket='bucket',key='file')
print(response)
trying in aws lambda only.
how to convert ssm parameter into text file which will be trigger file for next step and upload in s3 bucket?
Uploading to bucket is not happening because you are returning a value before the upload happens. When you return a value in the handler, the Lambda function completes.
Removing return will fix it.
import boto3
import os
client = boto3.client('ssm')
s3 = boto3.client("s3")
def lambda_handler(event, context):
parameter = client.get_parameter(Name='otherparam', WithDecryption=True)
print(parameter)
with open("/tmp/log.txt", "w") as f:
file.write(parameter)
s3.upload_file("/tmp/log.txt", "copys3toecsbucket-117", "logs.txt")
return True

Capture Athena Query Execution Details using Lambda

Under CloudWatch, I created a rule to capture Athena Query State Change Event that will (1) write a log to a log group (2) trigger a Lambda function that will capture the Athena Query Execution details and pipe it to a s3 bucket. Point 2 fails as no Athena Query Execution details are piped it to a s3 bucket. Below is the Lambda Function I used:
import json
import boto3
from botocore.config import Config
my_config = Config(
region_name = '<my_region>')
print('Loading function')
def lambda_handler(event, context):
print("Received event: " + json.dumps(event))
print("QuertID: " + event['id'])
#get query statistics
client = boto3.client('athena', config=my_config)
queries = client.get_query_execution( QueryExecutionId=event['detail']['QueryExecutionId'])
del queries['QueryExecution']['Status']
#saving the query statistics to s3
s3 = boto3.resource('s3')
object = s3.Object('<s3_bucket_path>','query_statistics_json/' + event['detail']['QueryExecutionId'])
object.put(Body=str(queries['QueryExecution']))
return 0
I used this AWS Documentation as reference:
https://docs.aws.amazon.com/athena/latest/ug/control-limits.html
The body should be of the type binary data.
object.put(Body=some binary data)
Maybe you can write the str(queries['QueryExecution'] to a txt file in lambda's /tmp directory and upload it.
content="String content to write to a new S3 file"
s3.Object('my-bucket-name', '/tmp/newfile.txt').put(Body=content)
it's just an indentation problem, after line 11, all should be indented...

How to write parquet file to ECS in Flask python using boto or boto3

I have flask python rest api which is called by another flask rest api.
the input for my api is one parquet file (FileStorage object) and ECS connection and bucket details.
I want to save parquet file to ECS in a specific folder using boto or boto3
the code I have tried
def uploadFileToGivenBucket(self,inputData,file):
BucketName = inputData.ecsbucketname
calling_format = OrdinaryCallingFormat()
client = S3Connection(inputData.access_key_id, inputData.secret_key, port=inputData.ecsport,
host=inputData.ecsEndpoint, debug=2,
calling_format=calling_format)
#client.upload_file(BucketName, inputData.filename, inputData.folderpath)
bucket = client.get_bucket(BucketName,validate=False)
key = boto.s3.key.Key(bucket, inputData.filename)
fileName = NamedTemporaryFile(delete=False,suffix=".parquet")
file.save(fileName)
with open(fileName.name) as f:
key.send_file(f)
but it is not working and giving me error like...
signature_host = '%s:%d' % (self.host, port)
TypeError: %d format: a number is required, not str
I tried google but no luck Can anyone help me with this or any sample code for the same.
After a lot of hit and tried and time, I finally got the solution. I posting it for everyone else who are facing the same issue.
You need to use Boto3 and here is the code...
def uploadFileToGivenBucket(self,inputData,file):
BucketName = inputData.ecsbucketname
#bucket = client.get_bucket(BucketName,validate=False)
f = NamedTemporaryFile(delete=False,suffix=".parquet")
file.save(f)
endpointurl = "<your endpoints>"
s3_client = boto3.client('s3',endpoint_url=endpointurl, aws_access_key_id=inputData.access_key_id,aws_secret_access_key=inputData.secret_key)
try:
newkey = 'yourfolderpath/anotherfolder'+inputData.filename
response = s3_client.upload_file(f.name, BucketName,newkey)
except ClientError as e:
logging.error(e)
return False
return True

Upload to Amazon S3 using tinys3

I'm using Python and tinys3 to write files to S3, but it's not working. Here's my code:
import tinys3
conn = tinys3.Connection('xxxxxxx','xxxxxxxx',tls=True)
f = open('testing_s3.txt','rb')
print conn.upload('testing_data/testing_s3.txt',f,'testing-bucket')
print conn.get('testing_data/testing_s3.txt','testing-bucket')
That gives the output:
<Response [301]>
<Response [301]>
When I try specifying the endpoint, I get:
requests.exceptions.HTTPError: 403 Client Error: Forbidden
Any idea what I'm doing wrong?
Edit: When I try using boto, it works, so the problem isn't in the access key or secret key.
I finally figured this out. Here is the correct code:
import tinys3
conn = tinys3.Connection('xxxxxxx','xxxxxxxx',tls=True,endpoint='s3-us-west-1.amazonaws.com')
f = open('testing_s3.txt','rb')
print conn.upload('testing_data/testing_s3.txt',f,'testing-bucket')
print conn.get('testing_data/testing_s3.txt','testing-bucket')
You have to use the region endpoint, not s3.amazonaws.com. You can look up the region endpoint from here: http://docs.aws.amazon.com/general/latest/gr/rande.html. Look under the heading "Amazon Simple Storage Service (S3)."
I got the idea from this thread: https://github.com/smore-inc/tinys3/issues/5
If using an IAM user it is necessary to allow the "s3:PutObjectAcl" action.
Don't know why but this code never worked for me.
I've switched to boto, and it just uploaded file from 1 time.
AWS_ACCESS_KEY_ID = 'XXXXXXXXXXXXXXXXXXXXX'
AWS_SECRET_ACCESS_KEY = 'XXXXXXXXXXXXXXXXXXXXX/XXXXXXXXXXXXXXXXXXXXXXXXXXX'
bucket_name = 'my-bucket'
conn = boto.connect_s3(AWS_ACCESS_KEY_ID,
AWS_SECRET_ACCESS_KEY)
bucket = conn.get_bucket('my-bucket')
print 'Uploading %s to Amazon S3 bucket %s' % \
(filename, bucket_name)
k = Key(bucket)
k.key = filename
k.set_contents_from_filename(filename,
cb=percent_cb, num_cb=10)

Categories

Resources