I am trying to get a Google Cloud Function in Python 3.7 to take a file from Google Cloud Storage and upload it into AWS S3. In the command line, I would authenticate with awscli and then use the gsutil cp command to copy the file across. I have translated this process into python as:
import subprocess
def GCS_to_s3(arg1, arg2):
subprocess.call(["aws configure set aws_access_key_id AKIA********"], shell=True)
subprocess.call(["aws configure set aws_secret_access_key EgkjntEFFDVej"], shell=True)
subprocess.call(["gsutil cp gs://test_bucket/gcs_test.csv s3://mybucket)"], shell=True)`
The requirements.txt is:
awscli
google-cloud-storage
This function deploys successfully and runs successfully but the output does not show up in the S3 bucket.
What would be the best way of writing such a function?
You'll probably want to use the boto3 Python package instead, since the command-line AWS tools aren't available or installable for Cloud Functions. There's a number of ways to configure credentials as well.
Related
Trying to copyfile AWS S3 to Azure blob storage using Access Key ID and Access secret Key ID in databricks.
Code:
====
set AWS_ACCESS_KEY_ID='ABCDED'
set AWS_SECRET_ACCESS_KEY='12345'
azcopy cp "https://s3://emp-data/tsc/" "https://empdata.blob.core.windows.net/awstosf/" --recursive
The above code is not working and getting invalid syntax
In Azure Databricks, to work with CLI tools, you need to install databricks-cli using pip install databricks-cli. Also, configure the cli token using databricks configure --token command.
Also, instead of setting the variable programmatically, you should set them using UI in clusters. Follow: Databticks -> Cluster -> Environment Variables. So, your AWS_ACCESS_KEY_ID=<access-key> and AWS_SECRET_ACCESS_KEY=<secret-access-key> will be set in cluster level.
Source: File manipulation Commands in Azure Databricks
I have a requirement to copy the file as soon as it landed in the S3 bucket. So I enabled trigger to lambda function on S3. But I am not sure how to copy the content to lightsail directory using AWS lambda. I looked into the documentation but I don't see any solution using Python - Boto3.
I can only see the FTP solution. Is there any other way to dump the file into Lightsail from S3?
Thanks.
Generally, for ec2 instances you SSM Run Command for that.
In this solution, your lambda invokes sends a command (e.g. download some files from s3) to the instance by means of the SSM Run Command.
For this to work, you need to install SSM Agent on your Lightsail Instance. The following link describs the installation process:
Configuring SSM Agent on an Amazon Lightsail Instance.
The link also gives an example of sending commands to the instance:
How to Run Shell Commands on a Managed Instance
I would like to run a gsutil command every x minutes as a cloud function. I tried the following:
# main.py
import os
def sync():
line = "gsutil -m rsync -r gs://some_bucket/folder gs://other_bucket/other_folder"
os.system(line)
While the Cloud Function gets triggered, the execution of the line does not work (or i.e. the files are not copied from one bucket to another). However, it does work fine when I run it locally in Pycharm or with cmd. What is the difference with cloud functions?
You can use Cloud Run for this. You have very few change to perform in your code.
Create a container with gsutil installed and python also, for example gcr.io/google.com/cloudsdktool/cloud-sdk as base image
Take care of the service account used when you deploy Cloud Run, grant the correct permission for accessing to your bucket
Let me know if you need more guidance
Cloud Functions server instances don't have gsutil installed. It works on your local machine because you do have it installed and configured there.
I suggest trying to find a way to do what you want with the Cloud Storage SDK for python. Or figure out how to deploy gsutil with your function and figure out how to configure and invoke it from your code, but that might be very difficult.
There's no straightforward option for that.
I think the best for Cloud Functions is to use google-cloud-storage python library
I am new to AWS as well as Python.
AWS CLI, the below command is working perfectly fine:
aws cloudformation package --template-file sam.yaml --output-template-file output-sam.yaml --s3-bucket <<bucket_Name>>
The goal is to create automate python script which will run the above command. I tried to google it but none of the solutions is working for me.
Below are the solution that I have tried but not able to upload the artifacts onto the S3 bucket.
test.py file:
import subprocess
command= ["aws","cloudformation","package","--template-file","sam.yaml","--output-template-file","output-sam.yaml","--s3-bucket","<<bucket_name>>"]
print(subprocess.check_output(command, stderr=subprocess.STDOUT))
It can easily be done using the os library. The simplest way of doing it is given in the code.
import os
os.system("aws cloudformation package --template-file sam.yaml --output-template-file output-sam.yaml --s3-bucket <<bucket_name>>")
However, subprocess can be used for a little complicated tasks.
You can also check out boto3 library for such tasks. Boto is AWS SDK for Python.
You can check how this aws-cli command is implemented as it's all in Python already. Basically aws cloudformation package uploads the template to S3, so you can do the same with boto3 as mentioned in the comments.
I'd like to upload xml's directly to S3 without the use of modules like boto, boto3, or tinys3.
So far I have written:
url = "https://my-test-s3.s3.amazonaws.com"
with open(xml_file,'rb') as data:
requests.put(url, data=data)
and I've gone and head and set the AllowedOrigin on my S3 bucket to accept my server's address.
This does not error when running, however, it also does not seem to be uploading anything.
Any help would be appreciated --- I'd like to (a) get the thing to upload and (b) figure out how to apply AWSAccessKey and AWSSecretAccessKey to the request
If you want to upload xml's directly to S3 without the use of modules like boto, boto3, or tinys3 I would recommend to use awscli:
pip install awscli
aws configure # enter your AWSAccessKey and AWSSecretAccessKey credentials
AWSAccessKey and AWSSecretAccessKey will be stored inside ~/.aws folder permanently after using aws configure.
And then you can upload files using python:
os.system("aws s3 cp {0} s3://your_bucket_name/{1}".format(file_path, file_name))
Docs are here.
you need to install awscli following this documentation.
Then in a commandline shell, execute aws configure and follow the instruction.
to upload file, its much easier using boto
import boto3
s3 = boto3.resource('s3')
s3.meta.client.upload_file(xml_file, 'yourbucket', 'yours3filepath')
Alternatively, you can use aws s3 cp command combined with python subprocess.
subprocess.call(["aws", "s3", "cp", xml_file, "yours3destination"])