Uploading file to AWS S3 through Chalice API call - python

I'm trying to upload a file to my S3 bucket through Chalice (I'm playing around with it currently, still new to this). However, I can't seem to get it right.
I have AWS setup correctly, doing the tutorial successfully returns me some messages. Then I try to do some upload/download, and problem shows up.
s3 = boto3.resource('s3', region_name=<some region name, in this case oregon>)
BUCKET= 'mybucket'
UPLOAD_FOLDER = os.path.abspath('') # the file I wanna upload is in the same folder as my app.py, so I simply get the current folder name
#app.route('/upload/{file_name}', methods=['PUT'])
def upload_to_s3(file_name):
s3.meta.client.upload_file(UPLOAD_FOLDER+file_name, BUCKET, file_name)
return Response(message='upload successful',
status_code=200,
headers={'Content-Type': 'text/plain'}
)
Please don't worry about how I set my file path, unless that's the issue, of course.
I got the error log:
No such file or directory: ''
in this case file_name is just mypic.jpg.
I'm wondering why the UPLOAD_FOLDER part is not being picked up. Also, for the reference, it seems like using absolute path will be troublesome with Chalice (while testing, I've seen the code being moved to /var/task/)
Does anyone know how to set it up correctly?
EDIT:
the complete script
from chalice import Chalice, Response
import boto3
app = Chalice(app_name='helloworld') # I'm just modifying the script I used for the tutorial
s3 = boto3.client('s3', region_name='us-west-2')
BUCKET = 'chalicetest1'
#app.route('/')
def index():
return {'status_code': 200,
'message': 'welcome to test API'}
#app.route('/upload/{file_name}, methods=['PUT'], content_types=['application/octet-stream'])
def upload_to_s3(file_name):
try:
body = app.current_request.raw_body
temp_file = '/tmp/' + file_name
with open(temp_file, 'wb') as f:
f.write(body)
s3.upload_file(temp_file, BUCKET, file_name)
return Response(message='upload successful',
headers=['Content-Type': 'text/plain'],
status_code=200)
except Exception, e:
app.log.error('error occurred during upload %s' % e)
return Response(message='upload failed',
headers=['Content-Type': 'text/plain'],
status_code=400)

I got it running and this works for me as app.py in an AWS Chalice project:
from chalice import Chalice, Response
import boto3
app = Chalice(app_name='helloworld')
BUCKET = 'mybucket' # bucket name
s3_client = boto3.client('s3')
#app.route('/upload/{file_name}', methods=['PUT'],
content_types=['application/octet-stream'])
def upload_to_s3(file_name):
# get raw body of PUT request
body = app.current_request.raw_body
# write body to tmp file
tmp_file_name = '/tmp/' + file_name
with open(tmp_file_name, 'wb') as tmp_file:
tmp_file.write(body)
# upload tmp file to s3 bucket
s3_client.upload_file(tmp_file_name, BUCKET, file_name)
return Response(body='upload successful: {}'.format(file_name),
status_code=200,
headers={'Content-Type': 'text/plain'})
You can test this with curl and its --upload-file directly from the command line with:
curl -X PUT https://YOUR_API_URL_HERE/upload/mypic.jpg --upload-file mypic.jpg --header "Content-Type:application/octet-stream"
To get this running, you have to manually attach the policy to write to s3 to the role of your lambda function. This role is auto-generated by Chalice. Attach the policy (e.g. AmazonS3FullAccess) manually next to the existing policy in the AWS IAM web interface to the role created by your Chalice project.
Things to mention:
You cannot write to the working directory /var/task/ of the Lambda functions, but you have some space at /tmp/, see this answer.
You have to specify the accepted content-type 'application/octet-stream' for the #app.route (and upload the file accordingly via curl).
HTTP PUT puts a file or resource at a specific URI, so to use PUT this file has to be uploaded to the API via HTTP.

Related

Python Flask server on elastic beanstalk cant save files

I am trying to upload and save a file to my EC2 instance so that I can do some file manipulation.
My upload looks like this:
UPLOAD_FOLDER = os.getcwd() + '/uploads'
application = Flask(__name__)
CORS(application)
application.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER
#application.route("/test-upload", methods=["POST"])
def test_upload():
file = request.files['file']
filename = secure_filename(file.filename)
file.save(os.path.join(application.config['UPLOAD_FOLDER'], filename))
# return redirect(url_for('download_file', name=filename))
return str(os.path.join(application.config['UPLOAD_FOLDER'], filename))
If I remove the file.save() line, then the function won't return errors. And as you can see I'm returning the path, so that will return the correct path. But, I get a 500 bad request from the server when I try and save the file.
I'm not sure what I'm missing here.
I can locally try this using postman, and this works fine. The file saves to the right location. I do have file size for EC2 enabled to be 100MB, but I am testing with a 50 Byte file so size can't be the issue. I know the file is being uploaded to the EC2 web server for sure.

Calling my python function in VS Code Azure Function App tab (http trigger)

I'm new to Azure Function App.
I have my python code that I want to run when http trigger called.
I have new project and calling in "__ init __.py"
What is the correct way to call my code?
Here is "__ init __.py":
import logging
import azure.functions as func
import UploadToGCS
def main(req: func.HttpRequest) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
name = req.params.get('name')
if not name:
try:
req_body = req.get_json()
except ValueError:
pass
else:
name = req_body.get('name')
if name:
UploadToGCS(UploadToGCS.upload_files) <--- I called it here
return func.HttpResponse(f"Hello, {name}. This HTTP triggered function executed successfully.")
else:
return func.HttpResponse(
"This HTTP triggered function executed successfully. Pass a name in the query string or in the request body for a personalized response.",
status_code=200
)
Currently I receive "401 error" page
Can you please, suggest how it should be done?
Here is my python code: (I'm uploading file to Google Cloud Storage bucket using details in config_file = find("gcs_config.json", "C:/")):
from google.cloud import storage
import os
import glob
import json
# Finding path to config file that is called "gcs_config.json" in directory C:/
def find(name, path):
for root, dirs, files in os.walk(path):
if name in files:
return os.path.join(root, name)
def upload_files(config_file):
# Reading 3 Parameters for upload from JSON file
with open(config_file, "r") as file:
contents = json.loads(file.read())
print(contents)
# Setting up login credentials
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = contents['login_credentials']
# The ID of GCS bucket
bucket_name = contents['bucket_name']
# Setting path to files
LOCAL_PATH = contents['folder_from']
for source_file_name in glob.glob(LOCAL_PATH + '/**'):
# For multiple files upload
# Setting destination folder according to file name
if os.path.isfile(source_file_name):
partitioned_file_name = os.path.split(source_file_name)[-1].partition("-")
file_type_name = partitioned_file_name[0]
# Setting folder where files will be uploaded
destination_blob_name = file_type_name + "/" + os.path.split(source_file_name)[-1]
# Setting up required variables for GCS
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(destination_blob_name)
# Running upload and printing confirmation message
blob.upload_from_filename(source_file_name)
print("File from {} uploaded to {} in bucket {}.".format(
source_file_name, destination_blob_name, bucket_name
))
config_file = find("gcs_config.json", "C:/")
upload_files(config_file)
Kind regards,
Anna
I'm replying to this as no one else did, and someone might stumble upon this thread looking for an answer.
To run your function locally from VS Code:
Initiate the function in your local environment, run the command in terminal inside vscode:
func init
This will create all the necessary files in your folder and a virtual environments (if you're using Anaconda, you need to configure the setting.json for vscode that points to the Conda environments).
Finish your init.py file. Then start the function with:
func start
The function will deploy at localhost and give you a link.
If you want to deploy it in the cloud, install Azure Function extension, you'll have option to login and select your subscription. After this is done, you can deploy the functions under any Function App that was created in Azure.

Get content_type from Google Cloud file

I have two api endpoints, one that takes a file from an http request and uploads it to a google cloud bucket using the python api, and another that downloads it again. in the first view, i get the file content type from the http request and upload it to the bucket,setting that metadata:
from google.cloud import storage
file_obj = request.FILES['file']
client = storage.Client.from_service_account_json(path.join(
path.realpath(path.dirname(__file__)),
'..',
'settings',
'api-key.json'
))
bucket = client.get_bucket('storage-bucket')
blob = bucket.blob(filename)
blob.upload_from_string(
file_text,
content_type=file_obj.content_type
)
Then in another view, I download the file:
...
bucket = client.get_bucket('storage-bucket')
blob = bucket.blob(filename)
blob.download_to_filename(path)
How can I access the file metadata I set earlier (content_type) ? It's not available on the blob object anymore since a new one was instantiated, but it still holds the file.
You should try
blob = bucket.get_blob(blob_name)
blob.content_type

Can't download from AWS boto3 generated signed url

I'm trying to upload a file to an existing AWS s3 bucket, generate a public URL and use that URL (somewhere else) to download the file.
I'm closely following the example here:
import os
import boto3
import requests
import tempfile
s3 = boto3.client('s3')
with tempfile.NamedTemporaryFile(mode="w", delete=False) as outfile:
outfile.write("dummycontent")
file_name = outfile.name
with open(file_name, mode="r") as outfile:
s3.upload_file(outfile.name, "twistimages", "filekey")
os.unlink(file_name)
url = s3.generate_presigned_url(
ClientMethod='get_object',
Params={
'Bucket': 'twistimages',
'Key': 'filekey'
}
)
response = requests.get(url)
print(response)
I would expect to see a success return code (200) from the requests library.
Instead, stdout is: <Response [400]>
Also, if I navigate to the corresponding URL with a webbrowser, I get an XML file with an error code: InvalidRequest and an error message:
The authorization mechanism you have provided is not supported. Please
use AWS4-HMAC-SHA256.
How can I use boto3 to generate a public URL, which can easily be downloaded by any user by just navigating to the corresponding URL, without generating complex headers?
Why does the example code from the official documentation not work in my case?
AWS S3 still support legacy old v2 signature in US region(prior 2014). But under new AWS region, only AWS4-HMAC-SHA256(s3v4) are allowed.
To support this features , you must specify them explicitly in .aws/config file or during boto3.s3 resource/client instantiation. e.g.
# add this entry under ~/.aws/config
[default]
s3.signature_version = s3v4
[other profile]
s3.signature_version = s3v4
Or declare them explicitly
s3client = boto3.client('s3', config= boto3.session.Config(signature_version='s3v4'))
s3resource = boto3.resource('s3', config= boto3.session.Config(signature_version='s3v4'))
I solved the issue. I'm using an S3 bucket in the eu-central-1 region and after specifying the region in the config file, everything worked as expected and the script stdout was <Response [200]>.
The configuration file (~/.aws/config) now looks like:
[default]
region=eu-central-1

Python Boto3 AWS Multipart Upload Syntax

I am successfully authenticating with AWS and using the 'put_object' method on the Bucket object to upload a file. Now I want to use the multipart API to accomplish this for large files. I found the accepted answer in this question:
How to save S3 object to a file using boto3
But when trying to implement I am getting "unknown method" errors. What am I doing wrong? My code is below. Thanks!
## Get an AWS Session
self.awsSession = Session(aws_access_key_id=accessKey,
aws_secret_access_key=secretKey,
aws_session_token=session_token,
region_name=region_type)
...
# Upload the file to S3
s3 = self.awsSession.resource('s3')
s3.Bucket('prodbucket').put_object(Key=fileToUpload, Body=data) # WORKS
#s3.Bucket('prodbucket').upload_file(dataFileName, 'prodbucket', fileToUpload) # DOESNT WORK
#s3.upload_file(dataFileName, 'prodbucket', fileToUpload) # DOESNT WORK
The upload_file method has not been ported over to the bucket resource yet. For now you'll need to use the client object directly to do this:
client = self.awsSession.client('s3')
client.upload_file(...)
Libcloud S3 wrapper transparently handles all the splitting and uploading of the parts for you.
Use upload_object_via_stream method to do so:
from libcloud.storage.types import Provider
from libcloud.storage.providers import get_driver
# Path to a very large file you want to upload
FILE_PATH = '/home/user/myfile.tar.gz'
cls = get_driver(Provider.S3)
driver = cls('api key', 'api secret key')
container = driver.get_container(container_name='my-backups-12345')
# This method blocks until all the parts have been uploaded.
extra = {'content_type': 'application/octet-stream'}
with open(FILE_PATH, 'rb') as iterator:
obj = driver.upload_object_via_stream(iterator=iterator,
container=container,
object_name='backup.tar.gz',
extra=extra)
For official documentation on S3 Multipart feature, refer to AWS Official Blog.

Categories

Resources