Resizing images uploaded to s3 using pillow - python

I have a chalice application that has a defined lambda_handler that will be triggered using s3 event notifications. Every time an image is created in my s3 bucket, the lambda_handler function will be invoked to create thumbnails. But when you upload images to s3 using presigned_urls, the uploaded file does not have a file extension.
The files on s3 look like this:
Now when using pillow, an error is thrown unknown extension file.
How should I go about this?

Do you have access to the function that is in charge of performing the image upload?
If you're using a pre-signed post to perform the image upload, you should also specify the file extension within the object_name parameter.
response = s3_client.generate_presigned_post(BUCKET_NAME,
"5eafba9fa31dd3bcc190a52.jpg",
Fields=fields,
Conditions=conditions,
ExpiresIn=expiration)
This will cause images to be uploaded with their proper extensions, therefore allowing any subsequent invocation to have the proper file extension.
If you only have access to your chalice application, if there's any guarantee that the file extension will always be of a certain type, you can append the extension prior to using Pillow.

Related

AWS: Make a file downloadable by https link

I have a local .exe file and I want to make it available by https so everyone can download it.
example: "download my app here: https://look_how_downloadable_i_am.exe
If I can update the file with python and manually with interface, it would be perfect ! (the possibility to automate the process and keep it simple if done manually).
It's maybe possible with AWS S3 or/and Lambda.
The most straightforward way would be using an s3 bucket to enable downloads to the file.
Steps are:
Upload file to the bucket
Select the file after it gets uploaded, press actions and select make public
This will make the file publicly downloadable through its unique link. In order to use your own custom domain and link you will have to use CloudFront as #jordanm suggested.
You can also use a python script to update or download your file, you can find demo codes and documentations in Reference 3
Reference 1: How to create download link for an Amazon S3 bucket's object?
Reference 2: https://aws.amazon.com/premiumsupport/knowledge-center/read-access-objects-s3-bucket/
Reference 3: https://docs.aws.amazon.com/code-samples/latest/catalog/code-catalog-python-example_code-s3.html
You can use boto3 to programmatically upload a local file to a bucket, than just edit the buckets permissions to allow public read. Or instead of editing the buckets permissions, when uploading the file just edit the ACL s3.upload_file(upload_path, "bucket-name", file-key, ExtraArgs={'ACL': "public-read"})
upload_path just being the local file path, and file-key being the object name

How to view S3 bucket video files into browser

In my current project, my objective is to access the video files (in mp4) from AWS S3 bucket.
I have created S3 bucket, named videostreambucketpankesh . This is a public folder with the following permission (as follows).
The Access Control list (ACL) of videostreambucketpankesh bucket is as follows:
The bucket policy of videostreambucketpankesh bucket is as follows:
Now the bucket “videostreambucketpankesh” contains many subfolders (or sub-buckets), including one subfolder, named “video”. This sub-bucket contains some .mp4 file (as shown in the image below).
My problem is that there are some files (such as firetruck.mp4 and ambulance.mp4) that can be directly accessed by browser, when I click its objectURL. I can play them in the browser.
However, I am not able to play other .mp4 ( 39cf9079-7b65-4aa8-8913-8a6b924021d3.mp4, 45fd1749-95aa-488c-ac2f-be8673b8416e.mp4, 8ba187f2-5148-49f6-9acc-2459e41f547b.mp4) files into the browser, when I click its objectURL.
Please note that I upload 39cf9079-7b65-4aa8-8913-8a6b924021d3.mp4, 45fd1749-95aa-488c-ac2f-be8673b8416e.mp4, 8ba187f2-5148-49f6-9acc-2459e41f547b.mp4 video file using a python program programmatically in Python (See the following code ).
def upload_to_s3(local_file, bucket, s3_file):
data = open(local_file, 'rb')
s3_client.put_object(Key="video/"+frame_id+".mp4", Body=data, ContentType='video/mp4', Bucket = s3_bucket)
print("Upload succcessful")
However, I am not able to play mp4 file (I play them in VLC player) in my Google chrome browser. Can you please suggest how can I resolve this issue?
Select the files and look at Properties / Metadata.
It should show Content-Type : video/mp4 like this:
When uploading via the browser, the metadata is automatically set based upon the filetype.
If you are uploading via your own code, you can set the metadata like this:
s3_client.upload_file('video.mp4', bucketname, key, ExtraArgs={'ContentType': "video/mp4"})
or
bucket.put_object(key, Body=data, ContentType='video/mp4')
See: AWS Content Type Settings in S3 Using Boto3

Is there a way to upload large files to Amazon lambda functions using Amazon s3

I have written a python file which produces specific sentences that I would like to use as part of an Alexa skill. This then produces an array of these sentences. I am now trying to create a lambda function which implements these sentences in an Alexa based game format. However, the python file contains many imports which are not native to lambda so cannot be used.
After much reading I attempted to install the import packages/dependencies using the following code (one library as an example):
pip3 install spacy -t .
I then zip the contents of my folder using the following code, before uploading the zip file into a Amazon s3 bucket:
zip -r ../zipped_dir.zip *
This worked initially but once I began installing many imports, the zip file quickly exceeded 100mb and so I am required to upload the zip file using 'AWS CLI, AWS SDK, or Amazon S3 REST API'. I have tried several methods of doing this, and the zip file successfully uploads but when I attempt to integrate it in my lambda function using the 'Code entry type':'Upload a file from Amazon s3', and providing the correct URL, the function does not allow me to save it. I click save and it attempts to do so, but remains orange. I believe this is because the file is too large. I am sure the upload method is correct because I re-tested the process using a zip file smaller than 100mb and uploaded to the lambda function with the associated URL, and that saved successfully.
In order to upload the large file I have tried the following methods. I have used a combination of flask and boto3 in this example.
import boto3
from flask import Flask, request
app = Flask(__name__)
#app.route('/')
def index():
return '''<form method=POST enctype=multipart/form-data ` action="upload">
<input type=file name=myfile>
<input type=submit>
</form'''
#app.route('/upload',methods=['POST'])
def upload():
s3 = boto3.resource('s3')
s3.Bucket('bucket').put_object(Key='file.zip',Body=request.files ['myfile'])
` return 'File saved to S3'
if __name__ == '__main__':
app.run(debug=True)
This method also uploads the file successfully into the bucket, but I am unable to save it with the lambda function and URL.
I have also tried to do it from the terminal with this command, which also uploads into the bucket successfully but cannot be saved on the lambda function:
aws s3 cp /foldername s3://bucketname/ --recursive --include ` "myzip.zip"
I am sure manually installing all the files for the associated imports is not the most optimal method and so any suggested other methods would be helpful. If there is also a way to run the python file elsewhere, and pass the array of strings into the lambda function which can run on an Alexa device, that method will also be helpful. I have been stuck on this for almost a week and i'm sure the solution is rather simple, so any help is much appreciated. Thank you.
You're hitting the Lambda Limits with you Archive size > 50MB, that's why your current attempts were unsuccessful.
From the docs:
Deployment package size
50 MB (zipped, for direct upload)
250 MB (unzipped, including layers)
3 MB (console editor)
If you have larger dependencies I suggest you look into using Lambda Layers they basically provide a way to separate your dependencies from your main code.
To make your life easier I recommend you look into using the open source Serverless Framework which makes deploying Lambda functions quite easy. I use the Serverless Framework in combination with the serverless-python-requirements plugin to separate my code from my requirements and deploy the requirements as a Lambda Layer.
Note: Make sure that your unzipped requirements and code stay below 250MB, otherwise you'll hit another limit.

Python Flask Uploading FIles

I'm trying to upload user selected image into my firebase.
When I browse for the file
file = request.files['inputFile']
and I try this
storage.child("images/examples.jpg").put(file)
I get an error
io.UnsupportedOperation: fileno
How do I go about fixing this? I just want user to select the file and I be able to make use of the .jpg file and upload it
The put method takes a path to a local file (and an optional user token).
request.files[key] returns a custom object that represents the uploaded file. Flask documentation links: file uploads quickstart, incoming request data api, FileStorage class.
You need to store the uploaded file data to a local file, and the pass that file name to the put method:
request.files['inputFile'].save("some_filename.ext")
storage.child("images/examples.jpg").put("some_filename.ext")
Look into the tempfile module to generate random temporary file names (instead of using the hard coded some_filename.ext, which obviously is not a very good idea with concurrent requests).

Asynchronous File Upload to Amazon S3 with Django

I am using this file storage engine to store files to Amazon S3 when they are uploaded:
http://code.welldev.org/django-storages/wiki/Home
It takes quite a long time to upload because the file must first be uploaded from client to web server, and then web server to Amazon S3 before a response is returned to the client.
I would like to make the process of sending the file to S3 asynchronous, so the response can be returned to the user much faster. What is the best way to do this with the file storage engine?
Thanks for your advice!
I've taken another approach to this problem.
My models have 2 file fields, one uses the standard file storage backend and the other one uses the s3 file storage backend. When the user uploads a file it get's stored localy.
I have a management command in my application that uploads all the localy stored files to s3 and updates the models.
So when a request comes for the file I check to see if the model object uses the s3 storage field, if so I send a redirect to the correct url on s3, if not I send a redirect so that nginx can serve the file from disk.
This management command can ofcourse be triggered by any event a cronjob or whatever.
It's possible to have your users upload files directly to S3 from their browser using a special form (with an encrypted policy document in a hidden field). They will be redirected back to your application once the upload completes.
More information here: http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1434
There is an app for that :-)
https://github.com/jezdez/django-queued-storage
It does exactly what you need - and much more, because you can set any "local" storage and any "remote" storage. This app will store your file in fast "local" storage (for example MogileFS storage) and then using Celery (django-celery), will attempt asynchronous uploading to the "remote" storage.
Few remarks:
The tricky thing is - you can setup it to copy&upload, or to upload&delete strategy, that will delete local file once it is uploaded.
Second tricky thing - it will serve file from "local" storage until it is not uploaded.
It also can be configured to make number of retries on uploads failures.
Installation & usage is also very simple and straightforward:
pip install django-queued-storage
append to INSTALLED_APPS:
INSTALLED_APPS += ('queued_storage',)
in models.py:
from queued_storage.backends import QueuedStorage
queued_s3storage = QueuedStorage(
'django.core.files.storage.FileSystemStorage',
'storages.backends.s3boto.S3BotoStorage', task='queued_storage.tasks.TransferAndDelete')
class MyModel(models.Model):
my_file = models.FileField(upload_to='files', storage=queued_s3storage)
You could decouple the process:
the user selects file to upload and sends it to your server. After this he sees a page "Thank you for uploading foofile.txt, it is now stored in our storage backend"
When the users has uploaded the file it is stored temporary directory on your server and, if needed, some metadata is stored in your database.
A background process on your server then uploads the file to S3. This would only possible if you have full access to your server so you can create some kind of "deamon" to to this (or simply use a cronjob).*
The page that is displayed polls asynchronously and displays some kind of progress bar to the user (or s simple "please wait" Message. This would only be needed if the user should be able to "use" (put it in a message, or something like that) it directly after uploading.
[*: In case you have only a shared hosting you could possibly build some solution which uses an hidden Iframe in the users browser to start a script which then uploads the file to S3]
You can directly upload media to the s3 server without using your web application server.
See the following references:
Amazon API Reference : http://docs.amazonwebservices.com/AmazonS3/latest/dev/index.html?UsingHTTPPOST.html
A django implementation : https://github.com/sbc/django-uploadify-s3
As some of the answers here suggest uploading directly to S3, here's a Django S3 Mixin using plupload:
https://github.com/burgalon/plupload-s3mixin
I encountered the same issue with uploaded images. You cannot pass along files to a Celery worker because Celery needs to be able to pickle the arguments to a task. My solution was to deconstruct the image data into a string and get all other info from the file, passing this data and info to the task, where I reconstructed the image. After that you can save it, which will send it to your storage backend (such as S3). If you want to associate the image with a model, just pass along the id of the instance to the task and retrieve it there, bind the image to the instance and save the instance.
When a file has been uploaded via a form, it is available in your view as a UploadedFile file-like object. You can get it directly out of request.FILES, or better first bind it to your form, run is_valid and retrieve the file-like object from form.cleaned_data. At that point at least you know it is the kind of file you want it to be. After that you can get the data using read(), and get the other info using other methods/attributes. See https://docs.djangoproject.com/en/1.4/topics/http/file-uploads/
I actually ended up writing and distributing a little package to save an image asyncly. Have a look at https://github.com/gterzian/django_async Right it's just for images and you could fork it and add functionalities for your situation. I'm using it with https://github.com/duointeractive/django-athumb and S3

Categories

Resources