Serving files from BlobStore in GAE - python

I want to ask if I can download files from the blobstore in google app engine (zip files especially) without using the handlers(class handlers). I mean serve files directly without downloadhandler class usage.
Have any idea??

No. (if I understand the question properly) There is no direct URL for blobstore items, so you can't get at them directly. However you can serve blobs from URLs that you define with less than 10 lines of code.
EDIT: The send_blob also takes a save_as argument. Try save_as=True to use the blob's uploaded filename as the attachment filename.

Related

Images are not loading from an Amazon S3 bucket

I am doing a Django project. I have hosted my static files on Amazon S3. It has been successfully uploaded to it. But, the images are not loading when I run the server.
When I inspect the image field it shows:
https://django-ecommerce-files.s3.amazonaws.com/images/logo.png%22%20id=%22image%22%20style=%22width:%2040px;%20height:40px%22%3E
When I double clicked it. It shows this error:
<Error>
<Code>AccessDenied</Code>
<Message>Access Denied</Message>
<RequestId>07PX6KHYASHT3008</RequestId>
<HostId>pJCxChq1JHlw/GL0Zy/W+PvX1TevOf/C60Huyidi8+0GMAs8geYlXSrEgo6m9vllL0PouTn6NAA=
</HostId>
</Error>
When working with S3 bucket, there is a need to make your resources(files) publicly accessible. You can either do that programmatically at the point of uploading to S3 or from the AWS console. please check how you can enable public access to your files here
Make sure that you have changed the public access settings for the S3 bucket, such that it allows files to be accessed by your app (with the right credentials).
Your requirement may vary, so take a look at their user manual.
Check the Permissions tab under the bucket.
Or, you can also take a look at the actions allowed on your S3 bucket, it must be configured to allow read/write. Refer the docs for a few examples
im not sure if this works. can you try enabling the static hosting on your s3?
go to your s3.
go to properties, scroll down to the bottom
enable the static hosting
a png file on s3 would look like this(the link works btw):
https://aws-cicd-react.s3.ap-southeast-1.amazonaws.com/logo512.png
addendum:
if you want to see the url of your file:
In your s3, click the file then go to properties
look at the Object URL

Create a zip with files from an AWS S3 path

Is there a way to provide a single URL for a user to download all the content from an S3 path?
Otherwise, is there a way to create a zip with all files found on an S3 path recursively?
ie. my-bucket/media/123/*
Each path usually has 1K+ images and 10+ videos.
There's no built-in way. You have to download all files, compact them "locally", re-upload it, and then you'll have a single URL for download.
As mentioned before, there's no built-in way to do it. But from another hand, you don't need to download and upload back your files. You could create a serverless solution in the same AWS region/location.
You could implement it in different ways:
API Gateway + Lambda Function
In this case, you will trigger your lambda function via API Gateway. Lambda function will create an archive from your bucket's files and upload the result back to S3. Lambda function will return URL to this archive***.
Drawbacks of this way: Lambda can't execute more than 5 min and if you have too many files, it will not have enough time to process them. Be aware, that S3 max file size is 5 terabytes. The largest object that can be uploaded in a single PUT is 5 gigabytes. For objects larger than 100 megabytes, you should consider using the Multipart Upload capability.
Example: Full guide to developing REST API’s with AWS API Gateway and AWS Lambda
Step Function (API Gateway + Lambda Function that calls Step Function)
5 min should be enough to create an archive, but if you are going to do some preprocessing I recommend you to use Step Function. SF has the limitation with the maximum number of registered activities/states and request size (you can't pass you archive in a request) but it is easy to avoid it (if you take it to consideration during designing). Check out more there.
Personally, I am using both ways for different cases.
*** It is bad practice - give to user path to your real file on S3. It is better to use CloudFront CDN. CloudFront allows you to control the lifetime of URL and provide different ways of security and restrictions.
There is no single call you can make to s3 to download as a .zip. You would have to create a service recursively download all of the objects and compress them. It is important to keep in mind the size limit of your S3 objects though. The limit is 5TB per object. You will want to add a check to verify the size of the .zip before re-upload.

How does a user of an AppEngine app upload a large file if the app is using GCS?

With the old blobstore, blobstore_handlers.BlobstoreUploadHandler took care of long requests, for example the user could upload a 100mb file, taking 2 minutes
However I've not seen any example or document anywhere that explains a similar process using Google Cloud Storage, all examples provided just upload sample text files from appengine to gcs, there are no "user -> appengine -> gcs" examples, hope I didn't miss anything
You can upload a file to GCS using the blobstore API by specifying the argument gs_bucket_name of the function create_upload_url.

What's a Django/Python solution for providing a one-time url for people to download files?

I'm looking for a way to sell someone a card at an event that will have a unique code that they will be able to use later in order to download a file (mp3, pdf, etc.) only one time and mask the true file location so a savvy person downloading the file won't be able to download the file more than once. It would be nice to host the file on Amazon S3 to save on bandwidth where our server is co-located.
My thought for the codes would be to pre-generate the unique codes that will get printed on the cards and store those in a database that could also have a field that stores the number of times the file was downloaded. This way we could set how many attempts we would allow the user for downloading the file.
The part that I need direction on is how do I hide/mask the original file location so people can't steal that url and then download the file as many times as they want. I've done Google searches and I'm either not searching using the right keywords or there aren't very many libraries or snippets out there already for this type of thing.
I'm guessing that I might be able to rig something up using django.views.static.serve that acts as a sort of proxy between the actual file and the user downloading the file. The only drawback to this method I would think is that I would need to use the actual web server and wouldn't be able to store the file on Amazon S3.
Any suggestions or thoughts are greatly appreciated.
Neat idea. However, I would warn against the single-download method, because there is no guarantee that their first download attempt will be successful. Perhaps use a time-expiration method instead?
But it is certainly possible to do this with Django. Here is an outline of the basic approach:
Set up a django url for serving these files
Use a GET parameter which is a unique string to identify which file to get.
Keep a database table which has a FileField for the file to download. This table maps the unique strings to the location of the file on the file system.
To serve the file as a download, set the response headers in the view like this:
(path is the location of the file to serve)
with open(path, 'rb') as f:
response = HttpResponse(f.read())
response['Content-Type'] = 'application/octet-stream';
response['Content-Disposition'] = 'attachment; filename="%s"' % 'insert_filename_here'
return response
Since we are using this Django page to serve the file, the user cannot find out the original file location.
You can just use something simple such as mod_xsendfile. This functionality is also available in other popular webservers such lighttpd or nginx.
It works like this: when enabled your application (e.g. a trivial PHP script) can send a special response header, causing the webserver to serve a static file.
If you want it to work with S3 you will need to handle each and every request this way, meaning the traffic will go through your site, from there to AWS, back to your site and back to the client. Does S3 support symbolic links / aliases? If so you might just redirect a valid user to one of the symbolic URLs and delete that symlink after a couple of hours.

Asynchronous File Upload to Amazon S3 with Django

I am using this file storage engine to store files to Amazon S3 when they are uploaded:
http://code.welldev.org/django-storages/wiki/Home
It takes quite a long time to upload because the file must first be uploaded from client to web server, and then web server to Amazon S3 before a response is returned to the client.
I would like to make the process of sending the file to S3 asynchronous, so the response can be returned to the user much faster. What is the best way to do this with the file storage engine?
Thanks for your advice!
I've taken another approach to this problem.
My models have 2 file fields, one uses the standard file storage backend and the other one uses the s3 file storage backend. When the user uploads a file it get's stored localy.
I have a management command in my application that uploads all the localy stored files to s3 and updates the models.
So when a request comes for the file I check to see if the model object uses the s3 storage field, if so I send a redirect to the correct url on s3, if not I send a redirect so that nginx can serve the file from disk.
This management command can ofcourse be triggered by any event a cronjob or whatever.
It's possible to have your users upload files directly to S3 from their browser using a special form (with an encrypted policy document in a hidden field). They will be redirected back to your application once the upload completes.
More information here: http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1434
There is an app for that :-)
https://github.com/jezdez/django-queued-storage
It does exactly what you need - and much more, because you can set any "local" storage and any "remote" storage. This app will store your file in fast "local" storage (for example MogileFS storage) and then using Celery (django-celery), will attempt asynchronous uploading to the "remote" storage.
Few remarks:
The tricky thing is - you can setup it to copy&upload, or to upload&delete strategy, that will delete local file once it is uploaded.
Second tricky thing - it will serve file from "local" storage until it is not uploaded.
It also can be configured to make number of retries on uploads failures.
Installation & usage is also very simple and straightforward:
pip install django-queued-storage
append to INSTALLED_APPS:
INSTALLED_APPS += ('queued_storage',)
in models.py:
from queued_storage.backends import QueuedStorage
queued_s3storage = QueuedStorage(
'django.core.files.storage.FileSystemStorage',
'storages.backends.s3boto.S3BotoStorage', task='queued_storage.tasks.TransferAndDelete')
class MyModel(models.Model):
my_file = models.FileField(upload_to='files', storage=queued_s3storage)
You could decouple the process:
the user selects file to upload and sends it to your server. After this he sees a page "Thank you for uploading foofile.txt, it is now stored in our storage backend"
When the users has uploaded the file it is stored temporary directory on your server and, if needed, some metadata is stored in your database.
A background process on your server then uploads the file to S3. This would only possible if you have full access to your server so you can create some kind of "deamon" to to this (or simply use a cronjob).*
The page that is displayed polls asynchronously and displays some kind of progress bar to the user (or s simple "please wait" Message. This would only be needed if the user should be able to "use" (put it in a message, or something like that) it directly after uploading.
[*: In case you have only a shared hosting you could possibly build some solution which uses an hidden Iframe in the users browser to start a script which then uploads the file to S3]
You can directly upload media to the s3 server without using your web application server.
See the following references:
Amazon API Reference : http://docs.amazonwebservices.com/AmazonS3/latest/dev/index.html?UsingHTTPPOST.html
A django implementation : https://github.com/sbc/django-uploadify-s3
As some of the answers here suggest uploading directly to S3, here's a Django S3 Mixin using plupload:
https://github.com/burgalon/plupload-s3mixin
I encountered the same issue with uploaded images. You cannot pass along files to a Celery worker because Celery needs to be able to pickle the arguments to a task. My solution was to deconstruct the image data into a string and get all other info from the file, passing this data and info to the task, where I reconstructed the image. After that you can save it, which will send it to your storage backend (such as S3). If you want to associate the image with a model, just pass along the id of the instance to the task and retrieve it there, bind the image to the instance and save the instance.
When a file has been uploaded via a form, it is available in your view as a UploadedFile file-like object. You can get it directly out of request.FILES, or better first bind it to your form, run is_valid and retrieve the file-like object from form.cleaned_data. At that point at least you know it is the kind of file you want it to be. After that you can get the data using read(), and get the other info using other methods/attributes. See https://docs.djangoproject.com/en/1.4/topics/http/file-uploads/
I actually ended up writing and distributing a little package to save an image asyncly. Have a look at https://github.com/gterzian/django_async Right it's just for images and you could fork it and add functionalities for your situation. I'm using it with https://github.com/duointeractive/django-athumb and S3

Categories

Resources