Django Google App Engine Upload files greater than 32mb - python

I have a Django Rest Framework Project that I've integrated with Django-Storages to upload files to GCS. Everything works locally. However, Google App Engine imposes a hard limit of 32mb on the size of each request, I cannot upload any files greater than this described limit.
I looked into many posts here on StackOverflow and on the internet. Some of the solutions out listed the use of Blobstore API. However, I cannot find a way to integrate this into Django. Another solution describes the use of django-filetransfers but that plugin is obsolete.
I would appreciate it if someone can point me towards an approach I can take to fixing this problem.
PS: I would like to point out that the current setup works likes this. A post request sends the file up to the server which then handles the process of storing the file in google cloud storage. Since Google App Engine restricts request size to 32mb I cannot get to the point of receiving the file. So my issue is that how can I go about uploading these large files.

According with the official documentation[1] cloud storage can manage files until the 5 tb of size, nevertheless, is recommended take a look at the best practices document[2], also there is an example about how to upload objects using python here [3].
[1]https://cloud.google.com/storage/docs/json_api/v1/objects/insert
[2]https://cloud.google.com/storage/docs/best-practices#uploading
[3]https://cloud.google.com/storage/docs/uploading-objects#storage-upload-object-python

Related

Images disappearing (not storing correctly?) after x amount of time [duplicate]

The app I am currently hosting on Heroku allows users to submit photos. Initially, I was thinking about storing those photos on the filesystem, as storing them in the database is apparently bad practice.
However, it seems there is no permanent filesystem on Heroku, only an ephemeral one. Is this true and, if so, what are my options with regards to storing photos and other files?
It is true. Heroku allows you to create cloud apps, but those cloud apps are not "permanent" - they are instances (or "slugs") that can be replicated multiple times on Amazon's EC2 (that's why scaling is so easy with Heroku). If you were to push a new version of your app, then the slug will be recompiled, and any files you had saved to the filesystem in the previous instance would be lost.
Your best bet (whether on Heroku or otherwise) is to save user submitted photos to a CDN. Since you are on Heroku, and Heroku uses AWS, I'd recommend Amazon S3, with optionally enabling CloudFront.
This is beneficial not only because it gets around Heroku's ephemeral "limitation", but also because a CDN is much faster, and will provide a better service for your webapp and experience for your users.
Depending on the technology you're using, your best bet is likely to stream the uploads to S3 (Amazon's storage service). You can interact with S3 with a client library to make it simple to post and retrieve the files. Boto is an example client library for Python - they exist for all popular languages.
Another thing to keep in mind is that Heroku file systems are not shared either. This means you'll have to be putting the file to S3 with the same application as the one handling the upload (instead of say, a worker process). If you can, try to load the upload into memory, never write it to disk and post directly to S3. This will increase the speed of your uploads.
Because Heroku is hosted on AWS, the streams to S3 happen at a very high speed. Keep that in mind when you're developing locally.

Setting Up S3 with Heroku and Django with Images

so I currently have my static files (js and css) just being stored on Heroku which is no biggie. However, I have objects that I need to store multiple images too and be able to get those images on request. How would I store a reference to those images?
I was planning to use a S3 Direct File Upload using these steps on Heroku here. Is this also going to be the best way for me to do so?
Thank you in advance.
I don't think setting up static (css,js,etc..) or media (images, videos) to be stored on S3 has anything to do with Heroku or where you deploy. Rather, its just making sure Django knows where to save the files, and where to fetch them. I would definitely not follow that link, because it seems confusing and not helpful when working with Django.
This tutorial has really helped me, as it will show you how to set all of that up. I have gone through these steps and can confirm it does the trick. https://simpleisbetterthancomplex.com/tutorial/2017/08/01/how-to-setup-amazon-s3-in-a-django-project.html
While I've gone this route in the past, I've recently opted to use Digital Ocean's one-click app - Dokku. It's based on Herokuish. I then use Dokku's persistent storage to take advantage of the 25 gigs of storage on DO's smallest, $5/month, plan. I wrote a guide to this here.

Resumable Upload and Show Percentage of Upload using Drive API v3 on Python 3

I tried to find a way to do Resumable Upload and resuming it using the Drive API v3 on Python 3.5. I came across Google's Official API Guide on Media Upload however it used the file.insert function which seems to not be available in v3.
Additionally, I also planned to upload large files so a progress bar/percentage could really help. Also, do you think I should be using Chunk Upload? Google's official docs seems to say there's a lost in performance.
Thank you!
the files.insert function which seems to not be available in v3.
files.insert method was changed to files.create in v3. You can check that out in the Migration Guide.
I also planned to upload large files so a progress bar/percentage
If you want to show progress bars, check out some HTML5 and JS tutorials on the web like this one. There's plenty on the web for additional samples.
do you think I should be using Chunk Upload?
Resumable upload is good for big files as opposed to simple upload. So if you're working with large files, then that's the recommended way.
Resumable
upload:
uploadType=resumable. For reliable transfer, especially important with
larger files. With this method, you use a session initiating request,
which optionally can include metadata. This is a good strategy to use
for most applications, since it also works for smaller files at the
cost of one additional HTTP request per upload.

Storing text files > 1MB in GAE/P

I have a Google App Engine app where I need to store text files that are larger than 1 MB (the maximum entity size.
I'm currently storing them in the Blobstore and I make use of the Files API for reading and writing them. Current operations including uploading them from a user, reading them to process and update, and presenting them to a user. Eventually, I would like to allow a user to edit them (likely as a Google doc).
Are there advantages to storing such text files in Google Cloud Storage, as a Google Doc, or in some other location instead of using the Blobstore?
It really depends on what exactly you need. There are of course advantages of using one service over the other, but in the end it really doesn't matter, since all of the solutions will be almost equally fast and not that expensive. If you will have a huge amount of data after some time you might consider switching to another solution, just because you might save some money.
Having said that, I will suggest you to continue with the Blobstore API, since that will not require extra communication with external services, more secret keys, etc. Security and speed wise it is exactly the same. When you will reach 10K or 100K users you will already going to know if it'actually worth it to store them somewhere else. Continue with what you know best, but just make sure that you're following the right practices when building on Google App Engine.
If you're already using the Files API to read and write the files, I'd recommend you use Google Cloud Storage rather than the Blobstore. GCS offers a richer RESTful API (makes it easier to do things like access control), does a number of things to accelerate serving static data, etc.
Sharing data is more easy in Google Docs (now Google Drive) and Google Cloud Storage. Using Google drive u can also use the power of Google Apps scripts.

Store images temporary in Google App Engine?

I'm writing an app with Python, which will check for updates on a website(let's call it A) every 2 hours, if there are new posts, it will download the images in the post and post them to another website(call it B), then delete those images.
Site B provide API for upload images with description, which is like:
upload(image_path, description), where image_path is the path of the image on your computer.
Now I've finished the app, and I'm trying to make it run on Google App Engine(because my computer won't run 7x24), but it seems that GAE won't let you write files on its file system.
How can I solve this problem? Or are there other choices for free Python hosting and providing "cron job" feature?
GAE has a BlobStore API, which can work pretty much as a file storage, but probably it's not what you whant. Actually, the right answer depends on what kind of API you're using - it may support file-like objects, so you could pass urllib response object, or accept URLs, or tons of other interesting features
You shouldn't need to use temporary storage at all - just download the image with urlfetch into memory, then use another urlfetch to upload it to the destination site.

Categories

Resources