GCS, only user who logged in can see images in my mosaic - python

I am building a mosaic using Google cloud storage, and I store the user uploaded image in a bucket. However, to see any images, you must log in(a google account) first. I have set the bucket to public and any user who had already logged in can view the image. But I want the images to show up even if they haven't logged in. What should I do?
I store the url of the image in my database:
#this part is the return function of my image upload
imageurl= 'https://storage.cloud.google.com/fortest098.appspot.com/{}'.format(mosaicLocation)
# and this is how i put my image into the database:
ImageInfo(...,image_url=imageurl).put
am I suppose to get a public url or something?

By default objects uploaded to GCS are access controlled.
You can make your objects publicly readable by setting a predefinedAcl at upload time or after they are uploaded (e.g., see https://cloud.google.com/storage/docs/json_api/v1/objects/update).
You can also do this using the gsutil command:
gsutil acl set public-read gs://your-bucket/your-object

Related

Verify image in Google Cloud Storage Bucket

I have system that creates uploadable link to Google Cloud Storage Bucket uploads. After that user is uploading it directly there from Frontend.
Is there a way to verify this image file there without downloading it to a Backend app and verify there (e.g. using PIL for python)?
Verification for:
is it an image at all;
is it fully uploaded;
is it not broken;
etc.
P.S. is there anything similar for PDF?
Cloud Storage doesn't directly offer any direct support for any particular formats, be it JPEG or PDF or anything else. To fully validate what's in a file, you need to download it and check.
You can, however, get part of the way there.
First, you can have your client validate the file, then capture the size and/or a checksum (either MD5 or CRC32c) of the original file, and you can specify them as part of the upload to ensure that they are uploaded exactly as intended. If your server can know the intended file size or checksum, you can ask Cloud Storage for just the metadata of an object without downloading it to verify that it is as intended.
Second, many files, including JPEG, have particular headers or footers that describe their contents. Instead of downloading what is potentially a very large image, you could download only the first few bytes from Cloud Storage. If the first two bytes aren't 0xFF and 0xD8, then it's not a JPEG file. Similar magic numbers exist for many other formats.

Images are not loading from an Amazon S3 bucket

I am doing a Django project. I have hosted my static files on Amazon S3. It has been successfully uploaded to it. But, the images are not loading when I run the server.
When I inspect the image field it shows:
https://django-ecommerce-files.s3.amazonaws.com/images/logo.png%22%20id=%22image%22%20style=%22width:%2040px;%20height:40px%22%3E
When I double clicked it. It shows this error:
<Error>
<Code>AccessDenied</Code>
<Message>Access Denied</Message>
<RequestId>07PX6KHYASHT3008</RequestId>
<HostId>pJCxChq1JHlw/GL0Zy/W+PvX1TevOf/C60Huyidi8+0GMAs8geYlXSrEgo6m9vllL0PouTn6NAA=
</HostId>
</Error>
When working with S3 bucket, there is a need to make your resources(files) publicly accessible. You can either do that programmatically at the point of uploading to S3 or from the AWS console. please check how you can enable public access to your files here
Make sure that you have changed the public access settings for the S3 bucket, such that it allows files to be accessed by your app (with the right credentials).
Your requirement may vary, so take a look at their user manual.
Check the Permissions tab under the bucket.
Or, you can also take a look at the actions allowed on your S3 bucket, it must be configured to allow read/write. Refer the docs for a few examples
im not sure if this works. can you try enabling the static hosting on your s3?
go to your s3.
go to properties, scroll down to the bottom
enable the static hosting
a png file on s3 would look like this(the link works btw):
https://aws-cicd-react.s3.ap-southeast-1.amazonaws.com/logo512.png
addendum:
if you want to see the url of your file:
In your s3, click the file then go to properties
look at the Object URL

Google cloud python API returns None for blob.owner attribute

I have created a Google bucket with "Fine-grained access control" and a few users have uploaded files to it. Using the python API I can't seem to get any information on who uploaded each. The blob.owner property just returns None:
sclient = storage.Client(project=GCLOUD_PROJECT)
bucket = storage.bucket.Bucket(client=sclient, name=GCLOUD_BUCKET)
blob = bucket.get_blob('foo.bar')
blob.reload()
print(blob.owner)
I'm calling reload() there because the documentation states it's required to pull some attributes from the server. All other properties I try print fine (size, updated, etag, md5_hash, etc.).
How can I recover the uploader identification?
For anyone running into this, I created a ticket and it was fixed.
https://github.com/googleapis/python-storage/issues/136

Access google cloud storage account objects from app engine

Overview
I have a GCP storage bucket, which has a .json file and 5 jpeg files. In the .json file the image names match the jpeg file names. I want to know a way which i can access each of the object within the storage account based upon the image name.
Method 1 (Current Method):
Currently, a python script is been used to to get the images from the storage bucket. This is been done by looping through the .json file of image names, getting each individual image name, then building a URL based on the bucket/image name and retrieving the image and displaying it on a flask App Engine site.
This current method requires the bucket objects to be public, which poses a security issue with the internet granted access to this bucket, secondly it is computational expensive, with each image having to be pulled down from the bucket separately. The bucket will eventually contain 10000 images, which will result in the images been slow to load and display on the web page.
Requirement (New Method):
Is there a method in which i can pull down images from the bucket, not all the images at once, and display them on a web page. I want to be able to access individual images from the bucket and display their corresponding image data, retrieved from the .json file.
Lastly i want to ensure that neither the bucket or the objects are public and can only be accessed via the app engine.
Thanks
Would be helpful to see the Python code that's doing the work right now. You shouldn't need the storage objects to be public. They should be able to be retrieved using the Google Cloud Storage (GCS) API and a service account token that has view-only permissions on storage (although depending on whether or not you know the object names and need to get the bucket name, it might require more permissions on the service account).
As for the performance, you could either do things on the front end to be smart about how many you're showing and fetch only what you want to display as the user scrolls, or you could paginate your results from the GCS bucket.
Links to the service account and API pieces here:
https://cloud.google.com/iam/docs/service-accounts
https://cloud.google.com/storage/docs/reference/libraries#client-libraries-install-python
Information about pagination for retrieving GCS objects here:
How does paging work in the list_blobs function in Google Cloud Storage Python Client Library

Asynchronous File Upload to Amazon S3 with Django

I am using this file storage engine to store files to Amazon S3 when they are uploaded:
http://code.welldev.org/django-storages/wiki/Home
It takes quite a long time to upload because the file must first be uploaded from client to web server, and then web server to Amazon S3 before a response is returned to the client.
I would like to make the process of sending the file to S3 asynchronous, so the response can be returned to the user much faster. What is the best way to do this with the file storage engine?
Thanks for your advice!
I've taken another approach to this problem.
My models have 2 file fields, one uses the standard file storage backend and the other one uses the s3 file storage backend. When the user uploads a file it get's stored localy.
I have a management command in my application that uploads all the localy stored files to s3 and updates the models.
So when a request comes for the file I check to see if the model object uses the s3 storage field, if so I send a redirect to the correct url on s3, if not I send a redirect so that nginx can serve the file from disk.
This management command can ofcourse be triggered by any event a cronjob or whatever.
It's possible to have your users upload files directly to S3 from their browser using a special form (with an encrypted policy document in a hidden field). They will be redirected back to your application once the upload completes.
More information here: http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1434
There is an app for that :-)
https://github.com/jezdez/django-queued-storage
It does exactly what you need - and much more, because you can set any "local" storage and any "remote" storage. This app will store your file in fast "local" storage (for example MogileFS storage) and then using Celery (django-celery), will attempt asynchronous uploading to the "remote" storage.
Few remarks:
The tricky thing is - you can setup it to copy&upload, or to upload&delete strategy, that will delete local file once it is uploaded.
Second tricky thing - it will serve file from "local" storage until it is not uploaded.
It also can be configured to make number of retries on uploads failures.
Installation & usage is also very simple and straightforward:
pip install django-queued-storage
append to INSTALLED_APPS:
INSTALLED_APPS += ('queued_storage',)
in models.py:
from queued_storage.backends import QueuedStorage
queued_s3storage = QueuedStorage(
'django.core.files.storage.FileSystemStorage',
'storages.backends.s3boto.S3BotoStorage', task='queued_storage.tasks.TransferAndDelete')
class MyModel(models.Model):
my_file = models.FileField(upload_to='files', storage=queued_s3storage)
You could decouple the process:
the user selects file to upload and sends it to your server. After this he sees a page "Thank you for uploading foofile.txt, it is now stored in our storage backend"
When the users has uploaded the file it is stored temporary directory on your server and, if needed, some metadata is stored in your database.
A background process on your server then uploads the file to S3. This would only possible if you have full access to your server so you can create some kind of "deamon" to to this (or simply use a cronjob).*
The page that is displayed polls asynchronously and displays some kind of progress bar to the user (or s simple "please wait" Message. This would only be needed if the user should be able to "use" (put it in a message, or something like that) it directly after uploading.
[*: In case you have only a shared hosting you could possibly build some solution which uses an hidden Iframe in the users browser to start a script which then uploads the file to S3]
You can directly upload media to the s3 server without using your web application server.
See the following references:
Amazon API Reference : http://docs.amazonwebservices.com/AmazonS3/latest/dev/index.html?UsingHTTPPOST.html
A django implementation : https://github.com/sbc/django-uploadify-s3
As some of the answers here suggest uploading directly to S3, here's a Django S3 Mixin using plupload:
https://github.com/burgalon/plupload-s3mixin
I encountered the same issue with uploaded images. You cannot pass along files to a Celery worker because Celery needs to be able to pickle the arguments to a task. My solution was to deconstruct the image data into a string and get all other info from the file, passing this data and info to the task, where I reconstructed the image. After that you can save it, which will send it to your storage backend (such as S3). If you want to associate the image with a model, just pass along the id of the instance to the task and retrieve it there, bind the image to the instance and save the instance.
When a file has been uploaded via a form, it is available in your view as a UploadedFile file-like object. You can get it directly out of request.FILES, or better first bind it to your form, run is_valid and retrieve the file-like object from form.cleaned_data. At that point at least you know it is the kind of file you want it to be. After that you can get the data using read(), and get the other info using other methods/attributes. See https://docs.djangoproject.com/en/1.4/topics/http/file-uploads/
I actually ended up writing and distributing a little package to save an image asyncly. Have a look at https://github.com/gterzian/django_async Right it's just for images and you could fork it and add functionalities for your situation. I'm using it with https://github.com/duointeractive/django-athumb and S3

Categories

Resources