I have a store of images in Google Cloud Storage and I am looking to read them into OpenCV in Datalab. I can find information on how to read text files but can't find anthing on how I can read in an image. How would I go about doing this?
I am not really familiar with OpenCV, so let me cover the Datalab ⟷ GCS part and I hope that is enough for you to go on with the OpenCV part.
In Datalab, you can use two different approaches to access Google Cloud Storage resources. They are both documented (with working examples) in these Jupyter notebooks: access GCS using Storage commands ( %%gcs ) or access GCS using Storage APIs ( google.datalab.storage ).
I'll provide an example using Storage commands, but feel free to adapt it to the Datalab GCS Python library if you prefer.
# Imports
from google.datalab import Context
from IPython.display import Image
# Define the bucket and and an example image to read
bucket_path = "gs://BUCKET_NAME"
bucket_object = bucket_path + "/google.png"
# List all the objects in your bucket, and read the example image file
%gcs list --objects $bucket_path
%gcs read --object $bucket_object -v img
# Print the image content (see it is in PNG format) and show it
print(type(img))
img
Image(img)
Using the piece of code I shared, you are able to perform a simple object-listing for all the objects in your bucket and also read an example PNG image. Having its content stored in a Python variable, I hope you are able to consume it in OpenCV.
Related
I have seen so many different threads about this topic but none of their solutions seems to work for me. I've tried several ways of reading an image from my Drive into Colab using its URL, with no success. I want to read it using its URL rather than mounting my Drive and using directories because multiple people share this Colab, and their directory to the image might not be the same as mine.
The first attempt comes from a popular thread on this issue: How do I read image data from a URL in Python?
from PIL import Image
import requests
url = 'https://drive.google.com/file/d/1z33YPsoMe0lSNNa2XWa0tiK2571j2tFu/view?usp=sharing'
im = Image.open(requests.get(url).raw) # apparently no need for bytes wrapping in new Python versions
im = Image.open(requests.get(url, stream=True).raw) # also does not work
The error I got was UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7f0189569770>
Then I tried:
from skimage import io
io.imshow(io.imread(url))
Which returned ValueError: Could not find a format to read the specified file in mode 'i'. Feeling very lost because all these approaches seem to work for everyone else. Would appreciate any feedback.
Using gdown to read an image from Google Drive into Colab.
If you have your image in your Google Drive and you are using colab.research.google.com, then you can follow these steps:
pip install gdown
Obtaining the share link from your image (remember to set the option "Share to anyone with the link"). For an example:
# https://drive.google.com/file/d/1WG3DGKAo8JEG4htBSBhIIVqhn6D2YPZ/view?usp=sharing
Extract the id of the share link from the URL 1WG3DGKAo8JEG4htBSBhIIVqhn6D2YPZ.
Download the image in the folder content of a particular user who has access to this Colab:
!gdown --id '1WG3DGKAo8JEG4htBSBhIIVqhn6D2YPZ' --output bird1.jpp
Read the image file from the folder content
from PIL import Image
im = Image.open("../content/bird1.jpg")
im
I want to read image from google drive and store in binary field.How can I do that?
I tried this code but I got too large image.Another option for reading image from drive?
link = urllib.request.urlopen(image_path).read()
image_base64 = base64.encodestring(link)
I have to process very large images (size > 2 GB) stored in aws s3.
Before processing I actually want to display some of them.
Download time is infeasible, is it possible to display them without downloading using only Python?
You could give a URL to the user to open in a web browser. This does involve downloading the image, but it would be done outside of Python.
If you want to present them with a "thumbnail", then you would need a method of converting the image. This could be done with an AWS Lambda function that:
Loads the image into memory (it's too big for the default disk space)
Resizes the image to a smaller size
Stores it in Amazon S3
Provides a URL to the smaller image
This is similar to Tutorial: Using AWS Lambda with Amazon S3 but it would need a tweak to manipulate the image in memory instead of downloading the image to the Lambda function's disk storage (that is limited to 512MB).
In Google App Engine, I need to be able to take an uploaded PDF and convert it to an image (or maybe one day a number of tiled images) for storing and serving back out. Is there a library that will read PDF files that is also 100% python (so it can be uploaded with my app)?
From what I've gathered so far...
PIL does not read PDF files, only writes them.
GhostScript is the standard FOSS PDF reader, but I don't believe I'll be able to upload it with my app to GAE since I don't believe it's 100% python.
Is there anything else I might be able to use? Or maybe even a web service that I can call?)
You may want to look into using the GAE Conversion API (not yet fully released). There's a tester signup form here, with a link to further details.
From the doc:
Conversions can be performed in any direction between PDF, HTML, TXT, and image formats, and OCR will be employed if necessary. Note that while PNG, GIF, JPEG, and BMP image formats are supported as input formats, only PNG is available for output.
I would like to go over an image and do some pixel by pixel operation. The image API provided by Google App Engine seems to be incapable to do this. And it doesn't include Python Imaging Library. So, how should I proceed with it.
Thanks..
You could maybe use the image API to convert to PNG, then use the png module (which is pure python, so should hopefully run on app engine) to load the PNG and modify the pixels. Then convert back to PNG using the png module, and back to whatever format you need using the image API.