Writing Images to s3fs.S3FileSystem after preprocessing image - python

Am currently accessing a s3 bucket from my school system.
To connect, I used the following:
import s3fs
from skimage import exposure
from PIL import Image, ImageStat
s3 = s3fs.S3FileSystem(client_kwargs={'endpoint_url': 'XXX'},
key='XXX',
secret='XXX')
I can retrieve an image from the s3 bucket as defined above and preprocess them using
infile = s3.open('test.jpg',"rb")
image = Image.open(infile)
img = np.asarray(image) #numpy.ndarray
img_eq = exposure.equalize_adapthist(img,clip_limit=0.03) #CLAHE
image_eq = Image.fromarray((img_eq * 255).astype(np.uint8)) #Convert back to image
To save the resulting image <image_eq> locally, would just be
image_eq.save("hello.jpg")
However, how do I save/write the resulting image into the s3fs filesystem instead?

save in Pillow accepts a file too. You could do:
image_eq.save(fs.open('s3://bucket/file.png', 'wb'), 'PNG')
You have to write a binary file. I think it works best by enforcing the file type, e.g. in this case PNG.

Related

Python: how to convert an image in memory?

I have an image in memory (downloaded from an online source) and I want to convert it to a different format before sending it on to a different online location.
The conversion is .webp to .jpg but that's not really relevant.
With Pillow I can easily convert local images and save them back to disc, but I can't get it to work with an image in memory.
I don't necessarily need to use Pillow. Any way to convert the image without having to save anything to disc is fine.
I am new to BytesIO with PIL, so just check my code attempt, with my test image it works, let me know
from PIL import Image
from io import BytesIO
img = Image.open('test.webp')
print('image : ', img.format)
img.show()
# Write PIL Image to in-memory PNG
membuf = BytesIO()
img.save(membuf, format="png")
img = Image.open(membuf)
print('image : ', img.format)
img.show()

conveting bytes to image using tinytag and PIL

I am using tinytags module in python to get the cover art of a mp3 file and want to display or store it. The return type of the variable is showing to be bytes. I have tried fumbling around with PIL using frombytes but to no avail. Is there any method to convert the bytes to image?
from tinytag import TinyTag
tag = TinyTag.get("03. Me, Myself & I.mp3", image=True)
img = tag.get_image()
I actually got a PNG image when I called tag.get_image() but I guess you might get a JPEG. Either way, you can wrap it in a BytesIO and open it with PIL/Pillow or display it. Carrying on from your code:
from PIL import Image
import io
...
im = tag.get_image()
# Make a PIL Image
pi = Image.open(io.BytesIO(im))
# Save as PNG, or JPEG
pi.save('cover.png')
# Display
pi.show()
Note that you don't have to use PIL/Pillow. You could look at the first few bytes and if they are a PNG signature (\x89PNG) save data as binary with PNG extension. If the signature is JPEG (\xff \xd8) save data as binary with JPEG extension.

How to read an image file from ftp and convert it into an opencv image without saving in python

The question is self explanatory, basically I want to read an image file from ftp using ftplib and convert it into an opencv image but without saving it on the disk in python.
Thanks
I was able to achieve this myself using the following code.
connection= ftplib.FTP('server.address.com','USERNAME','PASSWORD')
r = BytesIO()
connection.retrbinary('RETR '+ image_path, r.write)
image = np.asarray(bytearray(r.getvalue()), dtype="uint8")
image = cv.imdecode(image, cv.IMREAD_COLOR)

PDF to IMG to PDF all done in memory

In order to remove sensitive content from a PDF, I am converting it to image and back to PDF again.
I am able to do this while saving the jpeg image, however I would eventually like to adapt my code so that the file is in memory the whole time. PDF in memory -> JPEG in memory -> PDF in memory. I'm having trouble with the intermediary step.
from pdf2image import convert_from_path, convert_from_bytes
import img2pdf
images = convert_from_path('testing.pdf', fmt='jpeg')
image = images[0]
# opening from filename
with open("output/output.pdf","wb") as f:
f.write(img2pdf.convert(image.tobytes()))
On the last line, I am getting the error:
ImageOpenError: cannot read input image (not jpeg2000). PIL: error reading image: cannot identify image file <_io.BytesIO object at 0x1040cc8f0>
I'm not sure how to be converting this image to the string that img2pdf is looking for.
The pdf2image module will extract the images as Pillow images. And according the Pillow tobytes() documention: "This method returns the raw image data from the internal storage." Which is some bitmap representation.
To get your code working use BytesIO module like so:
# opening from filename
import io
with open("output/output.pdf","wb") as f, io.BytesIO() as output:
image.save(output, format='jpg')
f.write(img2pdf.convert(output.getvalue()))

Saving an Image file using binary Files - pyspark

How can I save Image file(JPG format) into my local system. I used BinaryFiles to load the pictures into spark, converted them into Array and processed them. Below is the code
from PIL import Image
import numpy as np
import math
images = sc.binaryFiles("path/car*")
imagerdd = images.map(lambda (x,y): (x,(np.asarray(Image.open(StringIO(y)))))
did some image processing and now key has path and value has Array for Image
imageOutuint = imagelapRDD.map(lambda (x,y): (x,(y.astype(np.uint8))))
imageOutIMG = imageOutuint.map(lambda (x,y): (x,(Image.fromarray(y))))
How can I save the Image to local/HDFS system, I see there is no option pertaining to it.
If you want to save data to local file system just collect as local iterator and use standard tools to save files records by records:
for x, img in imagerdd.toLocalIterator():
path = ... # Some path .jpg (based on x?)
img.save(path)
Just be sure to cache imagerdd to avoid recomputation.

Categories

Resources