JPG image size reduced on imsave - python

I am building up a library called hips where one module is involved with fetching tile images and storing them on disk. The problem here is that I fetch a tile from a remote URL and save it using scipy.misc.imsave function in a temporary directory. The saved file size is 41.0 kB, however, if I save the file manually from the remote URL, its size is 119.7 kB.
I have copied the failed test case below:
def test_fetch_read_write_jpg(self, tmpdir):
meta = HipsTileMeta( ... )
url = 'http://alasky.unistra.fr/2MASS/H/Norder6/Dir30000/Npix30889.jpg'
tile = HipsTile.fetch(meta, url)
filename = str(tmpdir / 'Npix30889.jpg')
tile.write(filename)
tile2 = HipsTile.read(meta, filename=filename)
print(tile.data.shape)
print(tile2.data.shape)
assert tile == tile2
Here is the failed assertion:
----------------------------------Captured stdout call--------------------------------------
(512, 512, 3)
(512, 512, 3)
False
The code involved with tile storing is shown below:
from scipy.misc import imsave
def write(self, filename: str = None) -> None:
path = Path(filename) if filename else self.meta.full_path
imsave(str(path), self.data)
I also tried saving the file using PIL.Image library, using this code:
from PIL import Image
image = Image.fromarray(self.data)
image.save(str(path))
But, it produces the same results. I tried printing out the tile data at index [0][0] which came to be [10, 10, 10] for both cases. Also, I displayed the image using matplotlib, and the results were identical. But, I can't figure out the reason for the reduction in size / quality.

JPEG is a lossy format. If you write an image to a JPEG file and then read it back, you won't, in general, get back the same data.
For lossless image storage, you could use PNG.

Related

HEIC to JPEG conversion with metadata

I'm trying to convert heic file in jpeg importing also all metadadata (like gps info and other stuff), unfurtunately with the code below the conversion is ok but no metadata are stored on the jpeg file created.
Anyone can describe me what I need to add in the conversion method?
heif_file = pyheif.read("/transito/126APPLE_IMG_6272.HEIC")
image = Image.frombytes(
heif_file.mode,
heif_file.size,
heif_file.data,
"raw",
heif_file.mode,
heif_file.stride,
)
image.save("/transito/126APPLE_IMG_6272.JPEG", "JPEG")
Thanks, i found a solution, I hope can help others:
# Open the file
heif_file = pyheif.read(file_path_heic)
# Creation of image
image = Image.frombytes(
heif_file.mode,
heif_file.size,
heif_file.data,
"raw",
heif_file.mode,
heif_file.stride,
)
# Retrive the metadata
for metadata in heif_file.metadata or []:
if metadata['type'] == 'Exif':
exif_dict = piexif.load(metadata['data'])
# PIL rotates the image according to exif info, so it's necessary to remove the orientation tag otherwise the image will be rotated again (1° time from PIL, 2° from viewer).
exif_dict['0th'][274] = 0
exif_bytes = piexif.dump(exif_dict)
image.save(file_path_jpeg, "JPEG", exif=exif_bytes)
HEIF to JPEG:
from PIL import Image
import pillow_heif
if __name__ == "__main__":
pillow_heif.register_heif_opener()
img = Image.open("any_image.heic")
img.save("output.jpeg")
JPEG to HEIF:
from PIL import Image
import pillow_heif
if __name__ == "__main__":
pillow_heif.register_heif_opener()
img = Image.open("any_image.jpg")
img.save("output.heic")
Rotation (EXIF of XMP) will be removed automatically when needed.
Call to register_heif_opener can be replaced by importing pillow_heif.HeifImagePlugin instead of pillow_heif
Metadata can be edited in Pillow's "info" dictionary and will be saved when saving to HEIF.
Here is an other approach to convert iPhone HEIC images to JPG preserving exif data
Pyhton 3.9 (I'm on Rasperry PI 4 64 bit)
install pillow_heif (0.8.0)
And run following code and you'll find exif data in the new JPEG image.
The trick is to get the dictionary information. No additional conversion required.
This is sample code, built your own wrapper around.
from PIL import Image
import pillow_heif
# open the image file
heif_file = pillow_heif.read_heif("/mnt/pictures/test/IMG_0001.HEIC")
#create the new image
image = Image.frombytes(
heif_file.mode,
heif_file.size,
heif_file.data,
"raw",
heif_file.mode,
heif_file.stride,
)
print(heif_file.info.keys())
dictionary=heif_file.info
exif_dict=dictionary['exif']
# debug
print(exif_dict)
image.save('/tmp/test000.JPG', "JPEG", exif=exif_dict)

Cover art size from a MP3 with Python

I'am trying to write a script that gives me the dimension of the Cover art in a mp3 file, the furthest I've is via Mutgen doing:
audiofile = mutagen.File(wavefile, easy=False)
print(audiofile.tags)
but then from that raw output how can I extract the dimensions like 400x400
You can use stagger and PIL, i.e.:
import stagger, io, traceback
from PIL import Image
try:
mp3 = stagger.read_tag('Menuetto.mp3')
im = Image.open(io.BytesIO(mp3[stagger.id3.APIC][0].data))
print(im.size)
# (300, 300)
# im.save("cover.jpg") # save cover to file
except:
pass
print(traceback.format_exc())

Image processing after upload with Python Bottle

Context
I have made a simple web app for uploading content to a blog. The front sends AJAX requests (using FormData) to the backend which is Bottle running on Python 3.7. Text content is saved to a MySQL database and images are saved to a folder on the server. Everything works fine.
Image processing and PIL/Pillow
Now, I want to enable processing of uploaded images to standardise them (I need them all resized and/or cropped to 700x400px).
I was hoping to use Pillow for this. My problem is creating a PIL Image object from the file object in Bottle. I cannot initialise a valid Image object.
Code
# AJAX sends request to this route
#post('/update')
def update():
# Form data
title = request.forms.get("title")
body = request.forms.get("body")
image = request.forms.get("image")
author = request.forms.get("author")
# Image upload
file = request.files.get("file")
if file:
extension = file.filename.split(".")[-1]
if extension not in ('png', 'jpg', 'jpeg'):
return {"result" : 0, "message": "File Format Error"}
save_path = "my/save/path"
file.save(save_path)
The problem
This all works as expected, but I cannot create a valid Image object with pillow for processing. I even tried reloading the saved image using the save path but this did not work either.
Other attempts
The code below did not work. It caused an internal server error, though I am having trouble setting up more detailed Python debugging.
path = save_path + "/" + file.filename
image_data = open(path, "rb")
image = Image.open(image_data)
When logged manually, the path is a valid relative URL ("../domain-folder/images") and I have checked that I am definitely importing PIL (Pillow) correctly using PIL.PILLOW_VERSION.
I tried adapting this answer:
image = Image.frombytes('RGBA', (128,128), image_data, 'raw')
However, I won’t know the size until I have created the Image object. I also tried using io:
image = Image.open(io.BytesIO(image_data))
This did not work either. In each case, it is only the line trying to initialise the Image object that causes problems.
Summary
The Bottle documentation says the uploaded file is a file-like object, but I am not having much success in creating an Image object that I can process.
How should I go about this? I do not have a preference about processing before or after saving. I am comfortable with the processing, it is initialising the Image object that is causing the problem.
Edit - Solution
I got this to work by adapting the answer from eatmeimadanish. I had to use a io.BytesIO object to save the file from Bottle, then load it with Pillow from there. After processing, it could be saved in the usual way.
obj = io.BytesIO()
file.save(obj) # This saves the file retrieved by Bottle to the BytesIO object
path = save_path + "/" + file.filename
# Image processing
im = Image.open(obj) # Reopen the object with PIL
im = im.resize((700,400))
im.save(path, optimize=True)
I found this from the Pillow documentation about a different function that may also be of use.
PIL.Image.frombuffer(mode, size, data, decoder_name='raw', *args)
Note that this function decodes pixel data only, not entire images.
If you have an entire image file in a string, wrap it in a BytesIO object, and use open() to load it.
Use StringIO instead.
From PIL import Image
try:
import cStringIO as StringIO
except ImportError:
import StringIO
s = StringIO.StringIO()
#save your in memory file to this instead of a regular file
file = request.files.get("file")
if file:
extension = file.filename.split(".")[-1]
if extension not in ('png', 'jpg', 'jpeg'):
return {"result" : 0, "message": "File Format Error"}
file.save(s)
im = Image.open(s)
im.resize((700,400))
im.save(s, 'png', optimize=True)
s64 = base64.b64encode(s.getvalue())
From what I understand, you're trying to resize the image after it has been saved locally (note that you could try to do the resize before it is saved). If this is what you want to achieve here, you can open the image directly using Pillow, it does the job for you (you do not have to open(path, "rb"):
image = Image.open(path)
image.resize((700,400)).save(path)

Send multiple StringIO from PIL Image in POST requests with Python

I have a stored picture on my computer. I open it using the Python Image module. Then I crop this image into several pieces using this module. To conclude, I would like to upload the image via a POST request on a website.
Because that small images are PIL object, I converted each of them into StringIO to be able to send the form without having to save them on my PC.
Unfortunately, I encounter an error, whereas if the images are physically stored on my PC, there is no problem. I do not understand why.
You can visit the website here: http://www.noelshack.com/api.php
This is a very basic API that returns the link to the uploaded picture.
In my case, the problem is that returns nothing, at the end of the second image (no problem for the first).
Here is the programming code to crop the image into 100 pieces.
import requests
import Image
import StringIO
import os
image = Image.open("test.jpg")
width, height = image.size
images = []
for i in range(10):
for j in range(10):
crop = image.crop((i * 10, j * 10, (i + 1) * 10, (j + 1) * 10))
images.append(crop)
The function to upload an image:
def upload(my_file):
api_url = 'http://www.noelshack.com/api.php'
r = requests.post(api_url, files={'fichier': my_file})
if not 'www.noelshack.com' in r.text:
raise Exception(r.text)
return r.text
Now we have two possibilities. The first is to save each of the 100 images on disk and upload them.
if not os.path.exists("directory"):
os.makedirs("directory")
i = 0
for img in images:
img.save("directory/" + str(i) + ".jpg")
i += 1
for file in os.listdir("directory"):
with open("directory/" + file, "rb") as f:
print upload(f)
It works like a charm, but it is not very convenient. So, I thought to use StringIO.
for img in images:
my_file = StringIO.StringIO()
img.save(my_file, "JPEG")
print upload(my_file.getvalue())
# my_file.close() -> Does not change anything
The first link is printed, but the function raise the exception then.
I think the problem lies in the img.save(), because the same kind of for loop was not working to save to disk and then upload. In addition, if you add a time.sleep(1) between the uploads, it seems to work.
Any help would be welcome please, because I'm really stuck.
my_file.getvalue() returns a string. What you need is a file-like object, which my_file already is. And file like objects have a cursor, so to speak, which says where to read from or write to. So, if you do my_file.seek(0) before the upload, it should get fixed.
modify the code to:
for img in images:
my_file = StringIO.StringIO()
img.save(my_file, "JPEG")
my_file.seek(0)
print upload(my_file)

Encode processed image to BASE64 in python for use in json [duplicate]

I am looking to create base64 inline encoded data of images for display in a table using canvases. Python generates and creates the web page dynamically. As it stands python uses the Image module to create thumbnails. After all of the thumbnails are created Python then generates base64 data of each thumbnail and puts the b64 data into hidden spans on the user's webpage. A user then clicks check marks by each thumbnail relative to their interest. They then create a pdf file containing their selected images by clicking a generate pdf button. The JavaScript using jsPDF generates the hidden span b64 data to create the image files in the pdf file and then ultimately the pdf file.
I am looking to hopefully shave down Python script execution time and minimize some disk I/O operations by generating the base64 thumbnail data in memory while the script executes.
Here is an example of what I would like to accomplish.
import os, sys
import Image
size = 128, 128
im = Image.open("/original/image/1.jpeg")
im.thumbnail(size)
thumb = base64.b64encode(im)
This doesn't work sadly, get a TypeErorr -
TypeError: must be string or buffer, not instance
Any thoughts on how to accomplish this?
You first need to save the image again in JPEG format; using the im.tostring() method would otherwise return raw image data that no browser would recognize:
from io import BytesIO
output = BytesIO()
im.save(output, format='JPEG')
im_data = output.getvalue()
This you can then encode to base64:
image_data = base64.b64encode(im_data)
if not isinstance(image_data, str):
# Python 3, decode from bytes to string
image_data = image_data.decode()
data_url = 'data:image/jpg;base64,' + image_data
Here is one I made with this method:

Unfortunately the Markdown parser doesn't let me use this as an actual image, but you can see it in action in a snippet instead:
<img src=""/>
In Python 3, you may need to use BytesIO:
from io import BytesIO
...
outputBuffer = BytesIO()
bg.save(outputBuffer, format='JPEG')
bgBase64Data = outputBuffer.getvalue()
# http://stackoverflow.com/q/16748083/2603230
return 'data:image/jpeg;base64,' + base64.b64encode(bgBase64Data).decode()
thumb = base64.b64encode(im.tostring())
I think would work
I use PNG when I save to the buffer. With JPEG the numpy arrays are a bit different.
import base64
import io
import numpy as np
from PIL import Image
image_path = 'dog.jpg'
img2 = np.array(Image.open(image_path))
# Numpy -> b64
buffered = io.BytesIO()
Image.fromarray(img2).save(buffered, format="PNG")
b64image = base64.b64encode(buffered.getvalue())
# b64 -> Numpy
img = np.array(Image.open(io.BytesIO(base64.b64decode(b64image))))
print(img.shape)
np.testing.assert_almost_equal(img, img2)
Note that it will be slower.

Categories

Resources