Weird interaction with Python PIL image.save quality parameter

Weird interaction with Python PIL image.save quality parameter - python

This is just a part of the project I'm currently working on. I am trying to convert picture into text, then from text back to the picture without any loss or extra size.
First, I open the picture, read the pixels, and write them down. Pictures are size NxN.
from PIL import Image
import sys
import zlib
def rgb_to_hex(rgb):
return '%02x%02x%02x' % rgb
N = im.width
im = Image.open(r"path\\pic.png")
px = im.load()
read_pixels = ""
for i in range(N):
for j in range(N):
read_pixels += rgb_to_hex(px[j, i,])
Then, transform the string into bytes.
data = bytes.fromhex(read_pixels)
img = Image.frombytes("RGB", (N,N), data)
img.save("path\\new.png",quality = 92)
According to the Pillow official documentation they are saying that quality goes from 0 - 100 and values over 95 should be avoided. If there is nothing set, the default value is 75.
For example I used this picture.
The original photo when downloaded takes up 917 KB. When the picture is converted by the program, the new picture takes up 911 KB. Then I take my new picture (911KB) and run that one by the same program and I get back the same size 911KB this one did not shrink by a few KB and I do not know why. Why does this weird interaction happen only when I put original picture of 917 KB? Is there a way I could get 100% of the original quality.
I also tried this on some random 512x512 .jpg picture. Original size of that picture is 67.4KB, next "generation" of that picture is 67.1KB and one after that is 66.8KB. Also if I change quality to 93 or above (when using .jpg) the size goes up by a lot (at quality = 100, size > 135KB). I was 'playing' around with quality value and found out closest to the same size is 92 (<93 puts some extra KB for .jpg).
So with quality 92 .PNG the size stays the same after the first "generation" but with .jpg the size (and potentially quality) goes down.
Is there something I am missing in my code? My best guess is that .PNG stores some extra information about the picture which is lost in the conversion, but not sure why the .jpg pictures decrease in size every generation. I tried putting 92.5 quality but the function does not accept decimal numbers as parameters.

Quick takeaways from the following explanations...
The quality parameter for PIL.Image.save isn't used when saving PNGs.
JPEG is generationally-lossy so as you keep re-saving images, they will likely degrade in quality because the algorithm will introduce more artifacting (among other things)
PNG is lossless and the file size differences you're seeing are due to PIL stripping metadata when you re-save your image.
Let's look at your PNG file first. PNG is a lossless format - the image data you give it will not suffer generational loss if you were to open it and re-save it as PNG over and over again.
The quality parameter isn't even recognized by the PNG plugin to PIL - if you look at the PngImagePlugin.py/PngStream._save method it is never referenced in there.
What's happening with your specific sample image is that Pillow is dropping some metadata when you re-save it in your code.
On my test system, I have your PNG saved as sample.png, and I did a simple load-and-save with the following code and save it as output.png (inside ipython)
In [1]: from PIL import Image
In [2]: img = Image.open("sample.png")
In [3]: img.save("output.png")
Now let's look at the differences between their metadata with ImageMagick:
#> diff <(magick identify -verbose output.png) <(magick identify -verbose sample.png)
7c7,9
< Units: Undefined
---
> Resolution: 94.48x94.48
> Print size: 10.8383x10.8383
> Units: PixelsPerCentimeter
74c76,78
< Orientation: Undefined
---
> Orientation: TopLeft
> Profiles:
> Profile-exif: 5218 bytes
76,77c80,81
< date:create: 2022-08-12T21:27:13+00:00
< date:modify: 2022-08-12T21:27:13+00:00
---
> date:create: 2022-08-12T21:23:42+00:00
> date:modify: 2022-08-12T21:23:31+00:00
78a83,85
> exif:ImageDescription: IMGP5493_seamless_2.jpg
> exif:ImageLength: 1024
> exif:ImageWidth: 1024
84a92
> png:pHYs: x_res=9448, y_res=9448, units=1
85a94,95
> png:text: 1 tEXt/zTXt/iTXt chunks were found
> png:text-encoded profiles: 1 were found
86a97
> unknown: nomacs - Image Lounge 3.14
90c101
< Filesize: 933730B
---
> Filesize: 939469B
93c104
< Pixels per second: 42.9936MP
---
> Pixels per second: 43.7861MP
You can see there are metadata differences - PIL didn't retain some of the information when re-saving the image, especially some exif properties (you can see this PNG was actually converted from a JPG and the EXIF metadata was preserved in the conversion).
However, if you re-save the image with original image's info data...
In [1]: from PIL import Image
In [2]: img = Image.open("sample.png")
In [3]: img.save("output-with-info.png", info=img.info)
You'll see that the two files are exactly the same again:
❯ sha256sum output.png output-with-info.png
37ad78a7b7000c9430f40d63aa2f0afd2b59ffeeb93285b12bbba9c7c3dec4a2 output.png
37ad78a7b7000c9430f40d63aa2f0afd2b59ffeeb93285b12bbba9c7c3dec4a2 output-with-info.png
Maybe Reducing PNG File Size
While lossless, the PNG format does allow for reducing the size of the image by specifying how aggressive the compression is (there are also more advanced things you could do like specifying a compression dictionary).
PIL exposes these options as optimize and compress_level under PNG options.
optimize
If present and true, instructs the PNG writer to make the
output file as small as possible. This includes extra
processing in order to find optimal encoder settings.
compress_level
ZLIB compression level, a number between 0 and 9: 1 gives
best speed, 9 gives best compression, 0 gives no
compression at all. Default is 6. When optimize option is
True compress_level has no effect (it is set to 9 regardless
of a value passed).
And seeing it in action...
from PIL import Image
img = Image.open("sample.png")
img.save("optimized.png", optimize=True)
The resulting image I get is about 60K smaller than the original.
❯ ls -lh optimized.png sample.png
-rw-r--r-- 1 wkl staff 843K Aug 12 18:10 optimized.png
-rw-r--r-- 1 wkl staff 918K Aug 12 17:23 sample.png
JPEG File
Now, JPEG is a generationally-lossy image format - as you save it over and over, you will keep losing quality - it doesn't matter if your subsequent generations save it at even higher qualities than the previous ones, you've lost data already from the previous saves.
Note that the likely reason why you saw file sizes balloon if you used quality=100 is because libjpeg/libjpeg-turbo (which are the underlying libraries used by PIL for JPEG) do not do certain things when the quality is set that high, I think it doesn't do quantization which is an important step in determining how many bits are needed to compress.

Related

PIL Image.fromarray reducing the size of my image when saved

my code:
from PIL import Image
img = Image.open('this.jpg') # 2.72mb image
arrayA = np.array(img)
new_img = Image.fromarray(arrayA)
new_img.save("this_changed.jpg") # to 660 kb image size

PIL will compress and subsample your image in order to save space. Try to deactivate compressing and subsamling:
new_image.save("this_changed.jpg",
quality=100, # preserve color and image quality
subsampling=0 # use the whole image as base for the save
)

PIL saves images with a quality of 75 by default, i.e. if you don't say otherwise. If the original image has a quality of 100, and you save it as per your code, it will save it with quality 75 which will result in a smaller file size.
Let's make an example. Here is an image with quality 100:
Let's check its quality and file size with exiftool:
exiftool -filesize -JPEGQualityEstimate -n image.jpg
File Size : 896738
JPEG Quality Estimate : 100
Run your code, and you'll get:
exiftool -filesize -JPEGQualityEstimate -n result.jpg
File Size : 184922
JPEG Quality Estimate : 75
Notice it is lower quality and smaller.
Now tell PIL to keep the quality of the original:
im.save('result.jpg', quality='keep')
and check again:
exiftool -filesize -JPEGQualityEstimate -n result.jpg
File Size : 1261893
JPEG Quality Estimate : 100
Notice the quality setting has been retained and the file is now much larger - we'll come to that in a minute.
Just for completeness, let's tell PIL to use a quality of 85%:
im.save('result.jpg', quality=85)
and check results:
exiftool -filesize -JPEGQualityEstimate -n result.jpg
File Size : 232407
JPEG Quality Estimate : 85
The next question is why the file size is different from the original when specifying 'keep'. That is because JPEG is lossy and different encoders are allowed to make different decisions about how they do chroma-subsampling, what quantisation tables they use, whether they use integer or floating point accuracy and so on to optimise either filesize, or encoding speed or some other factor that was important to the author of the encoder. As such, you might find different Python/C++ packages produce different absolute values for pixels and different file sizes and in fact, these can change from one release/version of a library to the next. If you want your image data to remain identical, you should consider using PNG or another lossless format, e.g. NetPBM, TIFF.

Convert multi-channel numpy array to photoshop PSD file

I have a numpy array with 3 RGB channels and two alpha channels. (I am using python)
I want to convert it to a Photoshop .psd file, so I can later apply transformations in photoshop on the annotated alpha layers.
I guess it a very simple task but I haven't find any way to do it from the packages I found by googling it.
I guess it should be something in the lines of the following:
>> im.shape
(.., .., 4)
psd = PSD()
psd.add_layers_from_numpy(im, names=["R", "G", "B", "alpha-1", "alpha-2")
with open(of, 'wb') as f:
psd.write(f)
If you know how to do this, please let me know.
Thanks in advance!

I opened your PSD file in Photoshop and saved it as a TIFF. I then checked with tiffinfo and determined that your file is saved as RGB with 3 layers of "Extra samples":
tiffinfo MULTIPLE_ALPHA.tif
TIFF Directory at offset 0x8 (8)
Subfile Type: (0 = 0x0)
Image Width: 1000 Image Length: 1430
Resolution: 72, 72 pixels/inch
Bits/Sample: 8
Compression Scheme: None
Photometric Interpretation: RGB color <--- HERE
Extra Samples: 3<unspecified, unspecified, unspecified> <--- HERE
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 6
Rows/Strip: 1430
Planar Configuration: single image plane
Software: Adobe Photoshop 21.2 (Macintosh)
DateTime: 2020:09:08 19:34:38
You can load that into Python with:
import tifffile
import numpy as np
# Load TIFF saved by Photoshop
im = tifffile.imread('MULTIPLE_ALPHA.tif')
# Check shape
print(im.shape) # prints (1430, 1000, 6)
# Save with 'tifffile'
tifffile.imwrite('saved.tif', im, photometric='RGB')
And now check that Photoshop looks at and treats the tifffile image the same as your original:
You may want to experiment with the compress parameter. I noticed your file comes out as 8.5MB uncompressed, but as 2.4MB of lossless Zip-compressed data if I use:
tifffile.imwrite('saved.tif', im, photometric='RGB', compress=1)
Note that reading/writing with compression requires you to install imagecodecs:
pip install imagecodecs
Note that I am not suggesting it is impossible with a PSD-writer package, I am just saying I believe you can get what you want with a TIFF.
Keywords: Image processing, Photoshop, PSD, TIFF, save multi-channel, multi-layer, multi-image, multiple alpha, tifffile, Python

How to adjust Pillow EPS to JPG quality

I'm trying to convert EPS images to JPEG using Pillow. But the results are of low quality. I'm trying to use resize method, but it gets completely ignored. I set up the size of JPEG image as (3600, 4700), but the resulted image has (360, 470) size. My code is:
eps_image = Image.open('img.eps')
height = eps_image.height * 10
width = eps_image.width * 10
new_size = (height, width)
print(new_size) # prints (3600, 4700)
eps_image.resize(new_size, Image.ANTIALIAS)
eps_image.save(
'img.jpeg',
format='JPEG'
dpi=(9000, 9000),
quality=95)
UPD. Vasu Deo.S noticed one my error, and thanks to him the JPG image has become bigger, but quality is still low. I've tried different DPI, sizes, resample values for resize function, but the result does not change much. How can i make it better?

The problem is that PIL is a raster image processor, as opposed to a vector image processor. It "rasterises" vector images (such as your EPS file and SVG files) onto a grid when it opens them because it can only deal with rasters.
If that grid doesn't have enough resolution, you can never regain it. Normally, it rasterises at 100dpi, so if you want to make bigger images, you need to rasterise onto a larger grid before you even get started.
Compare:
from PIL import Image
eps_image = Image.open('image.eps')
eps_image.save('a.jpg')
The result is 540x720:
And this:
from PIL import Image
eps_image = Image.open('image.eps')
# Rasterise onto 4x higher resolution grid
eps_image.load(scale=4)
eps_image.save('a.jpg')
The result is 2160x2880:
You now have enough quality to resize however you like.
Note that you don't need to write any Python to do this at all - ImageMagick will do it all for you. It is included in most Linux distros and is available for macOS and Windows and you just use it in Terminal. The equivalent command is like this:
magick -density 400 input.eps -resize 800x600 -quality 95 output.jpg

It's because eps_image.resize(new_size, Image.ANTIALIAS) returns an resized copy of an image. Therefore you have to store it in a separate variable. Just change:-
eps_image.resize(new_size, Image.ANTIALIAS)
to
eps_image = eps_image.resize(new_size, Image.ANTIALIAS)
UPDATE:-
These may not solve the problem completely, but still would help.
You are trying to save your output image as a .jpeg, which is a
lossy compression format, therefore information is lost during the
compression/transformation (for the most part). Change the output
file extension to a lossless compression format like .png so that
data would not be compromised during compression. Also change
quality=95 to quality=100 in Image.save()
You are using Image.ANTIALIAS for resampling the image, which is
not that good when upscaling the image (it has been replaced by
Image.LANCZOS in newer version, the clause still exists for
backward compatibility). Try using Image.BICUBIC, which produces
quite favorable results (for the most part) when upscaling the image.

How can one specify dpi when saving image as tif via scipy.misc.imsave?

I am taking a screen-shot (PNG format) resizing it, and writing it back out in TIF format, via scipy.misc module (imread, imresize, imsave functions). The TIF format image is to be fed into Tesseract-OCR. However, Tesseract is complaining that the dpi specified in the TIF file's metadata is 0. How can one specify this when saving the image via scipy.misc.imsave or any other method?

Without analyzing where your problems exactly come from, the approach of Mark (maybe that's enough for you; maybe not; i can imagine there is something else in your code which might be the reason) can be emulated by using Pillow (and i don't see an option for this within scipy's wrapper).
Actually, instead of rewriting tags as he does, we care about these while doing our original task. In practice both approaches should be okay.
With a very high probability, scipy is already using Pillow under the hood (Note that Pillow (https://python-pillow.org/) is not a dependency of SciPy, but the image manipulation functions indicated in the list below are not available without it.; this list contains imsave).
from scipy.misc import ascent # test image
import PIL.Image
scipy_img = ascent().astype('uint8')
arr2im = PIL.Image.fromarray(scipy_img)
arr2im.save('test.tif', format='TIFF',
dpi=(100., 100.), # there still seems to be a bug when using int's here
compression='tiff_lzw',)
Checking with exiftool:
ExifTool Version Number : 10.63
File Name : test.tif
...
Image Width : 512
Image Height : 512
Bits Per Sample : 8
Compression : LZW
...
X Resolution : 100
Y Resolution : 100
...
Resolution Unit : inches
Image Size : 512x512
Megapixels : 0.262

Please file this one under "any other method" :-)
You can set the resolution with exiftool like this:
exiftool SomeImage.tif -xresolution=300 -yresolution=300 -resolutionunit=inches
Check it with ImageMagick:
identify -verbose SomeImage.tif
Image: SomeImage.tif
Format: TIFF (Tagged Image File Format)
Mime type: image/tiff
Class: DirectClass
Geometry: 100x100+0+0
Resolution: 300x300
Print size: 0.333333x0.333333
...
...
I am suggesting you shell out to run this command with os.system().
A Python wrapper exists, but I have never used it and cannot vouch for it.

Python: Remove Exif info from images

In order to reduce the size of images to be used in a website, I reduced the quality to 80-85%. This decreases the image size quite a bit, up to an extent.
To reduce the size further without compromising the quality, my friend pointed out that raw images from cameras have a lot of metadata called Exif info. Since there is no need to retain this Exif info for images in a website, we can remove it. This will further reduce the size by 3-10 kB.
But I'm not able to find an appropriate library to do this in my Python code. I have browsed through related questions and tried out some of the methods:
Original image: http://mdb.ibcdn.com/8snmhp4sjd75vdr27gbadolc003i.jpg
Mogrify
/usr/local/bin/mogrify -strip filename
Result: http://s23.postimg.org/aeaw5x7ez/8snmhp4sjd75vdr27gbadolc003i_mogrify.jpg
This method reduces the size from 105 kB to 99.6 kB, but also changed the color quality.
Exif-tool
exiftool -all= filename
Result: http://s22.postimg.org/aiq99o775/8snmhp4sjd75vdr27gbadolc003i_exiftool.jpg
This method reduces the size from 105 kB to 72.7 kB, but also changed the color quality.
This answer explains in detail how to manipulate the Exif info, but how do I use it to remove the info?
Can anyone please help me remove all the extra metadata without changing the colours, dimensions, and other properties of an image?

from PIL import Image
image = Image.open('image_file.jpeg')
# next 3 lines strip exif
data = list(image.getdata())
image_without_exif = Image.new(image.mode, image.size)
image_without_exif.putdata(data)
image_without_exif.save('image_file_without_exif.jpeg')

For me, gexiv2 works fine:
#!/usr/bin/python3
from gi.repository import GExiv2
exif = GExiv2.Metadata('8snmhp4sjd75vdr27gbadolc003i.jpg')
exif.clear_exif()
exif.clear_xmp()
exif.save_file()
See also Exif manipulation library for python, which you linked, but didn't read all answers ;)

You can try loading the image with the Python Image Lirbary (PIL) and then save it again to a different file. That should remove the meta data.

You don't even need to do the extra steps #user2141737 suggested. Just opening it up with PIL and saving it again seems to do the trick just fine:
from PIL import Image
image = Image.open('path/to/image')
image.save('new/path/' + file_name)

As for pillow==9.2.0
This seems to print exif data, a mutable mapping
print(im.info)
This seems to clear exif data for PNG
def clear_exif():
with Image.open('./my_image.png', mode='r', formats=['PNG']) as im:
fields_to_keep = ('transparency', )
exif_fields = list(im.info.keys())
for k in exif_fields:
if k not in fields_to_keep:
del im.info[k]
im.save('./my_image.png', format='PNG')

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.