I downloaded a test image from Wikipedia (the tree seen below) to compare Pillow and OpenCV (using cv2) in python. Perceptually the two images appear the same, but their respective md5 hashes don't match; and if I subtract the two images the result is not even close to solid black (the image shown below the original). The original image is a JPEG. If I convert it to a PNG first, the hashes match.
The last image shows the frequency distribution of how the pixel value differences.
As Catree pointed out my subtraction was causing integer overflow. I updated to converting too dtype=int before the subtraction (to show the negative values) and then taking the absolute value before plotting the difference. Now the difference image is perceptually solid black.
This is the code I used:
from PIL import Image
import cv2
import sys
import md5
import numpy as np
def hashIm(im):
imP = np.array(Image.open(im))
# Convert to BGR and drop alpha channel if it exists
imP = imP[..., 2::-1]
# Make the array contiguous again
imP = np.array(imP)
im = cv2.imread(im)
diff = im.astype(int)-imP.astype(int)
cv2.imshow('cv2', im)
cv2.imshow('PIL', imP)
cv2.imshow('diff', np.abs(diff).astype(np.uint8))
cv2.imshow('diff_overflow', diff.astype(np.uint8))
with open('dist.csv', 'w') as outfile:
diff = im-imP
for i in range(-256, 256):
outfile.write('{},{}\n'.format(i, np.count_nonzero(diff==i)))
cv2.waitKey(0)
cv2.destroyAllWindows()
return md5.md5(im).hexdigest() + ' ' + md5.md5(imP).hexdigest()
if __name__ == '__main__':
print sys.argv[1] + '\t' + hashIm(sys.argv[1])
Frequency distribution updated to show negative values.
This is what I was seeing before I implemented the changes recommended by Catree.
The original image is a JPEG.
JPEG decoding can produce different results depending on the libjpeg version, compiler optimization, platform, etc.
Check which version of libjpeg Pillow and OpenCV are using.
See this answer for more information: JPEG images have different pixel values across multiple devices or here.
BTW, (im-imP) produces uint8 overflow (there is no way to have such a high amount of large pixel differences without seeing it in your frequency chart). Try to cast to int type before doing your frequency computation.
Related
I'm trying to save a 16-bit numpy array as a 16-bit PNG but what I obtain is only a black picture. I put here a minimum example of what I'm talking aboout.
im = np.random.randint(low=1, high=6536, size=65536).reshape(256,256) #sample numpy array to save as image
plt.imshow(im, cmap=plt.cm.gray)
Given the above numpy array this is the image I see with matplotlib, but when then I save the image as 16-bit png I obtain the picture below:
import imageio
imageio.imwrite('result.png', im)
Image saved:
where some light grey spots are visible but the image is substantially black. Anyway when I read back the image and visualize it again with matplotlib I see the same starting image. I also tried other libraries instead of imageio (like PIL or PyPNG) but with the same result.
I know that 16-bit image values range from 0 to 65535 and in the array numpy array here there only values from 1 to 6536, but I need to save numpy arrays images similar to this, i.e. where the maximum value represented in the image isn't the maximum representable value. I think that some sort of nornalization is involved in the saving process. I need to save the array exactly as I see them in matplotlib at their maximum resolution and without compression or shrinkage in their values (so division by 255 or conversion to 8-bit array are not suitable).
It looks like imageio.imwrite will do the right thing if you convert the data type of the array to numpy.uint16 before writing the PNG file:
imageio.imwrite('result.png', im.astype(np.uint16))
When I do that, result.png is a 16 bit gray-scale PNG file.
If you want the image to have the full grayscale range from black to white, you'll have to scale the values to the range [0, 65535]. E.g. something like:
im2 = (65535*(im - im.min())/im.ptp()).astype(np.uint16)
Then you can save that array with
imageio.imwrite('result2.png', im2)
For writing a NumPy array to a PNG file, an alternative is numpngw (a package that I created). For example,
from numpngw import write_png
im2 = (65535*(im - im.min())/im.ptp()).astype(np.uint16)
write_png('result2.png', im2)
If you are already using imageio, there is probably no signficant advantage to using numpngw. It is, however, a much lighter dependency than imageio--it depends only on NumPy (no dependence on PIL/Pillow and no dependence on libpng).
I want to create a script which takes a .HDR file and tonemaps it into a .JPG. I have looked at a few OpenCV tutorials and it seems it should be able to do this.
I have written this script:
import cv2
import numpy as np
filename = "image/gg.hdr"
im = cv2.imread(filename)
cv2.imshow('', im.astype(np.uint8))
cv2.waitKey(0)
tonemapDurand = cv2.createTonemapDurand(2.2)
ldrDurand = tonemapDurand.process(im.copy())
new_filename = filename + ".jpg"
im2_8bit = np.clip(ldrDurand * 255, 0, 255).astype('uint8')
cv2.imwrite(new_filename, ldrDurand)
cv2.imshow('', ldrDurand.astype(np.uint8))
Which according to the tutorials should work. I am getting a black image in the end though. I have verified that the result it saves is .JPG, as well as that the input image (a 1.6 megapixel HDR envrionment map) is a valid .HDR.
OpenCV should be able to load .HDRs according to the documentation.
I have tried reproducing the tutorial linked and that worked correctly, so the issue is in the .HDR image, anybody know what to do?
Thanks
EDIT: I used this HDR image. Providing a link rather than a direct download due to copyright etc.
You were almost there, except for two small mistakes.
The first mistake is using cv2.imread to load the HDR image without specifying any flags. Unless you call it with IMREAD_ANYDEPTH, the data will be downscaled to 8-bit and you lose all that high dynamic range.
When you do specify IMREAD_ANYDEPTH, the image will be loaded as 32bit floating point format. This would normally have intensities in range [0.0, 1.0], but due to being HDR, the values exceed 1.0 (in this particular case they go up to about 22). This means that you won't be able to visualize it (in a useful way) by simply casting the data to np.uint8. You could perhaps normalize it first into the nominal range, or use the scale and clip method... whatever you find appropriate. Since the early visualization is not relevant to the outcome, I'll skip it.
The second issue is trivial. You correctly scale and clip the tone-mapped image back to np.uint8, but then you never use it.
Script
import cv2
import numpy as np
filename = "GoldenGate_2k.hdr"
im = cv2.imread(filename, cv2.IMREAD_ANYDEPTH)
tonemapDurand = cv2.createTonemapDurand(2.2)
ldrDurand = tonemapDurand.process(im)
im2_8bit = np.clip(ldrDurand * 255, 0, 255).astype('uint8')
new_filename = filename + ".jpg"
cv2.imwrite(new_filename, im2_8bit)
Output
I have a following code:
import cv2 as cv
import numpy as np
im = cv.imread('outline.png', cv.IMREAD_UNCHANGED)
cv.imwrite('output.png', im)
f1 = open('outline.png', 'rb')
f2 = open('output.png', 'rb')
img1_b = b64encode(f1.read())
img2_b = b64encode(f2.read())
print(img1_b)
print(img2_b)
What is the reason that img1_b and img2_b are different? img2_b is much longer - why?.
I do not want to copy the file - I would like to process it before saving but this part of code is not included.
Both outline.png and output.png looks same after the operation.
What can I change in my code to make img2_b value same as img1_b??
I have tried PIL Image with same result.
The phenomenon you have run into is the result of data compression not being 100% rigidly defined. PNG files use DEFLATE compression, which requires a given compressed file must always decompress to the same output, but does not require that a given input must produce the same compressed file. This gives room for improvement in the compression algorithm where a more optimal compression may be found over a different type of file. It sounds like your original image was compressed using a better (or just different) algorithm than cv2 is using. In order to duplicate the exact compressed version you'll likely need the exact same implementation of compression algorithm that was used to create the original image.
If you want to ensure that the images are indeed identical, you should compare the decoded pixel values. In the name of not re-inventing the wheel, I'll refer you to this excellent blog post on the subject.
Edit: linked article wasn't loading consistently for me so I copied the code here for referencing.
import cv2
import numpy as np
original = cv2.imread("imaoriginal_golden_bridge.jpg")
duplicate = cv2.imread("images/duplicate.jpg")
# 1) Check if 2 images are equals
if original.shape == duplicate.shape:
print("The images have same size and channels")
difference = cv2.subtract(original, duplicate)
b, g, r = cv2.split(difference)
if cv2.countNonZero(b) == 0 and cv2.countNonZero(g) == 0 and cv2.countNonZero(r) == 0:
print("The images are completely Equal")
I'm attempting to make a reasonably simple code that will be able to read the size of an image and return all the RGB values. I'm using PIL on Python 2.7, and my code goes like this:
import os, sys
from PIL import Image
img = Image.open('C:/image.png')
pixels = img.load()
print(pixels[0, 1])
now this code was actually gotten off of this site as a way to read a gif file. I'm trying to get the code to print out an RGB tuple (in this case (55, 55, 55)) but all it gives me is a small sequence of unrelated numbers, usually containing 34.
I have tried many other examples of code, whether from here or not, but it doesn't seem to work. Is it something wrong with the .png format? Do I need to further code in the rgb part? I'm happy for any help.
My guess is that your image file is using pre-multiplied alpha values. The 8 values you see are pretty close to 55*34/255 (where 34 is the alpha channel value).
PIL uses the mode "RGBa" (with a little a) to indicate when it's using premultiplied alpha. You may be able to tell PIL to covert the to normal "RGBA", where the pixels will have roughly the values you expect:
img = Image.open('C:/image.png').convert("RGBA")
Note that if your image isn't supposed to be partly transparent at all, you may have larger issues going on. We can't help you with that without knowing more about your image.
I am working with 2D floating-point numpy arrays and saving them as .png files with high precision (see this question for how I came to this point). To do this I use the freeimage plugin, as in that linked question.
This creates a weird behaviour where the images are flipped (both left-right and up-down) if saved to 16-bit. This behaviour happens only for RGB or RGBA images, not for greyscale images. Here is some example code:
from skimage import io, img_as_uint, img_as_ubyte
im = np.random.uniform(size=(256, 256))
im[:128, :128] = 1
im = img_as_ubyte(im)
io.use_plugin('freeimage')
io.imsave('test_1.png', im)
creates the following picture:
when I try to save this in 16 bit, I get the same result (albeit taking 99kb instead of 50, so I know the bitdepth is working).
Now do the same as an RGB image:
im = np.random.uniform(size=(256, 256, 3))
im[:128, :128] = 1
im = img_as_ubyte(im)
io.use_plugin('freeimage')
io.imsave('test_1.png', im)
The 8-bit result is:
but doing the following
im = img_as_uint(im)
io.use_plugin('freeimage')
io.imsave('test_1.png', im)
gives me
This happens if the array contains an alpha level too.
It can be fixed by including
im = np.fliplr(np.flipud(im))
before saving. However, it seems to me this is pretty weird behaviour and not very desirable. Any idea why this is happening or whether it is intended? As far as I could see it's not documented.