OpenCV phase correlation differs from SciKit

OpenCV phase correlation differs from SciKit - python

I'm trying to port some code that was originally written in scikit to OpenCV, as I already use OpenCV for some other tasks. I have these two images:
which are essentially the polar forms of two images that share a common center, where one image is a rotation of the other. I need to use phase correlation to determine what this angle is. In OpenCV, I'm doing:
import cv2
import numpy as np
im1 = np.float32(cv2.cvtColor(cv2.imread('polar-part.png'), cv2.COLOR_BGR2GRAY))
im2 = np.float32(cv2.cvtColor(cv2.imread('polar-template.png'), cv2.COLOR_BGR2GRAY))
print(cv2.phaseCorrelate(im1, im2))
Which produces the incorrect answer
((-0.07302320870314816, -0.19596856380076133), 0.03418491033860195)
In Scikit, I do
template_polar = rgb2gray(imread('polar-template.png'))
up_cam_polar = rgb2gray(imread('polar-part.png'))
print(phase_cross_correlation(up_cam_polar, template_polar, upsample_factor=20))
which produces the correct answer of
(array([ 1.3625e+02, -5.0000e-02]), 0.2080647049014251, 2.6434620698588315e-07)
The import number here is the y shift, which is about 136. This is the correct number of pixels to translate one image onto the other.
Why does OpenCV give back a drastically different answer?

Best I can tell is as follows from the documentation.
Skimage is returning the correct offsets for your images:
Returns the (y,x) shifts and the normalized rms error.
OpenCV is returning incorrect offsets for your images (phase correlation can be very sensitive to noise and non-cyclic images). Your only available argument would be to adjust the windowing:
Returns the (x,y) shifts and the sum of the elements of the correlation (r) within the 5x5 centroid around the peak location. It is normalized to a maximum of 1
Here is a different example:
Input 1:
Input 2:
Results:
opencv: ((20.20846249610655, 22.459076722413144), 0.5959940360504904)
skimage: (array([-22., -20.]), 0.31940809429051836, -2.0134398e-10)
which shows equivalent shifts. (The signs are opposite depending upon how the two methods assign the reference and target images). The correlation metric values returned are different, due to the different ways that they are computed and normalized.

Related

How do I normalize the pixel value of an image to 0~1?

The type of my train_data is 'Array of unit 16'. The size is (96108,7,7). Therefore, there are 96108 images.
The image is different from the general image. My image has a sensor of 7x7 and 49 pixels contain the number of detected lights. And one image is the number of light detected for 0 to 1 second. Since the sensor detects randomly for a unit time, the maximum values of the pixel are all different.
If the max value of all images is 255, I can do 'train data/255', but I can't use the division because the max value of the image I have is all different.
I want to make the pixel value of all images 0 to 1.
What should I do?

Contrast Normalization (or contrast stretch) should not be confused with Data Normalization which maps data between 0.0-1.0.
Data Normalization
We use the following formula to normalize data. The min() and max() values are the possible minimum and maximum values supported within the type of data.
When we use it with images, x is the whole image and i is an individual pixel of that image. If you are using an 8-bit image the min() and max() values become 0 and 255 respectively. This should not be confused with the minimum and maximum values presented within your image in question.
To convert an 8-bit image into a floating-point image, As min() value reaches 0, the simple math is image/255.
img = img/255
NumPy methods likes to output arrays in 64-bit floating-point by default. To effectively test methods applied to 8-bit images with NumPy, an 8-bit array is required as the input:
image = np.random.randint(0,255, (7,7), dtype=np.uint8)
normalized_image = image/255
When we examine the output of the above two lines we can see the maximum value of the image is 252 which has now mapped to 0.9882352941176471 on the 64-bit normalized image.
However, in most cases, you wouldn't need a 64-bit image. You can output (or in other words cast) it to 32-bit (or 16-bit) using the following code. If you try to cast it to 8-bit it will throw an error. Using '/' for division is a shorthand for np.true_divide but lacks the ability to define the output data format.
normalized_image_2 = np.true_divide(image, 255, dtype=np.float32)
The properties of the new array is shown below. You can see the number of digits are now reduced and 252 has been remapped to 0.9882353.
Contrast Normalization
The method shown by #3dSpatialUser effectively does a partial contrast normalization, meaning it stretches the intensities of the image within the available intensity range. Test it with an 8-bit array with the following code.
c_image = np.random.randint(64,128, (7,7), dtype=np.uint8)
cn_image = (c_image - c_image.min()) / (c_image.max()- c_image.min())
Contrast is now stretched mapping minimum contrast of 64 to 0.0 and maximum 127 to 1.0.
The formula for contrast normalization is shown below.
Using the above formula with NumPy and to remap data back to the 8-bit input format after contrast normalization, the image should be multiplied by 255, then change the data type back to unit8:
cn_image_correct = (c_image - c_image.min()) / (c_image.max()- c_image.min()) * 255
cn_image_correct = cn_image_correct.astype(np.int8)
64 is now mapped to 0 and 174 is mapped to 255 stretching the contrast.
Where the confusion arise
In most applications, the intensity values of an image are spread close to their minima and maxima. Hence, when we apply the normalization formula using the min and max values presented within the image, instead of the min max of the available range, it will output a better looking image (in most cases) within the 0.0-1.0 range, which effectively does normalize both data and contrast at the same time. Also, image editing software perform gamma corrections or remapping when switching between image data types 8/16/32-bits.

import numpy as np
data = np.random.normal(loc=0, scale=1, size=(96108, 7, 7))
data_min = np.min(data, axis=(1,2), keepdims=True)
data_max = np.max(data, axis=(1,2), keepdims=True)
scaled_data = (data - data_min) / (data_max - data_min)
EDIT: I have voted for the other answer since that is a cleaner way (in my opinion) to do it, but the principles are the same.
EDIT v2: I saw the comment and I see the difference. I will rewrite my code so it is "cleaner" with less extra variables but still correct using min/max:
data -= data.min(axis=(1,2), keepdims=True)
data /= data.max(axis=(1,2), keepdims=True)
First the minimum value is moved to zero, thereafter one can take the maximum value to get the full range (max-min) of the specific image.
After this step np.array_equal(data, scaled_data) = True.

You can gather the maximum values with np.ndarray.max across multiple axes: here axis=1 and axis=2 (i.e. on each image individually). Then normalize the initial array with it. To avoid having to broadcast this array of maxima yourself, you can use the keepdims option:
>>> x = np.random.rand(96108,7,7)
>>> x.max(axis=(1,2), keepdims=True).shape
(96108, 1, 1)
While x.max(axis=(1,2)) alone would have returned an array shaped (96108,)...
Such that you can then do:
>>> x /= x.max(axis=(1,2), keepdims=True)

apply filters on images when there is no data pixels

I have image that contains many no data pixels. The image is 2d numpy array and the no-data values are "None". Whenever I try to apply on it filters, seems like the none values are taken into account into the kernel and makes my pixels dissapear.
For example, I have this image:
I have tried to apply on it the lee filter with this function (taken from Speckle ( Lee Filter) in Python):
from scipy.ndimage.filters import uniform_filter
from scipy.ndimage.measurements import variance
def lee_filter(img, size):
img_mean = uniform_filter(img, (size, size))
img_sqr_mean = uniform_filter(img**2, (size, size))
img_variance = img_sqr_mean - img_mean**2
overall_variance = variance(img)
img_weights = img_variance / (img_variance + overall_variance)
img_output = img_mean + img_weights * (img - img_mean)
return img_output
but the results looks like this:
with the warnning:
UserWarning: Warning: converting a masked element to nan. dv =
np.float64(self.norm.vmax) - np.float64(self.norm.vmin)
I have also tried to use the library findpeaks.
from findpeaks import findpeaks
import findpeaks
#lee enhanced filter
image_lee_enhanced = findpeaks.lee_enhanced_filter(img, win_size=3, cu=0.25)
but I get the same blank image.
When I used median filter on the same image with ndimage is worked no problem.
My question is how can I run those filters on the image without letting the None values interrupt the results?
edit: I prefer not to set no value pixels to 0 because the pixel range value is between -50-1 (is an index values). In addition i'm afraid that if I change it to any other value e.g 9999) it will also influence the filter (am I wrong?)
Edit 2:
I have read Cris Luengo answer and I have tried to apply something similar with the scipy.ndimage median filter as I have realized that the result is disorted as well.
This is the original image:
I have tried masking the Null values:
idx = np.ma.masked_where(img,img!=None)[:,1]
median_filter_img = ndimage.median_filter(img[idx].reshape(491, 473), size=10)
zeros = np.zeros([img.shape[0],img.shape[1]])
zeros[idx] = median_filter_img
The results looks like this (color is darker to see the problem in the edges):
As it can bee seen, seems like the edges values are inflluences by the None values.
I have done this also with img!=0 but got the same problem.
(just to add: the pixels vlues are between 1 to -35)

If you want to apply a linear smoothing filter, then you can use the Normalized Convolution.
The basic recipe is:
Create a mask image that is 1 for the pixels with data, and 0 for the pixels without data.
Set the pixels without data to any number, for example 0. NaN is not valid because it spreads in the computations.
Apply the linear smoothing filter to the image multiplied by the mask.
Apply the linear smoothing filter to the mask.
Divide the two results.
Basically, we normalize the result of the linear smoothing filter (convolution) by the number of pixels with data within the filter window.
In regions where the smoothed mask is 0 (far away from data), we will divide 0 by 0, so special care needs to be taken there.
Note that normalized convolution can be used also for uncertain data, where the mask image gets values in between 0 and 1 indicating the confidence we have in each pixel. Pixels thought to be noisy can be set to a value closer to 0 than the other pixels, for example.
The recipe above is only valid for linear smoothing filters. Normalized convolution can be done with other linear filters, for example derivative filters, but the resulting recipe is different. See for example here the equation for Normalized Convolution to compute the derivative.
For non-linear filters, other approaches are necessary. Non-linear smoothing filters, for example, will often avoid affecting edges, and so will work quite well in images with missing data, if the missing pixels are set to 0, or some value far outside of the data range. The concept of keeping a mask image that indicates which pixels have data and which don't is always a good idea.

Seems like a simple solution is to set the non values to zero. I don't know how you would get around this, because most image processing kernels require some value to for you to apply.
a[numpy.argwhere(a==None)] = 0

How to judge if an image is part of another one in Python?

This is how I tried:
(1) use PIL.Image to open the original(say 100*100) and target(say 20*20) image and convert them into np.array;
(2) start from every pixel in the original one as a starting position, crop a 20*20 area and compare every pixel RGB with the target.
(3) If the total difference is under certain given level, then stop and output the specific starting pixel position in the original one.
The problem is, step(3) costs over 10s which is much too long, even step(2) costs over 0.04s and I hope to optimize my program. In both steps I used For to iterate array, is there a more efficient way?

To compare two signals (or images) at different displacements one can use cross-correlation.
If you have the scipy package you can use 2D cross-correlation to measure how similar the two images are when you slide one image over the other.
This example is copied from the correlate2d function:
from scipy import signal
from scipy import misc
lena = misc.lena() - misc.lena().mean()
template = np.copy(lena[235:295, 310:370]) # right eye
template -= template.mean()
lena = lena + np.random.randn(*lena.shape) * 50 # add noise
corr = signal.correlate2d(lena, template, boundary='symm', mode='same')
y, x = np.unravel_index(np.argmax(corr), corr.shape) # find the match
If you don't want to use a toolbox you could implement the cross-correlation yourself.

How to create a bidimensional Gaussian filter on a dense list of points

I am doing my best to replicate the algorithm described here in this paper for making an inpainting algorithm. The idea is to get the contour or edge points of the part of the image that needs to be inpainted. In order to find the most linear point in the region, the orthogonal normal vector is found. On page 6, a short description of the implementation is given.
In our implementation the contour
δΩ of the target region is modelled as a dense list of image point
locations. Given a point p ∈ δΩ, the normal direction np
is computed as follows: i) the positions of the
“control” points of δΩ are filtered via a bi-dimensional Gaussian
kernel and, ii) np is estimated as the unit vector orthogonal to
the line through the preceding and the successive points in the
list.
So it appears that I need to put all these points in a gaussian filter. How do I set up a bi-dimensional Gaussian filter when we have a single dimension or a list of points?
Lets say our contour is a box shape at points, then I create a 1 dimensional list of points: [1,1],[1,2],[1,3],[2,1],[2,3],[3,1],[3,2],[3,3]. Do I need to simply make a new 2d matrix table and put the points in and leave the middle point at [2,2] empty, then run a Gaussian filter on it? This doesn't seem very dense though.
I am trying to run this through python libraries.

a dense list of image points
is simply a line.
You are basically applying a gaussian filter to a black and white image where the line is black and background is white, from what I understand. I think by doing that, they approximate the curve model fitting.
Convolve all of the points in the 2D region surrounding the point and then overwrite the point with the result.
This will make any curve on the edge of the target region less sharp, lowering the noise in the calculation of the normal, which would be the vector orthogonal to the two points that surround the current one.

Image comparison algorithm

I'm trying to compare images to each other to find out whether they are different. First I tried to make a Pearson correleation of the RGB values, which works also quite good unless the pictures are a litte bit shifted. So if a have a 100% identical images but one is a little bit moved, I get a bad correlation value.
Any suggestions for a better algorithm?
BTW, I'm talking about to compare thousand of imgages...
Edit:
Here is an example of my pictures (microscopic):
im1:
im2:
im3:
im1 and im2 are the same but a little bit shifted/cutted, im3 should be recognized as completly different...
Edit:
Problem is solved with the suggestions of Peter Hansen! Works very well! Thanks to all answers! Some results can be found here
http://labtools.ipk-gatersleben.de/image%20comparison/image%20comparision.pdf

A similar question was asked a year ago and has numerous responses, including one regarding pixelizing the images, which I was going to suggest as at least a pre-qualification step (as it would exclude very non-similar images quite quickly).
There are also links there to still-earlier questions which have even more references and good answers.
Here's an implementation using some of the ideas with Scipy, using your above three images (saved as im1.jpg, im2.jpg, im3.jpg, respectively). The final output shows im1 compared with itself, as a baseline, and then each image compared with the others.
>>> import scipy as sp
>>> from scipy.misc import imread
>>> from scipy.signal.signaltools import correlate2d as c2d
>>>
>>> def get(i):
... # get JPG image as Scipy array, RGB (3 layer)
... data = imread('im%s.jpg' % i)
... # convert to grey-scale using W3C luminance calc
... data = sp.inner(data, [299, 587, 114]) / 1000.0
... # normalize per http://en.wikipedia.org/wiki/Cross-correlation
... return (data - data.mean()) / data.std()
...
>>> im1 = get(1)
>>> im2 = get(2)
>>> im3 = get(3)
>>> im1.shape
(105, 401)
>>> im2.shape
(109, 373)
>>> im3.shape
(121, 457)
>>> c11 = c2d(im1, im1, mode='same') # baseline
>>> c12 = c2d(im1, im2, mode='same')
>>> c13 = c2d(im1, im3, mode='same')
>>> c23 = c2d(im2, im3, mode='same')
>>> c11.max(), c12.max(), c13.max(), c23.max()
(42105.00000000259, 39898.103896795357, 16482.883608327804, 15873.465425120798)
So note that im1 compared with itself gives a score of 42105, im2 compared with im1 is not far off that, but im3 compared with either of the others gives well under half that value. You'd have to experiment with other images to see how well this might perform and how you might improve it.
Run time is long... several minutes on my machine. I would try some pre-filtering to avoid wasting time comparing very dissimilar images, maybe with the "compare jpg file size" trick mentioned in responses to the other question, or with pixelization. The fact that you have images of different sizes complicates things, but you didn't give enough information about the extent of butchering one might expect, so it's hard to give a specific answer that takes that into account.

I have one done this with an image histogram comparison. My basic algorithm was this:
Split image into red, green and blue
Create normalized histograms for red, green and blue channel and concatenate them into a vector (r0...rn, g0...gn, b0...bn) where n is the number of "buckets", 256 should be enough
subtract this histogram from the histogram of another image and calculate the distance
here is some code with numpy and pil
r = numpy.asarray(im.convert( "RGB", (1,0,0,0, 1,0,0,0, 1,0,0,0) ))
g = numpy.asarray(im.convert( "RGB", (0,1,0,0, 0,1,0,0, 0,1,0,0) ))
b = numpy.asarray(im.convert( "RGB", (0,0,1,0, 0,0,1,0, 0,0,1,0) ))
hr, h_bins = numpy.histogram(r, bins=256, new=True, normed=True)
hg, h_bins = numpy.histogram(g, bins=256, new=True, normed=True)
hb, h_bins = numpy.histogram(b, bins=256, new=True, normed=True)
hist = numpy.array([hr, hg, hb]).ravel()
if you have two histograms, you can get the distance like this:
diff = hist1 - hist2
distance = numpy.sqrt(numpy.dot(diff, diff))
If the two images are identical, the distance is 0, the more they diverge, the greater the distance.
It worked quite well for photos for me but failed on graphics like texts and logos.

You really need to specify the question better, but, looking at those 5 images, the organisms all seem to be oriented the same way. If this is always the case, you can try doing a normalized cross-correlation between the two images and taking the peak value as your degree of similarity. I don't know of a normalized cross-correlation function in Python, but there is a similar fftconvolve() function and you can do the circular cross-correlation yourself:
a = asarray(Image.open('c603225337.jpg').convert('L'))
b = asarray(Image.open('9b78f22f42.jpg').convert('L'))
f1 = rfftn(a)
f2 = rfftn(b)
g = f1 * f2
c = irfftn(g)
This won't work as written since the images are different sizes, and the output isn't weighted or normalized at all.
The location of the peak value of the output indicates the offset between the two images, and the magnitude of the peak indicates the similarity. There should be a way to weight/normalize it so that you can tell the difference between a good match and a poor match.
This isn't as good of an answer as I want, since I haven't figured out how to normalize it yet, but I'll update it if I figure it out, and it will give you an idea to look into.

If your problem is about shifted pixels, maybe you should compare against a frequency transform.
The FFT should be OK (numpy has an implementation for 2D matrices), but I'm always hearing that Wavelets are better for this kind of tasks ^_^
About the performance, if all the images are of the same size, if I remember well, the FFTW package created an specialised function for each FFT input size, so you can get a nice performance boost reusing the same code... I don't know if numpy is based on FFTW, but if it's not maybe you could try to investigate a little bit there.
Here you have a prototype... you can play a little bit with it to see which threshold fits with your images.
import Image
import numpy
import sys
def main():
img1 = Image.open(sys.argv[1])
img2 = Image.open(sys.argv[2])
if img1.size != img2.size or img1.getbands() != img2.getbands():
return -1
s = 0
for band_index, band in enumerate(img1.getbands()):
m1 = numpy.fft.fft2(numpy.array([p[band_index] for p in img1.getdata()]).reshape(*img1.size))
m2 = numpy.fft.fft2(numpy.array([p[band_index] for p in img2.getdata()]).reshape(*img2.size))
s += numpy.sum(numpy.abs(m1-m2))
print s
if __name__ == "__main__":
sys.exit(main())
Another way to proceed might be blurring the images, then subtracting the pixel values from the two images. If the difference is non nil, then you can shift one of the images 1 px in each direction and compare again, if the difference is lower than in the previous step, you can repeat shifting in the direction of the gradient and subtracting until the difference is lower than a certain threshold or increases again. That should work if the radius of the blurring kernel is larger than the shift of the images.
Also, you can try with some of the tools that are commonly used in the photography workflow for blending multiple expositions or doing panoramas, like the Pano Tools.

I have done some image processing course long ago, and remember that when matching I normally started with making the image grayscale, and then sharpening the edges of the image so you only see edges. You (the software) can then shift and subtract the images until the difference is minimal.
If that difference is larger than the treshold you set, the images are not equal and you can move on to the next. Images with a smaller treshold can then be analyzed next.
I do think that at best you can radically thin out possible matches, but will need to personally compare possible matches to determine they're really equal.
I can't really show code as it was a long time ago, and I used Khoros/Cantata for that course.

First off, correlation is a very CPU intensive rather inaccurate measure for similarity. Why not just go for the sum of the squares if differences between individual pixels?
A simple solution, if the maximum shift is limited: generate all possible shifted images and find the one that is the best match. Make sure you calculate your match variable (i.e. correllation) only over the subset of pixels that can be matched in all shifted images. Also, your maximum shift should be significantly smaller than the size of your images.
If you want to use some more advances image processing techniques I suggest you look at SIFT this is a very powerfull method that (theoretically anyway) can properly match items in images independent of translation, rotation and scale.

I guess you could do something like this:
estimate vertical / horizontal displacement of reference image vs the comparison image. a
simple SAD (sum of absolute difference) with motion vectors would do to.
shift the comparison image accordingly
compute the pearson correlation you were trying to do
Shift measurement is not difficult.
Take a region (say about 32x32) in comparison image.
Shift it by x pixels in horizontal and y pixels in vertical direction.
Compute the SAD (sum of absolute difference) w.r.t. original image
Do this for several values of x and y in a small range (-10, +10)
Find the place where the difference is minimum
Pick that value as the shift motion vector
Note:
If the SAD is coming very high for all values of x and y then you can anyway assume that the images are highly dissimilar and shift measurement is not necessary.

To get the imports to work correctly on my Ubuntu 16.04 (as of April 2017), I installed python 2.7 and these:
sudo apt-get install python-dev
sudo apt-get install libtiff5-dev libjpeg8-dev zlib1g-dev libfreetype6-dev liblcms2-dev libwebp-dev tcl8.6-dev tk8.6-dev python-tk
sudo apt-get install python-scipy
sudo pip install pillow
Then I changed Snowflake's imports to these:
import scipy as sp
from scipy.ndimage import imread
from scipy.signal.signaltools import correlate2d as c2d
How awesome that Snowflake's scripted worked for me 8 years later!

I propose a solution based on the Jaccard index of similarity on the image histograms. See: https://en.wikipedia.org/wiki/Jaccard_index#Weighted_Jaccard_similarity_and_distance
You can compute the difference in the distribution of the pixel colors. This is indeed pretty invariant to translations.
from PIL.Image import Image
from typing import List
def jaccard_similarity(im1: Image, im2: Image) -> float:
"""Compute the similarity between two images.
First, for each image an histogram of the pixels distribution is extracted.
Then, the similarity between the histograms is compared using the weighted Jaccard index of similarity, defined as:
Jsimilarity = sum(min(b1_i, b2_i)) / sum(max(b1_i, b2_i)
where b1_i, and b2_i are the ith histogram bin of images 1 and 2, respectively.
The two images must have same resolution and number of channels (depth).
See: https://en.wikipedia.org/wiki/Jaccard_index
Where it is also called Ruzicka similarity."""
if im1.size != im2.size:
raise Exception("Images must have the same size. Found {} and {}".format(im1.size, im2.size))
n_channels_1 = len(im1.getbands())
n_channels_2 = len(im2.getbands())
if n_channels_1 != n_channels_2:
raise Exception("Images must have the same number of channels. Found {} and {}".format(n_channels_1, n_channels_2))
assert n_channels_1 == n_channels_2
sum_mins = 0
sum_maxs = 0
hi1 = im1.histogram() # type: List[int]
hi2 = im2.histogram() # type: List[int]
# Since the two images have the same amount of channels, they must have the same amount of bins in the histogram.
assert len(hi1) == len(hi2)
for b1, b2 in zip(hi1, hi2):
min_b = min(b1, b2)
sum_mins += min_b
max_b = max(b1, b2)
sum_maxs += max_b
jaccard_index = sum_mins / sum_maxs
return jaccard_index
With respect to mean squared error, the Jaccard index lies always in the range [0,1], thus allowing for comparisons among different image sizes.
Then, you can compare the two images, but after rescaling to the same size! Or pixel counts will have to be somehow normalized. I used this:
import sys
from skincare.common.utils import jaccard_similarity
import PIL.Image
from PIL.Image import Image
file1 = sys.argv[1]
file2 = sys.argv[2]
im1 = PIL.Image.open(file1) # type: Image
im2 = PIL.Image.open(file2) # type: Image
print("Image 1: mode={}, size={}".format(im1.mode, im1.size))
print("Image 2: mode={}, size={}".format(im2.mode, im2.size))
if im1.size != im2.size:
print("Resizing image 2 to {}".format(im1.size))
im2 = im2.resize(im1.size, resample=PIL.Image.BILINEAR)
j = jaccard_similarity(im1, im2)
print("Jaccard similarity index = {}".format(j))
Testing on your images:
$ python CompareTwoImages.py im1.jpg im2.jpg
Image 1: mode=RGB, size=(401, 105)
Image 2: mode=RGB, size=(373, 109)
Resizing image 2 to (401, 105)
Jaccard similarity index = 0.7238955686269157
$ python CompareTwoImages.py im1.jpg im3.jpg
Image 1: mode=RGB, size=(401, 105)
Image 2: mode=RGB, size=(457, 121)
Resizing image 2 to (401, 105)
Jaccard similarity index = 0.22785529941822316
$ python CompareTwoImages.py im2.jpg im3.jpg
Image 1: mode=RGB, size=(373, 109)
Image 2: mode=RGB, size=(457, 121)
Resizing image 2 to (373, 109)
Jaccard similarity index = 0.29066426814105445
You might also consider experimenting with different resampling filters (like NEAREST or LANCZOS), as they, of course, alter the color distribution when resizing.
Additionally, consider that swapping images change the results, as the second image might be downsampled instead of upsampled (After all, cropping might better suit your case rather than rescaling.)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.