measuring similarity between two rgb images in python - python
I have two rgb images of same size, and I would like to compute a similarity metric. I thought of starting out with euclidean distance:
import scipy.spatial.distance as dist
import cv2
im1 = cv2.imread("im1.jpg")
im2 = cv2.imread("im2.jpg")
>> im1.shape
(820, 740, 3)
>> dist.euclidean(im1,im2)
ValueError: Input vector should be 1-D.
I know that dist.euclidean expects a 1-D array and im1 and im2 are 3-D, but is there a function that will work with 3-D arrays, or is it possible to transform im1 and im2 into a 1-D array that preserves the information in the images?
Grayscale Solution (?)
(There is a discussion below as to your comment about a function "that preserves the information in the images")
It seems possible to me that you might be able to solve the problem using a grayscale image rather than an RGB image. I know I'm making assumptions here, but it's a thought.
I'm going to try a simple example relating to your code, then give an example of an image similarity measure using 2D Discrete Fourier Transforms that uses a conversion to grayscale. That DFT analysis will have its own section
(My apologies if you see this while in progress. I'm just trying to make sure my work is saved.)
Because of my assumption, I'm going to try your method with some RGB images, then see if the problem will be solved by converting to grayscale. If the problem is solved with grayscale, we can do an analysis of the amount of information loss brought on by the grayscale solution by finding the image similarity using a combination of all three channels, each compared separately.
Method
Making sure I have all the libraries/packages/whatever you want to call them.
> python -m pip install opencv-python
> python -m pip install scipy
> python -m pip install numpy
Note that, in this trial, I'm using some PNG images that were created in the attempt (described below) to use a 2D DFT.
Making sure I get the same problem
>>> import scipy.spatial.distance as dist
>>> import cv2
>>>
>>> im1 = cv2.imread("rhino1_clean.png")
>>> im2 = cv2.imread("rhino1_streak.png")
>>>
>>> im1.shape
(178, 284, 3)
>>>
>>> dist.euclidean(im1, im2)
## Some traceback stuff ##
ValueError: Input vector should be 1-D.
Now, let's try using grayscale. If this works, we can simply find the distance for each of the RGB Channels. I hope it works, because I want to do the information-loss analysis.
Let's convert to grayscale:
>>> im1_gray = cv2.cvtColor(im1, cv2.COLOR_BGR2GRAY)
>>> im2_gray = cv2.cvtColor(im2, cv2.COLOR_BGR2GRAY)
>>> im1_gray.shape
(178, 284)
A simple dist.euclidean(im1_gray, im2,gray) will lead to the same ValueError: Input vector should be 1-D. exception, but I know the structure of a grayscale image array (an array of pixel rows), so I do the following.
>>> dists = []
>>> for i in range(0, len(im1_gray)):
... dists.append(dist.euclidean(im1_gray[i], im2_gray[i]))
...
>>> sum_dists = sum(dists)
>>> ave_dist = sum_dists/len(dists)
>>> ave_dist
2185.9891304058297
By the way, here are the two original images:
Grayscale worked (with massaging), let's try color
Following some procedure from this SO answer, let's do the following.
Preservation of Information
Following the analysis here (archived), let's look at our information loss. (Note that this will be a very naïve analysis, but I want to give a crack at it.
Grayscale vs. Color Information
Let's just look at the color vs. the grayscale. Later, we can look at whether we preserve the information about the distances.
Comparisons of different distance measures using grayscale vs. all three channels - comparison using ratio of distance sums for a set of images.
I don't know how to do entropy measurements for the distances, but my intuition tells me that, if I calculate distances using grayscale and using color channels, I should come up with similar ratios of distances IF I haven't lost any information.
My first thought when seeing this question was to use a 2-D Discrete Fourier Transform, which I'm sure is available in Python or NumPy or OpenCV. Basically, your first components of the DFT will relate to large shapes in your image. (Here is where I'll put in a relevant research paper: link. I didn't look too closely - anyone is welcome to suggest another.)
So, let me look up a 2-D DFT easily available from Python, and I'll get back to putting up some working code.
(My apologies if you see this while in progress. I'm just trying to make sure my work is saved.)
First, you'll need to make sure you have PIL Pillow and NumPy. It seems you have NumPy, but here are some instructions. (Note that I'm on Windows at the moment) ...
> python -m pip install opencv-python
> python -m pip install numpy
> python -m pip install pillow
Now, here are 5 images -
a rhino image, rhino1_clean.jpg (source);
the same image with some black streaks drawn on by me in MS Paint, rhino1_streak.jpg;
another rhino image, rhino2_clean.jpg (source);
a first hippo image hippo1_clean.jpg (source);
a second hippo image, hippo2_clean.jpg (source).
All images used with fair use.
Okay, now, to illustrate further, let's go to the Python interactive terminal.
>python
>>> import PIL
>>> import numpy as np
First of all, life will be easier if we use grayscale PNG images - PNG because it's a straight bitmap (rather than a compressed image), grayscale because I don't have to show all the details with the channels.
>>> rh_img_1_cln = PIL.Image.open("rhino1_clean.jpg")
>>> rh_img_1_cln.save("rhino1_clean.png")
>>> rh_img_1_cln_gs = PIL.Image.open("rhino1_clean.png").convert('LA')
>>> rh_img_1_cln_gs.save("rhino1_clean_gs.png")
Follow similar steps for the other four images. I used PIL variable names, rh_img_1_stk, rh_img_2_cln, hp_img_1_cln, hp_img_2_cln. I ended up with the following image filenames for the grayscale images, which I'll use further: rhino1_streak_gs.png, rhino2_clean_gs.png, hippo1_clean_gs.png, hippo2_clean_gs.png.
Now, let's get the coefficients for the DFTs. The following code (ref. this SO answer) would be used for the first, clean rhino image.
Let's "look" at the image array, first. This will show us a grid version of the top-left column, with higher values being more white and lower values being more black.
Note that, before I begin outputting this array, I set things to the numpy default, cf. https://docs.scipy.org/doc/numpy/reference/generated/numpy.set_printoptions.html
>>> np.set_printoptions(edgeitems=3,infstr='inf',
... linewidth=75, nanstr='nan', precision=8,
... suppress=False, threshold=1000, formatter=None)
>>> rh1_cln_gs_array = np.array(rh_img_1_cln_gs)
>>> for i in {0,1,2,3,4}:
... print(rh1_cln_gs_array[i][:13])
...
[93 89 78 87 68 74 58 51 73 96 90 75 86]
[85 93 64 64 76 49 19 52 65 76 86 81 76]
[107 87 71 62 54 31 32 49 51 55 81 87 69]
[112 93 94 72 57 45 58 48 39 49 76 86 76]
[ 87 103 90 65 88 61 44 57 34 55 70 80 92]
Now, let's run the DFT and look at the results. I change my numpy print options to make things nicer before I start the actual transform.
>>> np.set_printoptions(formatter={'all':lambda x: '{0:.2f}'.format(x)})
>>>
>>> rh1_cln_gs_fft = np.fft.fft2(rh_img_1_cln_gs)
>>> rh1_cln_gs_scaled_fft = 255.0 * rh1_cln_gs_fft / rh1_cln_gs_fft.max()
>>> rh1_cln_gs_real_fft = np.absolute(rh1_cln_gs_scaled_fft)
>>> for i in {0,1,2,3,4}:
... print(rh1_cln_gs_real_fft[i][:13])
...
[255.00 1.46 7.55 4.23 4.53 0.67 2.14 2.30 1.68 0.77 1.14 0.28 0.19]
[38.85 5.33 3.07 1.20 0.71 5.85 2.44 3.04 1.18 1.68 1.69 0.88 1.30]
[29.63 3.95 1.89 1.41 3.65 2.97 1.46 2.92 1.91 3.03 0.88 0.23 0.86]
[21.28 2.17 2.27 3.43 2.49 2.21 1.90 2.33 0.65 2.15 0.72 0.62 1.13]
[18.36 2.91 1.98 1.19 1.20 0.54 0.68 0.71 1.25 1.48 1.04 1.58 1.01]
Now, the result for following the same procedure with rhino1_streak.jpg
[255.00 3.14 7.69 4.72 4.34 0.68 2.22 2.24 1.84 0.88 1.14 0.55 0.25]
[40.39 4.69 3.17 1.52 0.77 6.15 2.83 3.00 1.40 1.57 1.80 0.99 1.26]
[30.15 3.91 1.75 0.91 3.90 2.99 1.39 2.63 1.80 3.14 0.77 0.33 0.78]
[21.61 2.33 2.64 2.86 2.64 2.34 2.25 1.87 0.91 2.21 0.59 0.75 1.17]
[18.65 3.34 1.72 1.76 1.44 0.91 1.00 0.56 1.52 1.60 1.05 1.74 0.66]
I'll print \Delta values instead of doing a more comprehensive distance. You could sum the squares of the values shown here, if you want a distance.
>>> for i in {0,1,2,3,4}:
... print(rh1_cln_gs_real_fft[i][:13] - rh1_stk_gs_real_fft[i][:13])
...
[0.00 -1.68 -0.15 -0.49 0.19 -0.01 -0.08 0.06 -0.16 -0.11 -0.01 -0.27
-0.06]
[-1.54 0.64 -0.11 -0.32 -0.06 -0.30 -0.39 0.05 -0.22 0.11 -0.11 -0.11 0.04]
[-0.53 0.04 0.14 0.50 -0.24 -0.02 0.07 0.30 0.12 -0.11 0.11 -0.10 0.08]
[-0.33 -0.16 -0.37 0.57 -0.15 -0.14 -0.36 0.46 -0.26 -0.07 0.13 -0.14
-0.04]
[-0.29 -0.43 0.26 -0.58 -0.24 -0.37 -0.32 0.15 -0.27 -0.12 -0.01 -0.17
0.35]
I'll be putting just three coefficient arrays truncated to a length of five to show how this works for showing image similarity. Honestly, this is an experiment for me, so we'll see how it goes.
You can work on comparing those coefficients with distances or other metrics.
More About Preservation of Information
Let's do an information-theoretical analysis of information loss with the methods proposed above.
Following the analysis here (archived), let's look at our information loss.
Good luck!
You can try
import scipy.spatial.distance as dist
import cv2
import numpy as np
im1 = cv2.imread("im1.jpg")
im2 = cv2.imread("im2.jpg")
dist.euclidean(im1.flatten(), im2.flatten())
You can use the reshape function for both images to convert them from 3D to 1D.
import scipy.spatial.distance as dist
import cv2
im1 = cv2.imread("im1.jpg")
im2 = cv2.imread("im2.jpg")
im1.reshape(1820400)
im2.reshape(1820400)
dist.euclidean(im1,im2)
Related
Obtaining 2 or more coefficients from defined equation using regression methods
I'm looking to run this code that enables to solve for the x number of unknowns (c_10, c_01, c_11 etc.) just from plotting the graph. Some background on the equation: Mooney-Rivlin model (1940) with P1 = c_10[(2*λ+λ**2)-3]+c_01[(λ**-2+2*λ)-3]. P1 (or known as P) and lambda are data pre-defined in numerical terms in the table below (sheet ExperimentData of experimental_data1.xlsx): λ P 1.00 0.00 1.01 0.03 1.12 0.14 1.24 0.23 1.39 0.32 1.61 0.41 1.89 0.50 2.17 0.58 2.42 0.67 3.01 0.85 3.58 1.04 4.03 1.21 4.76 1.58 5.36 1.94 5.76 2.29 6.16 2.67 6.40 3.02 6.62 3.39 6.87 3.75 7.05 4.12 7.16 4.47 7.27 4.85 7.43 5.21 7.50 5.57 7.61 6.30 I have tried obtaining coefficients using Linear regression. However, to my knowledge, random forest is not able to obtain multiple coefficients using reg.coef_ Tried SVR with reg.dual_coef_ However keeps obtaining error ValueError: not enough values to unpack (expected 2, got 1) Code below: data = pd.read_excel('experimental_data.xlsx', sheet_name='ExperimentData') X_s = [[(2*λ+λ**2)-3, (λ**-2+2*λ)-3] for λ in data['λ']] y_s = data['P'] svr = SVR() svr.fit(X_s, y_s) c_01, c_10 = svr.dual_coef_ And for future proofing this method, if lets say there are more than 2 coefficients, are there other methods apart from Linear Regression? For example, referring to Ishihara model (1951) where P1 = {2*c_10 + 4*c_20*c_01[(2*λ**-1+λ**2) - 3]*[(λ**-2 + 2*λ) - 3] + c_20 * c_01 * (λ**-1) * [(2*λ**-1 + λ**2) - 3]**2}*{λ - λ**-2} Any comments is greatly appreciated!
Getting meaningful results from pandas.describe()
I called describe on one column of a dataframe and ended up with the following output, count 1.048575e+06 mean 8.232821e+01 std 2.859016e+02 min 0.000000e+00 25% 3.000000e+00 50% 1.400000e+01 75% 6.000000e+01 max 8.599700e+04 What parameter do I pass to get meaningful integer values. What I mean is when I check the SQL count its about 43 million. All the other values are also different.Can someone help me understand what this conversion means and how do I get float rounded to 2 decimal places. I'm new to Pandas.
You can directly use round() and pass the number of decimals you want as argument # importing pandas as pd import pandas as pd # importing numpy as np import numpy as np # setting the seed to create the dataframe np.random.seed(25) # Creating a 5 * 4 dataframe df = pd.DataFrame(np.random.random([5, 4]), columns =["A", "B", "C", "D"]) # rounding describe df.describe().round(2) A B C D count 5.00 5.00 5.00 5.00 mean 0.52 0.47 0.38 0.42 std 0.21 0.23 0.19 0.29 min 0.33 0.12 0.16 0.11 25% 0.41 0.37 0.28 0.19 50% 0.45 0.58 0.37 0.44 75% 0.56 0.59 0.40 0.52 max 0.87 0.70 0.68 0.84 DOCS
There are two ways to control the output of pandas, either by controlling it or by using apply. pd.set_option('display.float_format', lambda x: '%.5f' % x) df['X'].describe().apply("{0:.5f}".format)
Python numpy shannon entropy array
I have an Numpy array: A = [ 1.56 1.47 1.31 1.16 1.11 1.14 1.06 1.12 1.19 1.06 0.92 0.78 0.6 0.59 0.4 0.03 0.11 0.54 1.17 1.9 2.6 3.28 3.8 4.28 4.71 4.61 4.6 4.41 3.88 3.46 3.04 2.63 2.3 1.75 1.24 1.14 0.97 0.92 0.94 1. 1.15 1.33 1.37 1.48 1.53 1.45 1.32 1.08 1.06 0.98 0.69] How can I obtain the shannon entropy? I have seen it like this but not sure: print -np.sum(A * np.log2(A), axis=1)
There are essentially two cases and it is not clear from your sample which one applies here. (1) Your probability distribution is discrete. Then you have to translate what appear to be relative frequencies to probabilities pA = A / A.sum() Shannon2 = -np.sum(pA*np.log2(pA)) (2) Your probability distribution is continuous. In that case the values in your input needn't sum to one. Assuming that the input is sampled regularly from the entire space, you'd get pA = A / A.sum() Shannon2 = -np.sum(pA*np.log2(A)) but in this case the formula really depends on the details of sampling and the underlying space. Side note: the axis=1 in your example will cause an error since your input is flat. Omit it.
numpy and sklearn PCA return different covariance vector
Trying to learn PCA through and through but interestingly enough when I use numpy and sklearn I get different covariance matrix results. The numpy results match this explanatory text here but the sklearn results different from both. Is there any reason why this is so? d = pd.read_csv("example.txt", header=None, sep = " ") print(d) 0 1 0 0.69 0.49 1 -1.31 -1.21 2 0.39 0.99 3 0.09 0.29 4 1.29 1.09 5 0.49 0.79 6 0.19 -0.31 7 -0.81 -0.81 8 -0.31 -0.31 9 -0.71 -1.01 Numpy Results print(np.cov(d, rowvar = 0)) [[ 0.61655556 0.61544444] [ 0.61544444 0.71655556]] sklearn Results from sklearn.decomposition import PCA clf = PCA() clf.fit(d.values) print(clf.get_covariance()) [[ 0.5549 0.5539] [ 0.5539 0.6449]]
Because for np.cov, Default normalization is by (N - 1), where N is the number of observations given (unbiased estimate). If bias is 1, then normalization is by N. Set bias=1, the result is the same as PCA: In [9]: np.cov(df, rowvar=0, bias=1) Out[9]: array([[ 0.5549, 0.5539], [ 0.5539, 0.6449]])
So I've encountered the same issue, and I think that it returns different values because the covariance is calculated in a different way. According to the sklearn documentation, the get_covariance() method, uses the noise variances to obtain the covariance matrix.
Hierarchical clustering from confusion matrix with python
Using on the following answer, I tried to code hierarchical class clustering based on confusion matrix. Confusion matrix is used to evaluate results of classification problem and isn't symmetric. Each row represents the instances in an actual class. Here is an example of confusion matrix where you can read that 25% of the samples of the 'zero' class is predicted as class 'six'. I tried to modify the code with: conf_mat = 1 - conf_mat # 1.0 means dissimilarity sch.linkage(conf_mat, method='warp') But I got wrong results. How should I organize my data in order to apply the clustering? The following should give me the rearranged order of cluster, right ? ind = sch.fcluster(Y, 0, 'distance')
I'm not sure I understand WHY you are doing this, but, based on the comment which you posted above, it seems that you'd like to cluster 10 objects ('zero', 'one' 'nine') by comparing their values in your confusion matrix, generated by some other algorithm. I would like the clusters to maximize the classification results: if one class is mainly recognize as another one then both classes should be fused. ... So, looking at your data, object 'eight' and object 'nine' might be in the same cluster because they have both have mostly low values and one relatively high value for the 'eight' column. To do this, you can treat each of the 10 objects as having 10 arbitrary properties; then this is a standard setup. Perhaps Euclidean distance is appropriate to determine the distance between objects; you would know best. It sounds like you'd like to do some hierarchical clustering, which you can do with scipy.cluster.hierarchy ; example below. Example I didn't want to type up your data by hand, so I just randomly generated a matrix. To avoid confusion I'm calling the objects 'zero' ... 'nine' (spelled out) and I'm using numerals '1' through '9' as the object's properties. 0 1 2 3 4 5 6 7 8 9 zero 0.37 0.27 0.23 0.92 0.86 0.62 0.08 0.95 0.35 0.69 one 0.24 0.23 0.70 0.39 0.52 0.03 0.14 0.00 0.53 0.10 two 0.78 0.12 0.85 0.79 0.32 0.90 0.78 0.07 0.07 0.62 ... nine 0.15 0.39 0.27 0.93 0.12 0.14 0.34 0.11 0.72 0.52 So this is my "confusion matrix". Hierarchical clustering with SciPy. I'm using Euclidean distance, and the single-link agglomerative method. from scipy.cluster import hierarchy Y = hierarchy.distance.pdist(data.as_matrix(), metric='euclidean') Z = hierarchy.linkage(Y, method='single') ax = hierarchy.dendrogram(Z, show_contracted=True, labels=data.index.tolist()) [I put my matrix in a dataframe so I could add labels to columns and indices. That's why I'm using pandas commands data.as_matrix() to get the raw data, and data.index.tolist() to set the labels.] This gives: