I want to get the SSIM when comparing two images in python

I want to get the SSIM when comparing two images in python - python

I'm trying to use the "compare_ssim" function. I currently have two 2xN matrices of x,y coordinates where the first row is all the x coordinates and the second row is all the y coordinates of each of the two images. How can I calculate the SSIM for these two images (if there is a way to do so)
For example I have:
X = np.array([[1,2,3], [4,5,6]])
Y = np.array([[3,4,5],[5,6,7]])
compare_ssim(X,Y)
But I am getting the error
ValueError: win_size exceeds image extent. If the input is a multichannel (color) image, set multichannel=True.
I'm not sure if I am missing a parameter or if I should convert the matrices in such a way that this function works. Or if there is a way that I am supposed to convert my coordinates to a grayscale matrix? I'm a bit confused on what the matrices for the parameters of the function should look like. I know that they are supposed to be ndarrays but the type(Y) and type(Y) are both numpy.ndarray.

Since you haven't mentioned which framework/library you are using, I am going with the assumption that you are using skimage's compare_ssim.
The error in question is due to the shape of your inputs. You can find more details here.
TL;DR: compare_ssim expects images in (H, W, C) dimensions but your input images have a dimension of (2, 3). So the function is confused which dimension to treat as the channel dimension. When multichannel=True, the last dimension is treated as the channel dimension.
There are 3 key problems with your code,
compare_image expects Images as input. So your X and Y matrices should be of the dimensions (H, W, C) and not (2, 3)
They should of float datatype.
Below I have shown a bit of demo code (note: Since skimage v1.7, compare_ssim has been moved to skimage.metrics.structural_similarity)
import numpy as np
from skimage.metrics import structural_similarity
img1 = np.random.randint(0, 255, size=(200, 200, 3)).astype(np.float32)
img2 = np.random.randint(0, 255, size=(200, 200, 3)).astype(np.float32)
ssim_score = structural_similarity(img1, img2, multichannel=True) #score: 0.0018769083894301646
ssim_score = structural_similarity(img1, img1, multichannel=True) #score: 1.0

Related

OpenCV can't resize() a numpy array created from a pygame.PixelArray, error: src data type = 8 is not supported [duplicate]

I would like to take an image and change the scale of the image, while it is a numpy array.
For example I have this image of a coca-cola bottle:
bottle-1
Which translates to a numpy array of shape (528, 203, 3) and I want to resize that to say the size of this second image:
bottle-2
Which has a shape of (140, 54, 3).
How do I change the size of the image to a certain shape while still maintaining the original image? Other answers suggest stripping every other or third row out, but what I want to do is basically shrink the image how you would via an image editor but in python code. Are there any libraries to do this in numpy/SciPy?

Yeah, you can install opencv (this is a library used for image processing, and computer vision), and use the cv2.resize function. And for instance use:
import cv2
import numpy as np
img = cv2.imread('your_image.jpg')
res = cv2.resize(img, dsize=(54, 140), interpolation=cv2.INTER_CUBIC)
Here img is thus a numpy array containing the original image, whereas res is a numpy array containing the resized image. An important aspect is the interpolation parameter: there are several ways how to resize an image. Especially since you scale down the image, and the size of the original image is not a multiple of the size of the resized image. Possible interpolation schemas are:
INTER_NEAREST - a nearest-neighbor interpolation
INTER_LINEAR - a bilinear interpolation (used by default)
INTER_AREA - resampling using pixel area relation. It may be a preferred method for image decimation, as it gives moire’-free
results. But when the image is zoomed, it is similar to the
INTER_NEAREST method.
INTER_CUBIC - a bicubic interpolation over 4x4 pixel neighborhood
INTER_LANCZOS4 - a Lanczos interpolation over 8x8 pixel neighborhood
Like with most options, there is no "best" option in the sense that for every resize schema, there are scenarios where one strategy can be preferred over another.

While it might be possible to use numpy alone to do this, the operation is not built-in. That said, you can use scikit-image (which is built on numpy) to do this kind of image manipulation.
Scikit-Image rescaling documentation is here.
For example, you could do the following with your image:
from skimage.transform import resize
bottle_resized = resize(bottle, (140, 54))
This will take care of things like interpolation, anti-aliasing, etc. for you.

One-line numpy solution for downsampling (by 2):
smaller_img = bigger_img[::2, ::2]
And upsampling (by 2):
bigger_img = smaller_img.repeat(2, axis=0).repeat(2, axis=1)
(this asssumes HxWxC shaped image. note this method only allows whole integer resizing (e.g., 2x but not 1.5x))

For people coming here from Google looking for a fast way to downsample images in numpy arrays for use in Machine Learning applications, here's a super fast method (adapted from here ). This method only works when the input dimensions are a multiple of the output dimensions.
The following examples downsample from 128x128 to 64x64 (this can be easily changed).
Channels last ordering
# large image is shape (128, 128, 3)
# small image is shape (64, 64, 3)
input_size = 128
output_size = 64
bin_size = input_size // output_size
small_image = large_image.reshape((output_size, bin_size,
output_size, bin_size, 3)).max(3).max(1)
Channels first ordering
# large image is shape (3, 128, 128)
# small image is shape (3, 64, 64)
input_size = 128
output_size = 64
bin_size = input_size // output_size
small_image = large_image.reshape((3, output_size, bin_size,
output_size, bin_size)).max(4).max(2)
For grayscale images just change the 3 to a 1 like this:
Channels first ordering
# large image is shape (1, 128, 128)
# small image is shape (1, 64, 64)
input_size = 128
output_size = 64
bin_size = input_size // output_size
small_image = large_image.reshape((1, output_size, bin_size,
output_size, bin_size)).max(4).max(2)
This method uses the equivalent of max pooling. It's the fastest way to do this that I've found.

If anyone came here looking for a simple method to scale/resize an image in Python, without using additional libraries, here's a very simple image resize function:
#simple image scaling to (nR x nC) size
def scale(im, nR, nC):
nR0 = len(im) # source number of rows
nC0 = len(im[0]) # source number of columns
return [[ im[int(nR0 * r / nR)][int(nC0 * c / nC)]
for c in range(nC)] for r in range(nR)]
Example usage: resizing a (30 x 30) image to (100 x 200):
import matplotlib.pyplot as plt
def sqr(x):
return x*x
def f(r, c, nR, nC):
return 1.0 if sqr(c - nC/2) + sqr(r - nR/2) < sqr(nC/4) else 0.0
# a red circle on a canvas of size (nR x nC)
def circ(nR, nC):
return [[ [f(r, c, nR, nC), 0, 0]
for c in range(nC)] for r in range(nR)]
plt.imshow(scale(circ(30, 30), 100, 200))
Output:
This works to shrink/scale images, and works fine with numpy arrays.

For people who wants to resize(interpolate) a batch of numpy array, pytorch provide a faster function names torch.nn.functional.interpolate, just remember to use np.transpose first to change the channel from batchxWxHx3 to batchx3xWxH.

SciPy's imresize() method was another resize method, but it will be removed starting with SciPy v 1.3.0 . SciPy refers to PIL image resize method: Image.resize(size, resample=0)
size – The requested size in pixels, as a 2-tuple: (width, height).
resample – An optional resampling filter. This can be one of PIL.Image.NEAREST (use nearest neighbour), PIL.Image.BILINEAR (linear interpolation), PIL.Image.BICUBIC (cubic spline interpolation), or PIL.Image.LANCZOS (a high-quality downsampling filter). If omitted, or if the image has mode “1” or “P”, it is set PIL.Image.NEAREST.
Link here:
https://pillow.readthedocs.io/en/3.1.x/reference/Image.html#PIL.Image.Image.resize

Stumbled back upon this after a few years. It looks like the answers so far fall into one of a few categories:
Use an external library. (OpenCV, SciPy, etc)
User Power-of-Two Scaling
Use Nearest Neighbor
These solutions are all respectable, so I offer this only for completeness. It has three advantages over the above: (1) it will accept arbitrary resolutions, even non-power-of-two scaling factors; (2) it uses pure Python+Numpy with no external libraries; and (3) it interpolates all the pixels for an arguably 'nicer-looking' result.
It does not make good use of Numpy and, thus, is not fast, especially for large images. If you're only rescaling smaller images, it should be fine. I offer this under Apache or MIT license at the discretion of the user.
import math
import numpy
def resize_linear(image_matrix, new_height:int, new_width:int):
"""Perform a pure-numpy linear-resampled resize of an image."""
output_image = numpy.zeros((new_height, new_width), dtype=image_matrix.dtype)
original_height, original_width = image_matrix.shape
inv_scale_factor_y = original_height/new_height
inv_scale_factor_x = original_width/new_width
# This is an ugly serial operation.
for new_y in range(new_height):
for new_x in range(new_width):
# If you had a color image, you could repeat this with all channels here.
# Find sub-pixels data:
old_x = new_x * inv_scale_factor_x
old_y = new_y * inv_scale_factor_y
x_fraction = old_x - math.floor(old_x)
y_fraction = old_y - math.floor(old_y)
# Sample four neighboring pixels:
left_upper = image_matrix[math.floor(old_y), math.floor(old_x)]
right_upper = image_matrix[math.floor(old_y), min(image_matrix.shape[1] - 1, math.ceil(old_x))]
left_lower = image_matrix[min(image_matrix.shape[0] - 1, math.ceil(old_y)), math.floor(old_x)]
right_lower = image_matrix[min(image_matrix.shape[0] - 1, math.ceil(old_y)), min(image_matrix.shape[1] - 1, math.ceil(old_x))]
# Interpolate horizontally:
blend_top = (right_upper * x_fraction) + (left_upper * (1.0 - x_fraction))
blend_bottom = (right_lower * x_fraction) + (left_lower * (1.0 - x_fraction))
# Interpolate vertically:
final_blend = (blend_top * y_fraction) + (blend_bottom * (1.0 - y_fraction))
output_image[new_y, new_x] = final_blend
return output_image
Sample rescaling:
Original:
Downscaled by Half:
Upscaled by one and one quarter:

Are there any libraries to do this in numpy/SciPy
Sure. You can do this without OpenCV, scikit-image or PIL.
Image resizing is basically mapping the coordinates of each pixel from the original image to its resized position.
Since the coordinates of an image must be integers (think of it as a matrix), if the mapped coordinate has decimal values, you should interpolate the pixel value to approximate it to the integer position (e.g. getting the nearest pixel to that position is known as Nearest neighbor interpolation).
All you need is a function that does this interpolation for you. SciPy has interpolate.interp2d.
You can use it to resize an image in numpy array, say arr, as follows:
W, H = arr.shape[:2]
new_W, new_H = (600,300)
xrange = lambda x: np.linspace(0, 1, x)
f = interp2d(xrange(W), xrange(H), arr, kind="linear")
new_arr = f(xrange(new_W), xrange(new_H))
Of course, if your image is RGB, you have to perform the interpolation for each channel.
If you would like to understand more, I suggest watching Resizing Images - Computerphile.

import cv2
import numpy as np
image_read = cv2.imread('filename.jpg',0)
original_image = np.asarray(image_read)
width , height = 452,452
resize_image = np.zeros(shape=(width,height))
for W in range(width):
for H in range(height):
new_width = int( W * original_image.shape[0] / width )
new_height = int( H * original_image.shape[1] / height )
resize_image[W][H] = original_image[new_width][new_height]
print("Resized image size : " , resize_image.shape)
cv2.imshow(resize_image)
cv2.waitKey(0)

Convert 3 Dimensional Numpy Array to 4 Dimensional Numpy Array

I want to make a simple Program which outputs a video as an Webcam, but the Cam wants a RGBA Numpy Array but I only have RGB from the video. How can I convert the 3 dimensional array to 4 dimensions?

You're actually not converting a 3-dimensional array to a 4-dimensional array. You're changing the size of one of the dimensions from three to four.
Lets say you have a NxMx3 image. You then need to:
temp = np.zeros((N, M, 4))
temp[:,:,0:3] = image
temp[:,:,3] = whatever default alpha you choose to use.
Generalize as you see fit.

Assuming your existing array is shaped (xsize, ysize, 3) and you want to create alpha as a 4th entry all filled with 1, you should be able to do something like
alpha = np.ones((*rgb.shape[0:2], 1))
rgba = np.concatenate((rgb, alpha), axis=2)
If you wanted a different uniform alpha value you could use np.full with that value instead of np.ones, but normally when converting RGB to RGBA you want fully opaque.

You can np.dstack your original im with np.ones(im.shape[:2])
new_im = np.dstack((im, np.ones(im.shape[:2])))
update: this is equivalent to #hobbs solution np.concatenate(..., axis=2)

Maybe try something like these: (import numpy as np)
arr # shape (n_bands, y_pixels, x_pixels)
swapped = np.moveaxis(arr, 0, 2) # shape (y_pixels, x_pixels, n_bands)
arr4d = np.expand_dims(swapped, 0) # shape (1, y_pixels, x_pixels, n_bands)

Voxel normalization of third dimension in python

I´m currently working on normalizing ct-scans (x, y, layer). Normalizing the first two dimensions is simple using cv2.reshape, but the third dimension... My idea is to flatten the first two dimensions to get a 2d-numpy-array. If I do the reshape to (x * y) for each layer and reshape it back to (x, y) I get a completely different image. I have a img of a lung in the beginning and lines of different gray values afterwords.
test = cv2.resize(img, (img.shape[0] * img.shape[1], 1), interpolation=cv2.INTER_LINEAR)
test = cv2.resize(test, (159, 159), interpolation=cv2.INTER_LINEAR)
self.print_prediction(test, cv2.resize(temp2_masks[:, 0], (159, 159)),
color=False, shape=(159, 159))
I'm sure it's some kind of simple mistake, but I don't see it. So I would be very grateful for help.

The cv2.resize function does not reshape your array.
It actually resizes the image. Your first line is squashing your image horizontally while expanding it a lot vertically. The values are not preserved at all.
Use numpy.reshape to reshape your arrays instead.

How to use view as blocks in Scikit-Image?

I have a numpy array of shape (12,224,224). This is 12 images of size (244, 244). When I had a single image, this was simple. The image was of size (x,y). For example, x is an image of size (400,400), for which I could use view_as_blocks like this:
from skimage.util import view_as_blocks as vablks
xx = vablks(x, block_shape=(8,8))
This would result in a block of shape (50,50,8,8).
Now I would like to know how to apply this when I have a list of images. Either I lose shape, that is my 12 images are combined into one (224,224) block broken down into (28,28,8,8), or I run into a ValueError. Here is the code I tried to use for iterating over the 12 images and viewing the (224,224) shaped images
xx = []
for item_ in x:
xx.append(blockSplitter(item_))
where x is a list of images.
Here is the error:
ValueError: 'block_shape' is not compatible with 'arr_in'
Overall, I would like to know how to view the images as blocks of 8x8 without losing the images.
Help, Please and Thank You.

You have at least two options:
1) Convert the list to an array, as suggested by the commenter above. Then use view_as_blocks with the correct parameters:
from skimage.util import view_as_blocks
images = [np.zeros((50, 50)) for i in range(10)]
images = np.array(images)
all_blocks = view_as_blocks(images, block_shape=(1, 10, 10)).squeeze()
2) Convert each item in the list to a windowed view, and then convert the end result to an array:
from skimage.util import view_as_blocks
images = [np.zeros((50, 50)) for i in range(10)]
image_blocks = [view_as_blocks(image, block_shape=(10, 10)) for image in images]
all_blocks = np.array(image_blocks)

strange behaviour when cropping a numpy/opencv image

I'm really puzzled by the way of indexing a numpy multidimensional array. My goal is to crop a region from an image I loaded using opencv.
Loading the image works great:
import numpy as np
import cv2
img = cv2.imread(start_filename)
print img.shape
shape is displayed as
(2000L, 4096L, 3L)
Now I want to cut a part from the image which ranges from pixels 550 to 1550 in the first dimension and only consists of the last 782 pixels of the second dimension. I tried
img=img[550:1550][:-782][:]
print img.shape
Now the shape is displayed as
(782L, 4096L, 3L)
I'm confused, whats the correct way of indexing for the crop operation?

The correct way of cropping image is using slicing technique:
import cv2
img = cv2.imread("lenna.png")
crop_img = img[200:400, 100:300] # Crop from x, y, w, h -> 100, 200, 300, 400
# NOTE: its img[y: y + h, x: x + w] and *not* img[x: x + w, y: y + h]
In your case, the final cropped image may be reproduced as:
crop_img=img[550:1550, -782:]
print crop_img.shape

As mentioned in other answers you could use img[550:1550,-782:,:] but this will give you only a read only view of the array. It means that you cannot modify it. If you want to modify the image after you crop it you could use the ix_ function of Numpy for indexing.
img=img[ix_(range(550, 1550), range(img.shape[1]-782, img.shape[1]))]
# or
img=img[ix_(range(550, 1550), range(img.shape[1]-782, img.shape[1]), range(3))]
After this your shape will look like:
(1000, 782, 3)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

I want to get the SSIM when comparing two images in python - python

Related

OpenCV can't resize() a numpy array created from a pygame.PixelArray, error: src data type = 8 is not supported [duplicate]

Convert 3 Dimensional Numpy Array to 4 Dimensional Numpy Array

Voxel normalization of third dimension in python

How to use view as blocks in Scikit-Image?

strange behaviour when cropping a numpy/opencv image

Categories

Resources