Python OpenCV cannot display image correctly after transformation - python

Today I was trying to compress the image below, using sklearn's PCA algorithm in Python.
Because the image is RGB (3 channels), I first reshaped the image, so that it becomes 2D. Then, I applied the PCA algorithm on the data to compress the image. After the image was compressed, I inversed the PCA transformation and reshaped the approximated (decompressed) image back to its original shape.
However, when I tried to display the approximated image I got this weird result here:
While the image is stored correctly with the cv2.imwrite function, OpenCV fails to display the image correctly using cv2.imshow. Do You have any idea why this might be happening?
My code is below:
from sklearn.decomposition import PCA
import cv2
import numpy as np
image_filepath = 'baby_yoda_image.jpg'
# Loading image from disk.
input_image = cv2.imread(image_filepath)
height = input_image.shape[0]
width = input_image.shape[1]
channels = input_image.shape[2]
# Reshaping image to perform PCA.
print('Input image shape:', input_image.shape)
#--- OUT: (533, 800, 3)
reshaped_image = np.reshape(input_image, (height, width*channels))
print('Reshaped Image:', reshaped_image.shape)
#--- OUT: (533, 2400)
# Applying PCA transformation to image. No whitening is applied to prevent further data loss.
n_components = 64
whitening = False
pca = PCA(n_components, whitening)
compressed_image = pca.fit_transform(reshaped_image)
print('PCA Compressed Image Shape:', compressed_image.shape)
#--- OUT: (533, 64)
print('Compression achieved:', np.around(np.sum(pca.explained_variance_ratio_), 2)*100, '%')
#--- OUT: 97.0 %
# Plotting images.
approximated_image = pca.inverse_transform(compressed_image)
approximated_original_shape_image = np.reshape(approximated_image, (height, width, channels))
cv2.imshow('Input Image', input_image)
cv2.imshow('Compressed Image', approximated_original_shape_image)
cv2.waitKey()
Thanks in advance.

Finally, I found a solution to this problem, thanks to #fmw42 . After the transformation, there were negative values in the pixels and also values that exceeded 255.
Luckily, OpenCV does take care of this problem with this line of code:
approximated_uint8_image = cv2.convertScaleAbs(approximated_original_shape_image)

Related

How to convert data array to show easy to understand results

I tried to make a algorithm using Teachable Machine to receive a picture and see if it fall under one of two categories of pictures (e.g dogs or humans), but after I exported the code that was given I couldn't make sense of how I could make the results that were given via array to turn into something that anyone can understand. So far it only shows a list of two numbers (e.g [[0.00058185 0.99941814]] the first number being dogs and the second one humans) I wanted to make it to show which one of the two numbers means dog and human and the percentage of both or to make it to only shows which one is the most probable to be.
Here's the code:
import tensorflow.keras
from PIL import Image, ImageOps
import numpy as np
from decimal import Decimal
# Disable scientific notation for clarity
np.set_printoptions(suppress=True)
# Load the model
model = tensorflow.keras.models.load_model('keras_model.h5')
# Create the array of the right shape to feed into the keras model
# The 'length' or number of images you can put into the array is
# determined by the first position in the shape tuple, in this case 1.
data = np.ndarray(shape=(1, 224, 224, 3), dtype=np.float32)
# Replace this with the path to your image
image = Image.open('test_photo.jpg')
#resize the image to a 224x224 with the same strategy as in TM2:
#resizing the image to be at least 224x224 and then cropping from the center
size = (224, 224)
image = ImageOps.fit(image, size, Image.ANTIALIAS)
#turn the image into a numpy array
image_array = np.asarray(image)
# display the resized image
image.show()
# Normalize the image
normalized_image_array = (image_array.astype(np.float32) / 127.0) - 1
# Load the image into the array
data[0] = normalized_image_array
# run the inference
prediction = model.predict(data)
print(prediction)
input('Press ENTER to exit')
Using argmax and max does what you want:
"Prediction is {} with {}% probability".format(["dog", "human"][np.argmax(prediction)], round(np.max(prediction)*100,2))
'Prediction is human with 99.94% probability'

inverse reshaping giving different results

I need to resize and then reshape certain image. Then I want to apply the inverse reshaping which should give the original picture, but it is not working. Let us have this code:
images=[]
image = imread('1.png')
resized = np.resize(image, (320, 440))
images.append(resized)
arr=np.asarray(images)
newArray=arr.astype('float32')
plt.figure(figsize=[5, 5])
reshaped = np.reshape(newArray[0], (320,440))
plt.imshow(reshaped, cmap='gray')
plt.show()
Original picture 1.png:
Reshaped picture shown by plt.show():
Inverse reshaped image is not like the original, can someone tell me where is the problem? Thanks.
Edit:
After clarification, it looks like the confusion comes from np.resize, which is not an image processing operation, and is not used for rescaling an image while retaining content.
It looks as though image processing wrappers such as imresize have been removed from the scipy library, and while you can in principle use the scipy.interpolate package to reproduce the functionality of imresize, I recommend either using pillow, or scikit-image
with pillow:
import numpy as np
from PIL import Image
image = Image.open("my_image.jpg")
image = image.resize((224, 224))
image = np.asarray(image) # convert to numpy
with sckit-image:
from skimage.transform import resize
image = imread("my_image.jpg")
image = resize(image, (224, 224))
original answer
np.reshape will first ravel the elements, then sort them into the new shape specified, losing a lot of spatial relationships, as you're seeing. Likely, what you actually want to do is np.transpose the image, swapping two axes:
images=[]
image = imread('1.png')
resized = np.resize(image, (320, 440))
images.append(resized)
arr=np.asarray(images)
newArray=arr.astype('float32')
plt.figure(figsize=[5, 5])
transposed = np.transpose(newArray[0])
plt.imshow(transposed, cmap='gray')
plt.show()

How to create synthetic blurred image from sharp image using PSF kernel (in image format)

Update as suggestion from #Fix that I should BGR to RGB, but the outputs are still not the same as the paper's output.
(Small note: this post already post on https://dsp.stackexchange.com/posts/60670 but since I need help quickly so I think I reposted here, hope this doesn't violate to any policy)
I tried to create synthetic blurred image from ground-truth image using PSF kernels (in png format), some paper only mentioned that I need to do convolve operation on it, but it's seem to be I need more than that.
What I did
import matplotlib.pyplot as plt
import cv2 as cv
import scipy
from scipy import ndimage
import matplotlib.image as mpimg
import numpy as np
img = cv.imread('../dataset/text_01.png')
norm_image = cv.normalize(img, None, alpha=-0.1, beta=1.8, norm_type=cv.NORM_MINMAX, dtype=cv.CV_32F)
f = cv.imread('../matlab/uniform_kernel/kernel_01.png')
norm_f = cv.normalize(f, None, alpha=0, beta=1, norm_type=cv.NORM_MINMAX, dtype=cv.CV_32F)
result = ndimage.convolve(norm_image, norm_f, mode='nearest')
result = np.clip(result, 0, 1)
imgplot = plt.imshow(result)
plt.show()
And this only give me a white-entire image.
I tried to decrease the beta to lower number like this here norm_f = cv.normalize(f, None, alpha=0, beta=0.03, norm_type=cv.NORM_MINMAX, dtype=cv.CV_32F) and the image is appeared but it's very different in the color of it.
The paper I got idea how to do it and dataset (images with ground-truth and PSF kernels in PNG format) are here
This is what they said:
We create the synthetic saturated images in a way similar to [3, 10].
Specifically, we first stretch the intensity range of the latent image
from [0,1] to [−0.1,1.8], and convolve the blur kernels with the
images. We then clip the blurred images into the range of [0,1]. The
same process is adopted for generating non-uniform blurred images.
This is some images I got from my source.
And this is the ground-truth image:
And this is the PSF kernel in PNG format file:
And this is their output (synthetic image):
Please help me out, it doesn't matter solution, even it's a software, another languages, another tools. I only care eventually I have synthetic blurred image from original (sharp) image with PSF kernel with good performance (I tried on Matlab but suffered similar problem, I used imfilter, and one more problem with Matlab is they're slow).
(please not judge for only care about the output of the process, I'm not using deconvol method to deblur blurred back to the original image one so I want to have enough datasets (original&blurred) pairs to test my hypothesis/method)
Thanks.
OpenCV reads / writes images in BGR format, and Matplotlib in RGB. So if you want to display the right colours, you should first convert it to RGB :
result_rgb = cv.cvtColor(result, cv.COLOR_BGR2RGB)
imgplot = plt.imshow(result)
plt.show()
Edit: You could convolve each channel separately and normalise your convolve image like this:
f = cv.cvtColor(f, cv.COLOR_BGR2GRAY)
norm_image = img / 255.0
norm_f = f / 255.0
result0 = ndimage.convolve(norm_image[:,:,0], norm_f)/(np.sum(norm_f))
result1 = ndimage.convolve(norm_image[:,:,1], norm_f)/(np.sum(norm_f))
result2 = ndimage.convolve(norm_image[:,:,2], norm_f)/(np.sum(norm_f))
result = np.stack((result0, result1, result2), axis=2).astype(np.float32)
Then you should get the right colors. This though uses a normalisation between 0.0 and 1.0 for both the image and the kernel (unlike between -0.1 and 1.8 for the image as the paper suggests).

How to convert a RGB image (3 channel) to grayscale (1 channel) and save it?

Working with a deep learning project and I have a lot of images, that don't need to have colors. I saved them doing:
import matplotlib.pyplot as plt
plt.imsave('image.png', image, format='png', cmap='gray')
However later when I checked the shape of the image the result is:
import cv2
img_rgb = cv2.imread('image.png')
print(img_rgb.shape)
(196,256,3)
So even though the image I view is in grayscale, I still have 3 color channels. I realized I had to do some algebric operations in order to convert those 3 channels into 1 single channel.
I have tried the methods described on the thread "How can I convert an RGB image into grayscale in Python?" but I'm confused.
For example, when to do the conversion using:
from skimage import color
from skimage import io
img_gray = color.rgb2gray(io.imread('image.png'))
plt.imsave('image_gray.png', img_gray, format='png')
However when I load the new image and check its shape:
img_gr = cv2.imread('image_gray.png')
print(img_gr.shape)
(196,256,3)
I tried the other methods on that thread but the results are the same. My goal is to have images with a (196,256,1) shape, given how much less computationally intensive it will be for a Convolutional Neural Network.
Any help would be appreciated.
Your first code block:
import matplotlib.pyplot as plt
plt.imsave('image.png', image, format='png', cmap='gray')
This is saving the image as RGB, because cmap='gray' is ignored when supplying RGB data to imsave (see pyplot docs).
You can convert your data into grayscale by taking the average of the three bands, either using color.rgb2gray as you have, or I tend to use numpy:
import numpy as np
from matplotlib import pyplot as plt
import cv2
img_rgb = np.random.rand(196,256,3)
print('RGB image shape:', img_rgb.shape)
img_gray = np.mean(img_rgb, axis=2)
print('Grayscale image shape:', img_gray.shape)
Output:
RGB image shape: (196, 256, 3)
Grayscale image shape: (196, 256)
img_gray is now the correct shape, however if you save it using plt.imsave, it will still write three bands, with R == G == B for each pixel. This is because, I believe, a PNG file requires three (or four) bands. Warning: I am not sure about this: I expect to be corrected.
plt.imsave('image_gray.png', img_gray, format='png')
new_img = cv2.imread('image_gray.png')
print('Loaded image shape:', new_img.shape)
Output:
Loaded image shape: (196, 256, 3)
One way to avoid this is to save the images as numpy files, or indeed to save a batch of images as numpy files:
np.save('np_image.npy', img_gray)
new_np = np.load('np_image.npy')
print('new_np shape:', new_np.shape)
Output:
new_np shape: (196, 256)
The other thing you could do is save the grayscale png (using imsave) but then only read in the first band:
finalimg = cv2.imread('image_gray.png',0)
print('finalimg image shape:', finalimg.shape)
Output:
finalimg image shape: (196, 256)
As it turns out, Keras, the deep-learning library I'm using has its own method of converting images to a single color channel (grayscale) in its image pre-processing step.
When using the ImageDataGenerator class the flow_from_directory method takes the color_mode argument. Setting color_mode = "grayscale" will automatically convert the PNG into a single color channel!
https://keras.io/preprocessing/image/#imagedatagenerator-methods
Hope this helps someone in the future.
if you want to just add extra channels that have the same value as the graysacale , maybe to use a specific model that requires 3 channel input_shape .
lets say your pictures are 28 X 28 and so you have a shape of (28 , 28 , 1)
def add_extra_channels_to_pic(pic):
if pic.shape == (28 , 28 , 1):
pic = pic.reshape(28,28)
pic = np.array([pic , pic , pic])
# to make the channel axis in the end
pic = np.moveaxis(pic , 0 , -1)
return pic
Try this method
import imageio
new_data = imageio.imread("file_path", as_gray =True)
imageio.imsave("file_path", new_data)
The optional argument "as_gray = True" in line 2 of the code does the actual conversion.

greyscale image normalization issue with MINMAX

I am trying to normalize a bunch of images which I have scaled to 32x32 pixel size. I was initially wanting to use x-median/std for normalization, but I found some code to use MINMAX instead so I am trying that. I need to get the image into the 0 to 1 range, so I assume that dtype 32F would do that, so I think this is where the problem lies. When I run the code, the normalized image is completely black. Any advice on how to solve that?
Here is the code:
import cv2
import numpy as np
from PIL import Image
image = cv2.imread("image.png", cv2.IMREAD_UNCHANGED) # uint8 image
norm_image = np.zeros((32, 32))
norm_image = cv2.normalize(image, norm_image, alpha=0, beta=1, norm_type = cv2.NORM_MINMAX, dtype=cv2.CV_32F)
im = Image.fromarray(norm_image)
if im != 'RGB':
im = im.convert('RGB')
im.save("image_norm.png")
cv2.waitKey(0)
cv2.destroyAllWindows()
Sample image

Categories

Resources