I have a numpy array that I wish to resize using opencv.
Its values range from 0 to 255. If I opt to use cv2.INTER_CUBIC, I may get values outside this range. This is undesirable, since the resized array is supposed to still represent an image.
One solution is to clip the results to [0, 255]. Another is to use a different interpolation method.
It is my understanding that using INTER_AREA is valid for down-sampling an image, but works similar to nearest neighbor for upsampling it, rendering it less than optimal for my purpose.
Should I use INTER_CUBIC (and clip), INTER_AREA, or INTER_LINEAR?
an example for values outside of range using INTER_CUBIC:
a = np.array( [ 0, 10, 20, 0, 5, 2, 255, 0, 255 ] ).reshape( ( 3, 3 ) )
[[ 0 10 20]
[ 0 5 2]
[255 0 255]]
b = cv2.resize( a.astype('float'), ( 4, 4 ), interpolation = cv2.INTER_CUBIC )
[[ 0. 5.42489886 15.43670964 21.29199219]
[ -28.01513672 -2.46422291 1.62949324 -19.30908203]
[ 91.88964844 25.07939219 24.75106835 91.19140625]
[ 273.30322266 68.20603609 68.13853455 273.15966797]]
Edit: As berak pointed out, converting the type to float (from int64) allows for values outside the original range. the cv2.resize() function does not work with the default 'int64' type. However, converting to 'uint8' will automatically saturate the values to [0..255].
Also, as pointed out by SaulloCastro, another related answer demonstrated scipy's interpolation, and that there the defualt method is the cubic interpolation (with saturation).
If you are enlarging the image, you should prefer to use INTER_LINEAR or INTER_CUBIC interpolation.
If you are shrinking the image, you should prefer to use INTER_AREA interpolation.
Cubic interpolation is computationally more complex, and hence slower than linear interpolation. However, the quality of the resulting image will be higher.
To overcome such problem you should find out the new size of the given image where the interpolation can be made. And copy interpolated sampled image on the target image like:
# create target image and copy sample image into it
(wt, ht) = imgSize # target image size
(h, w) = img.shape # given image size
fx = w / wt
fy = h / ht
f = max(fx, fy)
newSize = (max(min(wt, int(w / f)), 1),
max(min(ht, int(h / f)), 1)) # scale according to f (result at least 1 and at most wt or ht)
img = cv2.resize(img, newSize, interpolation=cv2.INTER_CUBIC) #INTER_CUBIC interpolation
target = np.ones([ht, wt]) * 255 # shape=(64,800)
target[0:newSize[1], 0:newSize[0]] = img
Some of the possible interpolation in openCV are:
INTER_NEAREST – a nearest-neighbor interpolation
INTER_LINEAR – a bilinear interpolation (used by default)
INTER_AREA – resampling using pixel area relation. It may be a preferred method for image decimation, as it gives moire’-free results. But when the image is zoomed, it is similar to the INTER_NEAREST method.
INTER_CUBIC – a bicubic interpolation over 4×4 pixel neighborhood
INTER_LANCZOS4 – a Lanczos interpolation over 8×8 pixel neighborhood
See here for results in each interpolation.
I think you should start with INTER_LINEAR which is the default option for resize() function. It combines sufficiently good visual results with sufficiently good time performance (although it is not as fast as INTER_NEAREST). And it won't create those out-of-range values.
My this answer is based on testing. And in the end it supports the answer of #shivam. I tested on these interpolation method, for both combination, shrinking and enlarging. And after enlarging I calculated psnr with orignal image.
[cv2.INTER_AREA,
cv2.INTER_BITS,
cv2.INTER_BITS2,
cv2.INTER_CUBIC,
cv2.INTER_LANCZOS4,
cv2.INTER_LINEAR,
cv2.INTER_LINEAR_EXACT,
cv2.INTER_NEAREST]
shirking=0.25
enlarge=4
Tested this on 165 images of different shapes. For results I picked maximum and 2nd maximum psnr and calculated the count.
For the Maximum the count for interpolation is show in image below.
From this test the maximum psnr is given by the combination of AREA and LANCZOS4 which gave max psnr for 141/204 images.
I also wanted to include the 2nd maximum. So here are the results of 2nd Maximum only.
Here the AREA and CUBIC gave the 2nd best result. 19/204 has the highest psnr and 158/347 have the 2nd highes psnr using AREA + CUBIC.
This results were vague so I opened the files for which CUBIC gave the highes psnr. Turns out the images with lot of texture/abstraction gave highest psnr using CUBIC.
So I did furthure tests only for AREA+CUBIC and AREA+LANCZOS4. I came to the conclusion that if you are shrinking image <10 times then go for the LANCZOS4. It will give you better results for less than 10times zoom and if the image is large than it's better than CUBIC.
As for my program I was shiriking image 4 times so for me the AREA+LANCZOS4 works better.
Scripts and images: https://github.com/crackaf/triple-recovery/tree/main/tests
Related
I've been experimenting on MedMNIST data and I noticed that the default loader produces 28x28x3 images.
I wanted to resize all images via cv2 to 32x32x3
I previously tried this method but my CNN models fail to achieve good accuracy.
def regen(generate):
for i, j in generate:
a= np.zeros((64, 32, 32, 3))
a[:,2:30,2:30,:]=i
b = a/255
yield b, j
Here is a link to the original dataset
https://medmnist.com/
Not sure if this is what you tried before, but you can try resizing the images using cv2.resize:
import cv2
img = cv2.imread('my_image.jpg')
res = cv2.resize(img, dsize=(32, 32))
You can also try changing the interpolation argument which defaults to INTER_LINEAR, but there are others that may work better for your case. The possible values from the documentation I linked:
INTER_NEAREST - a nearest-neighbor interpolation
INTER_LINEAR - a bilinear interpolation (used by default)
INTER_AREA - resampling
using pixel area relation. It may be a preferred method for image
decimation, as it gives moire’-free results. But when the image is
zoomed, it is similar to the INTER_NEAREST method.
INTER_CUBIC - a
bicubic interpolation over 4x4 pixel neighborhood
INTER_LANCZOS4 - a
Lanczos interpolation over 8x8 pixel neighborhood
I have images of size 48x48. I want to increase its size to 150x150 for training using transfer learning(CNN). What is the possible way to do this? I want to increase the image size in such a way that the resolution remains same without any loss of data.
you can use without problem the tensorflow.image.resize method
tf.image.resize(X, [150, 150])
if you read the doc of tensorflow, it states that you can do down sampling and up sampling, with different methods
The method argument expects an item from the image.ResizeMethod enum, or the string equivalent. The options are:
bilinear: Bilinear interpolation. If antialias is true, becomes a
hat/tent filter function with radius 1 when downsampling.
lanczos3: Lanczos kernel with radius 3. High-quality practical filter but may have some ringing, especially on synthetic images
lanczos5: Lanczos kernel with radius 5. Very-high-quality filter but may have stronger ringing.
bicubic: Cubic interpolant of Keys. Equivalent to Catmull-Rom kernel. Reasonably good quality and faster than Lanczos3Kernel,
particularly when upsampling.
gaussian: Gaussian kernel with radius 3, sigma = 1.5 / 3.0.
nearest: Nearest neighbor interpolation. antialias has no effect when used with nearest neighbor interpolation.
area: Anti-aliased resampling with area interpolation. antialias has no effect when used with area interpolation; it always
anti-aliases.
mitchellcubic: Mitchell-Netravali Cubic non-interpolating filter. For synthetic images (especially those lacking proper prefiltering),
less ringing than Keys cubic kernel but less sharp.
here details
This might work.
from PIL import Image
import numpy as np
# random image data
image_data = np.random.randint(low=0, high=256, size=(48,48,3)).astype(np.uint8)
image_small = Image.fromarray(image_data)
# there are many settings you can play here, depending on how you want the image resized
image_large = image_small.resize((120,120))
np.array(image_large)
I am using the python bindings for opencv. I am using keypoint detection and description (ie SURF, SIFT,...) to find a template image contained within a target image, but there is a catch: the template can be "squeezed" in the target image, so that the aspect ratio is different than the target image.
This does not work with findHomography(), since it assumes a simple perspective transform, which cannot have this sort of stretching.
Are there any ways to do this? I have thought about incrementally stretching the target image different amounts to change the aspect ratio, and using findHomography at each iteration, but as far as I can tell there is no way of comparing the quality of a fit (since I'm using RANSAC to find the best fit), so I can't tell at which squeeze level it fits best.
Perhaps counting the number of points that matched correctly from the RANSAC by looking at the length of the returned mask? This seems sorta gross.
This does not work with findHomography(), since it assumes a simple perspective transform, which cannot have this sort of stretching.
This is not true; even an affine warp includes stretching the aspect ratios and even shear distortion, and homographies expand this by even non-uniform distortions. For example, the affine transformation given by the matrix
2 0 0
0 1 0
will stretch an image horizontally by a factor of two, as seen with this short program:
import cv2
import numpy as np
img = cv2.imread('lena.png')
affine_warp = np.array([[2, 0, 0], [0, 1, 0]], dtype=np.float32)
dsize = (img.shape[1]*2, img.shape[0])
warped_img = cv2.warpAffine(img, affine_warp, dsize)
cv2.imshow("2x Horizontal Stretching", warped_img)
cv2.waitKey(0)
Producing the output:
So that is not your issue. Homographies allow even stronger warping. Are you running RANSAC yourself or letting the findHomography() function decide your points via RANSAC? Please post your expected output and your current code, possibly in a new question that reflects the problems you're facing.
I've succeeded on it by using the below method, but I'm sure there must be other more time-efficient alternatives to provide exact angle of rotation instead of an approximation as the method below. I'll be pleased to hear your feedback.
The procedure is based on the following steps:
Import a template image (i.e.: with orientation at 0º)
Create a discrete array of the same image but each one rotated at 360º/rotatesteps compared to its nearest neighbour (i.e.: 30 to 50 rotated images)
# python 3 / opencv 3
# Settings:
rotate_steps = 36
step_angle = round((360/rotate_steps), 0) # one image at each 10º
# Rotation function
def rotate_image(image, angle):
# ../..
return rotated_image
# Importing a sample image and creating a n-dimension array where to store images in:
image = cv2.imread('sample_image.png')
image_Array = np.zeros((image.shape[1], image.shape[0], 1), dtype='uint8')
# Rotating sample image and saving it into the array as a new channel:
while rotation_angle <= (360 - step_angle):
angles.append(rotation_angle)
image_array[:,:,channel] = rotate_image(image.copy(), rotation_angle)
# ../..
So I get:
angles = [0, 10.0, 20.0, 30.0, .../..., 340.0, 350.0]
image_array = [image_1, image_2, image_3, ...] where image_i is a different channel on a numpy array.
Retrieve the 'test_image' for which I'm looking at the angle compared to the sample image we have previously rotated and stored into an array
Follow a series of cv2.matchTemplate() and cv2.minMaxLoc() to find what rotated image's angle best matches the 'test_image'
for i in range(len(angles)):
res = cv2.matchTemplate(test_image, image_array[:,:,i], cv2.TM_CCOEFF_NORMED)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
# ../..
And finally I pick the discretized angle matching the sample image as the one corresponding to the template image with 'max_val' highest value.
This has proved to work well having in mind the resulting precision is based on an approximation with higher / lower precision depending on the amount of rotated template images, and also the rising time taken when rotated template number increases...
I'm sure there must be other smarter alternatives based on different methods such as generating a kind of "orientation vector" of an image, and so comparing just the resulting number with a previously known one from a sample template...
Your feedback will be highly appreciated.
I think your problem doesn't have an easy solution. It's in fact a registration problem, warping (in this case, rotating) an image to fit another. And it's a known difficult problem, as segmentation is.
I heard image processing researchers say that "he who masters segmentation and registration masters image processing", which might be a little bit of a hyperbole, but it gives the general idea.
Anyway, your technique is how I would have gone with it. Looking on researchgate, https://www.researchgate.net/post/How_can_one_determine_the_rotation_angle_between_two_images, lots of answers also go your way. The alternative would be using feature matching, but I'm not sure it would be faster than your solution.
Maybe you can have a look at OpenCV registration methods http://docs.opencv.org/trunk/db/d61/group__reg.html (the method in this link uses pixel matching and not feature matching, maybe it's faster)
Is there any good algorithm for detecting particles on a changing background intensity?
For example, if I have the following image:
Is there a way to count the small white particles, even with the clearly different background that appears towards the lower left?
To be a little more clear, I would like to label the image and count the particles with an algorithm that finds these particles to be significant:
I have tried many things with the PIL, cv , scipy , numpy , etc. modules.
I got some hints from this very similar SO question, and it appears at first glance that you could take a simple threshold like so:
im = mahotas.imread('particles.jpg')
T = mahotas.thresholding.otsu(im)
labeled, nr_objects = ndimage.label(im>T)
print nr_objects
pylab.imshow(labeled)
but because of the changing background you get this:
I have also tried other ideas, such as a technique I found for measuring paws, which I implemented in this way:
import numpy as np
import scipy
import pylab
import pymorph
import mahotas
from scipy import ndimage
import cv
def detect_peaks(image):
"""
Takes an image and detect the peaks usingthe local maximum filter.
Returns a boolean mask of the peaks (i.e. 1 when
the pixel's value is the neighborhood maximum, 0 otherwise)
"""
# define an 8-connected neighborhood
neighborhood = ndimage.morphology.generate_binary_structure(2,2)
#apply the local maximum filter; all pixel of maximal value
#in their neighborhood are set to 1
local_max = ndimage.filters.maximum_filter(image, footprint=neighborhood)==image
#local_max is a mask that contains the peaks we are
#looking for, but also the background.
#In order to isolate the peaks we must remove the background from the mask.
#we create the mask of the background
background = (image==0)
#a little technicality: we must erode the background in order to
#successfully subtract it form local_max, otherwise a line will
#appear along the background border (artifact of the local maximum filter)
eroded_background = ndimage.morphology.binary_erosion(background, structure=neighborhood, border_value=1)
#we obtain the final mask, containing only peaks,
#by removing the background from the local_max mask
detected_peaks = local_max - eroded_background
return detected_peaks
im = mahotas.imread('particles.jpg')
imf = ndimage.gaussian_filter(im, 3)
#rmax = pymorph.regmax(imf)
detected_peaks = detect_peaks(imf)
pylab.imshow(pymorph.overlay(im, detected_peaks))
pylab.show()
but this gives no luck either, showing this result:
Using the regional max function, I get images which almost appear to be giving correct particle identification, but there are either too many, or too few particles in the wrong spots depending on my gaussian filtering (images have gaussian filter of 2,3, & 4):
Also, it would need to work on images similar to this as well:
This is the same type of image above, just at a much higher density of particles.
EDIT: Solved solution: I was able to get a decent working solution to this problem using the following code:
import cv2
import pylab
from scipy import ndimage
im = cv2.imread('particles.jpg')
pylab.figure(0)
pylab.imshow(im)
gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (5,5), 0)
maxValue = 255
adaptiveMethod = cv2.ADAPTIVE_THRESH_GAUSSIAN_C#cv2.ADAPTIVE_THRESH_MEAN_C #cv2.ADAPTIVE_THRESH_GAUSSIAN_C
thresholdType = cv2.THRESH_BINARY#cv2.THRESH_BINARY #cv2.THRESH_BINARY_INV
blockSize = 5 #odd number like 3,5,7,9,11
C = -3 # constant to be subtracted
im_thresholded = cv2.adaptiveThreshold(gray, maxValue, adaptiveMethod, thresholdType, blockSize, C)
labelarray, particle_count = ndimage.measurements.label(im_thresholded)
print particle_count
pylab.figure(1)
pylab.imshow(im_thresholded)
pylab.show()
This will show the images like this:
(which is the given image)
and
(which is the counted particles)
and calculate the particle count as 60.
I had solved the "variable brightness in background" by using a tuned difference threshold with a technique called Adaptive Contrast. It works by performing a linear combination (a difference, in the case) of a grayscale image with a blurred version of itself, then applying a threshold to it.
Convolve the image with a suitable statistical operator.
Subtract the original from the convolved image, correcting intensity scale/gamma if necessary.
Threshold the difference image with a constant.
(original paper)
I did this very successfully with scipy.ndimage, in the floating-point domain (way better results than integer image processing), like this:
original_grayscale = numpy.asarray(some_PIL_image.convert('L'), dtype=float)
blurred_grayscale = scipy.ndimage.filters.gaussian_filter(original_grayscale, blur_parameter)
difference_image = original_grayscale - (multiplier * blurred_grayscale);
image_to_be_labeled = ((difference_image > threshold) * 255).astype('uint8') # not sure if it is necessary
labelarray, particle_count = scipy.ndimage.measurements.label(image_to_be_labeled)
Hope this helps!!
I cannot really give a definite answer, but here are a few pointers:
The function mahotas.morph.regmax might be better than the maximum filter as it removes pseudo-maxima. Perhaps combine this with a global threshold, with a local threshold (such as the mean over a window) or both.
If you have several images and the same uneven background, then maybe you can compute an average background and normalize against that, or use empty images as your estimate of background. This would be the case if you have a microscope, and like every microscope I've seen, the illumination is uneven.
Something like:
average = average_of_many(images)
# smooth it
average = mahotas.gaussian_filter(average,24)
Now you preprocess your images, like:
preproc = image/average
or something like that.