OpenCV hog features explanation

OpenCV hog features explanation - python

I am trying to calculate dense feature trajectories of a video as in https://hal.inria.fr/hal-00725627/document. I am trying to use openCV hog descriptors like this:
winSize = (32,32)
blockSize = (32,32)
blockStride = (2,2)
cellSize = (2,2)
nbins = 9
hog = cv2.HOGDescriptor(winSize,blockSize,blockStride,cellSize,nbins)
hist = hog.compute(img)
However, this returns a very large feature vector of size: (160563456, 1).
What is a window? (winSize)
What is a block?
What is a cell?
The documentation isn't particularly helpful at explaining what each of these parameters is.
From http://www.learnopencv.com/histogram-of-oriented-gradients/
I see that to compute HOGs we create a histogram for each cell of an image patch and then normalise over the patch.
What I want is 4 9bin histograms for each (32, 32) patch of my image which should be calculated from the histograms of the (16,16) cells from this patch. So I would expect a final hog feature of size 40716 for a (480,640) image.
(((32*32) / (16*16)) * 9) * (((480-16*640-16)/(32*32)*4)) = 40716
((PatchSize / cell size) * numBins) * numPatches = hogSize
I have also seen people doing stuff like this:
winStride = (8,8)
padding = (8,8)
locations = ((10,20),)
hist = hog.compute(image,winStride,padding,locations)
However, I don't understand what the locations parameter does as I do not wish to only compute the HOG features at a single location but for all (32,32) patches of my image.

cell_size = (16, 16) # h x w in pixels
block_size = (2, 2) # h x w in cells
nbins = 9 # number of orientation bins
# winSize is the size of the image cropped to an multiple of the cell size
# cell_size is the size of the cells of the img patch over which to calculate the histograms
# block_size is the number of cells which fit in the patch
hog = cv2.HOGDescriptor(_winSize=(img.shape[1] // cell_size[1] * cell_size[1],
img.shape[0] // cell_size[0] * cell_size[0]),
_blockSize=(block_size[1] * cell_size[1],
block_size[0] * cell_size[0]),
_blockStride=(cell_size[1], cell_size[0]),
_cellSize=(cell_size[1], cell_size[0]),
_nbins=nbins)
self.hog = hog.compute(img)

We divide the image into cells of mxn pixels. Let's say 8x8.
So a 64x64 image would result in 8x8 cells of 8x8 pixels.
To reduce overall brightness effects we add a normalization stop into the feature calculation. A block contains several cells. Instead of normalizing each cell we normalize across a block. A 32x32 pixel block would contain 4x4 8x8 pixel cells.
A window is the part of the image we calculate the feature descriptor for.
Let's say you want to find something of 64x64 pixels in a large image. You then would slide a 64x64 pixel window across the image and calculate the feature descriptor for each location which you then use to find the location of best match...
It's all in the documents. Just read it and experiment until you understand it.
If you can't follow the documentation, read the source code and see what is going on line by line.

Related

OpenCV can't resize() a numpy array created from a pygame.PixelArray, error: src data type = 8 is not supported [duplicate]

I would like to take an image and change the scale of the image, while it is a numpy array.
For example I have this image of a coca-cola bottle:
bottle-1
Which translates to a numpy array of shape (528, 203, 3) and I want to resize that to say the size of this second image:
bottle-2
Which has a shape of (140, 54, 3).
How do I change the size of the image to a certain shape while still maintaining the original image? Other answers suggest stripping every other or third row out, but what I want to do is basically shrink the image how you would via an image editor but in python code. Are there any libraries to do this in numpy/SciPy?

Yeah, you can install opencv (this is a library used for image processing, and computer vision), and use the cv2.resize function. And for instance use:
import cv2
import numpy as np
img = cv2.imread('your_image.jpg')
res = cv2.resize(img, dsize=(54, 140), interpolation=cv2.INTER_CUBIC)
Here img is thus a numpy array containing the original image, whereas res is a numpy array containing the resized image. An important aspect is the interpolation parameter: there are several ways how to resize an image. Especially since you scale down the image, and the size of the original image is not a multiple of the size of the resized image. Possible interpolation schemas are:
INTER_NEAREST - a nearest-neighbor interpolation
INTER_LINEAR - a bilinear interpolation (used by default)
INTER_AREA - resampling using pixel area relation. It may be a preferred method for image decimation, as it gives moire’-free
results. But when the image is zoomed, it is similar to the
INTER_NEAREST method.
INTER_CUBIC - a bicubic interpolation over 4x4 pixel neighborhood
INTER_LANCZOS4 - a Lanczos interpolation over 8x8 pixel neighborhood
Like with most options, there is no "best" option in the sense that for every resize schema, there are scenarios where one strategy can be preferred over another.

While it might be possible to use numpy alone to do this, the operation is not built-in. That said, you can use scikit-image (which is built on numpy) to do this kind of image manipulation.
Scikit-Image rescaling documentation is here.
For example, you could do the following with your image:
from skimage.transform import resize
bottle_resized = resize(bottle, (140, 54))
This will take care of things like interpolation, anti-aliasing, etc. for you.

One-line numpy solution for downsampling (by 2):
smaller_img = bigger_img[::2, ::2]
And upsampling (by 2):
bigger_img = smaller_img.repeat(2, axis=0).repeat(2, axis=1)
(this asssumes HxWxC shaped image. note this method only allows whole integer resizing (e.g., 2x but not 1.5x))

For people coming here from Google looking for a fast way to downsample images in numpy arrays for use in Machine Learning applications, here's a super fast method (adapted from here ). This method only works when the input dimensions are a multiple of the output dimensions.
The following examples downsample from 128x128 to 64x64 (this can be easily changed).
Channels last ordering
# large image is shape (128, 128, 3)
# small image is shape (64, 64, 3)
input_size = 128
output_size = 64
bin_size = input_size // output_size
small_image = large_image.reshape((output_size, bin_size,
output_size, bin_size, 3)).max(3).max(1)
Channels first ordering
# large image is shape (3, 128, 128)
# small image is shape (3, 64, 64)
input_size = 128
output_size = 64
bin_size = input_size // output_size
small_image = large_image.reshape((3, output_size, bin_size,
output_size, bin_size)).max(4).max(2)
For grayscale images just change the 3 to a 1 like this:
Channels first ordering
# large image is shape (1, 128, 128)
# small image is shape (1, 64, 64)
input_size = 128
output_size = 64
bin_size = input_size // output_size
small_image = large_image.reshape((1, output_size, bin_size,
output_size, bin_size)).max(4).max(2)
This method uses the equivalent of max pooling. It's the fastest way to do this that I've found.

If anyone came here looking for a simple method to scale/resize an image in Python, without using additional libraries, here's a very simple image resize function:
#simple image scaling to (nR x nC) size
def scale(im, nR, nC):
nR0 = len(im) # source number of rows
nC0 = len(im[0]) # source number of columns
return [[ im[int(nR0 * r / nR)][int(nC0 * c / nC)]
for c in range(nC)] for r in range(nR)]
Example usage: resizing a (30 x 30) image to (100 x 200):
import matplotlib.pyplot as plt
def sqr(x):
return x*x
def f(r, c, nR, nC):
return 1.0 if sqr(c - nC/2) + sqr(r - nR/2) < sqr(nC/4) else 0.0
# a red circle on a canvas of size (nR x nC)
def circ(nR, nC):
return [[ [f(r, c, nR, nC), 0, 0]
for c in range(nC)] for r in range(nR)]
plt.imshow(scale(circ(30, 30), 100, 200))
Output:
This works to shrink/scale images, and works fine with numpy arrays.

For people who wants to resize(interpolate) a batch of numpy array, pytorch provide a faster function names torch.nn.functional.interpolate, just remember to use np.transpose first to change the channel from batchxWxHx3 to batchx3xWxH.

SciPy's imresize() method was another resize method, but it will be removed starting with SciPy v 1.3.0 . SciPy refers to PIL image resize method: Image.resize(size, resample=0)
size – The requested size in pixels, as a 2-tuple: (width, height).
resample – An optional resampling filter. This can be one of PIL.Image.NEAREST (use nearest neighbour), PIL.Image.BILINEAR (linear interpolation), PIL.Image.BICUBIC (cubic spline interpolation), or PIL.Image.LANCZOS (a high-quality downsampling filter). If omitted, or if the image has mode “1” or “P”, it is set PIL.Image.NEAREST.
Link here:
https://pillow.readthedocs.io/en/3.1.x/reference/Image.html#PIL.Image.Image.resize

Stumbled back upon this after a few years. It looks like the answers so far fall into one of a few categories:
Use an external library. (OpenCV, SciPy, etc)
User Power-of-Two Scaling
Use Nearest Neighbor
These solutions are all respectable, so I offer this only for completeness. It has three advantages over the above: (1) it will accept arbitrary resolutions, even non-power-of-two scaling factors; (2) it uses pure Python+Numpy with no external libraries; and (3) it interpolates all the pixels for an arguably 'nicer-looking' result.
It does not make good use of Numpy and, thus, is not fast, especially for large images. If you're only rescaling smaller images, it should be fine. I offer this under Apache or MIT license at the discretion of the user.
import math
import numpy
def resize_linear(image_matrix, new_height:int, new_width:int):
"""Perform a pure-numpy linear-resampled resize of an image."""
output_image = numpy.zeros((new_height, new_width), dtype=image_matrix.dtype)
original_height, original_width = image_matrix.shape
inv_scale_factor_y = original_height/new_height
inv_scale_factor_x = original_width/new_width
# This is an ugly serial operation.
for new_y in range(new_height):
for new_x in range(new_width):
# If you had a color image, you could repeat this with all channels here.
# Find sub-pixels data:
old_x = new_x * inv_scale_factor_x
old_y = new_y * inv_scale_factor_y
x_fraction = old_x - math.floor(old_x)
y_fraction = old_y - math.floor(old_y)
# Sample four neighboring pixels:
left_upper = image_matrix[math.floor(old_y), math.floor(old_x)]
right_upper = image_matrix[math.floor(old_y), min(image_matrix.shape[1] - 1, math.ceil(old_x))]
left_lower = image_matrix[min(image_matrix.shape[0] - 1, math.ceil(old_y)), math.floor(old_x)]
right_lower = image_matrix[min(image_matrix.shape[0] - 1, math.ceil(old_y)), min(image_matrix.shape[1] - 1, math.ceil(old_x))]
# Interpolate horizontally:
blend_top = (right_upper * x_fraction) + (left_upper * (1.0 - x_fraction))
blend_bottom = (right_lower * x_fraction) + (left_lower * (1.0 - x_fraction))
# Interpolate vertically:
final_blend = (blend_top * y_fraction) + (blend_bottom * (1.0 - y_fraction))
output_image[new_y, new_x] = final_blend
return output_image
Sample rescaling:
Original:
Downscaled by Half:
Upscaled by one and one quarter:

Are there any libraries to do this in numpy/SciPy
Sure. You can do this without OpenCV, scikit-image or PIL.
Image resizing is basically mapping the coordinates of each pixel from the original image to its resized position.
Since the coordinates of an image must be integers (think of it as a matrix), if the mapped coordinate has decimal values, you should interpolate the pixel value to approximate it to the integer position (e.g. getting the nearest pixel to that position is known as Nearest neighbor interpolation).
All you need is a function that does this interpolation for you. SciPy has interpolate.interp2d.
You can use it to resize an image in numpy array, say arr, as follows:
W, H = arr.shape[:2]
new_W, new_H = (600,300)
xrange = lambda x: np.linspace(0, 1, x)
f = interp2d(xrange(W), xrange(H), arr, kind="linear")
new_arr = f(xrange(new_W), xrange(new_H))
Of course, if your image is RGB, you have to perform the interpolation for each channel.
If you would like to understand more, I suggest watching Resizing Images - Computerphile.

import cv2
import numpy as np
image_read = cv2.imread('filename.jpg',0)
original_image = np.asarray(image_read)
width , height = 452,452
resize_image = np.zeros(shape=(width,height))
for W in range(width):
for H in range(height):
new_width = int( W * original_image.shape[0] / width )
new_height = int( H * original_image.shape[1] / height )
resize_image[W][H] = original_image[new_width][new_height]
print("Resized image size : " , resize_image.shape)
cv2.imshow(resize_image)
cv2.waitKey(0)

Image pre-procesisng with dataframe

I have about 1000 images in a cvs file. I have already managed to put those images in my Python programm by doing the following:
df = pd.read_csv("./Testes_small.csv")
# Creates the dataframe
training_set = pd.DataFrame({'Images': training_imgs,'Labels': training_labels})
train_dataGen = ImageDataGenerator(rescale=1./255)
train_generator = train_dataGen.flow_from_dataframe(dataframe = training_set, directory="",
x_col="Images", y_col="Labels",
class_mode="categorical",
target_size=(224, 224),batch_size=32)
##Steps to plot the images
imgs,labels = next(train_generator)
for i in range(batch_size): # range de 0 a 31
image = imgs[i]
plt.imshow(image)
plt.show()
So now I have the train_generator variable of type python.keras.preprocessing.image.DataframeIterator Its size is (32,224,224,3).
In the function ImageDataGenerator I want to put my own preprocessing function to resize the images. I want to do this because I have some rectangular images that when resized lose its ratio.
Per examples these images before(upper image) and after(the lower one) resizing:
Clearly the secong image loses shape
I found this function(it's the answer to a previous thread):
def resize_image(self, image: Image, length: int) -> Image:
"""
Resize an image to a square. Can make an image bigger to make it fit or smaller if it doesn't fit. It also crops
part of the image.
:param self:
:param image: Image to resize.
:param length: Width and height of the output image.
:return: Return the resized image.
"""
"""
Resizing strategy :
1) We resize the smallest side to the desired dimension (e.g. 1080)
2) We crop the other side so as to make it fit with the same length as the smallest side (e.g. 1080)
"""
if image.size[0] < image.size[1]:
# The image is in portrait mode. Height is bigger than width.
# This makes the width fit the LENGTH in pixels while conserving the ration.
resized_image = image.resize((length, int(image.size[1] * (length / image.size[0]))))
# Amount of pixel to lose in total on the height of the image.
required_loss = (resized_image.size[1] - length)
# Crop the height of the image so as to keep the center part.
resized_image = resized_image.crop(
box=(0, required_loss / 2, length, resized_image.size[1] - required_loss / 2))
# We now have a length*length pixels image.
return resized_image
else:
# This image is in landscape mode or already squared. The width is bigger than the heihgt.
# This makes the height fit the LENGTH in pixels while conserving the ration.
resized_image = image.resize((int(image.size[0] * (length / image.size[1])), length))
# Amount of pixel to lose in total on the width of the image.
required_loss = resized_image.size[0] - length
# Crop the width of the image so as to keep 1080 pixels of the center part.
resized_image = resized_image.crop(
box=(required_loss / 2, 0, resized_image.size[0] - required_loss / 2, length))
# We now have a length*length pixels image.
return resized_image
I'm trying to insert it like this img_datagen = ImageDataGenerator(rescale=1./255, preprocessing_function = resize_image but it doesn't work because I'm not giving an im. Do you have any ideas on how can I do this?

Check the documentation for providing custom functions to the ImageDataGenerator. It says and quote,
"preprocessing_function: function that will be applied on each input. The function will run after the image is resized and augmented. The function should take one argument: one image (Numpy tensor with rank 3), and should output a Numpy tensor with the same shape."
From the above documentation we can note the following:
When will this function be executed:
after resizing image.
after any data augmentation which has to be done.
Function argument requirements:
only one argument.
this argument is for only one numpy image.
the image should be numpy tensor of rank 3
Function output requirements:
one output image.
should be same shape as input
This last point is really important for your question. Since your function is resizing the image it's output will not be the same shape as input and so you cannot do this directly.
One alternative to get this done is to do your resizing of dataset before passing to ImageDataGenerator.

Resample DICOM Images to Size and Spacing and align to same Origin

I have a set of 4 DICOM CT Volumes which I am reading with SimpleITK ImageSeriesReader. Two of the images represent the CT of patient before and after the surgery. The other two images are binary segmentation masks segmented on the former 2 CT images. The segmentations are a ROI of their source CT.
All the 4 CT images, have different Size, Spacing, Origin and Direction. I have tried applying this GitHub gist https://gist.github.com/zivy/79d7ee0490faee1156c1277a78e4a4c4 to resize my images to 512x512x512 and Spacing 1x1x1. However, it doesn't place the images at the correct location. The segmented structure is always placed in the center of the CT image, instead of the correct location, as you can see from the pictures.
This my "raw" DICOM Image with its tumor segmentation (orange blob).
This is after the "resizing" algorithm and writing to disk (same image as before, just the tumor is colored green blob because inconsistency):
Code used for resampling all 4 DICOM Volumes to the same dimensions:
def resize_resample_images(images):
""" Resize all the images to the same dimensions, spacing and origin.
Usage: newImage = resize_image(source_img_plan, source_img_validation, ROI(ablation/tumor)_mask)
1. translate to same origin
2. largest number of slices and interpolate the others.
3. same resolution 1x1x1 mm3 - resample
4. (physical space)
Slice Thickness (0018,0050)
ImagePositionPatient (0020,0032)
ImageOrientationPatient (0020,0037)
PixelSpacing (0028,0030)
Frame Of Reference UID (0020,0052)
"""
# %% Define tuple to store the images
tuple_resized_imgs = collections.namedtuple('tuple_resized_imgs',
['img_plan',
'img_validation',
'ablation_mask',
'tumor_mask'])
# %% Create Reference image with zero origin, identity direction cosine matrix and isotropic dimension
dimension = images.img_plan.GetDimension() #
reference_direction = np.identity(dimension).flatten()
reference_size = [512] * dimension
reference_origin = np.zeros(dimension)
data = [images.img_plan, images.img_validation, images.ablation_mask, images.tumor_mask]
reference_spacing = np.ones(dimension) # resize to isotropic size
reference_image = sitk.Image(reference_size, images.img_plan.GetPixelIDValue())
reference_image.SetOrigin(reference_origin)
reference_image.SetSpacing(reference_spacing)
reference_image.SetDirection(reference_direction)
reference_center = np.array(
reference_image.TransformContinuousIndexToPhysicalPoint(np.array(reference_image.GetSize()) / 2.0))
#%% Paste the GT segmentation masks before transformation
tumor_mask_paste = (paste_roi_image(images.img_plan, images.tumor_mask))
ablation_mask_paste = (paste_roi_image(images.img_validation, images.ablation_mask))
images.tumor_mask = tumor_mask_paste
images.ablation_mask = ablation_mask_paste
# %% Apply transforms
data_resized = []
for idx,img in enumerate(data):
transform = sitk.AffineTransform(dimension) # use affine transform with 3 dimensions
transform.SetMatrix(img.GetDirection()) # set the cosine direction matrix
# TODO: check translation when computing the segmentations
transform.SetTranslation(np.array(img.GetOrigin()) - reference_origin) # set the translation.
# Modify the transformation to align the centers of the original and reference image instead of their origins.
centering_transform = sitk.TranslationTransform(dimension)
img_center = np.array(img.TransformContinuousIndexToPhysicalPoint(np.array(img.GetSize()) / 2.0))
centering_transform.SetOffset(np.array(transform.GetInverse().TransformPoint(img_center) - reference_center))
centered_transform = sitk.Transform(transform)
centered_transform.AddTransform(centering_transform)
# Using the linear interpolator as these are intensity images, if there is a need to resample a ground truth
# segmentation then the segmentation image should be resampled using the NearestNeighbor interpolator so that
# no new labels are introduced.
if (idx==1 or idx==2): # temporary solution to resample the GT image with NearestNeighbour
resampled_img = sitk.Resample(img, reference_image, centered_transform, sitk.sitkNearestNeighbor, 0.0)
else:
resampled_img = sitk.Resample(img, reference_image, centered_transform, sitk.sitkLinear, 0.0)
# append to list
data_resized.append(resampled_img)
# assuming the order stays the same, reassigng back to tuple
resized_imgs = tuple_resized_imgs(img_plan=data_resized[0],
img_validation=data_resized[1],
ablation_mask=data_resized[2],
tumor_mask=data_resized[3])
Code for "pasting" the ROI segmentations images into a correct size. Might be redundant.:
def paste_roi_image(image_source, image_roi):
""" Resize ROI binary mask to size, dimension, origin of its source/original img.
Usage: newImage = paste_roi_image(source_img_plan, roi_mask)
"""
newSize = image_source.GetSize()
newOrigin = image_source.GetOrigin()
newSpacing = image_roi.GetSpacing()
newDirection = image_roi.GetDirection()
if image_source.GetSpacing() != image_roi.GetSpacing():
print('the spacing of the source and derived mask differ')
# re-cast the pixel type of the roi mask
pixelID = image_source.GetPixelID()
caster = sitk.CastImageFilter()
caster.SetOutputPixelType(pixelID)
image_roi = caster.Execute(image_roi)
# black 3D image
outputImage = sitk.Image(newSize, image_source.GetPixelIDValue())
outputImage.SetOrigin(newOrigin)
outputImage.SetSpacing(newSpacing)
outputImage.SetDirection(newDirection)
# transform from physical point to index the origin of the ROI image
# img_center = np.array(img.TransformContinuousIndexToPhysicalPoint(np.array(img.GetSize()) / 2.0))
destinationIndex = outputImage.TransformPhysicalPointToIndex(image_roi.GetOrigin())
# paste the roi mask into the re-sized image
pasted_img = sitk.Paste(outputImage, image_roi, image_roi.GetSize(), destinationIndex=destinationIndex)
return pasted_img

Using HOG to get image descriptors but hog.compute always returns 0.0

64x64 image
Trying to use OpenCV to do something simple in Python. Using HOG to get the feature vector. But I get all 0.0. I've tried a few images same result.
import cv2
image = cv2.imread("D:\\skhan\\research\\data\\face\\test.jpg",cv2.IMREAD_COLOR)
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
winSize = (64,64)
blockSize = (16,16)
blockStride = (8,8)
cellSize = (8,8)
nbins = 9
derivAperture = 1
winSigma = 4.0
histogramNormType = 0
L2HysThreshold = 2.0000000000000001e-01
gammaCorrection = 0
nlevels = 64
hog = cv2.HOGDescriptor(winSize,blockSize,blockStride,cellSize,nbins,derivAperture,winSigma,histogramNormType,L2HysThreshold,gammaCorrection,nlevels)
winStride = (8,8)
padding = (8,8)
locations = ((10,20),)
hist = hog.compute(image,winStride,padding,locations)

Simply using hist = hog.compute(image) gives a non-zero result.
By default winstride, padding, locations are empty tuples. However, if you are firm on using winstride, etc your image is 64x64 and your window size (winsize) is 64x64, adding a 8x8 winstride will cause no overlapping of the window on the image, hence your output is filled with zeros.
To clarify, winsize is the size of the sliding window that is run across the entire image. Since the size of the image and the sliding window is the same, due to the manner in which HoG features are calculated, there will be no features created, which is why you're seeing zeros.

skimage resize changes total sum of array

I want to resize an image in fits format to a smaller dimension. For example, I would like to resize my 100x100 pixel image to a 58x58 pixel image. The values of the array are intensity or flux values. I want the total intensity of the image to be conserved after transformation. This does not work with skimage resize. My total value reduces depending on what factor I scale up or scale down. I have shown below the code I tried so far.
import numpy as np
from skimage.transform import resize
image=fits.open(directory+file1)
cutout=image[0].data
out = resize(cutout, (58,58), order=1, preserve_range=True)
print(np.sum(out),np.sum(cutout))
My output is:
0.074657436655 0.22187 (I want these two values to be equal)
If I scale it to the same dimension using:
out = resize(cutout, (100,100), order=1, preserve_range=True)
print(np.sum(out),np.sum(cutout))
My output is very close to what I want:
0.221869631852 0.22187
I have the same problem if I try to increase the image size as well.
out = resize(cutout, (200,200), order=1, preserve_range=True)
print(np.sum(out),np.sum(cutout))
Output:
0.887316320731 0.22187
I would like to know if there is any workaround to this problem.
EDIT 1:
I just realized that if I multiply my image by the square of the scale of which I want to increase or decrease the size of my image, then my total sum is conserved.
For example:
x=58
out = resize(cutout, (x,x), order=1, preserve_range=True)
test=out*(100/x)**2
print(np.sum(test),np.sum(cutout))
My output is very close to what I want but slightly higher:
0.221930548915 0.22187
I tried this with different dimensions and it works except for really small values. Can anybody explain why this relation is true or is this just a statistical coincidence.

If you treat an image I = Width x Height where N = Width x Height as a set of pixels with intensities in the range of [0,1], it is completely normal that after resizing the image to M = newWidth x newWeight the sum of intensities completely differs from before.
Assume that an image I with N pixels has intensities uniformly distributed in the range [0,1]. Then the sum of intensities will approximately be 0.5 * N. If you use skimage's resize, the image will be resized to a lower (or larger) size by interpolating. Interpolating does not accumulate values (as you seem to expect), it does instead average values in a neighbourhood to predict the value of each of the pixels in the new image. Thus, the intensity range of the image does not change,the values are modified, and thus, the sum of intensites of the new resized image will approximately be 0.5 * M. If M != N then the sum of intensities will differ a lot.
What you can do to solve this problem is:
Re-scale your new data proportional to its size:
>>> y, x = (57, 58)
>>> out = resize(data, (y,x), order=1, preserve_range=True)
>>> out = out * (data.shape[0] / float(y)) * (data.shape[1] / float(x))
Which is analogous to what you propose but for any size image (not just square images). This however, compensates for every pixel with a constant factor out[i,j] *= X where X is equal for every pixel in the image, and not all the pixels will be interpolated with the same weight, thus, adding small artificial artifacts.
I think it is just best to replace the total sum of the image (which depends on the number of pixels on the image) with the average intensity in the image (which doesn't rely on the number of pixels)
>>> meanI = np.sum(I) / float(I.size) # Exactly the same as np.mean(I) or I.mean()
>>> meanInew = np.sum(out) / float(out.size)
>>> np.isclose(meanI, meanInew) # True

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

OpenCV hog features explanation - python

Related

OpenCV can't resize() a numpy array created from a pygame.PixelArray, error: src data type = 8 is not supported [duplicate]

Image pre-procesisng with dataframe

Resample DICOM Images to Size and Spacing and align to same Origin

Using HOG to get image descriptors but hog.compute always returns 0.0

skimage resize changes total sum of array

Categories

Resources