I would like to take an image and change the scale of the image, while it is a numpy array.
For example I have this image of a coca-cola bottle:
bottle-1
Which translates to a numpy array of shape (528, 203, 3) and I want to resize that to say the size of this second image:
bottle-2
Which has a shape of (140, 54, 3).
How do I change the size of the image to a certain shape while still maintaining the original image? Other answers suggest stripping every other or third row out, but what I want to do is basically shrink the image how you would via an image editor but in python code. Are there any libraries to do this in numpy/SciPy?
Yeah, you can install opencv (this is a library used for image processing, and computer vision), and use the cv2.resize function. And for instance use:
import cv2
import numpy as np
img = cv2.imread('your_image.jpg')
res = cv2.resize(img, dsize=(54, 140), interpolation=cv2.INTER_CUBIC)
Here img is thus a numpy array containing the original image, whereas res is a numpy array containing the resized image. An important aspect is the interpolation parameter: there are several ways how to resize an image. Especially since you scale down the image, and the size of the original image is not a multiple of the size of the resized image. Possible interpolation schemas are:
INTER_NEAREST - a nearest-neighbor interpolation
INTER_LINEAR - a bilinear interpolation (used by default)
INTER_AREA - resampling using pixel area relation. It may be a preferred method for image decimation, as it gives moire’-free
results. But when the image is zoomed, it is similar to the
INTER_NEAREST method.
INTER_CUBIC - a bicubic interpolation over 4x4 pixel neighborhood
INTER_LANCZOS4 - a Lanczos interpolation over 8x8 pixel neighborhood
Like with most options, there is no "best" option in the sense that for every resize schema, there are scenarios where one strategy can be preferred over another.
While it might be possible to use numpy alone to do this, the operation is not built-in. That said, you can use scikit-image (which is built on numpy) to do this kind of image manipulation.
Scikit-Image rescaling documentation is here.
For example, you could do the following with your image:
from skimage.transform import resize
bottle_resized = resize(bottle, (140, 54))
This will take care of things like interpolation, anti-aliasing, etc. for you.
One-line numpy solution for downsampling (by 2):
smaller_img = bigger_img[::2, ::2]
And upsampling (by 2):
bigger_img = smaller_img.repeat(2, axis=0).repeat(2, axis=1)
(this asssumes HxWxC shaped image. note this method only allows whole integer resizing (e.g., 2x but not 1.5x))
For people coming here from Google looking for a fast way to downsample images in numpy arrays for use in Machine Learning applications, here's a super fast method (adapted from here ). This method only works when the input dimensions are a multiple of the output dimensions.
The following examples downsample from 128x128 to 64x64 (this can be easily changed).
Channels last ordering
# large image is shape (128, 128, 3)
# small image is shape (64, 64, 3)
input_size = 128
output_size = 64
bin_size = input_size // output_size
small_image = large_image.reshape((output_size, bin_size,
output_size, bin_size, 3)).max(3).max(1)
Channels first ordering
# large image is shape (3, 128, 128)
# small image is shape (3, 64, 64)
input_size = 128
output_size = 64
bin_size = input_size // output_size
small_image = large_image.reshape((3, output_size, bin_size,
output_size, bin_size)).max(4).max(2)
For grayscale images just change the 3 to a 1 like this:
Channels first ordering
# large image is shape (1, 128, 128)
# small image is shape (1, 64, 64)
input_size = 128
output_size = 64
bin_size = input_size // output_size
small_image = large_image.reshape((1, output_size, bin_size,
output_size, bin_size)).max(4).max(2)
This method uses the equivalent of max pooling. It's the fastest way to do this that I've found.
If anyone came here looking for a simple method to scale/resize an image in Python, without using additional libraries, here's a very simple image resize function:
#simple image scaling to (nR x nC) size
def scale(im, nR, nC):
nR0 = len(im) # source number of rows
nC0 = len(im[0]) # source number of columns
return [[ im[int(nR0 * r / nR)][int(nC0 * c / nC)]
for c in range(nC)] for r in range(nR)]
Example usage: resizing a (30 x 30) image to (100 x 200):
import matplotlib.pyplot as plt
def sqr(x):
return x*x
def f(r, c, nR, nC):
return 1.0 if sqr(c - nC/2) + sqr(r - nR/2) < sqr(nC/4) else 0.0
# a red circle on a canvas of size (nR x nC)
def circ(nR, nC):
return [[ [f(r, c, nR, nC), 0, 0]
for c in range(nC)] for r in range(nR)]
plt.imshow(scale(circ(30, 30), 100, 200))
Output:
This works to shrink/scale images, and works fine with numpy arrays.
For people who wants to resize(interpolate) a batch of numpy array, pytorch provide a faster function names torch.nn.functional.interpolate, just remember to use np.transpose first to change the channel from batchxWxHx3 to batchx3xWxH.
SciPy's imresize() method was another resize method, but it will be removed starting with SciPy v 1.3.0 . SciPy refers to PIL image resize method: Image.resize(size, resample=0)
size – The requested size in pixels, as a 2-tuple: (width, height).
resample – An optional resampling filter. This can be one of PIL.Image.NEAREST (use nearest neighbour), PIL.Image.BILINEAR (linear interpolation), PIL.Image.BICUBIC (cubic spline interpolation), or PIL.Image.LANCZOS (a high-quality downsampling filter). If omitted, or if the image has mode “1” or “P”, it is set PIL.Image.NEAREST.
Link here:
https://pillow.readthedocs.io/en/3.1.x/reference/Image.html#PIL.Image.Image.resize
Stumbled back upon this after a few years. It looks like the answers so far fall into one of a few categories:
Use an external library. (OpenCV, SciPy, etc)
User Power-of-Two Scaling
Use Nearest Neighbor
These solutions are all respectable, so I offer this only for completeness. It has three advantages over the above: (1) it will accept arbitrary resolutions, even non-power-of-two scaling factors; (2) it uses pure Python+Numpy with no external libraries; and (3) it interpolates all the pixels for an arguably 'nicer-looking' result.
It does not make good use of Numpy and, thus, is not fast, especially for large images. If you're only rescaling smaller images, it should be fine. I offer this under Apache or MIT license at the discretion of the user.
import math
import numpy
def resize_linear(image_matrix, new_height:int, new_width:int):
"""Perform a pure-numpy linear-resampled resize of an image."""
output_image = numpy.zeros((new_height, new_width), dtype=image_matrix.dtype)
original_height, original_width = image_matrix.shape
inv_scale_factor_y = original_height/new_height
inv_scale_factor_x = original_width/new_width
# This is an ugly serial operation.
for new_y in range(new_height):
for new_x in range(new_width):
# If you had a color image, you could repeat this with all channels here.
# Find sub-pixels data:
old_x = new_x * inv_scale_factor_x
old_y = new_y * inv_scale_factor_y
x_fraction = old_x - math.floor(old_x)
y_fraction = old_y - math.floor(old_y)
# Sample four neighboring pixels:
left_upper = image_matrix[math.floor(old_y), math.floor(old_x)]
right_upper = image_matrix[math.floor(old_y), min(image_matrix.shape[1] - 1, math.ceil(old_x))]
left_lower = image_matrix[min(image_matrix.shape[0] - 1, math.ceil(old_y)), math.floor(old_x)]
right_lower = image_matrix[min(image_matrix.shape[0] - 1, math.ceil(old_y)), min(image_matrix.shape[1] - 1, math.ceil(old_x))]
# Interpolate horizontally:
blend_top = (right_upper * x_fraction) + (left_upper * (1.0 - x_fraction))
blend_bottom = (right_lower * x_fraction) + (left_lower * (1.0 - x_fraction))
# Interpolate vertically:
final_blend = (blend_top * y_fraction) + (blend_bottom * (1.0 - y_fraction))
output_image[new_y, new_x] = final_blend
return output_image
Sample rescaling:
Original:
Downscaled by Half:
Upscaled by one and one quarter:
Are there any libraries to do this in numpy/SciPy
Sure. You can do this without OpenCV, scikit-image or PIL.
Image resizing is basically mapping the coordinates of each pixel from the original image to its resized position.
Since the coordinates of an image must be integers (think of it as a matrix), if the mapped coordinate has decimal values, you should interpolate the pixel value to approximate it to the integer position (e.g. getting the nearest pixel to that position is known as Nearest neighbor interpolation).
All you need is a function that does this interpolation for you. SciPy has interpolate.interp2d.
You can use it to resize an image in numpy array, say arr, as follows:
W, H = arr.shape[:2]
new_W, new_H = (600,300)
xrange = lambda x: np.linspace(0, 1, x)
f = interp2d(xrange(W), xrange(H), arr, kind="linear")
new_arr = f(xrange(new_W), xrange(new_H))
Of course, if your image is RGB, you have to perform the interpolation for each channel.
If you would like to understand more, I suggest watching Resizing Images - Computerphile.
import cv2
import numpy as np
image_read = cv2.imread('filename.jpg',0)
original_image = np.asarray(image_read)
width , height = 452,452
resize_image = np.zeros(shape=(width,height))
for W in range(width):
for H in range(height):
new_width = int( W * original_image.shape[0] / width )
new_height = int( H * original_image.shape[1] / height )
resize_image[W][H] = original_image[new_width][new_height]
print("Resized image size : " , resize_image.shape)
cv2.imshow(resize_image)
cv2.waitKey(0)
I want to make a simple Program which outputs a video as an Webcam, but the Cam wants a RGBA Numpy Array but I only have RGB from the video. How can I convert the 3 dimensional array to 4 dimensions?
You're actually not converting a 3-dimensional array to a 4-dimensional array. You're changing the size of one of the dimensions from three to four.
Lets say you have a NxMx3 image. You then need to:
temp = np.zeros((N, M, 4))
temp[:,:,0:3] = image
temp[:,:,3] = whatever default alpha you choose to use.
Generalize as you see fit.
Assuming your existing array is shaped (xsize, ysize, 3) and you want to create alpha as a 4th entry all filled with 1, you should be able to do something like
alpha = np.ones((*rgb.shape[0:2], 1))
rgba = np.concatenate((rgb, alpha), axis=2)
If you wanted a different uniform alpha value you could use np.full with that value instead of np.ones, but normally when converting RGB to RGBA you want fully opaque.
You can np.dstack your original im with np.ones(im.shape[:2])
new_im = np.dstack((im, np.ones(im.shape[:2])))
update: this is equivalent to #hobbs solution np.concatenate(..., axis=2)
Maybe try something like these: (import numpy as np)
arr # shape (n_bands, y_pixels, x_pixels)
swapped = np.moveaxis(arr, 0, 2) # shape (y_pixels, x_pixels, n_bands)
arr4d = np.expand_dims(swapped, 0) # shape (1, y_pixels, x_pixels, n_bands)
I´m currently working on normalizing ct-scans (x, y, layer). Normalizing the first two dimensions is simple using cv2.reshape, but the third dimension... My idea is to flatten the first two dimensions to get a 2d-numpy-array. If I do the reshape to (x * y) for each layer and reshape it back to (x, y) I get a completely different image. I have a img of a lung in the beginning and lines of different gray values afterwords.
test = cv2.resize(img, (img.shape[0] * img.shape[1], 1), interpolation=cv2.INTER_LINEAR)
test = cv2.resize(test, (159, 159), interpolation=cv2.INTER_LINEAR)
self.print_prediction(test, cv2.resize(temp2_masks[:, 0], (159, 159)),
color=False, shape=(159, 159))
I'm sure it's some kind of simple mistake, but I don't see it. So I would be very grateful for help.
The cv2.resize function does not reshape your array.
It actually resizes the image. Your first line is squashing your image horizontally while expanding it a lot vertically. The values are not preserved at all.
Use numpy.reshape to reshape your arrays instead.
I have a numpy array of shape (12,224,224). This is 12 images of size (244, 244). When I had a single image, this was simple. The image was of size (x,y). For example, x is an image of size (400,400), for which I could use view_as_blocks like this:
from skimage.util import view_as_blocks as vablks
xx = vablks(x, block_shape=(8,8))
This would result in a block of shape (50,50,8,8).
Now I would like to know how to apply this when I have a list of images. Either I lose shape, that is my 12 images are combined into one (224,224) block broken down into (28,28,8,8), or I run into a ValueError. Here is the code I tried to use for iterating over the 12 images and viewing the (224,224) shaped images
xx = []
for item_ in x:
xx.append(blockSplitter(item_))
where x is a list of images.
Here is the error:
ValueError: 'block_shape' is not compatible with 'arr_in'
Overall, I would like to know how to view the images as blocks of 8x8 without losing the images.
Help, Please and Thank You.
You have at least two options:
1) Convert the list to an array, as suggested by the commenter above. Then use view_as_blocks with the correct parameters:
from skimage.util import view_as_blocks
images = [np.zeros((50, 50)) for i in range(10)]
images = np.array(images)
all_blocks = view_as_blocks(images, block_shape=(1, 10, 10)).squeeze()
2) Convert each item in the list to a windowed view, and then convert the end result to an array:
from skimage.util import view_as_blocks
images = [np.zeros((50, 50)) for i in range(10)]
image_blocks = [view_as_blocks(image, block_shape=(10, 10)) for image in images]
all_blocks = np.array(image_blocks)
I'm really puzzled by the way of indexing a numpy multidimensional array. My goal is to crop a region from an image I loaded using opencv.
Loading the image works great:
import numpy as np
import cv2
img = cv2.imread(start_filename)
print img.shape
shape is displayed as
(2000L, 4096L, 3L)
Now I want to cut a part from the image which ranges from pixels 550 to 1550 in the first dimension and only consists of the last 782 pixels of the second dimension. I tried
img=img[550:1550][:-782][:]
print img.shape
Now the shape is displayed as
(782L, 4096L, 3L)
I'm confused, whats the correct way of indexing for the crop operation?
The correct way of cropping image is using slicing technique:
import cv2
img = cv2.imread("lenna.png")
crop_img = img[200:400, 100:300] # Crop from x, y, w, h -> 100, 200, 300, 400
# NOTE: its img[y: y + h, x: x + w] and *not* img[x: x + w, y: y + h]
In your case, the final cropped image may be reproduced as:
crop_img=img[550:1550, -782:]
print crop_img.shape
As mentioned in other answers you could use img[550:1550,-782:,:] but this will give you only a read only view of the array. It means that you cannot modify it. If you want to modify the image after you crop it you could use the ix_ function of Numpy for indexing.
img=img[ix_(range(550, 1550), range(img.shape[1]-782, img.shape[1]))]
# or
img=img[ix_(range(550, 1550), range(img.shape[1]-782, img.shape[1]), range(3))]
After this your shape will look like:
(1000, 782, 3)