I want to make a simple Program which outputs a video as an Webcam, but the Cam wants a RGBA Numpy Array but I only have RGB from the video. How can I convert the 3 dimensional array to 4 dimensions?
You're actually not converting a 3-dimensional array to a 4-dimensional array. You're changing the size of one of the dimensions from three to four.
Lets say you have a NxMx3 image. You then need to:
temp = np.zeros((N, M, 4))
temp[:,:,0:3] = image
temp[:,:,3] = whatever default alpha you choose to use.
Generalize as you see fit.
Assuming your existing array is shaped (xsize, ysize, 3) and you want to create alpha as a 4th entry all filled with 1, you should be able to do something like
alpha = np.ones((*rgb.shape[0:2], 1))
rgba = np.concatenate((rgb, alpha), axis=2)
If you wanted a different uniform alpha value you could use np.full with that value instead of np.ones, but normally when converting RGB to RGBA you want fully opaque.
You can np.dstack your original im with np.ones(im.shape[:2])
new_im = np.dstack((im, np.ones(im.shape[:2])))
update: this is equivalent to #hobbs solution np.concatenate(..., axis=2)
Maybe try something like these: (import numpy as np)
arr # shape (n_bands, y_pixels, x_pixels)
swapped = np.moveaxis(arr, 0, 2) # shape (y_pixels, x_pixels, n_bands)
arr4d = np.expand_dims(swapped, 0) # shape (1, y_pixels, x_pixels, n_bands)
Related
I have array with 12 bands:
array
(53, 44, 12)
I want to create RGB image from this array.
In order to do that I know that I need to slice the 3,2 and the 1 channels from the array and to create an image from them. In order to take those bands I have done this:
red=array[:,:,3]
green=array[:,:,2]
blue=array[:,:,1]
#stack together:
rgb=np.stack((red,green,blue))
The problem is that whenever I want to display the image as rgb I get error:
fig, ax = plt.subplots(figsize=(20,10))
ax.matshow(rgb)
TypeError: Invalid shape (3, 44, 12) for image data
My first idea to solve it was to change the order of the axis as I thought that it takes the 3 as number of rows, I have used np.moveaxis:
rgb=np.moveaxis(rgb, 0, -1).shape
rgb.shape
>>>
(44,12,3)
but then when I tried again with matshow I have got error again:
TypeError: Invalid shape (3,) for image data
I'm not sure where is my mistake as in the past using matshow similarly worked.
My end goal is to be able to plot the three bands as rgb image.
You need to stack along the 3rd axis:
rgb = np.dstack((red,green,blue))
or
rgb = np.stack((red,green,blue), axis=2)
As Mark Setchell already pointed out you can do the same much easier by standard slicing. If the 12 bands of your array in fact represent 4 images (3 channels each) you could also split the array along axis 2 in 4 arrays like this:
a1, a2, a3, a4 = np.split(array, 4, 2)
If you have an image with 12 bands like this:
image[53,44,12]
and you want the first three bands, just use:
b0b1b2 = image[..., :3]
Check what you have with:
print(b0b1b2.shape)
(53, 44, 3)
This is Numpy slicing.
If you want bands 0, 4 and 5, use
b0b4b5 = image[..., [0,4,5]]
I'm trying to use the "compare_ssim" function. I currently have two 2xN matrices of x,y coordinates where the first row is all the x coordinates and the second row is all the y coordinates of each of the two images. How can I calculate the SSIM for these two images (if there is a way to do so)
For example I have:
X = np.array([[1,2,3], [4,5,6]])
Y = np.array([[3,4,5],[5,6,7]])
compare_ssim(X,Y)
But I am getting the error
ValueError: win_size exceeds image extent. If the input is a multichannel (color) image, set multichannel=True.
I'm not sure if I am missing a parameter or if I should convert the matrices in such a way that this function works. Or if there is a way that I am supposed to convert my coordinates to a grayscale matrix? I'm a bit confused on what the matrices for the parameters of the function should look like. I know that they are supposed to be ndarrays but the type(Y) and type(Y) are both numpy.ndarray.
Since you haven't mentioned which framework/library you are using, I am going with the assumption that you are using skimage's compare_ssim.
The error in question is due to the shape of your inputs. You can find more details here.
TL;DR: compare_ssim expects images in (H, W, C) dimensions but your input images have a dimension of (2, 3). So the function is confused which dimension to treat as the channel dimension. When multichannel=True, the last dimension is treated as the channel dimension.
There are 3 key problems with your code,
compare_image expects Images as input. So your X and Y matrices should be of the dimensions (H, W, C) and not (2, 3)
They should of float datatype.
Below I have shown a bit of demo code (note: Since skimage v1.7, compare_ssim has been moved to skimage.metrics.structural_similarity)
import numpy as np
from skimage.metrics import structural_similarity
img1 = np.random.randint(0, 255, size=(200, 200, 3)).astype(np.float32)
img2 = np.random.randint(0, 255, size=(200, 200, 3)).astype(np.float32)
ssim_score = structural_similarity(img1, img2, multichannel=True) #score: 0.0018769083894301646
ssim_score = structural_similarity(img1, img1, multichannel=True) #score: 1.0
I can declare a 3D array like this:
a 3D array, shape-(2, 2, 2)
3D_array = np.array([[[0, 1],[2, 3]], [[4, 5],[6, 7]]])
So if I have an image 10*10(pixels) 3 rgb channels, image.shape would be (3x10x10).
But i see all the time image.shape equal to (10x10x3), i don't understand why?
Thanks for you attention.
Usually in numpy and matplotlib the rgb channels are in the last axis. This is just a convention, so you can do little about this. If you use a program that uses the other convention (channels first), you can transform the image with:
channels_first_im = np.moveaxis(channels_last_im, 0, 1)
and the other way:
channels_last_im = np.moveaxis(channels_first_im, 0, -1)
If you're confused about why the convention image arrays would be of shape (N, M, 3) instead of (3, N, M), let's look at how indexing would work in both of those scenarios.
Let's assume we have an image called image_array, that represents a random colored with a width and height of 100 pixels, and let's try to index it to access the value of the pixel at index (50, 50).
Channels First
import numpy as np
image = np.random.random((3, 100, 100)) #image.shape == (3, 100, 100)
pixel = image[:, 50, 50] #pixel.shape == (3,)
Channels Last
import numpy as np
image = np.random.random((100, 100, 3)) #image.shape == (100, 100, 3)
pixel = image[50, 50] #pixel.shape == (3,)
Having the channels as the last dimension of the array, means that the individual pixel information is easier to index to find. Where as in the first case, we need to specify that we want the entire first dimension every time. These are inherently the same thing, but leaving the channels last allows us to be less verbose as to how we index the array.
I have a list of PIL images: p0, p1, ..., p85999 (a total of 86000 of them). They are all RGB, of size 30x30px.
I need to convert them to normalized numpy arrays, I did the following:
[np.asarray(r).astype('float32') / 255.0) for r in images]
where r is a PIL image.
This gives an array of numpy arrays.
However, these arrays are sometimes of shape (30,30,3) and sometimes of shape (30,30).
I want them always to be of shape (30,30,3).
I'm guessing numpy does this for performance reasons (when RGB is not
needed, eg. white images?).
Anyway, how to get the desired result - get all numpy arrays to be of size (30,30,3)?
Also, ideally I would want my final numpy array to be of size (30, 30, 3, 86000). Is there a shortcut to create such an array straight from PIL images?
I'm guessing numpy does this for performance reasons
Numpy has nothing to do with it, this is your PIL Image having one channel only.
The simplest solution is to just convert everything to RGB:
ims = [np.asarray(r.convert('RGB')).astype('float32') / 255.0) for r in images]
If you then call np.asarray(ims), you'll obtain an array of shape [N,30,30,3] where N is the number of images, which you can then transpose to your desired ordering.
I have a greyscale image, represented by a 2D array of integers, shape (1000, 1000).
I then use sklearn.feature_extraction.image.extract_patches_2d() to generate an array of 3x3 'patches' from this image, resulting in an array of shape (1000000, 3, 3), as there are 1 million 3x3 arrays for each pixel value in the original image.
I reshape this to (1000, 1000, 3, 3), which is a 1000x1000 array of 3x3 arrays, one 3x3 array for each pixel in the original image.
I now want to effectively subtract the 2D array from the 4D array. I have already found a method to do this, but I would like to make one using vectorisation.
I currently iterate through each pixel and subtract the value there from the 3x3 array at the same index. This is a little bit slow.
This is what currently loads images, formats the arrays before hand, and then performs this subtraction.
from PIL import Image, ImageOps
from skimage import io
from sklearn.feature_extraction import image
import numpy
jitter = 1
patchsize = (jitter*2)+1
#load image as greyscale image using PIL
original = load_image_greyscale(filename)
#create a padded version of the image so that 1000x1000 patches are made
#instead of 998x998
padded = numpy.asarray(ImageOps.expand(original,jitter))
#extract these 3x3 patches using sklearn
patches = image.extract_patches_2d(padded,(patchsize,patchsize))
#convert image to numpy array
pixel_array = numpy.asarray(original)
#then reshape the array of patches so it matches array_image
patch_array = numpy.reshape(patches, (pixel_array.shape[0],pixel_array.shape[1],patchsize,patchsize))
#create a copy for results
patch_array_copy = numpy.copy(patch_array)
#iterate over each 3x3 array in the patch array and subtract the pixel value
#at the same index in the pixel array
for x in range(pixel_array.shape[0]):
for y in range(pixel_array.shape[1]):
patch_array_copy[x,y] = patch_array[x,y] - pixel_array[x,y]
I would like a way to perform the final step in the for loop using matrix operations.
I would also like to extend this at some point to work with RGB images, effectively making it a subtraction of an array with shape(1000,1000,3) from an array with shape(1000,1000,3,3,3). But i'm trying to go one step at a time here.
Any help or tips or suggestions or links to helpful resources would be greatly appreciated.