Trouble Reshaping Nd-array - python

I am working in Python and I have an image array which is of shape [100,3,200,1200]. The array is of format Number_of_images x Channels x Height x Width. I want to split the images along the width direction into 6 images of shape 200x200 and add that as different channels. Ultimately, I would like to receive an array of shape [100,18,200,200].
I've attempted use the reshape function but it is not working as expected. I tried the following:
np.reshape([100,18,200,200])
When I plot each image, I notice that it is not cropping the image the way I wanted it to.

First reshape to make the splits:
a = np.reshape(a, (100, 3, 200, 6, 200))
Then move the split axis besides the channel axis:
a = np.moveaxis(a, 3, 2)
Then merge those two axes:
a = np.reshape(a, (100, 18, 200, 200))
In this case, the 18 channels would be sorted as:
[red-split1, red-split2, red-split3, red-split4, red-split5, red-split6,
green-split1, ..., green-split6,
blue-split1, ..., blue-split6]
If you change the second instruction to:
a = np.moveaxis(a, 3, 1)
The axes would be sorted as:
[red-split1, green-split1, blue-split1,
...,
red-split6, green-split6, blue-split6]

Related

Can't reshape color palette tuples using numpy

I have the following code which gets the color palette from series of images and try to reshape the output using numpy reshape. but when I try reshaping the output I get the error can't reshape array of size 27 into shape (3,3).
The output of Colours array print out is like this
[(256,256,265),(256,256,265),(256,256,265),(256,256,265),(256,256,265),(256,256,265),(256,256,265),(256,256,265),(256,256,265)]
Which are 9 tuples containing the colour palette which supposedly can be reshaped into 3 * 3
But numpy.reshape keeps saying it is 27 items and can't be reshaped into 3*3 array,
My question is how can I reshape this output into 3 * 3 array
So The colour array I need after reshaping should look something like this:
colours=[
[(256,256,265),(256,256,265),(256,256,265)],
[(256,256,265),(256,256,265),(256,256,265)],
[(256,256,265),(256,256,265),(256,256,265)]
]
from PIL import Image
import numpy as np
array=[]
for row in range(1,4):
for column in range(1,4):
filename = '/storage/emulated/0/python/banana/banana_0'+str(row)+'_0'+str(column)+'.png'
img = Image.open(filename)
img.show()
colors = img.getpixel((10,10))
array.append(colors)
array=np.array(array)
box_array=array.reshape(3,3)
You need to reshape using the full destination shape. Your array contains 27 elements in total
When you do:
array = np.array(array)
you obtain a (9, 3) shaped array, so you can't reshape it in (3, 3), but in (3, 3, 3).
you can proceed like:
box_array = array.reshape(3, 3, 3)
Depending on what dimension is subject to change in your array later, you can let numpy figure it out.
If for instance your 2nd and 3rd dimensions will always be (3, 3), then you can reshape your array as follows and numpy will detect automatically the 1st dimension:
box_array = array.reshape(-1, 3, 3)
And inversely if your 1st and 2nd dimensions will always be (3, 3), then you can reshape your array as follows and numpy will detect automatically the 3rd dimension:
box_array = array.reshape(3, 3, -1)

How to stack matrices with different size

I have a list of matrices with size of (63,32,1,600,600), when I want to stack it with torch.stack(matrices).cpu().detach().numpy() it's raising with error:
"stack expects each tensor to be equal size, but got [32, 1, 600, 600] at entry 0 and [16, 1, 600, 600] at entry 62". Is tried for resizing but it did not work. I appreciate any recommendations.
If I understand correctly what you're trying to do is stack the outputted mini-batches together into a single batch. My bet is that your last batch is partially filled (only has 16 elements instead of 32).
Instead of using torch.stack (creating a new axis), I would simply concatenate with torch.cat on the batch axis (axis=0). Assuming matrices is a list of torch.Tensors.
torch.cat(matrices).cpu().detach().numpy()
As torch.cat concatenates on axis=0 by default.
When we have tensors that differ in size only on the first dimension, as of PyTorch v1.7.0, we can use torch.vstack() to stack it along axis 0. Using torch.stack() fails here because it expects all the tensors to be of same shape.
Here is a reproducible illustration matching your problem description:
# sample tensors (as per your size)
In [65]: t1 = torch.randn([32, 1, 600, 600])
In [66]: t2 = torch.randn([16, 1, 600, 600])
# vertical stacking (i.e., stacking along axis 0)
In [67]: stacked = torch.vstack([t1, t2])
# check shape of output
In [68]: stacked.shape
Out[68]: torch.Size([48, 1, 600, 600])
we get 48 (32 + 16) as the size of first dimension in the result because we're stacking tensors along that dimension.
Note:
You can also initialize the result tensor, say stacked, by explicitly calculating the shape and pass this tensor as a parameter to out= kwarg of torch.vstack() if you want to write the result to a specific tensor, for instance updating the values of existing tensor (of same shape). However, this is optional.
# calculate new shape of stacking
In [80]: newshape = (t1.shape[0] + t2.shape[0], *t1.shape[1:])
# allocate an empty tensor, filled with garbage values
In [81]: stacked = torch.empty(newshape)
# stack it along axis 0 and write the result to `stacked`
In [83]: torch.vstack([t1, t2], out=stacked)
# check shape/size
In [84]: stacked.shape
Out[84]: torch.Size([48, 1, 600, 600])

Batch-Matrix multiplication in Pytorch - Confused with the handling of the output's dimension

I got two arrays :
A
B
Array A contains a batch of RGB images, with shape:
[batch, Width, Height, 3]
whereas Array B contains coefficients needed for a "transformation-like" operation on images, with shape:
[batch, 4, 4, 3]
To put it simply, the operation for a single image is a multiplication that outputs an environment map (normalMap * Coefficients).
The output I want should hold shape:
[batch, Width, Height, 3]
I tried using torch.bmm but failed. Is this possible somehow?
I think you need to calculate that PyTorch works with
BxCxHxW : number of mini-batches, channels, height, width
format, and also use matmul, since bmm works with tensors or ndim/dim/rank =3.
I know you may find this online, but for any case:
batch1 = torch.randn(10, 3, 20, 10)
batch2 = torch.randn(10, 3, 10, 30)
res = torch.matmul(batch1, batch2)
res.size() # torch.Size([10, 3, 20, 30])

How to reshape a multidimensional array to a 2D image?

I'm working on an array shaped as follows
(64, 1, 64, 64)
This is in fact one grayscale image that was split into 64 patches, each patch with 64*64px.
Now I need to rebuild it into a 512*512px image.
I've tried using
np.reshape(arr, (512, 512))
but of course the resulting image is not as expected.
How do I resolve this?
It depends on how your patches are arranged. But the first thing you could try is
image.reshape(8, 8, 64, 64).swapaxes(1, 2).reshape(512, 512)
This is assuming that the original zeroth dimension lists the patches row by row, i.e. 0-7 are the first row of patches from left to right, 8-15 the second row and so on.
The first reshape reestablishes that arrangement, after it choosing index i, j for axes 0 and 1 addresses the j+1st patch in the i+1st row.
Now comes the interesting bit: When merging axes by reshape:
only adjacent dimensions can be combined
all but the rightmost axis in each block will be dispersed
Since we want to keep each patch together we have to rearrange in such a way that the current axes 2 and 3 become the rightmost members of blocks. That is what the swapaxes does.
By now the shape is (8, 64, 8, 64) and axes 1 and 3 are the within-patch coordinates. Combining two pairs ( 8, 64 -> 512 8, 64 -> 512 ) is all that's left to do.

Numpy remove a dimension from np array

I have some images I want to work with, the problem is that there are two kinds of images both are 106 x 106 pixels, some are in color and some are black and white.
one with only two (2) dimensions:
(106,106)
and one with three (3)
(106,106,3)
Is there a way I can strip this last dimension?
I tried np.delete, but it did not seem to work.
np.shape(np.delete(Xtrain[0], [2] , 2))
Out[67]: (106, 106, 2)
You could use numpy's fancy indexing (an extension to Python's built-in slice notation):
x = np.zeros( (106, 106, 3) )
result = x[:, :, 0]
print(result.shape)
prints
(106, 106)
A shape of (106, 106, 3) means you have 3 sets of things that have shape (106, 106). So in order to "strip" the last dimension, you just have to pick one of these (that's what the fancy indexing does).
You can keep any slice you want. I arbitrarily choose to keep the 0th, since you didn't specify what you wanted. So, result = x[:, :, 1] and result = x[:, :, 2] would give the desired shape as well: it all just depends on which slice you need to keep.
if you have multiple dimensional this might help
pred_mask[0,...] #Remove First Dim
Pred_mask[...,0] #Remove Last Dim
Just take the mean value over the colors dimension (axis=2):
Xtrain_monochrome = Xtrain.mean(axis=2)
When the shape of your array is (106, 106, 3), you can visualize it as a table with 106 rows and 106 columns filled with data points where each point is array of 3 numbers which we can represent as [x, y ,z]. Therefore, if you want to get the dimensions (106, 106), you must make the data points in your table of to not be arrays but single numbers. You can achieve this by extracting either the x-component, y-component or z-component of each data point or by applying a function that somehow aggregates the three component like the mean, sum, max etc. You can extract any component just like #matt Messersmith suggested above.
well, you should be careful when you are trying to reduce the dimensions of an image.
An Image is normally a 3-D matrix that contains data of the RGB values of each pixel. If you want to reduce it to 2-D, what you really are doing is converting a colored RGB image into a grayscale image.
And there are several ways to do this like you can take the maximum of three, min, average, sum, etc, depending on the accuracy you want in your image. The best you can do is, take a weighted average of the RGB values using the formula
Y = 0.299R + 0.587G + 0.114B
where R stands for RED, G is GREEN and B is BLUE. In numpy, this can be written as
new_image = img[:, :, 0]*0.299 + img[:, :, 1]*0.587 + img[:, :, 2]*0.114
Actually np.delete would work if you would apply it two times,
if you want to preserve the first channel for example then you could run the following:
Xtrain = np.delete(Xtrain,2,2) # this will get rid of the 3rd component of the 3 dimensions
print(Xtrain.shape) # will now output (106,106,2)
# again we apply np.delete but on the second component of the 3rd dimension
Xtrain = np.delete(Xtrain,1,2)
print(Xtrain.shape) # will now output (106,106,1)
# you may finally squeeze your output to get a 2d array
Xtrain = Xtrain.squeeze()
print(Xtrain.shape) # will now output (106,106)

Categories

Resources