How to change the shape of array from ixMxNx3 to (M*N)xix3?
I have a ixMxNx3 array L. You can think of L as an array containing i images, each image has height=M, width=N, and in each pixel it has a three-dimensional vector (or rgb). Let P = M*N. I can change its shape to ixPx3 by L.reshape(i,P,3). (I hope it is really changing it to the shape I want). How do I change its shape to Pxix3? i.e. an array that contains P points, each point has i images, each image of that point has a three-dimensional vector.
How can this change of shape be accomplished?
numpy.rollaxis can shift the position of an axis in a NumPy array:
L = L.reshape([i, P, 3])
L = numpy.rollaxis(L, 1)
It takes 3 arguments, one optional. The first is the array, the second is the axis to move, and the third is confusingly documented as "The axis is rolled until it lies before this position". Basically, if you want to move the ith axis to the jth position and j<i, the third argument should be j. If j>i, the third argument should be j+1. I don't know why it works that way. The third argument defaults to 0.
Related
I am handling a set of data recorded by a 2D detector. Therefore, the data are represented by three arrays: x and y labelling the coordinate of a pixel and intensity storing the measured signal.
For example, a 6x6 grid will give a set of data:
xraw = np.array([0,1,2,3,4,5,0,1,2,3,4,5,...])
yraw = np.array([0,0,0,0,0,0,1,1,1,1,1,1,...])
intensity = np.array([i_00,i_01,i_02,i_03,i_04,i_05,i_10,i_11,...])
Due to various reasons, such as pixel defects, some of the data points are discarded in the raw data. Therefore, xraw, yraw, intensity have a size smaller than 36 (if that's a 6x6 grid), with, say, the point at (2,3) missing.
The intensity data needs further treatment by an element-wise multiplication with another array. This treatment array is from theoretical calculation and so it has a size of nxn (6x6 in this case). However, as some of the points in the true data are missing, the two arrays have different sizes.
I can use a loop to check for the missing points and eliminate the corresponding element in the treatment array. I wonder if there are some methods in numpy that take care of such operations. Thanks
First, construct the indices of available and all possible pixel positions by
avail_ind = yraw * h + xraw
all_ind = np.arange(0, h * w)
where h and w is the image's height and width in pixels.
Then, find the indices of the missing pixels by
missing_ind = all_ind[~np.in1d(all_ind, avail_ind)]
Once having the missing indices, use np.delete to construct a copy of the treatment_array with elements at the indices removed, then simply multiply that with your intensity array.
result = intensity * np.delete(treatment_array, missing_ind)
I get confused by this example.
A = np.random.random((6, 4, 5))
A
A.min(axis=0)
A.min(axis=1)
A.min(axis=2)
What mins are we really computing here?
I know I can think of this array as a 6x5x4 Parallelepiped in 3D space and I know A.min(axis=0) means we go along the 0-th axis. OK, but as we go along that 0-th axis all we get is 6 "layers" which are basically rectangles of size 4x5 filled with numbers. So what min am I computing when saying A.min(axis=0) for example?!?! I am just trying to visualize it in my head.
From A.min(axis=0) I get back a 4x5 2D matrix. Why? Shouldn't I get just 6 values in a 1D array. I am walking along the 0-th axis so shouldn't I get 6 values back - one value for each of these 4x5 rectangles?
I always find this notation confusing and just don't get it, sorry.
You calculate the min across one particular axis when you are interested in maintaining the structure of the remainder axes.
The gif below may help to understand.
In this example, your result will have shape (3, 2).
That's because you are getting the smallest value along axis 0, which squeezes that dimension into only 1 value, so we don't need the dimension anymore.
I am learning numpy , have a question in my mind not able to clearly visualise from where this 1 as come in shape
import numpy as np
a = np.array([ [[1],[56]] , [[8],[98]] ,[[89],[62]] ])
np.shape(a)
The output is printed as : (3 ,2 , 1)
Will be appreciated if you could represent in diagrammatic / image format
What actually the 1 means in output
Basically, that last 1 is because every number in a has brackets around it.
Formally, it's the length of your "last" or "innermost" dimension. You can take your first two dimensions and arrange a as you would a normal matrix, but note that each element itself has brackets around it - each element is itself an array:
[[ [1] [56]]
[ [8] [98]]
[[89] [62]]]
If you add an element to each innermost-array, making that third shape number get larger, it's like stacking more arrays behind this top one in 3d, where now the corresponding elements in the "behind" array are in the same innermost array as the "front" array.
Equivalently, instead of considering the first two indices to denote the regular flat matrices, you can think of the back two making the flat matrices. This is how numpy does it: try printing out an array like this: x = np.random.randint(10, size = (3,3,3)). Along the first dimension, x[0], x[1], and x[2] are printed after each other, and each one individually is formatted like a 3x3 matrix. Then the second index corresponds to the rows of each individual matrix, and the third index corresponds to the columns. Note that when you print a, there's only one column displayed - its third dimension has size 1. You can play with the definition of x to see more what's going on (change the numbers in the size argument).
An alright example of visualizing a 3d array this way is this image, found on the Wikipedia page for the Levi-Civita symbol:
Don't worry too much about what the Levi-Civita symbol actually is - just note that here, if it were a numpy array it would have shape (3,3,3) (like the x I defined above). You use three indices to specify each element, i, j, and k. i tells you the depth (blue, red, or green), j tells you the row, and k tells you the column. When numpy prints, it just lists out blue, red, then green in order.
Wee know, in axis parameter 0,1 means column and row wise maximum element index but
for 2,3 & so on what it indicates? An example code is given here. What is the output significance in this code?
When you have an array of higher dimensions you will also have new axes. For example, in a dimension 3 array (e.g. a cube) you will have 3 axes (row, column, depth).
When you pass the axis in the np.argmax you are telling numpy along which axis you want the maximum argument. 3 will throw an error because your array only has 3 axes (0, 1, 2).
Here is an article about numpy arrays axes.
I have what is essentially a 4 column lookup table: cols 1, 2 are the respective xi,yj coordinates which map to x'i, y'j coordinates in the respective 3rd and 4th cols.
My goal is to provide a method to enter some (xnew,ynew) position within the range of my look-up values in the 1st and 2nd columns(xi,yj) then map that position to an interpolated (x'i,y'j) from the range of positions in the 3rd and 4th cols of the lut.
I have tried using interp2d, but have not been able to figure out how to enter the arrays into the proper format. For example: I don't understand why scipy.interpolate.interp2d(x'i, y'j, [xi,yj] kind='linear') gives me the following error:
ValueError: Invalid length for input z for non rectangular grid'.
This seems so simple, but I have not been able to figure it out. I will gladly provide more information if required.
interp2d requires that the interpolated function be 1D, see the docs:
z : 1-D ndarray The values of the function to interpolate at the data
points. If z is a multi-dimensional array, it is flattened before use.
So when you enter [xi,yj], it gets converted from its (2, n) shape to (2*n,), hence the error.
You can get around this setting up two different interpolating functions, one for each coordinate. If your lut is a single array of shape (n, 4), you would do something like:
x_interp = scipy.interpolate.interp2d(lut[0], lut[1], lut[2], kind = 'linear')
y_interp = scipy.interpolate.interp2d(lut[0], lut[1], lut[3], kind = 'linear')
And you can now do things like:
new_x, new_y = x_interp(x, y), y_interp(x, y)