NumPy - Excluding all zero 2D arrays from a 3D array - python

I have multiple 3D arrays with different shapes but I'm going to assume I have an array named A with shape (53, 768, 768) for an example. It consists of 53 2D arrays and some of them may be empty images. Those empty images have only 0 pixel values.
If there are N slices with all 0 values, I want to slice A into a (53 - N, 768, 768) 3D array. Is this possible with indexing?
I tried something like this a[:, ~np.all(a == 0)], but it returns an array with shape (53, 1, 768, 768).

Let's assume your data is something like this:
z = np.array([
[[1, 2, 3], [4, 5, 6]],
[[7, 8, 9], [10, 11, 12]],
[[0, 0, 0], [0, 0, 0]],
[[1, 1, 1], [1, 1, 1]]
])
The shape of z is (4, 2, 3). We therefore need a vector with shape 4, aggregating over the other dimensions. We can use the axis= parameter in most Numpy functions for this:
mask = np.all(z != 0, axis=(1, 2))
a[mask]
In this example, mask will be array([False, False, True, False]).
Axes are numbered 0, 1, 2, etc. So we use 1 and 2 to refer to the 2nd and 3rd axes.
You can also use negative numbers as in the other answer; if you write axis=(-2, -1) that refers to the last and 2nd-to-last axes, i.e. axes 1 and 2 in this example.
In general, use axis= to specify which axes are to be collapsed by aggregating. Any axis not specified in axis= will not be aggregated.

Use:
import numpy as np
A = np.array(A) # if A is not a NumPy array
result = A[np.sum(A, axis = (-1, -2)) != 0]
This will do.

Related

Numpy: for each element in one dimension, find coordinates of maximum of sub-array

I've seen variations of this question asked a few times but so far haven't seen any answers that get to the heart of this general case. I have an n-dimensional array of shape [a, b, c, ...] . For some dimension x, I want to look at each sub-array and find the coordinates of the maximum.
For example, say b = 2, and that's the dimension I'm interested in. I want the coordinates of the maximum of [:, 0, :, ...] and [:, 1, :, ...] in the form a_max = [a_max_b0, a_max_b1], c_max = [c_max_b0, c_max_b1], etc.
I've tried to do this by reshaping my input matrix to a 2d array [b, a*c*d*...], using argmax along axis 0, and unraveling the indices, but the output coordinates don't wind up giving the maxima in my dataset. In this case, n = 3 and I'm interested in axis 1.
shape = gains_3d.shape
idx = gains_3d.reshape(shape[1], -1)
idx = idx.argmax(axis = 1)
a1, a2 = np.unravel_index(idx, [shape[0], shape[2]])
Obviously I could use a loop, but that's not very pythonic.
For a concrete example, I randomly generated a 4x2x3 array. I'm interested in axis 1, so the output should be two arrays of length 2.
testarray = np.array([[[0.17028444, 0.38504759, 0.64852725],
[0.8344524 , 0.54964746, 0.86628204]],
[[0.77089997, 0.25876277, 0.45092835],
[0.6119848 , 0.10096425, 0.627054 ]],
[[0.8466859 , 0.82011746, 0.51123959],
[0.26681694, 0.12952723, 0.94956865]],
[[0.28123628, 0.30465068, 0.29498136],
[0.6624998 , 0.42748154, 0.83362323]]])
testarray[:,0,:] is
array([[0.17028444, 0.38504759, 0.64852725],
[0.77089997, 0.25876277, 0.45092835],
[0.8466859 , 0.82011746, 0.51123959],
[0.28123628, 0.30465068, 0.29498136]])
, so the first element of the first output array will be 2, and the first element of the other will be 0, pointing to 0.8466859. The second elements of the two matrices will be 2 and 2, pointing to 0.94956865 of testarray[:,1,:]
Let's first try to get a clear idea of what you are trying to do:
Sample 3d array:
In [136]: arr = np.random.randint(0,10,(2,3,4))
In [137]: arr
Out[137]:
array([[[1, 7, 6, 2],
[1, 5, 7, 1],
[2, 2, 5, *6*]],
[[*9*, 1, 2, 9],
[2, *9*, 3, 9],
[0, 2, 0, 6]]])
After fiddling around a bit I came up with this iteration, showing the coordinates for each middle dimension, and the max value
In [151]: [(i,np.unravel_index(np.argmax(arr[:,i,:]),(2,4)),np.max(arr[:,i,:])) for i in range
...: (3)]
Out[151]: [(0, (1, 0), 9), (1, (1, 1), 9), (2, (0, 3), 6)]
I can move the unravel outside the iteration:
In [153]: np.unravel_index([np.argmax(arr[:,i,:]) for i in range(3)],(2,4))
Out[153]: (array([1, 1, 0]), array([0, 1, 3]))
Your reshape approach does avoid this loop:
In [154]: arr1 = arr.transpose(1,0,2) # move our axis first
In [155]: arr1 = arr1.reshape(3,-1)
In [156]: arr1
Out[156]:
array([[1, 7, 6, 2, 9, 1, 2, 9],
[1, 5, 7, 1, 2, 9, 3, 9],
[2, 2, 5, 6, 0, 2, 0, 6]])
In [158]: np.argmax(arr1,axis=1)
Out[158]: array([4, 5, 3])
In [159]: np.unravel_index(_,(2,4))
Out[159]: (array([1, 1, 0]), array([0, 1, 3]))
max and argmax take only one axis value, where as you want the equivalent of taking the max along all but one axis. Some ufunc takes a axis tuple, but these do not. The transpose and reshape may be the only way.
In [163]: np.max(arr1,axis=1)
Out[163]: array([9, 9, 6])

Operations with Numpy arrays with zero dimensions

Is a numpy array of shape (0,10) a numpy array of shape (10). I'm writing a very simple function that will alternate between 2 and 3 dimensions and I am wondering know whether the output of something like this:
def Pick(N = 0, F, R, Choice=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]):
if N==0:
return np.array(np.random.choice(Choice,size=(F,R)))
else:
return np.array(np.random.choice(Choice,size=(N,F,R)))
will behave the same as the output of:
def Pick(N = 0, F, R, Choice=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]):
return np.array(np.random.choice(Choice,size=(N,F,R)))
Theoretically these should be the same but when I try.
a =np.full((10,10,10),1)
then
a+a
I get a (10,10,10) np.array of 2's. But if I try
b=np.full((0,10,10,10),1)
then
b+b
This is the only result I receive
array([], shape=(0, 10, 10, 10), dtype=int64)
any ideas as to why this is?
Abstractly, an array of shape (N,M,L) can be represented identically by an array of shape (<>,N,<>,M,<>,L,<>), where <> can be substituted for a sequence of 1s with arbitrary finite length. Consider the set of indexes corresponding to each data point — if one dimension is of length 0, what index corresponding to that dimension can data points bear? This should explain why defining a numpy array as you have yields the [] result — because you have defined an empty array. Defining
a = np.full((10,10,10),1)
b = np.full((10,10,10,1),1)
the
a+b
operation broadcasts appropriately (and) yields the expected result.
A 0 dimension has the same meaning as a 1, 2 or other positive integer:
In [437]: np.ones((2,3),int)
Out[437]:
array([[1, 1, 1], # 2*3 elements
[1, 1, 1]])
In [438]: np.ones((1,3),int)
Out[438]: array([[1, 1, 1]]) # 1*3 elements
In [439]: np.ones((0,3),int)
Out[439]: array([], shape=(0, 3), dtype=int64) # 0*3 elements

Numpy 3d array indexing

I have a 3d numpy array (n_samples x num_components x 2) in the example below n_samples = 5 and num_components = 7.
I have another array (indices) which is the selected component for each sample which is of shape (n_samples,).
I want to select from the data array given the indices so that the resulting array is n_samples x 2.
The code is below:
import numpy as np
np.random.seed(77)
data=np.random.randint(low=0, high=10, size=(5, 7, 2))
indices = np.array([0, 1, 6, 4, 5])
#how can I select indices from the data array?
For example for data 0, the selected component should be the 0th and for data 1 the selected component should be 1.
Note that I can't use any for loops because I'm using it in Theano and the solution should be solely based on numpy.
Is this what you are looking for?
In [36]: data[np.arange(data.shape[0]),indices,:]
Out[36]:
array([[7, 4],
[7, 3],
[4, 5],
[8, 2],
[5, 8]])
To get component #0, use
data[:, 0]
i.e. we get every entry on axis 0 (samples), and only entry #0 on axis 1 (components), and implicitly everything on the remaining axes.
This can be easily generalized to
data[:, indices]
to select all relevant components.
But what OP really wants is just the diagonal of this array, i.e. (data[0, indices[0]], (data[1, indices[1]]), ...) The diagonal of a high-dimensional array can be extracted using the diagonal function:
>>> np.diagonal(data[:, indices])
array([[7, 7, 4, 8, 5],
[4, 3, 5, 2, 8]])
(You may need to transpose the result.)
You have a variety of ways to do so, but this is my loop recommendation:
selection = np.array([ datum[indices[k]] for k,datum in enumerate(data)])
The resulting array, selection, has the desired shape.

Python reshape list to ndim array

Hi I have a list flat which is length 2800, it contains 100 results for each of 28 variables: Below is an example of 4 results for 2 variables
[0,
0,
1,
1,
2,
2,
3,
3]
I would like to reshape the list to an array (2,4) so that the results for each variable are in a single element.
[[0,1,2,3],
[0,1,2,3]]
You can think of reshaping that the new shape is filled row by row (last dimension varies fastest) from the flattened original list/array.
If you want to fill an array by column instead, an easy solution is to shape the list into an array with reversed dimensions and then transpose it:
x = np.reshape(list_data, (100, 28)).T
Above snippet results in a 28x100 array, filled column-wise.
To illustrate, here are the two options of shaping a list into a 2x4 array:
np.reshape([0, 0, 1, 1, 2, 2, 3, 3], (4, 2)).T
# array([[0, 1, 2, 3],
# [0, 1, 2, 3]])
np.reshape([0, 0, 1, 1, 2, 2, 3, 3], (2, 4))
# array([[0, 0, 1, 1],
# [2, 2, 3, 3]])
You can specify the interpretation order of the axes using the order parameter:
np.reshape(arr, (2, -1), order='F')
Step by step:
# import numpy library
import numpy as np
# create list
my_list = [0,0,1,1,2,2,3,3]
# convert list to numpy array
np_array=np.asarray(my_list)
# reshape array into 4 rows x 2 columns, and transpose the result
reshaped_array = np_array.reshape(4, 2).T
#check the result
reshaped_array
array([[0, 1, 2, 3],
[0, 1, 2, 3]])
The answers above are good. Adding a case that I used.
Just if you don't want to use numpy and keep it as list without changing the contents.
You can run a small loop and change the dimension from 1xN to Nx1.
tmp=[]
for b in bus:
tmp.append([b])
bus=tmp
It maybe not efficient in case of very large numbers. But it works for small set of numbers.
Thanks

Mapping element-wise a NumPy array into an array of more dimensions

I want map a numpy.array from NxM to NxMx3, where a vector of three elements is a function of the original entry:
lambda x: [f1(x), f2(x), f3(x)]
However, things like numpy.vectorize do not allow to change dimensions.
Sure, I can create an array of zeros and make a loop (and it is what I am doing by now), but it does not sound neither Pythonic nor efficient (as every looping in Python).
Is there a better way to perform an elementwise operation on numpy.array, producing a vector for each entry?
Now that I see your code, for most simple mathematical operations you can let numpy do the looping, what is often referred to as vectorization:
def complex_array_to_rgb(X, theme='dark', rmax=None):
'''Takes an array of complex number and converts it to an array of [r, g, b],
where phase gives hue and saturaton/value are given by the absolute value.
Especially for use with imshow for complex plots.'''
absmax = rmax or np.abs(X).max()
Y = np.zeros(X.shape + (3,), dtype='float')
Y[..., 0] = np.angle(X) / (2 * pi) % 1
if theme == 'light':
Y[..., 1] = np.clip(np.abs(X) / absmax, 0, 1)
Y[..., 2] = 1
elif theme == 'dark':
Y[..., 1] = 1
Y[..., 2] = np.clip(np.abs(X) / absmax, 0, 1)
Y = matplotlib.colors.hsv_to_rgb(Y)
return Y
This code should run much faster than yours.
If I understand your problem correctly, I suggest you use np.dstack:
Docstring:
Stack arrays in sequence depth wise (along third axis).
Takes a sequence of arrays and stack them along the third axis
to make a single array. Rebuilds arrays divided by `dsplit`.
This is a simple way to stack 2D arrays (images) into a single
3D array for processing.
In [1]: a = np.arange(9).reshape(3, 3)
In [2]: a
Out[2]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
In [3]: x, y, z = a*1, a*2, a*3 # in your case f1(a), f2(a), f3(a)
In [4]: np.dstack((x, y, z))
Out[4]:
array([[[ 0, 0, 0],
[ 1, 2, 3],
[ 2, 4, 6]],
[[ 3, 6, 9],
[ 4, 8, 12],
[ 5, 10, 15]],
[[ 6, 12, 18],
[ 7, 14, 21],
[ 8, 16, 24]]])

Categories

Resources