Given a 2D numpy array, i.e.;
import numpy as np
data = np.array([
[11,12,13],
[21,22,23],
[31,32,33],
[41,42,43],
])
I need modify in place a sub-array based on two masking vectors for the desired rows and columns;
rows = np.array([False, False, True, True], dtype=bool)
cols = np.array([True, True, False], dtype=bool)
Such that i.e.;
print data
#[[11,12,13],
# [21,22,23],
# [0,0,33],
# [0,0,43]]
Now that you know how to access the rows/cols you want, just assigne the value you want to your subarray. It's a tad trickier, though:
mask = rows[:,None]*cols[None,:]
data[mask] = 0
The reason is that when we access the subarray as data[rows][:,cols] (as illustrated in your previous question, we're taking a view of a view, and some references to the original data get lost in the way.
Instead, here we construct a 2D boolean array by broadcasting your two 1D arrays rows and cols one with the other. Your mask array has now the shape (len(rows),len(cols). We can use mask to directly access the original items of data, and we set them to a new value. Note that when you do data[mask], you get a 1D array, which was not the answer you wanted in your previous question.
To construct the mask, we could have used the & operator instead of * (because we're dealing with boolean arrays), or the simpler np.outer function:
mask = np.outer(rows,cols)
Edit: props to #Marcus Jones for the np.outer solution.
Related
Suppose that I have a 3d array A and a 2d array B. A has dimension (s,m,m) while B has dimension (m,s).
I want to write code for a 2d array C with dimension (m,s) such that C[:,i] = A[i,:,:] # B[:,i].
Is there a way to do this elegantly without using a for loop in numpy?
One solution I thought of was to reshape B into a 3d array with dimension (m,s,1), multiply A and B via A#B, then reshape the resulting 3d array into a 2d array. This sounds a bit tedious and was wondering if tensordot or einsum can be applied here.
Suggestions appreciated. Thanks!
Using einsum is straight forward here:
A = np.arange(18).reshape(2,3,3)
B = np.arange(6).reshape(3,2)
C = np.einsum("ijk,ki->ji",A,B)
for i in range(2):
A[i]#B[:,i]==C[:,i]
# array([ True, True, True])
# array([ True, True, True])
I would like to iterate through a subset of dimensions of a numpy array and compare the resulting array elements (which are arrays or the remaining dimension(s)).
The code below does this:
import numpy
def min(h,m):
return h*60+m
exclude_times_default=[min(3,00),min(6,55)]
d=exclude_times_default
exclude_times_wkend=[min(3,00),min(9,00)]
w=exclude_times_wkend;
exclude_times=numpy.array([[[min(3,00),min(6,20)],d,d,d,d,d,[min(3,00),min(6,20)],d,d,[min(3,00),min(6,20)]],
[d,d,d,d,[min(3,00),min(9,30)],[min(3,00),min(9,30)],d,d,d,d],
[[min(20,00),min(7,15)],[min(3,00),min(23,15)],[min(3,00),min(7,15)],[min(3,00),min(7,15)],[min(3,00),min(23,15)],[min(3,00),min(23,15)],d,d,d,d]])
num_level=exclude_times.shape[0]
num_wind=exclude_times.shape[1]
for level in range(num_level):
for window in range(num_wind):
if (exclude_times[level,window,:]==d).all():
print("Default")
exclude_times[level][window]=w
print(level,window,exclude_times[level][window])
The solution does not look very elegant to me, just wondering if there are more elegant solutions.
You can get a 2D mask pinpointing all the window/level combinations set to default like this:
mask = (exclude_times == d[None, None, :]).all(axis=-1)
The expression d[None, None, :] introduces two new axes into a view of d to make it broadcast to the shape of exclude_times properly. Another way to do that would be with an explicit reshape: np.reshape(d, (1, 1, -1)) or d.reshape(1, 1, -1). There are many other ways as well.
The .all(axis=-1) operation reduces the 3D boolean mask along the last axis, giving you a 2D mask indexed be level and window.
To count the number of default entries, use np.countnonzero:
nnz = np.countnonzero(mask)
To count the defaults for each window:
np.countnonzero(mask, axis=0)
To count the defaults for each level:
np.countnonzero(mask, axis=1)
Remember, the axis parameter is the one you reduce, not the one(s) you keep.
Assigning w to the default elements is a bit more complex. The problem is that exclude_times[mask[:, :, None]] is a copy of the original data, and doesn't preserve the shape of the original at all.
You have to do a couple of extra steps to reshape correctly:
exclude_times[mask[:, :, None]] = np.broadcast_to(w[None, :], (nnz, 2)).ravel()
In a related question I learned that if I have an array of shape MxMxN, and I want to select based on a boolean matrix of shape MxM, I can simply do
data[select, ...]
and be done with it. Unfortunately, now I have my data in a different order:
import numpy as np
data = np.arange(36).reshape((3, 4, 3))
select = np.random.choice([0, 1], size=9).reshape((3, 3)).astype(bool)
For each element in data indexed i0, i1, i2, it should be selected, if select[i0, i2] == True.
How can I proceed with my selection without having to do something inefficient like
data.flatten()[np.repeat(select[:, None, :], 4, axis=1).flatten()]
One way would be to simply use np.broadcast_to to broadcast without actual replication and use that broadcasted mask directly for masking required elements -
mask = np.broadcast_to(select[:,None,:], data.shape)
out = data[mask]
Another way and probably faster one would be to get the indices and then index with those. The elements thus obtained would be ordered by axis=1. The implementation would look something like this -
idx = np.argwhere(select)
out = data[idx[:,0], :, idx[:,1]]
I have a 3D array, which I want to mask by a given value. The value only occurs in one 2D slice of that 3D array along the z direction (axis 0) of that array, so all other slices along the z direction will be False.
How can i extract only the 2d array in which the number occurs, and also the position in the z dimension of the 2d slice in which it occurs?
So lets say i have a small 3D array:
import numpy as np
array = np.zeros((3,3,3))
array[2] = np.array([[2,2,0],[2,2,0],[2,2,0]])
print(array==2)
So masking the 3d array with value 2, would give me the whole 3d array, but i only need the 2d slice where it occurs, in that case it would be the third slice in the 3d array.
Desired Output:
[[ True True False]
[ True True False]
[ True True False]]
slice: 2
How can I achieve this?
As long as you know, that the condition is only fulfilled along the axis you call z direction, you can use a combination of numpy.any. You can use (array==2) to get a boolean array to find out where condition is met and then use np.any to find the correct slice. You can then find the slice index using np.where and then extract that index:
import numpy as np
array = np.zeros((3,3,3))
array[2] = np.array([[2,2,0],[2,2,0],[2,2,0]])
valueToFind = 2
conditionMet = (array==valueToFind)
zIndex = np.where(conditionMet.any(axis=(1, 2)))[0][0]
print(conditionMet[zIndex])
print("slice: {}".format(zIndex))
Alternatively, you can use np.compress to extract the slice.
np.compress(conditionMet.any(axis=(1, 2)), conditionMet, axis=0)[0]
Note, however, that this will not give you the z Index. So if you need that, you will need to use np.where anyway.
I have a 2D array and two masks, one for columns, and one for rows. If I try to simply do data[row_mask,col_mask], I get an error saying shape mismatch: indexing arrays could not be broadcast together with shapes .... On the other hand, data[row_mask][:,col_mask] works, but is not as pretty. Why does it expect indexing arrays to be of the same shape?
Here's a specific example:
import numpy as np
data = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
row_mask = np.array([True, True, False, True])
col_mask = np.array([True, True, False])
print(data[row_mask][:,col_mask]) # works
print(data[row_mask,col_mask]) # error
Use ix_ function :
>>> data[np.ix_(row_mask,col_mask)]
array([[ 1, 2],
[ 4, 5],
[10, 11]])
Combining multiple Boolean indexing arrays or a Boolean with an integer indexing array can best be understood with the obj.nonzero() analogy. The function ix_ also supports boolean arrays and will work without any surprises.