Mask 2D numpy array - python

I want to apply mask on 2D numpy array. But it does not work correctly. Suppose I have
val(lat, lon) ---> my 2D array (20, 30)
Mask_lat = np.ma.masked_array(lat, mask=latmask) ---> masked lat (5,)
Mask_lon = np.ma.masked_array(lon, mask =lonmask) ---> masked lon (8,)
Maks_val = np.ma.masked_array(val, mask=mask_lat_lon) ---> ?
I do not know how can I pass a correct mask_lat_lon to have masked val (5,8). I would appreciate if one guides me.
Thank you in advance.

If I understand your question correctly, you have two 1D arrays that represent y and x (lat and long) positions in a 2D array. You want to mask a region based on the x/y position in the 2D array.
The key part to understand is that mask for a 2D array is also 2D.
For example, let's mask a single element of a 2D array:
import numpy as np
z = np.arange(20).reshape(5, 4)
mask = np.zeros(z.shape, dtype=bool)
mask[3, 2] = True
print z
print np.ma.masked_array(z, mask)
This yields:
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]
[16 17 18 19]]
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 -- 15]
[16 17 18 19]]
In your case, you have two 1D x and y arrays that you need to create a 2D mask from. For example:
import numpy as np
x = np.linspace(-85, -78, 4)
y = np.linspace(32, 37, 5)
z = np.arange(20).reshape(5, 4)
xmask = (x > -82.6) & (x < -80)
ymask = (y > 33) & (y < 35.6)
print xmask
print ymask
We'd then need to combine them into a single 2D mask using broadcasting:
mask = xmask[np.newaxis, :] & ymask[:, np.newaxis]
Slicing with newaxis (or None, they're the same object) adds a new axis at that position, turning the 1D array into a 2D array. It you have seen this before, it's useful to take a quick look at what xmask[np.newaxis, :] and ymask[:, np.newaxis] look like:
In [14]: xmask
Out[14]: array([False, False, True, False], dtype=bool)
In [15]: ymask
Out[15]: array([False, True, True, False, False], dtype=bool)
In [16]: xmask[np.newaxis, :]
Out[16]: array([[False, False, True, False]], dtype=bool)
In [17]: ymask[:, np.newaxis]
Out[17]:
array([[False],
[ True],
[ True],
[False],
[False]], dtype=bool)
mask will then be (keep in mind that True elements are masked):
In [18]: xmask[np.newaxis, :] & ymask[:, np.newaxis]
Out[18]:
array([[False, False, False, False],
[False, False, True, False],
[False, False, True, False],
[False, False, False, False],
[False, False, False, False]], dtype=bool)
Finally, we can create a 2D masked array from z based on this mask:
arr = np.masked_array(z, mask)
Which gives us our final result:
[[ 0 1 2 3]
[ 4 5 -- 7]
[ 8 9 -- 11]
[12 13 14 15]
[16 17 18 19]]

Related

NumPy: Find first n columns according to mask

Say I have an array arr in shape (m, n) and a boolean array mask in the same shape as arr. I would like to obtain the first N columns from arr that are True in mask as well.
An example:
arr = np.array([[1,2,3,4,5],
[6,7,8,9,10],
[11,12,13,14,15]])
mask = np.array([[False, True, True, True, True],
[True, False, False, True, False],
[True, True, False, False, False]])
N = 2
Given the above, I would like to write a (vectorized) function that outputs the following:
output = maskify_n_columns(arr, mask, N)
output = np.array(([2,3],[6,9],[11,12]))
You can use broadcasting, numpy.cumsum() and numpy.argmax().
def maskify_n_columns(arr, mask, N):
m = (mask.cumsum(axis=1)[..., None] == np.arange(1,N+1)).argmax(axis=1)
r = arr[np.arange(arr.shape[0])[:, None], m]
return r
maskify_n_columns(arr, mask, 2)
Output:
[[ 2 3]
[ 6 9]
[11 12]]

Numpy array loses shape after applying mask across axis

Problem
I have np.array and mask which are of the same shape. Once I apply the mask, the array loses it shape and becomes 1D - flattened one dimensional.
Question
I am wanting to reduce my array across some axis, based on a mask of axis length 1D.
How can I apply a mask, but keep dimensionality of the array?
Example
A small example in code:
# data ...
>>> data = np.ones((4, 4))
>>> data.shape
(4, 4)
# mask ...
>>> mask = np.ones((4, 4), dtype=bool)
>>> mask.shape
(4, 4)
# apply mask ...
>>> data[mask].shape
(16,)
My ideal shape would be (4, 4).
An example with array dimension reduction across an axis:
# data, mask ...
>>> data = np.ones((4, 4))
>>> mask = np.ones((4, 4), dtype=bool)
# remove last column from data ...
>>> mask[:, 3] = False
>>> mask
array([[ True, True, True, False],
[ True, True, True, False],
[ True, True, True, False],
[ True, True, True, False]])
# equivalent mask in 1D ...
>>> mask[0]
array([ True, True, True, False])
# apply mask ...
>>> data[mask].shape
(12,)
The ideal dimensions of the array would be (4, 3) without reshape.
Help is appreciated, thanks!
The 'correct' way of achieving your goal is to not expand the mask to 2D. Instead index with [:, mask] with the 1D mask. This indicates to numpy that you want axis 0 unchanged and mask applied along axis 1.
a = np.arange(12).reshape(3, 4)
b = np.array((1,0,1,0),'?')
a
# array([[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11]])
b
# array([ True, False, True, False])
a[:, b]
# array([[ 0, 2],
# [ 4, 6],
# [ 8, 10]])
If your mask is already 2D, numpy won't check whether all its rows are the same because that would be inefficient. But obviously you can use [:, mask[0]] in that case.
If your mask is 2D and just happens to have the same number of Trues in each row then either use #tel's answer. Or create an index array:
B = b^b[:3, None]
B
# array([[False, True, False, True],
# [ True, False, True, False],
# [False, True, False, True]])
J = np.where(B)[1].reshape(len(B), -1)
And now either
np.take_along_axis(a, J, 1)
# array([[ 1, 3],
# [ 4, 6],
# [ 9, 11]])
or
I = np.arange(len(J))[:, None]
IJ = I, J
a[IJ]
# #array([[ 1, 3],
# [ 4, 6],
# [ 9, 11]])
I believe what you want can be done by calling new_data.reshape(837, -1). Here's a brief example:
arr = np.arange(8*6).reshape(8,6)
maskpiece = np.array([True, False]*3)
mask = np.broadcast_to(maskpiece, (8,6))
print('the original array\n%s\n' % arr)
print('the flat masked array\n%s\n' % arr[mask])
print('the masked array reshaped into 2D\n%s\n' % arr[mask].reshape(8, -1))
Output:
the original array
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]
[24 25 26 27 28 29]
[30 31 32 33 34 35]
[36 37 38 39 40 41]
[42 43 44 45 46 47]]
the flat masked array
[ 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46]
the masked array reshaped into 2D
[[ 0 2 4]
[ 6 8 10]
[12 14 16]
[18 20 22]
[24 26 28]
[30 32 34]
[36 38 40]
[42 44 46]]

Selecting vector of 2D array elements from column index vector

I have a 2D array A:
28 39 52
77 80 66
7 18 24
9 97 68
And a vector array of column indexes B:
1
0
2
0
How, in a pythonian way, using base Python or Numpy, can I select the elements from A which DO NOT correspond to the column indexes in B?
I should get this 2D array which contains the elements of A, Not corresponding to the column indexes stored in B:
28 52
80 66
7 18
97 68
You can make use of broadcasting and a row-wise mask to select elements not contained in your array for each row:
Setup
B = np.array([1, 0, 2, 0])
cols = np.arange(A.shape[1])
Now use broadcasting to create a mask, and index your array.
mask = B[:, None] != cols
A[mask].reshape(-1, 2)
array([[28, 52],
[80, 66],
[ 7, 18],
[97, 68]])
A spin off of my answer to your other question,
Replace 2D array elements with zeros, using a column index vector
We can make a boolean mask with the same indexing used before:
In [124]: mask = np.ones(A.shape, dtype=bool)
In [126]: mask[np.arange(4), B] = False
In [127]: mask
Out[127]:
array([[ True, False, True],
[False, True, True],
[ True, True, False],
[False, True, True]])
Indexing an array with a boolean mask produces a 1d array, since in the most general case such a mask could select a different number of elements in each row.
In [128]: A[mask]
Out[128]: array([28, 52, 80, 66, 7, 18, 97, 68])
In this case the result can be reshaped back to 2d:
In [129]: A[mask].reshape(4,2)
Out[129]:
array([[28, 52],
[80, 66],
[ 7, 18],
[97, 68]])
Since you allowed for 'base Python' here's list comprehension answer:
In [136]: [[y for i,y in enumerate(x) if i!=b] for b,x in zip(B,A)]
Out[136]: [[28, 52], [80, 66], [7, 18], [97, 68]]
If all the 0's in the other A come from the insertion, then we can also get the mask (Out[127]) with
In [142]: A!=0
Out[142]:
array([[ True, False, True],
[False, True, True],
[ True, True, False],
[False, True, True]])

Mask minimum values in matrix rows

I have this 3x3 matrix:
a=array([[ 1, 11, 5],
[ 3, 9, 9],
[ 5, 7, -3]])
I need to mask the minimum values in each row in order to calculate the mean of each row discarding the minimum values. Is there a general solution?
I have tried with
a_masked=np.ma.masked_where(a==np.ma.min(a,axis=1),a)
Which masks the minimum value in first and third row, but not the second row?
I would appreciate any help. Thanks!
The issue is because the comparison a == a.min(axis=1) is comparing each column to the minimum value of each row rather than comparing each row to the minimum values. This is because a.min(axis=1) returns a vector rather than a matrix which behaves similarly to an Nx1 array. As such, when broadcasting, the == operator performs the operation in a column-wise fashion to match dimensions.
a == a.min(axis=1)
# array([[ True, False, False],
# [False, False, False],
# [False, False, True]], dtype=bool)
One potential way to fix this is to resize the result of a.min(axis=1) into column vector (e.g. a 3 x 1 2D array).
a == np.resize(a.min(axis=1), [a.shape[0],1])
# array([[ True, False, False],
# [ True, False, False],
# [False, False, True]], dtype=bool)
Or more simply as #ColonelBeuvel has shown:
a == a.min(axis=1)[:,None]
Now applying this to your entire line of code.
a_masked = np.ma.masked_where(a == np.resize(a.min(axis=1),[a.shape[0],1]), a)
# masked_array(data =
# [[-- 11 5]
# [-- 9 9]
# [5 7 --]],
# mask =
# [[ True False False]
# [ True False False]
# [False False True]],
# fill_value = 999999)
What is with the min() function?
For every Row just do min(row) and it gives you the minimum of this list in your Case a row. Simply append this minimum in a list for all Minimum.
minList=[]
for i in array:
minList.append(min(i))

Getting a grid of a matrix via logical indexing in Numpy

I'm trying to rewrite a function using numpy which is originally in MATLAB. There's a logical indexing part which is as follows in MATLAB:
X = reshape(1:16, 4, 4).';
idx = [true, false, false, true];
X(idx, idx)
ans =
1 4
13 16
When I try to make it in numpy, I can't get the correct indexing:
X = np.arange(1, 17).reshape(4, 4)
idx = [True, False, False, True]
X[idx, idx]
# Output: array([6, 1, 1, 6])
What's the proper way of getting a grid from the matrix via logical indexing?
You could also write:
>>> X[np.ix_(idx,idx)]
array([[ 1, 4],
[13, 16]])
In [1]: X = np.arange(1, 17).reshape(4, 4)
In [2]: idx = np.array([True, False, False, True]) # note that here idx has to
# be an array (not a list)
# or boolean values will be
# interpreted as integers
In [3]: X[idx][:,idx]
Out[3]:
array([[ 1, 4],
[13, 16]])
In numpy this is called fancy indexing. To get the items you want you should use a 2D array of indices.
You can use an outer to make from your 1D idx a proper 2D array of indices. The outers, when applied to two 1D sequences, compare each element of one sequence to each element of the other. Recalling that True*True=True and False*True=False, the np.multiply.outer(), which is the same as np.outer(), can give you the 2D indices:
idx_2D = np.outer(idx,idx)
#array([[ True, False, False, True],
# [False, False, False, False],
# [False, False, False, False],
# [ True, False, False, True]], dtype=bool)
Which you can use:
x[ idx_2D ]
array([ 1, 4, 13, 16])
In your real code you can use x=[np.outer(idx,idx)] but it does not save memory, working the same as if you included a del idx_2D after doing the slice.

Categories

Resources