Index into NumPy array ignoring NaNs in the indexing array

Index into NumPy array ignoring NaNs in the indexing array - python

I have an array of zeros
arr = np.zeros([5,5])
array([[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]])
I want to assign values based on index so I did this .
out = np.array([[nan,2.,4.,1.,1.],[nan,3.,4.,4.,4.]])
arr[out[0].astype(int),np.arange(len(out[0]))] = 1
arr[out[1].astype(int),np.arange(len(out[1]))] = 1
Assignment works fine if there is 0 instead of nan.
How can I skip assignment in case of nan? and Is it possible to assign values at once from a multidimensional index array rather than using for loop ?

Mask it -
mask = ~np.isnan(out)
arr[out[0,mask[0]].astype(int),np.flatnonzero(mask[0])] = 1
arr[out[1,mask[1]].astype(int),np.flatnonzero(mask[1])] = 1
Sample run -
In [171]: out
Out[171]:
array([[ nan, 2., 4., 1., 1.],
[ nan, 3., 4., 4., 4.]])
In [172]: mask = ~np.isnan(out)
...: arr[out[0,mask[0]].astype(int),np.flatnonzero(mask[0])] = 1
...: arr[out[1,mask[1]].astype(int),np.flatnonzero(mask[1])] = 1
...:
In [173]: arr
Out[173]:
array([[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 1., 1.],
[ 0., 1., 0., 0., 0.],
[ 0., 1., 0., 0., 0.],
[ 0., 0., 1., 1., 1.]])
Alternative, replace the flatnonzero calls with range-masking -
r = np.arange(arr.shape[1])
arr[out[0,mask[0]].astype(int),r[mask[0]]] = 1
arr[out[1,mask[1]].astype(int),r[mask[1]]] = 1
If you are working with a lot many rows than just 2 and you want to assign them in a vectorized manner, here's one method, using linear-indexing -
n = arr.shape[1]
linear_idx = (out*n + np.arange(n))
np.put(arr, linear_idx[~np.isnan(linear_idx)].astype(int), 1)

Related

Pytorch index with Tensor

I have a 2-dimentional tensor arr with 0 as all the entries. I have a second tensor idx. I want to make all entries in arr with the indices in idx into 1.
arr = torch.zeros(size = (2,10))
idx = torch.Tensor([
[0,2],
[4,5]
])
arr[idx] = 1 #This doesn't work
print(arr)
The output should look like this:
tensor([[1., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 1., 0., 0., 0., 0.]])
I had high confidence that I would definitely find someone else ask this in SO, however I couldn't find one. I hope it isn't duplicate.

Use scatter() along dim=1 or the innermost dimension in this case i.e. dim=-1. Note that in place of src tensor, I just passed the constant value 1.
In [31]: arr = torch.zeros(size=(2, 10))
In [32]: idx = torch.tensor([
...: [0, 2],
...: [4, 5]
...: ])
In [33]: torch.scatter(arr, 1, idx, 1)
Out[33]:
tensor([[1., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 1., 0., 0., 0., 0.]])
In [34]: torch.scatter(arr, -1, idx, 1)
Out[34]:
tensor([[1., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 1., 0., 0., 0., 0.]])

How to fill numpy array of zeros with ones given indices/coordinates

Given a numpy array of zeros, say
arr = np.zeros((5, 5))
and an array of indices that represent vertices of a polygon, say
verts = np.array([[0, 2], [2, 0], [2, 4]])
1) What is the elegant way of doing
for v in verts:
arr[v[0], v[1]] = 1
such that the resulting array is
In [108]: arr
Out[108]:
array([[ 0., 0., 1., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 1., 0., 0., 0., 1.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]])
2) How can I fill the array with ones such that the output array is
In [158]: arr
Out[158]:
array([[ 0., 0., 1., 0., 0.],
[ 0., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 1.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]])

To answer the first part of your question: arr[tuple(verts.T)] = 1
verts.T transposes your indices to a (2, n) array, where the two rows correspond to the row and column dimensions of arr. These are then unpacked into a tuple of (row_indices, col_indices), which we then use to index into arr.
We could write this a bit more verbosely as:
row_indices = verts[:, 0]
col_indices = verts[:, 1]
arr[row_indices, col_indices] = 1
For the second part, one method that will work for arbitrary polygons would be to use matplotlib.Path.contains_points, as described here:
from matplotlib.path import Path
points = np.indices(arr.shape).reshape(2, -1).T
path = Path(verts)
mask = path.contains_points(points, radius=1e-9)
mask = mask.reshape(arr.shape).astype(arr.dtype)
print(repr(mask))
# array([[ 0., 0., 1., 0., 0.],
# [ 0., 1., 1., 1., 0.],
# [ 1., 1., 1., 1., 1.],
# [ 0., 0., 0., 0., 0.],
# [ 0., 0., 0., 0., 0.]])

Issue in numpy array loop for central difference

Input array for reference,
u = array([[ 0., 0., 0., 0., 0.],
[ 0., 1., 1., 1., 0.],
[ 0., 1., 1., 1., 0.],
[ 0., 1., 1., 1., 0.],
[ 0., 0., 0., 0., 0.]])
python function using for loop
import numpy as np
u = np.zeros((5,5))
u[1:-1,1:-1]=1
def cds(n):
for i in range(1,4):
for j in range(1,4):
u[i,j] = u[i,j+1] + u[i,j-1] + u[i+1,j] + u[i-1,j]
return u
above function cds(5) provide the following result by using for loop,
u=array([[ 0., 0., 0., 0., 0.],
[ 0., 2., 4., 5., 0.],
[ 0., 4., 10., 16., 0.],
[ 0., 5., 16., 32., 0.],
[ 0., 0., 0., 0., 0.]])
same function using numpy
def cds(n):
u[1:-1,1:-1] = u[1:-1,2:] + u[1:-1,:-2] + u[2:,1:-1] + u[:-2,1:-1]
return u
But for the same input array(u), function cds(5) using NUMPY provide different result.,
u=array([[ 0., 0., 0., 0., 0.],
[ 0., 2., 3., 2., 0.],
[ 0., 3., 4., 3., 0.],
[ 0., 2., 3., 2., 0.],
[ 0., 0., 0., 0., 0.]])
The reason for this problem is, python "for loop" updates every u[i,j] value to the exsisting u array while looping but "numpy" didn't.....
I want same result from numpy as like as from the for loop.
Is there any way to solve this issue in NUMPY? please help me, Thanks in advance...

Sparse Construct: Repeating Identity

say I have with ij being large (e.g. 5000) , the two following matrices
E = np.identity((ij))
oneVector = np.ones((1, ij))
and I need to compute
np.kron(E, oneVector)
This is quite slow and inefficient. Basically, the Kronecker product of identity and a row vector of ones is repeating the identity matrix horizontally oneVector.size times.
I believe that creating a sparse product would make more sense. scipy.sparse.kron would allow me to create that product if I had both A, B as sparse. But I don't know how to create the vector of ones as a "sparse type" matrix.
Is there a simple way to generate the sparse equivalent of np.ones() or is there another way I should proceed?

The arguments to scipy.sparse.kron do not have to be sparse.
In [31]: import numpy as np
In [32]: import scipy.sparse as sp
In [33]: ij = 4
In [34]: E = sp.identity(ij) # Sparse identity matrix
In [35]: oneVector = np.ones((1, ij)) # Dense
In [36]: m = sp.kron(E, oneVector) # m is sparse.
In [37]: m
Out[37]:
<4x16 sparse matrix of type '<type 'numpy.float64'>'
with 16 stored elements (blocksize = 1x4) in Block Sparse Row format>
In [38]: m.A
Out[38]:
array([[ 1., 1., 1., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 1., 1.]])
P.S. Based on this comment:
Basically, the Kronecker product of identity and a row vector of ones is repeating the identity matrix horizontally oneVector.size times.
I wonder if you meant kron(oneVector, E):
In [39]: m = sp.kron(oneVector, E)
In [40]: m.A
Out[40]:
array([[ 1., 0., 0., 0., 1., 0., 0., 0., 1., 0., 0., 0., 1., 0., 0., 0.],
[ 0., 1., 0., 0., 0., 1., 0., 0., 0., 1., 0., 0., 0., 1., 0., 0.],
[ 0., 0., 1., 0., 0., 0., 1., 0., 0., 0., 1., 0., 0., 0., 1., 0.],
[ 0., 0., 0., 1., 0., 0., 0., 1., 0., 0., 0., 1., 0., 0., 0., 1.]])

multiple condition in fancy indexing

I am new to python and am trying to some simple classification on raster image.
Basically, I am reading a TIF image as a 2D array and do some calculating and manipulation on it. For classification part, I am trying to create 3 empty arrays for land, water, and clouds. These classes will be assigned a value of 1 under multiple conditions, and eventually assigning these classes as landclass=1, waterclass=2, cloudclass=3 respectively.
apparently I can assign all values in an array to 1 under one condition
like this:
crop = gdal.Open(crop,GA_ReadOnly)
crop = crop.ReadAsArray()
rows,cols = crop.shape
mode = int(stats.mode(crop, axis=None)[0])
water = np.empty(shape(row,cols),dtype=float32)
land = water
clouds = water
than I have something like this (output):
>>> land
array([[ 0., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.],
...,
[ 0., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.],
[ 0., 0., 0., ..., 0., 0., 0.]], dtype=float32)
>>> land[water==0]=1
>>> land
array([[ 0., 0., 0., ..., 1., 1., 1.],
[ 0., 0., 0., ..., 1., 1., 1.],
[ 0., 0., 0., ..., 1., 1., 1.],
...,
[ 1., 1., 1., ..., 0., 0., 0.],
[ 1., 1., 1., ..., 0., 0., 0.],
[ 1., 1., 1., ..., 0., 0., 0.]], dtype=float32)
>>> land[crop>mode]=1
>>> land
array([[ 0., 0., 0., ..., 1., 1., 1.],
[ 0., 0., 0., ..., 1., 1., 1.],
[ 0., 0., 0., ..., 1., 1., 1.],
...,
[ 1., 1., 1., ..., 0., 0., 0.],
[ 1., 1., 1., ..., 0., 0., 0.],
[ 1., 1., 1., ..., 0., 0., 0.]], dtype=float32)
But how can I have the values in "land" equal to 1 under a couple of conditions without altering the shape of the array?
I tried to do this
land[water==0,crop>mode]=1
and I got ValueError. And I tried this
land[water==0 and crop>mode]=1
and python asks me to use a.all() or a.all()....
For only one condition, the result is exactly what I want, and I have to do it in order to get the result. eg (this is what I have in my actual code):
water[band6 < b6_threshold]=1
water[band7 < b7_threshold_1]=1
water[band6 > b6_threshold]=1
water[band7 < b7_threshold_2]=1
land[band6 > b6_threshold]=1
land[band7 > b7_threshold_2]=1
land[clouds == 1]=1
land[water == 1]=1
land[b1b4 < 0.5]=1
land[band3 < 0.1)]=1
clouds[land == 0]=1
clouds[water == 0]=1
clouds[band6 < (b6_mode-4)]=1
I found this is a bit confusing and I would like to combine all conditions within one statement... Any suggestion on that?
Thank you very much!

You can multiply the boolean arrays for something like "and":
>>> import numpy as np
>>> a = np.array([1,2,3,4])
>>> a[(a > 1) * (a < 3)] = 99
>>> a
array([ 1, 99, 3, 4])
And you can add them for something like "or":
>>> a[(a > 1) + (a < 3)] = 123
>>> a
array([123, 123, 123, 123])
Alternatively, if you prefer to think of boolean logic rather than True and False being 0 and 1, you can also use the operators & and | to the same effect.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Index into NumPy array ignoring NaNs in the indexing array - python

Related

Pytorch index with Tensor

How to fill numpy array of zeros with ones given indices/coordinates

Issue in numpy array loop for central difference

Sparse Construct: Repeating Identity

multiple condition in fancy indexing

Categories

Resources