Find the lowest non-masked point with numpy efficiently

Find the lowest non-masked point with numpy efficiently - python

The application here is finding the "cloud base", but the principles apply wherever. I have a numpy masked 3-D array (which we will say corresponds to a 3-D grid box with dimensions z, y, x), where I have masked out all points with a value of less than 0.1. What I want to find is, at every x,y point, what is the lowest z point index (not the lowest value in z, the smallest z coordinate) that is not masked out. I can think of a few trivial ways to do it, e.g.:
for x points:
for y points:
minz=-1
for z points:
if x,y,z is not masked:
minz = z
break
However, this seems really inefficient and I'm sure that there is a more efficient or more pythonic way to do this. What am I missing here?
Edit: I do not need to use masked arrays, but it seemed like the easiest way to ask the question- I can instead find the lowest point under a certain threshold without using masked arrays.
Edit 2: Idea for what I'm looking for (taking z=0 to be the lowest point):
input:
[[[0,1],
[1,5]],
[[3,3],
[2,4]],
[[2,1],
[4,9]]]
threshold: val >=3
output:
[[1,1],
[2,0]]

Assuming A as the input array, you could do -
np.where((A < thresh).all(0),-1,(A >= thresh).argmax(0))
Sample runs
Run #1:
In [87]: A
Out[87]:
array([[[0, 1],
[1, 5]],
[[3, 3],
[2, 4]],
[[2, 1],
[4, 9]]])
In [88]: thresh = 3
In [89]: np.where((A < thresh).all(0),-1,(A >= thresh).argmax(0))
Out[89]:
array([[1, 1],
[2, 0]])
Run #2:
In [82]: A
Out[82]:
array([[[17, 1, 2, 3],
[ 5, 13, 11, 2],
[ 9, 16, 11, 19],
[11, 16, 6, 3],
[15, 9, 14, 14]],
[[18, 19, 5, 8],
[13, 13, 17, 2],
[17, 12, 16, 0],
[19, 14, 12, 5],
[ 7, 8, 4, 7]],
[[10, 12, 11, 2],
[10, 18, 6, 15],
[ 4, 16, 0, 16],
[16, 18, 2, 1],
[10, 19, 9, 4]]])
In [83]: thresh = 10
In [84]: np.where((A < thresh).all(0),-1,(A >= thresh).argmax(0))
Out[84]:
array([[ 0, 1, 2, -1],
[ 1, 0, 0, 2],
[ 1, 0, 0, 0],
[ 0, 0, 1, -1],
[ 0, 2, 0, 0]])

Related

splitting an array where it meets the peak values

hope doing well.
I have an extremely big numpy array and want to split it into several ones. My array has three columns and I want to split it where the all the columns are reaching their maximum values:
array = [[0, 0, 0],
[0, 0, 5],
[10, 5, 10],
[1, 1, 1],
[5, 5, 15],
[10, 8, 20],
[2, 0, 0],
[10, 10, 12],
[1, 2, 0],
[2, 5, 9]]
Now, I want to split it into four array:
sub_array_1=[[0, 0, 0],
[0, 0, 5],
[10, 5, 10]]
sub_array_2=[[1, 1, 1],
[5, 5, 15],
[10, 8, 20]]
sub_array_3=[[2, 0, 0],
[10, 10, 12]]
sub_array_4=[[1, 2, 0],
[2, 5, 9]]
I tried to it in a for loop having if statements saying that give me an array when each element of my input is bigger than the element stored in the both upper and lower rows. And I also should figure out the last row:
import numpy as np
sub_array_1=np.array([])
for i in array:
if array[i,:]>array[i+1,:] and array[i,:]>array[i+1,:]:
vert_1=np.append(sub_array_1,array[0:i,:])
My code doesn't work, but it simply shows my idea.
I am quite new in Python and I could not find the way to write my idea as a code. So, I appreciate any help and contribution.
Cheers,
Ali

IIUC, one way using numpy.diff with numpy.array_split:
indices = np.argwhere(np.all(np.diff(array, axis=0) < 0, axis=1))
np.array_split(array, indices.ravel()+1, axis=0)
Output:
[array([[ 0, 0, 0],
[ 0, 0, 5],
[10, 5, 10]]),
array([[ 1, 1, 1],
[ 5, 5, 15],
[10, 8, 20]]),
array([[ 2, 0, 0],
[10, 10, 12]]),
array([[1, 2, 0],
[2, 5, 9]])]
np.all and np.diff find a row where all elements of the row as a negative difference with a next row (i.e. where the peak ends)
np.array_split will then split the given array based on the locations of the peak found.

Tensor Entry Selection Logic Divergence in PyTorch & Numpy

Description
I'm setting up a torch.Tensor for masking purpose. When attempting to select entries by indices, it turns out that behaviors between using numpy.ndarray and torch.Tensor to hold index data are different. I would like to have access to the design in both frameworks and related documents that explain the difference.
Steps to replicate
Environment
Pytorch 1.3 in container from official release: pytorch/pytorch:1.3-cuda10.1-cudnn7-devel
Example
Say I need to set up mask as torch.Tensor object with shape [3,3,3] and set values at entries (0,0,1) & (1,2,0) to 1. The code below explains the difference.
mask = torch.zeros([3,3,3])
indices = torch.tensor([[0, 1],
[0, 2],
[1, 0]])
mask[indices.numpy()] = 1 # Works
# mask[indices] = 1 # Incorrect result
I noticed that when using mask[indices.numpy()] a new torch.Tensor of shape [2], while mask[indices] returns a new torch.Tensor of shape [3, 2, 3, 3], which suggests difference in tensor slicing logic.

You get different results because that's how indexing is implemented in Pytorch. If you pass an array as index, then it gets "unpacked". For example:
indices = torch.tensor([[0, 1], [0, 2], [1, 0]])
mask = torch.arange(1,28).reshape(3,3,3)
# tensor([[[ 1, 2, 3],
# [ 4, 5, 6],
# [ 7, 8, 9]],
# [[10, 11, 12],
# [13, 14, 15],
# [16, 17, 18]],
# [[19, 20, 21],
# [22, 23, 24],
# [25, 26, 27]]])
The mask[indices.numpy()] is equivalent to mask[[0, 1], [0, 2], [1, 0]], i.e. the elements of the i-th row of indices.numpy() are used to select elements of mask along i-th axis. So it returns tensor([mask[0,0,1], mask[1,2,0]]), i.e. tensor([2, 16]).
On the other hand, when passing a tensor as index (I don't know the exact reason for this differentiation between arrays and tensors for indexing), it is not "unpacked" like an array, and the elements of the i-th row of the indices tensor are used for selecting the elements of mask along the axis-0. That is, mask[indices] is equivalent to mask[[[0, 1], [0, 2], [1, 0]], :, :]
>>> mask[ind]
tensor([[[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
[[10, 11, 12],
[13, 14, 15],
[16, 17, 18]]],
[[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
[[19, 20, 21],
[22, 23, 24],
[25, 26, 27]]],
[[[10, 11, 12],
[13, 14, 15],
[16, 17, 18]],
[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]]]])
which is basically tensor(mask[[0,1], :, :], mask[[0,2],: ,:], mask[[1,0], :, :]) and has shape indices.shape + mask[0,:,:].shape == (3,2,3,3). So whole "sheets" are selected and stacked into new dimensions. Note that this is not a new tensor, but a special view of mask. Therefore if you assign mask[indices] = 1, with this particular indices, then all the elements of mask will become 1.

How to create a 2D array of ranges using numpy

I have an array of start and stop indices, like this:
[[0, 3], [4, 7], [15, 18]]
and i would like to construct a 2D numpy array where each row is a range from the corresponding pair of start and stop indices, as follows:
[[0, 1, 2],
[4, 5, 6],
[15, 16, 18]]
Currently, i am creating an empty array and filling it in a for loop:
ranges = numpy.empty((3, 3))
a = [[0, 3], [4, 7], [15, 18]]
for i, r in enumerate(a):
ranges[i] = numpy.arange(r[0], r[1])
Is there a more compact and (more importantly) faster way of doing this? possibly something that doesn't involve using a loop?

One way is to use broadcast to add the left hand edges to the base arange:
In [11]: np.arange(3) + np.array([0, 4, 15])[:, None]
Out[11]:
array([[ 0, 1, 2],
[ 4, 5, 6],
[15, 16, 17]])
Note: this requires all ranges to be the same length.

If the ranges were to result in different lengths, for a vectorized approach you could use n_ranges from the linked solution:
a = np.array([[0, 3], [4, 7], [15, 18]])
n_ranges(a[:,0], a[:,1], return_flat=False)
# [array([0, 1, 2]), array([4, 5, 6]), array([15, 16, 17])]
Which would also work with the following array:
a = np.array([[0, 3], [4, 9], [15, 18]])
n_ranges(*a.T, return_flat=False)
# [array([0, 1, 2]), array([4, 5, 6, 7, 8]), array([15, 16, 17])]

numpy map 2D array values

I'm trying to map values of 2D numpy array, i.e. to iterate (efficiently) over rows and append values based on row index.
One of approaches I have tried is:
source = misc.imread(fname) # Load some image
img = np.array(source, dtype=np.float64) / 255 # Cast and normalize values
w, h, d = tuple(img.shape) # Get dimensions
img = np.reshape(img, (w * h, d)) # Flatten 3D to 2D
# The actual problem:
# Map (R, G, B) pixels to (R, G, B, X, Y) to preserve position
img_data = ((px[0], px[1], px[2], idx % w, int(idx // w)) for idx, px in enumerate(img))
img_data = np.fromiter(img_data, dtype=tuple) # Get back to np.array
but the solution raises: ValueError: cannot create object arrays from iterator
Can anyone suggest how to perform efficiently this absurdly simple operation in numpy? It's out of my mind how intricate is this library... And why that code consumes a few gigs of memory for 7000x5000 px?
Thanks

maybe np.concatenate and np.indices:
np.concatenate((np.arange(40).reshape((4,5,2)), *np.indices((4,5,1))), axis=-1)[:,:,:-1]
Out[264]:
array([[[ 0, 1, 0, 0],
[ 2, 3, 0, 1],
[ 4, 5, 0, 2],
[ 6, 7, 0, 3],
[ 8, 9, 0, 4]],
[[10, 11, 1, 0],
[12, 13, 1, 1],
[14, 15, 1, 2],
[16, 17, 1, 3],
[18, 19, 1, 4]],
[[20, 21, 2, 0],
[22, 23, 2, 1],
[24, 25, 2, 2],
[26, 27, 2, 3],
[28, 29, 2, 4]],
[[30, 31, 3, 0],
[32, 33, 3, 1],
[34, 35, 3, 2],
[36, 37, 3, 3],
[38, 39, 3, 4]]])
the [:,:,:-1] strips an 'extra' 0 entry, maybe there's a better way

Merging non-overlapping array blocks

I divided a (512x512) 2-dimensional array to 2x2 blocks using this function.
skimage.util.view_as_blocks (arr_in, block_shape)
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
>>> B = view_as_blocks(A, block_shape=(2, 2))
>>> B[0, 0]
array([[0, 1],
[4, 5]])
>>> B[0, 1]
array([[2, 3],
[6, 7]])
Now I need to put the same blocks to their original places after manipulation but I couldn't see any function in skimage for that.
What's the best way to merge the non-overlapping arrays as it was before?
Thank you!

Use transpose/swapaxes to swap the second and third axes and then reshape to have the last two axes merged -
B.transpose(0,2,1,3).reshape(-1,B.shape[1]*B.shape[3])
B.swapaxes(1,2).reshape(-1,B.shape[1]*B.shape[3])
Sample run -
In [41]: A
Out[41]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In [42]: B = view_as_blocks(A, block_shape=(2, 2))
In [43]: B
Out[43]:
array([[[[ 0, 1],
[ 4, 5]],
[[ 2, 3],
[ 6, 7]]],
[[[ 8, 9],
[12, 13]],
[[10, 11],
[14, 15]]]])
In [44]: B.transpose(0,2,1,3).reshape(-1,B.shape[1]*B.shape[3])
Out[44]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])

This is where you'd better use einops:
from einops import rearrange
# that's how you could rewrite view_as_blocks
B = rearrange(A, '(x dx) (y dy) -> x y dx dy', dx=2, dy=2)
# that's an answer to your question
A = rearrange(B, 'x y dx dy -> (x dx) (y dy)')
See documentation for more operations on images

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Find the lowest non-masked point with numpy efficiently - python

Related

splitting an array where it meets the peak values

Tensor Entry Selection Logic Divergence in PyTorch & Numpy

How to create a 2D array of ranges using numpy

numpy map 2D array values

Merging non-overlapping array blocks

Categories

Resources