Find longest consecutive zeros in 3D-array along axis - python

How can I find the longest consecutive zeros in a 3D array along a specific axis?
import numpy as np
a = np.random.randint(2, size=(10, 10, 10))
I want to find the longest sequence of 0 along axis=0 so that I get a 10x10 array.
In one dimension it works with:
import numpy as np
a = np.random.randint(2, size=100)
condition = (a==0)
L = np.diff(np.where(np.concatenate(([condition[0]],
condition[:-1] != condition[1:],
[True])))[0])[::2]
print(np.max(L))

You could use np.cumsum() to sum up the 1s and 0s along a given dimension.
The idea is that, when you have consecutive zeros the value in the cumsum stays the same. So in the end you want to find the most common value in this array, as its count is exactly the length of the longest sequence of zeros (+1).
import numpy as np
from scipy.stats import mode
# 1D, bincount
a = np.random.randint(2, size=10)
# array([1, 1, 1, 1, 0, 0, 0, 0, 1, 1])
# ^ ^ ^ ^
b = np.cumsum(a)
# array([1, 2, 3, 4, 4, 4, 4, 4, 5, 6])
# ^ ^ ^ ^
c = np.bincount(b)
# array([0, 1, 1, 1, 5, 1, 1])
# ^
res = np.max(c) - 1
# 4
bincount unfortunately works only for 1D arrays, so for the multidimensional case, I switch to scipy.stats.mode, which returns just the modal (most common) value and its count.
# 1D, stats.mode
c2 = mode(b)
# ModeResult(mode=array([4]), count=array([5]))
res = c2[1] - 1
# 3D, stas.mode
from scipy.stats import mode
axis = 0
a = np.random.randint(2, size=(10, 10, 10))
res = mode(np.cumsum(a, axis=axis), axis=axis)[1] - 1
# Note the resulting shape is (1, 10, 10)
# You might want to use np.squeeze() / np.max()
# to get rid of the dimension with size 1
# res = res.max(axis=axis)
EDIT
As #clearseplex pointed out, I didn't think of the case when the array starts with 0.
a = np.array([0, 0, 0, 1, 1, 0, 1, 0, 0, 1])
b = np.cumsum(a)
# array([0, 0, 0, 1, 2, 2, 3, 3, 3, 4])
# ^ ^ ^
There are 3 zeros, but if I subtract one I get the wrong result. So the correct solution is to subtract only if the most common value is not 0.
So the correct way is:
m, res = mode(np.cumsum(a, axis=axis), axis=axis)
res[m != 0] -= 1
# res = res.argmax(axis)

To perform your task, define the following function:
def longestZeroSeqLength(a):
# Changes in "isZero" for consecutive elements
chg = np.abs(np.diff(np.equal(a, 0).view(np.int8), prepend=[0], append=[0]))
# Ranges of "isZero" elements
rng = np.where(chg == 1)[0]
if rng.size == 0: return 0 # All non-zero elements
rng = rng.reshape(-1, 2)
# Compute length of each range and return the biggest
return np.subtract(rng[:,1], rng[:,0]).max()
Then apply it to your array:
result = np.apply_along_axis(longestZeroSeqLength, 0, a)
To test it, I created the following (smaller) array:
siz = (3, 4, 5)
np.random.seed(1)
a = np.random.randint(2, size=siz)
After running my code I got:
array([[1, 0, 2, 2, 0],
[1, 2, 1, 0, 3],
[3, 1, 1, 1, 0],
[1, 1, 2, 2, 3]], dtype=int64)
To easier assess what contains each slice and what is each partial
result, you can run:
for j in range(a.shape[1]):
for k in range(a.shape[2]):
b = a[:, j, k]
res = longestZeroSeqLength(b)
print(f'{j}, {k}: {b}, {res}')

Related

Creating 2D numpy array of start and end indices of "streaks" in another array.

Say I have a 1D numpy array of numbers myArray = ([1, 1, 0, 2, 0, 1, 1, 1, 1, 0, 0 ,1, 2, 1, 1, 1]).
I want to create a 2D numpy array that describe the first (column 1) and last (column 2) indices of any "streak" of consecutive 1's that is longer than 2.
So for the example above, the 2D array should look like this:
indicesArray =
([5, 8],
[13, 15])
Since there are at least 3 consecutive ones in the 5th, 6th, 7th, 8th places and in the 13th, 14th, 15th places.
Any help would be appreciated.
Approach #1
Here's one approach inspired by this post -
def start_stop(a, trigger_val, len_thresh=2):
# "Enclose" mask with sentients to catch shifts later on
mask = np.r_[False,np.equal(a, trigger_val),False]
# Get the shifting indices
idx = np.flatnonzero(mask[1:] != mask[:-1])
# Get lengths
lens = idx[1::2] - idx[::2]
return idx.reshape(-1,2)[lens>len_thresh]-[0,1]
Sample run -
In [47]: myArray
Out[47]: array([1, 1, 0, 2, 0, 1, 1, 1, 1, 0, 0, 1, 2, 1, 1, 1])
In [48]: start_stop(myArray, trigger_val=1, len_thresh=2)
Out[48]:
array([[ 5, 8],
[13, 15]])
Approach #2
Another with binary_erosion -
from scipy.ndimage.morphology import binary_erosion
mask = binary_erosion(myArray==1,structure=np.ones((3)))
idx = np.flatnonzero(mask[1:] != mask[:-1])
out = idx.reshape(-1,2)+[0,1]

numpy indexing: add vector to parts of rows, starting at varying position

I have this 2d array of zeros z and this 1d array of starting points starts. In addition, I have an 1d array of offsets
z = z = np.zeros(35, dtype='i').reshape(5, 7)
starts = np.array([1, 5, 3, 0, 3])
offsets = np.arange(5) + 1
I would like to vectorize this little for loop here, but I seem to be unable to do it.
for i in range(z.shape[0]):
z[i, starts[i]:] += offsets[i]
The result in this example should look like this:
z
array([[0, 1, 1, 1, 1, 1, 1],
[0, 0, 0, 0, 0, 2, 2],
[0, 0, 0, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4, 4],
[0, 0, 0, 5, 5, 5, 5]])
We could use some masking and NumPy broadcasting -
mask = starts[:,None] <= np.arange(z.shape[1])
z[mask] = np.repeat(offsets, mask.sum(1))
We could play a trick of broadcasted multiplication to get the final output -
z = offsets[:,None] * mask
Other way would be to assign values into z from offsets and then mask out the rest of mask, like so -
z[:] = offsets[:,None]
z[~mask] = 0
And other way would be have a replicated version from offsets as the starting z and then mask out -
z = np.repeat(offsets,z.shape[1]).reshape(z.shape[0],-1)
z[~mask] = 0
Of course, we would need the shape parameters before-hand.
If z is not initialized as zeros array, then only one of the solutions mentioned earlier would be applicable and that would need to be updated with +=, like so -
z[mask] += np.repeat(offsets, mask.sum(1))

Insert sections of zeros into numpy array using zip and np.insert

I cut out the zeros of a numpy array, do some stuff and want to insert them back in visual purposes. I do have the indices of the sections and tried to insert the zeros back in with numpy.insert and zip but the index runs out of bounds, even though I start at the lower end. Example:
import numpy as np
a = np.array([1, 2, 4, 0, 0, 0, 3, 6, 2, 0, 0, 1, 3, 0, 0, 0, 5])
a = a[a != 0] # cut zeros out
zero_start = [3, 9, 13]
zero_end = [5, 10, 15]
# Now insert the zeros back in using the former indices
for ev in zip(zero_start, zero_end):
a = np.insert(a, ev[0], np.zeros(ev[1]-ev[0]))
>>> IndexError: index 13 is out of bounds for axis 0 with size 12
Seems like he is not refreshing the array size inside the loop. Any suggestions or other (more pythonic) approaches to solve this problem?
Approach #1: Using indexing -
# Get all zero indices
idx = np.concatenate([range(i,j+1) for i,j in zip(zero_start,zero_end)])
# Setup output array of zeros
N = len(idx) + len(a)
out = np.zeros(N,dtype=a.dtype)
# Get mask of non-zero places and assign values from a into those
out[~np.in1d(np.arange(N),idx)] = a
We can also generate the actual indices where a had non-zeros originally and then assign. Thus, the last step of masking could be replaced with something like this -
out[np.setdiff1d(np.arange(N),idx)] = a
Approach #2: Using np.insert given zero_start and zero_end as arrays -
insert_start = np.r_[zero_start[0], zero_start[1:] - zero_end[:-1]-1].cumsum()
out = np.insert(a, np.repeat(insert_start, zero_end - zero_start + 1), 0)
Sample run -
In [755]: a = np.array([1, 2, 4, 0, 0, 0, 3, 6, 2, 0, 0, 1, 3, 0, 0, 0, 5])
...: a = a[a != 0] # cut zeros out
...: zero_start = np.array([3, 9, 13])
...: zero_end = np.array([5, 10, 15])
...:
In [756]: s0 = np.r_[zero_start[0], zero_start[1:] - zero_end[:-1]-1].cumsum()
In [757]: np.insert(a, np.repeat(s0, zero_end - zero_start + 1), 0)
Out[757]: array([1, 2, 4, 0, 0, 0, 3, 6, 2, 0, 0, 1, 3, 0, 0, 0, 5])

Python: Sampling from a discrete distribution defined in an n-dimensional array

Is there a function in Python that samples from an n-dimensional numpy array and returns the indices of each draw. If not how would one go about defining such a function?
E.g.:
>>> probabilities = np.array([[.1, .2, .1], [.05, .5, .05]])
>>> print function(probabilities, draws = 10)
([1,1],[0,2],[1,1],[1,0],[0,1],[0,1],[1,1],[0,0],[1,1],[0,1])
I know this problem can be solved in many ways with 1-D arrays. However, I will be dealing with large n-dimensional arrays and can not afford to reshape them just to do a single draw.
You can use np.unravel_index:
a = np.random.rand(3, 4, 5)
a /= a.sum()
def sample(a, n=1):
a = np.asarray(a)
choices = np.prod(a.shape)
index = np.random.choice(choices, size=n, p=a.ravel())
return np.unravel_index(index, dims=a.shape)
>>> sample(a, 4)
(array([2, 2, 0, 2]), array([0, 1, 3, 2]), array([2, 4, 2, 1]))
This returns a tuple of arrays, one per dimension of a, each of length the number of samples requested. If you would rather have an array of shape (samples, dimensions), change the return statement to:
return np.column_stack(np.unravel_index(index, dims=a.shape))
And now:
>>> sample(a, 4)
array([[2, 0, 0],
[2, 2, 4],
[2, 0, 0],
[1, 0, 4]])
If your array is contiguous in memory, you can change the shape of your array in place:
probabilities = np.array([[.1, .2, .1], [.05, .5, .05]])
nrow, ncol = probabilities.shape
idx = np.arange( nrow * ncol ) # create 1D index
probabilities.shape = ( 6, ) # this is OK because your array is contiguous in memory
samples = np.random.choice( idx, 10, p=probabilities ) # sample in 1D
rowIndex = samples / nrow # convert to 2D
colIndex = samples % ncol
array([2, 0, 1, 0, 2, 2, 2, 2, 2, 0])
array([1, 1, 2, 0, 1, 1, 1, 1, 1, 1])
Note that since your array is contiguous in memory, reshape returns a view as well:
In [53]:
view = probabilities.reshape( 6, -1 )
view[ 0 ] = 9
probabilities[ 0, 0 ]
Out[53]:
9.0

python numpy roll with padding

I'd like to roll a 2D numpy in python, except that I'd like pad the ends with zeros rather than roll the data as if its periodic.
Specifically, the following code
import numpy as np
x = np.array([[1, 2, 3], [4, 5, 6]])
np.roll(x, 1, axis=1)
returns
array([[3, 1, 2],[6, 4, 5]])
but what I would prefer is
array([[0, 1, 2], [0, 4, 5]])
I could do this with a few awkward touchups, but I'm hoping that there's a way to do it with fast built-in commands.
Thanks
There is a new numpy function in version 1.7.0 numpy.pad that can do this in one-line. Pad seems to be quite powerful and can do much more than a simple "roll". The tuple ((0,0),(1,0)) used in this answer indicates the "side" of the matrix which to pad.
import numpy as np
x = np.array([[1, 2, 3],[4, 5, 6]])
print np.pad(x,((0,0),(1,0)), mode='constant')[:, :-1]
Giving
[[0 1 2]
[0 4 5]]
I don't think that you are going to find an easier way to do this that is built-in. The touch-up seems quite simple to me:
y = np.roll(x,1,axis=1)
y[:,0] = 0
If you want this to be more direct then maybe you could copy the roll function to a new function and change it to do what you want. The roll() function is in the site-packages\core\numeric.py file.
I just wrote the following. It could be more optimized by avoiding zeros_like and just computing the shape for zeros directly.
import numpy as np
def roll_zeropad(a, shift, axis=None):
"""
Roll array elements along a given axis.
Elements off the end of the array are treated as zeros.
Parameters
----------
a : array_like
Input array.
shift : int
The number of places by which elements are shifted.
axis : int, optional
The axis along which elements are shifted. By default, the array
is flattened before shifting, after which the original
shape is restored.
Returns
-------
res : ndarray
Output array, with the same shape as `a`.
See Also
--------
roll : Elements that roll off one end come back on the other.
rollaxis : Roll the specified axis backwards, until it lies in a
given position.
Examples
--------
>>> x = np.arange(10)
>>> roll_zeropad(x, 2)
array([0, 0, 0, 1, 2, 3, 4, 5, 6, 7])
>>> roll_zeropad(x, -2)
array([2, 3, 4, 5, 6, 7, 8, 9, 0, 0])
>>> x2 = np.reshape(x, (2,5))
>>> x2
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
>>> roll_zeropad(x2, 1)
array([[0, 0, 1, 2, 3],
[4, 5, 6, 7, 8]])
>>> roll_zeropad(x2, -2)
array([[2, 3, 4, 5, 6],
[7, 8, 9, 0, 0]])
>>> roll_zeropad(x2, 1, axis=0)
array([[0, 0, 0, 0, 0],
[0, 1, 2, 3, 4]])
>>> roll_zeropad(x2, -1, axis=0)
array([[5, 6, 7, 8, 9],
[0, 0, 0, 0, 0]])
>>> roll_zeropad(x2, 1, axis=1)
array([[0, 0, 1, 2, 3],
[0, 5, 6, 7, 8]])
>>> roll_zeropad(x2, -2, axis=1)
array([[2, 3, 4, 0, 0],
[7, 8, 9, 0, 0]])
>>> roll_zeropad(x2, 50)
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
>>> roll_zeropad(x2, -50)
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
>>> roll_zeropad(x2, 0)
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
"""
a = np.asanyarray(a)
if shift == 0: return a
if axis is None:
n = a.size
reshape = True
else:
n = a.shape[axis]
reshape = False
if np.abs(shift) > n:
res = np.zeros_like(a)
elif shift < 0:
shift += n
zeros = np.zeros_like(a.take(np.arange(n-shift), axis))
res = np.concatenate((a.take(np.arange(n-shift,n), axis), zeros), axis)
else:
zeros = np.zeros_like(a.take(np.arange(n-shift,n), axis))
res = np.concatenate((zeros, a.take(np.arange(n-shift), axis)), axis)
if reshape:
return res.reshape(a.shape)
else:
return res
import numpy as np
def shift_2d_replace(data, dx, dy, constant=False):
"""
Shifts the array in two dimensions while setting rolled values to constant
:param data: The 2d numpy array to be shifted
:param dx: The shift in x
:param dy: The shift in y
:param constant: The constant to replace rolled values with
:return: The shifted array with "constant" where roll occurs
"""
shifted_data = np.roll(data, dx, axis=1)
if dx < 0:
shifted_data[:, dx:] = constant
elif dx > 0:
shifted_data[:, 0:dx] = constant
shifted_data = np.roll(shifted_data, dy, axis=0)
if dy < 0:
shifted_data[dy:, :] = constant
elif dy > 0:
shifted_data[0:dy, :] = constant
return shifted_data
This function would work on 2D arrays and replace rolled values with a constant of your choosing.
A bit late, but feels like a quick way to do what you want in one line. Perhaps would work best if wrapped inside a smart function (example below provided just for horizontal axis):
import numpy
a = numpy.arange(1,10).reshape(3,3) # an example 2D array
print a
[[1 2 3]
[4 5 6]
[7 8 9]]
shift = 1
a = numpy.hstack((numpy.zeros((a.shape[0], shift)), a[:,:-shift]))
print a
[[0 1 2]
[0 4 5]
[0 7 8]]
You can also use ndimage.shift:
>>> from scipy import ndimage
>>> arr = np.array([[1, 2, 3], [4, 5, 6]])
>>> ndimage.shift(arr, (0,1))
array([[0, 1, 2],
[0, 4, 5]])
Elaborating on the answer by Hooked (since it took me a few minutes to understand it)
The code below first pads a certain amount of zeros in the up, down, left and right margins and then selects the original matrix inside the padded one. A perfectly useless code, but good for understanding np.pad.
import numpy as np
x = np.array([[1, 2, 3],[4, 5, 6]])
y = np.pad(x,((1,3),(2,4)), mode='constant')[1:-3,2:-4]
print np.all(x==y)
now to make an upwards shift of 2 combined with a rightwards shift of 1 position one should do
print np.pad(x,((0,2),(1,0)), mode='constant')[2:0,0:-1]
You could also use numpy's triu and scipy.linalg's circulant. Make a circulant version of your matrix. Then, select the upper triangular part starting at the first diagonal, (the default option in triu). The row index will correspond to the number of padded zeros you want.
If you don't have scipy you can generate a nXn circulant matrix by making an (n-1) X (n-1) identity matrix and stacking a row [0 0 ... 1] on top of it and the column [1 0 ... 0] to the right of it.
I faced a similar problem with shifting a 2-d array in both directions
def shift_frame(img,move_dir,fill=np.inf):
frame = np.full_like(img,fill)
x,y = move_dir
size_x,size_y = np.array(img.shape) - np.abs(move_dir)
frame_x = slice(0,size_x) if x>=0 else slice(-x,size_x-x)
frame_y = slice(0,size_y) if y>=0 else slice(-y,size_y-y)
img_x = slice(x,None) if x>=0 else slice(0,size_x)
img_y = slice(y,None) if y>=0 else slice(0,size_y)
frame[frame_x,frame_y] = img[img_x,img_y]
return frame
test = np.arange(25).reshape((5,5))
shift_frame(test,(1,1))
'''
returns:
array([[ 6, 7, 8, 9, -1],
[11, 12, 13, 14, -1],
[16, 17, 18, 19, -1],
[21, 22, 23, 24, -1],
[-1, -1, -1, -1, -1]])
'''
I haven't measured the runtime of this, but it seems to work well enough for my use, although a built-in one liner would be nice
import numpy as np
def roll_zeropad(a, dyx):
h, w = a.shape[:2]
dy, dx = dyx
pad_x, start_x, end_x = ((dx,0), 0, w) if dx > 0 else ((0,-dx), -dx, w-dx)
pad_y, start_y, end_y = ((dy,0), 0, h) if dy > 0 else ((0,-dy), -dy, h-dy)
return np.pad(a, (pad_y, pad_x))[start_y:end_y,start_x:end_x]
test = np.arange(25).reshape((5,5))
out = roll_zeropad(test,(1,1))
print(out)
"""
returns:
[[ 0 0 0 0 0]
[ 0 0 1 2 3]
[ 0 5 6 7 8]
[ 0 10 11 12 13]
[ 0 15 16 17 18]]
"""

Categories

Resources