Say I have a 1D numpy array of numbers myArray = ([1, 1, 0, 2, 0, 1, 1, 1, 1, 0, 0 ,1, 2, 1, 1, 1]).
I want to create a 2D numpy array that describe the first (column 1) and last (column 2) indices of any "streak" of consecutive 1's that is longer than 2.
So for the example above, the 2D array should look like this:
indicesArray =
([5, 8],
[13, 15])
Since there are at least 3 consecutive ones in the 5th, 6th, 7th, 8th places and in the 13th, 14th, 15th places.
Any help would be appreciated.
Approach #1
Here's one approach inspired by this post -
def start_stop(a, trigger_val, len_thresh=2):
# "Enclose" mask with sentients to catch shifts later on
mask = np.r_[False,np.equal(a, trigger_val),False]
# Get the shifting indices
idx = np.flatnonzero(mask[1:] != mask[:-1])
# Get lengths
lens = idx[1::2] - idx[::2]
return idx.reshape(-1,2)[lens>len_thresh]-[0,1]
Sample run -
In [47]: myArray
Out[47]: array([1, 1, 0, 2, 0, 1, 1, 1, 1, 0, 0, 1, 2, 1, 1, 1])
In [48]: start_stop(myArray, trigger_val=1, len_thresh=2)
Out[48]:
array([[ 5, 8],
[13, 15]])
Approach #2
Another with binary_erosion -
from scipy.ndimage.morphology import binary_erosion
mask = binary_erosion(myArray==1,structure=np.ones((3)))
idx = np.flatnonzero(mask[1:] != mask[:-1])
out = idx.reshape(-1,2)+[0,1]
I have this 2d array of zeros z and this 1d array of starting points starts. In addition, I have an 1d array of offsets
z = z = np.zeros(35, dtype='i').reshape(5, 7)
starts = np.array([1, 5, 3, 0, 3])
offsets = np.arange(5) + 1
I would like to vectorize this little for loop here, but I seem to be unable to do it.
for i in range(z.shape[0]):
z[i, starts[i]:] += offsets[i]
The result in this example should look like this:
z
array([[0, 1, 1, 1, 1, 1, 1],
[0, 0, 0, 0, 0, 2, 2],
[0, 0, 0, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4, 4],
[0, 0, 0, 5, 5, 5, 5]])
We could use some masking and NumPy broadcasting -
mask = starts[:,None] <= np.arange(z.shape[1])
z[mask] = np.repeat(offsets, mask.sum(1))
We could play a trick of broadcasted multiplication to get the final output -
z = offsets[:,None] * mask
Other way would be to assign values into z from offsets and then mask out the rest of mask, like so -
z[:] = offsets[:,None]
z[~mask] = 0
And other way would be have a replicated version from offsets as the starting z and then mask out -
z = np.repeat(offsets,z.shape[1]).reshape(z.shape[0],-1)
z[~mask] = 0
Of course, we would need the shape parameters before-hand.
If z is not initialized as zeros array, then only one of the solutions mentioned earlier would be applicable and that would need to be updated with +=, like so -
z[mask] += np.repeat(offsets, mask.sum(1))
I cut out the zeros of a numpy array, do some stuff and want to insert them back in visual purposes. I do have the indices of the sections and tried to insert the zeros back in with numpy.insert and zip but the index runs out of bounds, even though I start at the lower end. Example:
import numpy as np
a = np.array([1, 2, 4, 0, 0, 0, 3, 6, 2, 0, 0, 1, 3, 0, 0, 0, 5])
a = a[a != 0] # cut zeros out
zero_start = [3, 9, 13]
zero_end = [5, 10, 15]
# Now insert the zeros back in using the former indices
for ev in zip(zero_start, zero_end):
a = np.insert(a, ev[0], np.zeros(ev[1]-ev[0]))
>>> IndexError: index 13 is out of bounds for axis 0 with size 12
Seems like he is not refreshing the array size inside the loop. Any suggestions or other (more pythonic) approaches to solve this problem?
Approach #1: Using indexing -
# Get all zero indices
idx = np.concatenate([range(i,j+1) for i,j in zip(zero_start,zero_end)])
# Setup output array of zeros
N = len(idx) + len(a)
out = np.zeros(N,dtype=a.dtype)
# Get mask of non-zero places and assign values from a into those
out[~np.in1d(np.arange(N),idx)] = a
We can also generate the actual indices where a had non-zeros originally and then assign. Thus, the last step of masking could be replaced with something like this -
out[np.setdiff1d(np.arange(N),idx)] = a
Approach #2: Using np.insert given zero_start and zero_end as arrays -
insert_start = np.r_[zero_start[0], zero_start[1:] - zero_end[:-1]-1].cumsum()
out = np.insert(a, np.repeat(insert_start, zero_end - zero_start + 1), 0)
Sample run -
In [755]: a = np.array([1, 2, 4, 0, 0, 0, 3, 6, 2, 0, 0, 1, 3, 0, 0, 0, 5])
...: a = a[a != 0] # cut zeros out
...: zero_start = np.array([3, 9, 13])
...: zero_end = np.array([5, 10, 15])
...:
In [756]: s0 = np.r_[zero_start[0], zero_start[1:] - zero_end[:-1]-1].cumsum()
In [757]: np.insert(a, np.repeat(s0, zero_end - zero_start + 1), 0)
Out[757]: array([1, 2, 4, 0, 0, 0, 3, 6, 2, 0, 0, 1, 3, 0, 0, 0, 5])
Is there a function in Python that samples from an n-dimensional numpy array and returns the indices of each draw. If not how would one go about defining such a function?
E.g.:
>>> probabilities = np.array([[.1, .2, .1], [.05, .5, .05]])
>>> print function(probabilities, draws = 10)
([1,1],[0,2],[1,1],[1,0],[0,1],[0,1],[1,1],[0,0],[1,1],[0,1])
I know this problem can be solved in many ways with 1-D arrays. However, I will be dealing with large n-dimensional arrays and can not afford to reshape them just to do a single draw.
You can use np.unravel_index:
a = np.random.rand(3, 4, 5)
a /= a.sum()
def sample(a, n=1):
a = np.asarray(a)
choices = np.prod(a.shape)
index = np.random.choice(choices, size=n, p=a.ravel())
return np.unravel_index(index, dims=a.shape)
>>> sample(a, 4)
(array([2, 2, 0, 2]), array([0, 1, 3, 2]), array([2, 4, 2, 1]))
This returns a tuple of arrays, one per dimension of a, each of length the number of samples requested. If you would rather have an array of shape (samples, dimensions), change the return statement to:
return np.column_stack(np.unravel_index(index, dims=a.shape))
And now:
>>> sample(a, 4)
array([[2, 0, 0],
[2, 2, 4],
[2, 0, 0],
[1, 0, 4]])
If your array is contiguous in memory, you can change the shape of your array in place:
probabilities = np.array([[.1, .2, .1], [.05, .5, .05]])
nrow, ncol = probabilities.shape
idx = np.arange( nrow * ncol ) # create 1D index
probabilities.shape = ( 6, ) # this is OK because your array is contiguous in memory
samples = np.random.choice( idx, 10, p=probabilities ) # sample in 1D
rowIndex = samples / nrow # convert to 2D
colIndex = samples % ncol
array([2, 0, 1, 0, 2, 2, 2, 2, 2, 0])
array([1, 1, 2, 0, 1, 1, 1, 1, 1, 1])
Note that since your array is contiguous in memory, reshape returns a view as well:
In [53]:
view = probabilities.reshape( 6, -1 )
view[ 0 ] = 9
probabilities[ 0, 0 ]
Out[53]:
9.0
I'd like to roll a 2D numpy in python, except that I'd like pad the ends with zeros rather than roll the data as if its periodic.
Specifically, the following code
import numpy as np
x = np.array([[1, 2, 3], [4, 5, 6]])
np.roll(x, 1, axis=1)
returns
array([[3, 1, 2],[6, 4, 5]])
but what I would prefer is
array([[0, 1, 2], [0, 4, 5]])
I could do this with a few awkward touchups, but I'm hoping that there's a way to do it with fast built-in commands.
Thanks
There is a new numpy function in version 1.7.0 numpy.pad that can do this in one-line. Pad seems to be quite powerful and can do much more than a simple "roll". The tuple ((0,0),(1,0)) used in this answer indicates the "side" of the matrix which to pad.
import numpy as np
x = np.array([[1, 2, 3],[4, 5, 6]])
print np.pad(x,((0,0),(1,0)), mode='constant')[:, :-1]
Giving
[[0 1 2]
[0 4 5]]
I don't think that you are going to find an easier way to do this that is built-in. The touch-up seems quite simple to me:
y = np.roll(x,1,axis=1)
y[:,0] = 0
If you want this to be more direct then maybe you could copy the roll function to a new function and change it to do what you want. The roll() function is in the site-packages\core\numeric.py file.
I just wrote the following. It could be more optimized by avoiding zeros_like and just computing the shape for zeros directly.
import numpy as np
def roll_zeropad(a, shift, axis=None):
"""
Roll array elements along a given axis.
Elements off the end of the array are treated as zeros.
Parameters
----------
a : array_like
Input array.
shift : int
The number of places by which elements are shifted.
axis : int, optional
The axis along which elements are shifted. By default, the array
is flattened before shifting, after which the original
shape is restored.
Returns
-------
res : ndarray
Output array, with the same shape as `a`.
See Also
--------
roll : Elements that roll off one end come back on the other.
rollaxis : Roll the specified axis backwards, until it lies in a
given position.
Examples
--------
>>> x = np.arange(10)
>>> roll_zeropad(x, 2)
array([0, 0, 0, 1, 2, 3, 4, 5, 6, 7])
>>> roll_zeropad(x, -2)
array([2, 3, 4, 5, 6, 7, 8, 9, 0, 0])
>>> x2 = np.reshape(x, (2,5))
>>> x2
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
>>> roll_zeropad(x2, 1)
array([[0, 0, 1, 2, 3],
[4, 5, 6, 7, 8]])
>>> roll_zeropad(x2, -2)
array([[2, 3, 4, 5, 6],
[7, 8, 9, 0, 0]])
>>> roll_zeropad(x2, 1, axis=0)
array([[0, 0, 0, 0, 0],
[0, 1, 2, 3, 4]])
>>> roll_zeropad(x2, -1, axis=0)
array([[5, 6, 7, 8, 9],
[0, 0, 0, 0, 0]])
>>> roll_zeropad(x2, 1, axis=1)
array([[0, 0, 1, 2, 3],
[0, 5, 6, 7, 8]])
>>> roll_zeropad(x2, -2, axis=1)
array([[2, 3, 4, 0, 0],
[7, 8, 9, 0, 0]])
>>> roll_zeropad(x2, 50)
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
>>> roll_zeropad(x2, -50)
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
>>> roll_zeropad(x2, 0)
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
"""
a = np.asanyarray(a)
if shift == 0: return a
if axis is None:
n = a.size
reshape = True
else:
n = a.shape[axis]
reshape = False
if np.abs(shift) > n:
res = np.zeros_like(a)
elif shift < 0:
shift += n
zeros = np.zeros_like(a.take(np.arange(n-shift), axis))
res = np.concatenate((a.take(np.arange(n-shift,n), axis), zeros), axis)
else:
zeros = np.zeros_like(a.take(np.arange(n-shift,n), axis))
res = np.concatenate((zeros, a.take(np.arange(n-shift), axis)), axis)
if reshape:
return res.reshape(a.shape)
else:
return res
import numpy as np
def shift_2d_replace(data, dx, dy, constant=False):
"""
Shifts the array in two dimensions while setting rolled values to constant
:param data: The 2d numpy array to be shifted
:param dx: The shift in x
:param dy: The shift in y
:param constant: The constant to replace rolled values with
:return: The shifted array with "constant" where roll occurs
"""
shifted_data = np.roll(data, dx, axis=1)
if dx < 0:
shifted_data[:, dx:] = constant
elif dx > 0:
shifted_data[:, 0:dx] = constant
shifted_data = np.roll(shifted_data, dy, axis=0)
if dy < 0:
shifted_data[dy:, :] = constant
elif dy > 0:
shifted_data[0:dy, :] = constant
return shifted_data
This function would work on 2D arrays and replace rolled values with a constant of your choosing.
A bit late, but feels like a quick way to do what you want in one line. Perhaps would work best if wrapped inside a smart function (example below provided just for horizontal axis):
import numpy
a = numpy.arange(1,10).reshape(3,3) # an example 2D array
print a
[[1 2 3]
[4 5 6]
[7 8 9]]
shift = 1
a = numpy.hstack((numpy.zeros((a.shape[0], shift)), a[:,:-shift]))
print a
[[0 1 2]
[0 4 5]
[0 7 8]]
You can also use ndimage.shift:
>>> from scipy import ndimage
>>> arr = np.array([[1, 2, 3], [4, 5, 6]])
>>> ndimage.shift(arr, (0,1))
array([[0, 1, 2],
[0, 4, 5]])
Elaborating on the answer by Hooked (since it took me a few minutes to understand it)
The code below first pads a certain amount of zeros in the up, down, left and right margins and then selects the original matrix inside the padded one. A perfectly useless code, but good for understanding np.pad.
import numpy as np
x = np.array([[1, 2, 3],[4, 5, 6]])
y = np.pad(x,((1,3),(2,4)), mode='constant')[1:-3,2:-4]
print np.all(x==y)
now to make an upwards shift of 2 combined with a rightwards shift of 1 position one should do
print np.pad(x,((0,2),(1,0)), mode='constant')[2:0,0:-1]
You could also use numpy's triu and scipy.linalg's circulant. Make a circulant version of your matrix. Then, select the upper triangular part starting at the first diagonal, (the default option in triu). The row index will correspond to the number of padded zeros you want.
If you don't have scipy you can generate a nXn circulant matrix by making an (n-1) X (n-1) identity matrix and stacking a row [0 0 ... 1] on top of it and the column [1 0 ... 0] to the right of it.
I faced a similar problem with shifting a 2-d array in both directions
def shift_frame(img,move_dir,fill=np.inf):
frame = np.full_like(img,fill)
x,y = move_dir
size_x,size_y = np.array(img.shape) - np.abs(move_dir)
frame_x = slice(0,size_x) if x>=0 else slice(-x,size_x-x)
frame_y = slice(0,size_y) if y>=0 else slice(-y,size_y-y)
img_x = slice(x,None) if x>=0 else slice(0,size_x)
img_y = slice(y,None) if y>=0 else slice(0,size_y)
frame[frame_x,frame_y] = img[img_x,img_y]
return frame
test = np.arange(25).reshape((5,5))
shift_frame(test,(1,1))
'''
returns:
array([[ 6, 7, 8, 9, -1],
[11, 12, 13, 14, -1],
[16, 17, 18, 19, -1],
[21, 22, 23, 24, -1],
[-1, -1, -1, -1, -1]])
'''
I haven't measured the runtime of this, but it seems to work well enough for my use, although a built-in one liner would be nice
import numpy as np
def roll_zeropad(a, dyx):
h, w = a.shape[:2]
dy, dx = dyx
pad_x, start_x, end_x = ((dx,0), 0, w) if dx > 0 else ((0,-dx), -dx, w-dx)
pad_y, start_y, end_y = ((dy,0), 0, h) if dy > 0 else ((0,-dy), -dy, h-dy)
return np.pad(a, (pad_y, pad_x))[start_y:end_y,start_x:end_x]
test = np.arange(25).reshape((5,5))
out = roll_zeropad(test,(1,1))
print(out)
"""
returns:
[[ 0 0 0 0 0]
[ 0 0 1 2 3]
[ 0 5 6 7 8]
[ 0 10 11 12 13]
[ 0 15 16 17 18]]
"""