Related
When slicing a numpy array we obtain a view on the corresponding data. However that doesn't seem to be the case with sparse matrices from scipy.sparse. Although the docs briefly mention slicing for the lil_matrix class it's not clear how (or if) one can obtain views on the data.
At least by using the following sample script I wasn't successful in obtaining views of sparse matrices:
import numpy as np
from scipy.sparse import lil_matrix
def test(matrix):
print('\n=== Testing {} ==='.format(type(matrix)))
a = matrix[:, 0]
b = matrix[0, :]
a[0] = 100
M[0, 1] = 200
M[1, 0] = 200
print('a = '); print(a)
print('b = '); print(b)
M = np.arange(4).reshape(2, 2) + 1
S = lil_matrix(M)
test(M)
test(S)
Which outputs:
=== Testing <class 'numpy.ndarray'> ===
a =
[100 200]
b =
[100 200]
=== Testing <class 'scipy.sparse.lil.lil_matrix'> ===
a =
(0, 0) 100
(1, 0) 3
b =
(0, 0) 1
(0, 1) 2
Tested on Python 3.6.6, numpy==1.14.5, scipy==1.1.0.
I'll eat my words - partially. There is a lilmatrix getrowview method (but not a getcolview).
A lil matrix has 2 object dtype array attributes, data and rows. Both contain lists, one for each row.
def getrow(self, i):
"""Returns a copy of the 'i'th row.
"""
i = self._check_row_bounds(i)
new = lil_matrix((1, self.shape[1]), dtype=self.dtype)
new.rows[0] = self.rows[i][:]
new.data[0] = self.data[i][:]
return new
def getrowview(self, i):
"""Returns a view of the 'i'th row (without copying).
"""
new = lil_matrix((1, self.shape[1]), dtype=self.dtype)
new.rows[0] = self.rows[i]
new.data[0] = self.data[i]
return new
A little testing shows that modifying elements of the row view does affect the parent, and v.v.
This view works because an object array contains pointers. As with pointers in a list, they can be shared. And if done right, such a list can be modified in-place.
I found this by doing a page search for view on the lil_matrix documentation. I don't find anything similar for the other formats.
There are numerical functions on the csr format that work directly with the .data attribute. This is possible if you aren't changing sparsity, and only want to modify the nonzero values. And it is possible to modify that attribute in place. In limited cases it might be possible to construct a new sparse matrix that shares slices of the data attribute of another, but it would not be anything as general as ndarray slicing.
In [88]: M = sparse.lil_matrix((4,10),dtype=int)
In [89]: M[0,1::2] = 1
In [90]: M[1,::2] = 2
In [91]: M1 = M.getrowview(0)
In [92]: M1[0,::2] = 3
In [94]: M.A
Out[94]:
array([[3, 1, 3, 1, 3, 1, 3, 1, 3, 1],
[2, 0, 2, 0, 2, 0, 2, 0, 2, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
In [95]: M[0,1::2] = 4
In [97]: M1.A
Out[97]: array([[3, 4, 3, 4, 3, 4, 3, 4, 3, 4]])
Following this model I could make an 'advanced-indexview, something thatndarray` doesn't do:
In [98]: M2 = sparse.lil_matrix((2,10), dtype=int)
In [99]: M2.rows[:] = M.rows[[0,3]]
In [100]: M2.data[:] = M.data[[0,3]]
In [101]: M2.A
Out[101]:
array([[3, 4, 3, 4, 3, 4, 3, 4, 3, 4],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
In [102]: M2[:,::2] *= 10
In [103]: M2.A
Out[103]:
array([[30, 4, 30, 4, 30, 4, 30, 4, 30, 4],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
In [104]: M1.A
Out[104]: array([[30, 4, 30, 4, 30, 4, 30, 4, 30, 4]])
Say I have a 1D numpy array of numbers myArray = ([1, 1, 0, 2, 0, 1, 1, 1, 1, 0, 0 ,1, 2, 1, 1, 1]).
I want to create a 2D numpy array that describe the first (column 1) and last (column 2) indices of any "streak" of consecutive 1's that is longer than 2.
So for the example above, the 2D array should look like this:
indicesArray =
([5, 8],
[13, 15])
Since there are at least 3 consecutive ones in the 5th, 6th, 7th, 8th places and in the 13th, 14th, 15th places.
Any help would be appreciated.
Approach #1
Here's one approach inspired by this post -
def start_stop(a, trigger_val, len_thresh=2):
# "Enclose" mask with sentients to catch shifts later on
mask = np.r_[False,np.equal(a, trigger_val),False]
# Get the shifting indices
idx = np.flatnonzero(mask[1:] != mask[:-1])
# Get lengths
lens = idx[1::2] - idx[::2]
return idx.reshape(-1,2)[lens>len_thresh]-[0,1]
Sample run -
In [47]: myArray
Out[47]: array([1, 1, 0, 2, 0, 1, 1, 1, 1, 0, 0, 1, 2, 1, 1, 1])
In [48]: start_stop(myArray, trigger_val=1, len_thresh=2)
Out[48]:
array([[ 5, 8],
[13, 15]])
Approach #2
Another with binary_erosion -
from scipy.ndimage.morphology import binary_erosion
mask = binary_erosion(myArray==1,structure=np.ones((3)))
idx = np.flatnonzero(mask[1:] != mask[:-1])
out = idx.reshape(-1,2)+[0,1]
I have this 2d array of zeros z and this 1d array of starting points starts. In addition, I have an 1d array of offsets
z = z = np.zeros(35, dtype='i').reshape(5, 7)
starts = np.array([1, 5, 3, 0, 3])
offsets = np.arange(5) + 1
I would like to vectorize this little for loop here, but I seem to be unable to do it.
for i in range(z.shape[0]):
z[i, starts[i]:] += offsets[i]
The result in this example should look like this:
z
array([[0, 1, 1, 1, 1, 1, 1],
[0, 0, 0, 0, 0, 2, 2],
[0, 0, 0, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4, 4],
[0, 0, 0, 5, 5, 5, 5]])
We could use some masking and NumPy broadcasting -
mask = starts[:,None] <= np.arange(z.shape[1])
z[mask] = np.repeat(offsets, mask.sum(1))
We could play a trick of broadcasted multiplication to get the final output -
z = offsets[:,None] * mask
Other way would be to assign values into z from offsets and then mask out the rest of mask, like so -
z[:] = offsets[:,None]
z[~mask] = 0
And other way would be have a replicated version from offsets as the starting z and then mask out -
z = np.repeat(offsets,z.shape[1]).reshape(z.shape[0],-1)
z[~mask] = 0
Of course, we would need the shape parameters before-hand.
If z is not initialized as zeros array, then only one of the solutions mentioned earlier would be applicable and that would need to be updated with +=, like so -
z[mask] += np.repeat(offsets, mask.sum(1))
Let's say we have a 1d numpy array filled with some int values. And let's say that some of them are 0.
Is there any way, using numpy array's power, to fill all the 0 values with the last non-zero values found?
for example:
arr = np.array([1, 0, 0, 2, 0, 4, 6, 8, 0, 0, 0, 0, 2])
fill_zeros_with_last(arr)
print arr
[1 1 1 2 2 4 6 8 8 8 8 8 2]
A way to do it would be with this function:
def fill_zeros_with_last(arr):
last_val = None # I don't really care about the initial value
for i in range(arr.size):
if arr[i]:
last_val = arr[i]
elif last_val is not None:
arr[i] = last_val
However, this is using a raw python for loop instead of taking advantage of the numpy and scipy power.
If we knew that a reasonably small number of consecutive zeros are possible, we could use something based on numpy.roll. The problem is that the number of consecutive zeros is potentially large...
Any ideas? or should we go straight to Cython?
Disclaimer:
I would say long ago I found a question in stackoverflow asking something like this or very similar. I wasn't able to find it. :-(
Maybe I missed the right search terms, sorry for the duplicate then. Maybe it was just my imagination...
Here's a solution using np.maximum.accumulate:
def fill_zeros_with_last(arr):
prev = np.arange(len(arr))
prev[arr == 0] = 0
prev = np.maximum.accumulate(prev)
return arr[prev]
We construct an array prev which has the same length as arr, and such that prev[i] is the index of the last non-zero entry before the i-th entry of arr. For example, if:
>>> arr = np.array([1, 0, 0, 2, 0, 4, 6, 8, 0, 0, 0, 0, 2])
Then prev looks like:
array([ 0, 0, 0, 3, 3, 5, 6, 7, 7, 7, 7, 7, 12])
Then we just index into arr with prev and we obtain our result. A test:
>>> arr = np.array([1, 0, 0, 2, 0, 4, 6, 8, 0, 0, 0, 0, 2])
>>> fill_zeros_with_last(arr)
array([1, 1, 1, 2, 2, 4, 6, 8, 8, 8, 8, 8, 2])
Note: Be careful to understand what this does when the first entry of your array is zero:
>>> fill_zeros_with_last(np.array([0,0,1,0,0]))
array([0, 0, 1, 1, 1])
Inspired by jme's answer here and by Bas Swinckels' (in the linked question) I came up with a different combination of numpy functions:
def fill_zeros_with_last(arr, initial=0):
ind = np.nonzero(arr)[0]
cnt = np.cumsum(np.array(arr, dtype=bool))
return np.where(cnt, arr[ind[cnt-1]], initial)
I think it's succinct and also works, so I'm posting it here for the record. Still, jme's is also succinct and easy to follow and seems to be faster, so I'm accepting it :-)
If the 0s only come in strings of 1, this use of nonzero might work:
In [266]: arr=np.array([1,0,2,3,0,4,0,5])
In [267]: I=np.nonzero(arr==0)[0]
In [268]: arr[I] = arr[I-1]
In [269]: arr
Out[269]: array([1, 1, 2, 3, 3, 4, 4, 5])
I can handle your arr by applying this repeatedly until I is empty.
In [286]: arr = np.array([1, 0, 0, 2, 0, 4, 6, 8, 0, 0, 0, 0, 2])
In [287]: while True:
.....: I=np.nonzero(arr==0)[0]
.....: if len(I)==0: break
.....: arr[I] = arr[I-1]
.....:
In [288]: arr
Out[288]: array([1, 1, 1, 2, 2, 4, 6, 8, 8, 8, 8, 8, 2])
If the strings of 0s are long it might be better to look for those strings and handle them as a block. But if most strings are short, this repeated application may be the fastest route.
I'd like to roll a 2D numpy in python, except that I'd like pad the ends with zeros rather than roll the data as if its periodic.
Specifically, the following code
import numpy as np
x = np.array([[1, 2, 3], [4, 5, 6]])
np.roll(x, 1, axis=1)
returns
array([[3, 1, 2],[6, 4, 5]])
but what I would prefer is
array([[0, 1, 2], [0, 4, 5]])
I could do this with a few awkward touchups, but I'm hoping that there's a way to do it with fast built-in commands.
Thanks
There is a new numpy function in version 1.7.0 numpy.pad that can do this in one-line. Pad seems to be quite powerful and can do much more than a simple "roll". The tuple ((0,0),(1,0)) used in this answer indicates the "side" of the matrix which to pad.
import numpy as np
x = np.array([[1, 2, 3],[4, 5, 6]])
print np.pad(x,((0,0),(1,0)), mode='constant')[:, :-1]
Giving
[[0 1 2]
[0 4 5]]
I don't think that you are going to find an easier way to do this that is built-in. The touch-up seems quite simple to me:
y = np.roll(x,1,axis=1)
y[:,0] = 0
If you want this to be more direct then maybe you could copy the roll function to a new function and change it to do what you want. The roll() function is in the site-packages\core\numeric.py file.
I just wrote the following. It could be more optimized by avoiding zeros_like and just computing the shape for zeros directly.
import numpy as np
def roll_zeropad(a, shift, axis=None):
"""
Roll array elements along a given axis.
Elements off the end of the array are treated as zeros.
Parameters
----------
a : array_like
Input array.
shift : int
The number of places by which elements are shifted.
axis : int, optional
The axis along which elements are shifted. By default, the array
is flattened before shifting, after which the original
shape is restored.
Returns
-------
res : ndarray
Output array, with the same shape as `a`.
See Also
--------
roll : Elements that roll off one end come back on the other.
rollaxis : Roll the specified axis backwards, until it lies in a
given position.
Examples
--------
>>> x = np.arange(10)
>>> roll_zeropad(x, 2)
array([0, 0, 0, 1, 2, 3, 4, 5, 6, 7])
>>> roll_zeropad(x, -2)
array([2, 3, 4, 5, 6, 7, 8, 9, 0, 0])
>>> x2 = np.reshape(x, (2,5))
>>> x2
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
>>> roll_zeropad(x2, 1)
array([[0, 0, 1, 2, 3],
[4, 5, 6, 7, 8]])
>>> roll_zeropad(x2, -2)
array([[2, 3, 4, 5, 6],
[7, 8, 9, 0, 0]])
>>> roll_zeropad(x2, 1, axis=0)
array([[0, 0, 0, 0, 0],
[0, 1, 2, 3, 4]])
>>> roll_zeropad(x2, -1, axis=0)
array([[5, 6, 7, 8, 9],
[0, 0, 0, 0, 0]])
>>> roll_zeropad(x2, 1, axis=1)
array([[0, 0, 1, 2, 3],
[0, 5, 6, 7, 8]])
>>> roll_zeropad(x2, -2, axis=1)
array([[2, 3, 4, 0, 0],
[7, 8, 9, 0, 0]])
>>> roll_zeropad(x2, 50)
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
>>> roll_zeropad(x2, -50)
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
>>> roll_zeropad(x2, 0)
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
"""
a = np.asanyarray(a)
if shift == 0: return a
if axis is None:
n = a.size
reshape = True
else:
n = a.shape[axis]
reshape = False
if np.abs(shift) > n:
res = np.zeros_like(a)
elif shift < 0:
shift += n
zeros = np.zeros_like(a.take(np.arange(n-shift), axis))
res = np.concatenate((a.take(np.arange(n-shift,n), axis), zeros), axis)
else:
zeros = np.zeros_like(a.take(np.arange(n-shift,n), axis))
res = np.concatenate((zeros, a.take(np.arange(n-shift), axis)), axis)
if reshape:
return res.reshape(a.shape)
else:
return res
import numpy as np
def shift_2d_replace(data, dx, dy, constant=False):
"""
Shifts the array in two dimensions while setting rolled values to constant
:param data: The 2d numpy array to be shifted
:param dx: The shift in x
:param dy: The shift in y
:param constant: The constant to replace rolled values with
:return: The shifted array with "constant" where roll occurs
"""
shifted_data = np.roll(data, dx, axis=1)
if dx < 0:
shifted_data[:, dx:] = constant
elif dx > 0:
shifted_data[:, 0:dx] = constant
shifted_data = np.roll(shifted_data, dy, axis=0)
if dy < 0:
shifted_data[dy:, :] = constant
elif dy > 0:
shifted_data[0:dy, :] = constant
return shifted_data
This function would work on 2D arrays and replace rolled values with a constant of your choosing.
A bit late, but feels like a quick way to do what you want in one line. Perhaps would work best if wrapped inside a smart function (example below provided just for horizontal axis):
import numpy
a = numpy.arange(1,10).reshape(3,3) # an example 2D array
print a
[[1 2 3]
[4 5 6]
[7 8 9]]
shift = 1
a = numpy.hstack((numpy.zeros((a.shape[0], shift)), a[:,:-shift]))
print a
[[0 1 2]
[0 4 5]
[0 7 8]]
You can also use ndimage.shift:
>>> from scipy import ndimage
>>> arr = np.array([[1, 2, 3], [4, 5, 6]])
>>> ndimage.shift(arr, (0,1))
array([[0, 1, 2],
[0, 4, 5]])
Elaborating on the answer by Hooked (since it took me a few minutes to understand it)
The code below first pads a certain amount of zeros in the up, down, left and right margins and then selects the original matrix inside the padded one. A perfectly useless code, but good for understanding np.pad.
import numpy as np
x = np.array([[1, 2, 3],[4, 5, 6]])
y = np.pad(x,((1,3),(2,4)), mode='constant')[1:-3,2:-4]
print np.all(x==y)
now to make an upwards shift of 2 combined with a rightwards shift of 1 position one should do
print np.pad(x,((0,2),(1,0)), mode='constant')[2:0,0:-1]
You could also use numpy's triu and scipy.linalg's circulant. Make a circulant version of your matrix. Then, select the upper triangular part starting at the first diagonal, (the default option in triu). The row index will correspond to the number of padded zeros you want.
If you don't have scipy you can generate a nXn circulant matrix by making an (n-1) X (n-1) identity matrix and stacking a row [0 0 ... 1] on top of it and the column [1 0 ... 0] to the right of it.
I faced a similar problem with shifting a 2-d array in both directions
def shift_frame(img,move_dir,fill=np.inf):
frame = np.full_like(img,fill)
x,y = move_dir
size_x,size_y = np.array(img.shape) - np.abs(move_dir)
frame_x = slice(0,size_x) if x>=0 else slice(-x,size_x-x)
frame_y = slice(0,size_y) if y>=0 else slice(-y,size_y-y)
img_x = slice(x,None) if x>=0 else slice(0,size_x)
img_y = slice(y,None) if y>=0 else slice(0,size_y)
frame[frame_x,frame_y] = img[img_x,img_y]
return frame
test = np.arange(25).reshape((5,5))
shift_frame(test,(1,1))
'''
returns:
array([[ 6, 7, 8, 9, -1],
[11, 12, 13, 14, -1],
[16, 17, 18, 19, -1],
[21, 22, 23, 24, -1],
[-1, -1, -1, -1, -1]])
'''
I haven't measured the runtime of this, but it seems to work well enough for my use, although a built-in one liner would be nice
import numpy as np
def roll_zeropad(a, dyx):
h, w = a.shape[:2]
dy, dx = dyx
pad_x, start_x, end_x = ((dx,0), 0, w) if dx > 0 else ((0,-dx), -dx, w-dx)
pad_y, start_y, end_y = ((dy,0), 0, h) if dy > 0 else ((0,-dy), -dy, h-dy)
return np.pad(a, (pad_y, pad_x))[start_y:end_y,start_x:end_x]
test = np.arange(25).reshape((5,5))
out = roll_zeropad(test,(1,1))
print(out)
"""
returns:
[[ 0 0 0 0 0]
[ 0 0 1 2 3]
[ 0 5 6 7 8]
[ 0 10 11 12 13]
[ 0 15 16 17 18]]
"""