This question already has an answer here:
Slicing 2d array from values in 1d array onwards
(1 answer)
Closed 3 years ago.
Firstly, here's the 1-d analog of what I'm trying to do..
Suppose I have a 1d array of 0s and I want to replace every 0 from index 2 onward with a 1. I can do this as follows:
import numpy as np
x = np.array([0,0,0,0])
i = 2
x[i:] = 1
print(x) # [0 0 1 1]
Now, I'm trying to figure out the 2d version of this operation. To start, I have a 5x4 array of 0s like
foo = np.zeros(shape = (5,4))
[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]
and a corresponding 5 element array of column indices like
fill_locs = np.array([0, 3, 1, 1, 2])
For each row of foo, I want to fill columns i: with 1s where i is the index given by fill_locs. foo[fill_locs.reshape(-1, 1):] = 1 feels right, but doesn't work.
My desired output should look like
expected_result = np.array([
[1, 1, 1, 1],
[0, 0, 0, 1],
[0, 1, 1, 1],
[0, 1, 1, 1],
[0, 0, 1, 1],
])
You don't need slicing, and you don't need to create the original array. You can accomplish all of this with broadcasted comparison.
a = np.array([0, 3, 1, 1, 2])
n = 4
(a[:, None] <= np.arange(n)).view('i1')
array([[1, 1, 1, 1],
[0, 0, 0, 1],
[0, 1, 1, 1],
[0, 1, 1, 1],
[0, 0, 1, 1]], dtype=int8)
Or using less_equal.outer
np.less_equal.outer(a, np.arange(n)).view('i1')
Related
How to get a 2D np.array with value 1 at indices represented by values in 1D np.array in Python.
Example:
[1, 2, 5, 1, 2]
should be converted to
[[0, 1, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 1],
[0, 1, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0]]
Here you already know the width (shape[2]) value of the new array beforehand.
I can do it manually but is there any way to do it directly using NumPy methods for faster execution? The dimension of my array is quite large and I have to do this for all iteration. Thus, doing this manually for each iteration is quite computationally demanding.
You can create a array with zeros using np.zeros. The shape the array should be (len(1D array), max(1D array)+1). Then use NumPy's indexing.
idx = [1, 2, 5, 1, 2]
shape = (len(idx), max(idx)+1)
out = np.zeros(shape)
out[np.arange(len(idx)), idx] = 1
print(out)
[[0. 1. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0.]
[0. 0. 0. 0. 0. 1.]
[0. 1. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0.]]
I have a matrix, with some rows and columns equal to zero, so it is not invertible.
I need to get an iverse of non-zero submatrix, so that the inverse then has the same structure as the original matrix.
Expected behavior would be something like this:
>>>test
array([[1, 0, 0, 2],
[0, 0, 0, 0],
[0, 0, 0, 0],
[3, 0, 0, 4]])
>>>get_nonzero(test)
array([[1, 2],
[3, 4]])
>>>np.linalg.inv(nonzero)
array([[-2. , 1. ],
[ 1.5, -0.5]])
>>>restore_shape(inv_matrix)
array([[-2. , 0. , 0. , 1. ],
[ 0. , 0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. ],
[ 1.5, 0. , 0. , -0.5]])
Perhaps it is relevant that I originally get test matrix by zeroing rows and columns of some original matrix with all elements being non-zero with boolean indexing like:
>>>bool_index
array([False, True, True, False])
>>>original[bool_index, :] = 0
>>>original[:, bool_index] = 0
>>>original
array([[1, 0, 0, 2],
[0, 0, 0, 0],
[0, 0, 0, 0],
[3, 0, 0, 4]])
I achieved getting non-zero submatrix from original matrix by first converting it to pandas DataFrame and indexing with boolean arrays with .loc like this:
>>>pd.DataFrame(original).loc[~bool_index, ~bool_index].values
array([[1, 2],
[3, 4]])
However, I am not sure, how can I efficiently restore inverted array to the original shape.
To use the function formatting you propose, and assuming your input matrix can lead to a N*N matrix, the following works:
import numpy as np
def get_nonzero(t1):
t2 = t1[t1 != 0]
size = int(np.sqrt(len(t2)))
return t2.reshape(size, size)
def restore_shape(t1, t3):
t4 = np.zeros(t1.shape)
idx_non_zeros = np.nonzero(t1)
for i, elt in enumerate(t3.flatten()):
t4[idx_non_zeros[0][i], idx_non_zeros[1][i]] = elt
return t4
t1 = np.array([[1, 0, 0, 2],
[0, 0, 0, 0],
[0, 0, 0, 0],
[3, 0, 0, 4]])
t2 = get_nonzero(t1)
t3 = np.linalg.inv(t2)
t4 = restore_shape(t1, t3)
But as proposed in the comments, np.linalg.pinv(t1) is much more elegant and efficient.
Consider example matrix array:
[[0 1 2 1 0]
[1 1 2 1 0]
[0 1 0 0 0]
[1 2 1 0 0]
[1 2 2 3 2]]
What I need to do:
find maxima in every row
select smaller surrounding of the maxima from every row (3 values in this case)
paste the surrounding of the maxima into new array (narrower)
For the example above, the result is:
[[ 1. 2. 1.]
[ 1. 2. 1.]
[ 0. 1. 0.]
[ 1. 2. 1.]
[ 2. 3. 2.]]
My current working code:
import numpy as np
A = np.array([
[0, 1, 2, 1, 0],
[1, 1, 2, 1, 0],
[0, 1, 0, 0, 0],
[1, 2, 1, 0, 0],
[1, 2, 2, 3, 2],
])
b = A.argmax(axis=1)
C = np.zeros((len(A), 3))
for idx, loc, row in zip(range(len(A)), b, A):
print(idx, loc, row)
C[idx] = row[loc-1:loc+2]
print(C)
My question:
How to get rid of the for loop and replace it with some cheaper numpy operation?
Note:
This algorithm is for straightening broken "lines" in video stream frames with thousands of rows.
Approach #1
We can have a vectorized solution based on setting up sliding windows and then indexing into those with b-offsetted indices to get desired output. We can leverage np.lib.stride_tricks.as_strided based scikit-image's view_as_windows to get sliding windows. More info on use of as_strided based view_as_windows.
The implementation would be -
from skimage.util.shape import view_as_windows
L = 3 # window length
w = view_as_windows(A,(1,L))[...,0,:]
Cout = w[np.arange(len(b)),b-L//2]
Being a view-based method, this has the advantage of being memory-efficient and hence good on performance too.
Approach #2
Alternatively, a one-liner by creating all those indices with outer-addition would be -
A[np.arange(len(b))[:,None],b[:,None] + np.arange(-(L//2),L//2+1)]
This works by making and array with all the desired indices, but somehow using that directly on A results in a 3D array, hence the subsequent indexing... Probably not optimal, but definitely another way of doing it!
import numpy as np
A = np.array([
[0, 1, 2, 1, 0],
[1, 1, 2, 1, 0],
[0, 1, 0, 0, 0],
[1, 2, 1, 0, 0],
[1, 2, 2, 3, 2],
])
b = A.argmax(axis = 1).reshape(-1, 1)
index = b + np.arange(-1,2,1).reshape(1, -1)
A[:,index][np.arange(b.size),np.arange(b.size)]
This question already has answers here:
How to make a checkerboard in numpy?
(28 answers)
Closed 4 years ago.
How to create an ‘n*n’ checkerboard matrix with the values alternate 0 and 1, using the tile function.
For example:
when n has a value of 2, Output should be:
[[0 1]
[1 0]]
I am able to create a matrix with 0 and 1, but they are not appearing alternatively, below is what i tried:
import numpy as np
n = 4
arr = ([0,1])
print(np.tile(arr,(n,n//2)))
output I got:
[[0 1 0 1]
[0 1 0 1]
[0 1 0 1]
[0 1 0 1]]`
output I want:
[[0 1 0 1]
[1 0 1 0]
[0 1 0 1]
[1 0 1 0]]`
A simple way using numpy could be to define a vector of 0s and 1s of size n and take advantage of broadcasting to create a nxn checkerboard:
def checkerboard(n):
a = np.resize([0,1], n)
return np.abs(a-np.array([a]).T)
Sample use -
checkerboard(2)
array([[0, 1],
[1, 0]])
checkerboard(4)
array([[0, 1, 0, 1],
[1, 0, 1, 0],
[0, 1, 0, 1],
[1, 0, 1, 0]])
Details -
The above works by initially creating a length n 1D vector of 0s and 1s using np.resize:
import numpy as np
n = 3
np.resize([0,1], n)
array([0, 1, 0])
And then subtracting its transposed (2D), which will result in a broadcast array of shape (n,n), with negative and positive 1s:
a-np.array([a]).T
array([[ 0, 1, 0, 1],
[-1, 0, -1, 0],
[ 0, 1, 0, 1],
[-1, 0, -1, 0]])
We just need to take the absolute value of it and we have a checkerboard matrix.
You could use numpy fancy indexing, no need to use np.tile:
import numpy as np
def tiling(n):
result = np.zeros((n, n))
result[::2, 1::2] = 1
result[1::2, ::2] = 1
return result
print(tiling(2))
print()
print(tiling(4))
Output
[[0. 1.]
[1. 0.]]
[[0. 1. 0. 1.]
[1. 0. 1. 0.]
[0. 1. 0. 1.]
[1. 0. 1. 0.]]
Here is a one line numpy solution. That said, I think Daniel's response is much more readable and probably more efficient.
If n is odd then np.arange(n*n).reshape(n,n)%2 gives the correct result. However, if n is even, then all the rows and columns will be the same (like your result). We can fix this by subtracting one from every other row.
tile = (np.arange(n*n).reshape(n,n)-np.arange(n).reshape(n,1)*(n%2+1))%2
Equivalently,
tile = (np.arange(n*n).reshape(n,n,order='F')-np.arange(n)*(n+1))%2
Generally, I'm trying to split a distance matrix into K folds. Specifically, for the 3 x 3 case, my distance matrix might look like this:
full = np.array([
[0, 0, 3],
[1, 0, 1],
[2, 1, 0]
])
I also have a list of randomly generated assignments, the length of which is equal to the sum over all elements in the distance matrix. For the K = 3 case, it might look like this:
assignments = np.array([0, 1, 0, 2, 1, 1, 0, 0])
I want to create K = 3 new 3 x 3 matrices of zeros, in which the values of the distance matrix are "distributed" according to the assignments list. Code is more precise than words, so here's my current attempt:
def assign(full, assignments):
folds = [np.zeros(full.shape) for _ in xrange(np.max(assignments) + 1)]
rows, cols = full.shape
a = 0
for r in xrange(rows):
for c in xrange(cols):
for i in xrange(full[r, c]):
folds[assignments[a]][r, c] += 1
a += 1
return folds
This works (slowly), and in this example,
folds = assign(full, assignments)
for f in folds:
print f
returns
[[ 0. 0. 2.]
[ 0. 0. 0.]
[ 1. 1. 0.]]
[[ 0. 0. 1.]
[ 0. 0. 1.]
[ 1. 0. 0.]]
[[ 0. 0. 0.]
[ 1. 0. 0.]
[ 0. 0. 0.]]
as desired. However, my current attempt is very slow, especially for the N x N case for N large. How can I improve the speed of this function? Is there some numpy magic that I should be using here?
One idea I had was converting to a sparse matrix and looping over nonzero entries. This would only help a bit, however,
You can use add.at to do unbuffered in place operation:
import numpy as np
full = np.array([
[0, 0, 3],
[1, 0, 1],
[2, 1, 0]
])
assignments = np.array([0, 1, 0, 2, 1, 1, 0, 0])
res = np.zeros((np.max(assignments) + 1,) + full.shape, dtype=int)
r, c = np.nonzero(full)
n = full[r, c]
r = np.repeat(r, n)
c = np.repeat(c, n)
np.add.at(res, (assignments, r, c), 1)
print(res)
You just need to figure out what item in the flattened output would get incremented each time, then aggregate them with bincount:
def assign(full, assignments):
assert len(assignments) == np.sum(full)
rows, cols = full.shape
n = np.max(assignments) + 1
full_flat = full.reshape(-1)
full_flat_non_zero = full_flat != 0
full_flat_indices = np.repeat(np.where(full_flat_non_zero)[0],
full_flat[full_flat_non_zero])
folds_flat_indices = full_flat_indices + assignments*rows*cols
return np.bincount(folds_flat_indices,
minlength=n*rows*cols).reshape(n, rows, cols)
>>> assign(full, assignments)
array([[[0, 0, 2],
[0, 0, 0],
[1, 1, 0]],
[[0, 0, 1],
[0, 0, 1],
[1, 0, 0]],
[[0, 0, 0],
[1, 0, 0],
[0, 0, 0]]])
You may want to print out each of those intermediate arrays for your example, to see what exactly is going on.