Numpy, Reading from File with no delimiter, and most efficient way - python

I'm looking for most efficient way to load file with matrix to numpy array without delimiter.
should I use generator to convert and fill? file consist of single 1 and 0 only
000000000
011111111
111000100
110001110
000001100
001000000
110000000
111111100
to:
[
[0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 1, 1, 1, 1, 1, 1, 1, 1]
[1, 1, 1, 0, 0, 0, 1, 0, 0]
...
]

You can use numpy.genfromtxt
import numpy as np
np.genfromtxt('matrix.txt', delimiter=1, dtype=int)
array([[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 0, 0, 0, 1, 0, 0],
[1, 1, 0, 0, 0, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 1, 1, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1, 0, 0]])

Related

Unexpected "list index out of range" error when extracting values of last column in 2D list

I have a 20x20 2D nested list (see end of post) composed of ones and zeros.
My aim is to store all the elements of the last column (except for the first and last rows) into a variable.
I initially thought the following code would work:
variable = nestedList[1:19][19]
However this throws me an "index out of range" error.
Why is this error coming up and how can I solve this problem?
The 2D list:
[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0],
[0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0],
[1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1],
[1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1],
[0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0],
[1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0],
[1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1],
[1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1],
[1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0],
[1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1],
[1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1],
[1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1],
[0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0]]
[19] is not indexing a column, it's indexing row number 19 in the list of rows. But you've sliced the rows so there are only 18 rows, so you get an error trying to access [19].
Use a list comprehension to get an element of each row:
[row[-1] for row in nestedList[1:-1]]
Notice that you can use negative indexes to count from the end.
While Barmar gives the technically correct answer, this solution has a clumsy taste. You can almost stick to your more intuitive syntax if you use numpy:
import numpy as np
mat = np.array(nested_list)
variable = mat[1:19,19]

Python 2d array loop

It might be stupid question but I'm starting with python and I have no clue how to write it.
So I want to print this table in loop like on the screen and then I want it to be usable but It's hard for me to write it down (look on screen pls):
table = [print([random.randint(0,1) for x in range(10)]) for y in range(10)]
a = table
print(a)
console output
The values are printed using the print inside the list. But the method print return nothing, None, so you're saving 10 None in the outer list
table = [[random.randint(0, 1) for x in range(10)] for y in range(10)]
for row in table:
print(row)
a = table
print(a)
[1, 0, 0, 1, 1, 0, 1, 1, 1, 1]
[0, 1, 1, 0, 0, 1, 0, 1, 1, 0]
[1, 0, 1, 0, 0, 0, 0, 1, 1, 0]
[1, 1, 1, 1, 0, 1, 1, 1, 0, 0]
[0, 1, 0, 0, 1, 1, 0, 1, 0, 0]
[1, 0, 0, 0, 0, 1, 0, 0, 0, 1]
[0, 1, 0, 1, 1, 1, 0, 0, 0, 0]
[0, 1, 0, 1, 0, 1, 1, 1, 1, 1]
[1, 0, 0, 0, 1, 1, 1, 0, 1, 1]
[1, 1, 0, 0, 0, 0, 1, 1, 0, 0]
[[1, 0, 0, 1, 1, 0, 1, 1, 1, 1], [0, 1, 1, 0, 0, 1, 0, 1, 1, 0], [1, 0, 1, 0, 0, 0, 0, 1, 1, 0], [1, 1, 1, 1, 0, 1, 1, 1, 0, 0], [0, 1, 0, 0, 1, 1, 0, 1, 0, 0], [1, 0, 0, 0, 0, 1, 0, 0, 0, 1], [0, 1, 0, 1, 1, 1, 0, 0, 0, 0], [0, 1, 0, 1, 0, 1, 1, 1, 1, 1], [1, 0, 0, 0, 1, 1, 1, 0, 1, 1], [1, 1, 0, 0, 0, 0, 1, 1, 0, 0]]

Finding groups of same value inside a 2D list

I'm currently working on something were I need to find groups of values in a 2d list that are surrounded by another value and then change the values of the surrounded elements. Start/end of a sub-list counts as surrounded.
For exemple if I have this list :
[[1, 2, 1, 1, 1, 0, 0, 0, 0],
[2, 2, 2, 2, 2, 1, 0, 0, 0],
[1, 2, 2, 1, 1, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 1, 0],
[0, 1, 2, 1, 0, 0, 1, 1, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0]]
I want the function to change it to this:
[[1, 3, 1, 1, 1, 0, 0, 0, 0],
[3, 3, 3, 3, 3, 1, 0, 0, 0],
[1, 3, 3, 1, 1, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 1, 0],
[0, 1, 3, 1, 0, 0, 1, 1, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0]]
I have no idea how to do it. Can anyone please help me ?
lst = [
[1, 2, 1, 1, 1, 0, 0, 0, 0],
[2, 2, 2, 2, 2, 1, 0, 0, 0],
[1, 2, 2, 1, 1, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 1, 0],
[0, 1, 2, 1, 0, 0, 1, 1, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
]
for row in range(len(lst)):
for col in range(len(lst[row])):
if lst[row][col] == 1:
continue
v1 = lst[row][col - 1] >= 1 if col - 1 >= 0 else 1
v2 = lst[row - 1][col] >= 1 if row - 1 >= 0 else 1
v3 = lst[row][col + 1] >= 1 if col + 1 < len(lst[row]) else 1
v4 = lst[row + 1][col] >= 1 if row + 1 < len(lst) else 1
if v1 + v2 + v3 + v4 == 4:
lst[row][col] += 1
from pprint import pprint
pprint(lst)
Prints:
[[1, 3, 1, 1, 1, 0, 0, 0, 0],
[3, 3, 3, 3, 3, 1, 0, 0, 0],
[1, 3, 3, 1, 1, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 1, 0],
[0, 1, 3, 1, 0, 0, 1, 1, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0]]

How to permutate rows in sparse (Numpy) matrix efficiently using permutation array?

I used the Scipy Reverse Cuthill-McKee implementation (scipy.sparse.csgraph.reverse_cuthill_mckee) for creating a band matrix using a (high-dimensional) sparse csr_matrix.
The result of this method is a permutation array whichs gives me the indices of how to permutate the rows of my matrix as I understood.
Now is there any efficient solution for doing this permutation on my sparse csr_matrix in any other sparse matrix (csr, lil_matrix, etc)?
I tried a for-loop but my matrix has dimension like 200,000 x 150,000 and it takes too much time.
A = csr_matrix((data,(rowind,columnind)), shape=(200000, 150000), dtype=np.uint8)
permutation_array = csgraph.reverse_cuthill_mckee(A, false)
result_matrix = lil_matrix((200000, 150000), dtype=np.uint8)
i=0
for x in np.nditer(permutation_array):
result_matrix[x, :]=A[i, :]
i+=1
The result of the reverse_cuthill_mckee call is an array which is like a tupel containing the indices for my permutation. So this array is something like: [199999 54877 54873 ..., 12045 9191 0] (size = 200,000)
This means:
row with index 0 has now index 199999,
row with index 1 has now index 54877,
row with index 2 has now index 54873,
etc. see: https://en.wikipedia.org/wiki/Permutation#Definition_and_notations
(As I understood the return)
Thank you
I wonder if you are applying the permutation array correctly.
Make a random matrix (float) and convert it to a uint8 (beware, csr calculations might not work with this dtype):
In [963]: ran=sparse.random(10,10,.3, format='csr')
In [964]: A = sparse.csr_matrix((np.ones(ran.data.shape).astype(np.uint8),ran.indices, ran.indptr))
In [965]: A.A
Out[965]:
array([[1, 1, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 1, 1, 1, 1, 1, 1, 0, 1, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 1, 0, 1],
[0, 1, 0, 0, 1, 1, 0, 0, 0, 0],
[1, 0, 1, 0, 0, 1, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1],
[0, 1, 1, 1, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 1, 1, 0, 0, 0]], dtype=uint8)
(oops, used the wrong matrix here):
In [994]: permutation_array = csgraph.reverse_cuthill_mckee(A, False)
In [995]: permutation_array
Out[995]: array([9, 7, 0, 4, 6, 3, 5, 1, 8, 2], dtype=int32)
My first inclination is to use such an array to simply index rows of the original matrix:
In [996]: A[permutation_array,:].A
Out[996]:
array([[0, 0, 0, 0, 1, 1, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1],
[1, 1, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 1, 0, 0, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 1, 0, 1],
[1, 0, 1, 0, 0, 1, 0, 1, 0, 0],
[0, 1, 1, 1, 1, 1, 1, 0, 1, 0],
[0, 1, 1, 1, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=uint8)
I see some clustering; maybe the best we can expect from a random matrix.
You on the other hand appear to be doing:
In [997]: res = sparse.lil_matrix(A.shape,dtype=A.dtype)
In [998]: res[permutation_array,:] = A
In [999]: res.A
Out[999]:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1],
[0, 0, 0, 0, 1, 1, 1, 0, 0, 0],
[1, 0, 1, 0, 0, 1, 0, 1, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 1, 0, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 1, 1, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 1, 1, 0, 1, 0],
[0, 1, 1, 1, 0, 1, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 1, 0, 0, 0]], dtype=uint8)
I don't see any improvement in clustering of 1s in res.
The docs for the MATLAB equivalent say
r = symrcm(S) returns the symmetric reverse Cuthill-McKee ordering of S. This is a permutation r such that S(r,r) tends to have its nonzero elements closer to the diagonal.
In numpy terms, that means:
In [1019]: I,J=np.ix_(permutation_array,permutation_array)
In [1020]: A[I,J].A
Out[1020]:
array([[0, 0, 0, 1, 1, 0, 1, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 0, 1, 0, 0, 1, 0, 0],
[0, 0, 0, 1, 0, 0, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 1, 0, 0],
[0, 1, 1, 0, 0, 0, 1, 0, 0, 1],
[0, 0, 0, 1, 1, 1, 1, 1, 1, 1],
[0, 0, 0, 0, 0, 1, 1, 1, 0, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=uint8)
And indeed there are more 0 bands in the 2 off diagonal corners.
And using the bandwidth calculation on the MATLAB page, https://www.mathworks.com/help/matlab/ref/symrcm.html
In [1028]: i,j=A.nonzero()
In [1029]: np.max(i-j)
Out[1029]: 7
In [1030]: i,j=A[I,J].nonzero()
In [1031]: np.max(i-j)
Out[1031]: 5
The MATLAB docs say that with this permutation, the eigenvalues remain the same. Testing:
In [1032]: from scipy.sparse import linalg
In [1048]: linalg.eigs(A.astype('f'))[0]
Out[1048]:
array([ 3.14518213+0.j , -0.96188843+0.j ,
-0.58978939+0.62853903j, -0.58978939-0.62853903j,
1.09950364+0.54544497j, 1.09950364-0.54544497j], dtype=complex64)
In [1049]: linalg.eigs(A[I,J].astype('f'))[0]
Out[1049]:
array([ 3.14518023+0.j , 1.09950352+0.54544479j,
1.09950352-0.54544479j, -0.58978981+0.62853914j,
-0.58978981-0.62853914j, -0.96188819+0.j ], dtype=complex64)
Eigenvalues are not the same for the row permutations we tried earlier:
In [1050]: linalg.eigs(A[permutation_array,:].astype('f'))[0]
Out[1050]:
array([ 2.95226836+0.j , -1.60117996+0.52467293j,
-1.60117996-0.52467293j, -0.01723826+1.06249797j,
-0.01723826-1.06249797j, 0.90314150+0.j ], dtype=complex64)
In [1051]: linalg.eigs(res.astype('f'))[0]
Out[1051]:
array([-0.05822830-0.97881651j, -0.99999994+0.j ,
1.17350495+0.j , -0.91237622+0.8656373j ,
-0.91237622-0.8656373j , 2.26292515+0.j ], dtype=complex64)
This [I,J] permutation works with the example matrix in http://ciprian-zavoianu.blogspot.com/2009/01/project-bandwidth-reduction.html
In [1058]: B = np.matrix('1 0 0 0 1 0 0 0;0 1 1 0 0 1 0 1;0 1 1 0 1 0 0 0;0 0 0
...: 1 0 0 1 0;1 0 1 0 1 0 0 0; 0 1 0 0 0 1 0 1;0 0 0 1 0 0 1 0;0 1 0 0 0
...: 1 0 1')
In [1059]: B
Out[1059]:
matrix([[1, 0, 0, 0, 1, 0, 0, 0],
[0, 1, 1, 0, 0, 1, 0, 1],
[0, 1, 1, 0, 1, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 1, 0],
[1, 0, 1, 0, 1, 0, 0, 0],
[0, 1, 0, 0, 0, 1, 0, 1],
[0, 0, 0, 1, 0, 0, 1, 0],
[0, 1, 0, 0, 0, 1, 0, 1]])
In [1060]: Bm=sparse.csr_matrix(B)
In [1061]: Bm
Out[1061]:
<8x8 sparse matrix of type '<class 'numpy.int32'>'
with 22 stored elements in Compressed Sparse Row format>
In [1062]: permB = csgraph.reverse_cuthill_mckee(Bm, False)
In [1063]: permB
Out[1063]: array([6, 3, 7, 5, 1, 2, 4, 0], dtype=int32)
In [1064]: Bm[np.ix_(permB,permB)].A
Out[1064]:
array([[1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 1, 1]], dtype=int32)

Scipy label erosion

How can I keep a ring of pixels around labeled regions in a numpy array?
In a simple case, I'd subtract the erosion. That approach doesn't work when the labels touch. How can I get get B from A?
A = array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
B = array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 0, 0, 0, 0, 0, 2, 0, 0, 0],
[0, 0, 2, 0, 0, 0, 0, 0, 2, 0, 0, 0],
[0, 0, 2, 0, 0, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
I'm working with large arrays with many labels, so separate erosions on each label isn't an option.
New Answer
Actually, I just thought of a better way:
B = A * (np.abs(scipy.ndimage.laplace(A)) > 0)
As a full example:
import numpy as np
import scipy.ndimage
A = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
B = A * (np.abs(scipy.ndimage.laplace(A)) > 0)
I think this should work in all cases (of "labeled" arrays like A, at any rate...).
If you're worried about performance, you can split this into a few pieces to reduce memory overhead:
B = scipy.ndimage.laplace(A)
B = np.abs(B, B) # Preform abs in-place
B /= B # This will produce a divide by zero warning that you can safely ignore
B *= A
This version is a lot more verbose, but should use much less memory.
Old Answer
I can't think of a good way to do it in one step with the usual scipy.ndimage functions. (I feel like a tophat filter should do what you want, but I can't quite figure it out.)
However, doing several separate erosions is an option, as you mentioned.
You should get reasonable performance even on very large arrays if you use find_objects to extract the subregion of each label, and then just do the erosion on the subregion.
For example:
import numpy as np
import scipy.ndimage
A = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
regions = scipy.ndimage.find_objects(A)
mask = np.zeros_like(A).astype(np.bool)
for val, region in enumerate(regions, start=1):
if region is not None:
subregion = A[region]
mask[region] = scipy.ndimage.binary_erosion(subregion == val)
B = A.copy()
B[mask] = 0
This yields:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 0, 0, 0, 0, 0, 2, 0, 0, 0],
[0, 0, 2, 0, 0, 0, 0, 0, 2, 0, 0, 0],
[0, 0, 2, 0, 0, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
The performance should be reasonable for large arrays, but it's going to depend strongly on how large of an area the different labeled objects span and the number of labeled objects that you have....

Categories

Resources