Finding groups of same value inside a 2D list - python

I'm currently working on something were I need to find groups of values in a 2d list that are surrounded by another value and then change the values of the surrounded elements. Start/end of a sub-list counts as surrounded.
For exemple if I have this list :
[[1, 2, 1, 1, 1, 0, 0, 0, 0],
[2, 2, 2, 2, 2, 1, 0, 0, 0],
[1, 2, 2, 1, 1, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 1, 0],
[0, 1, 2, 1, 0, 0, 1, 1, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0]]
I want the function to change it to this:
[[1, 3, 1, 1, 1, 0, 0, 0, 0],
[3, 3, 3, 3, 3, 1, 0, 0, 0],
[1, 3, 3, 1, 1, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 1, 0],
[0, 1, 3, 1, 0, 0, 1, 1, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0]]
I have no idea how to do it. Can anyone please help me ?

lst = [
[1, 2, 1, 1, 1, 0, 0, 0, 0],
[2, 2, 2, 2, 2, 1, 0, 0, 0],
[1, 2, 2, 1, 1, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 1, 0],
[0, 1, 2, 1, 0, 0, 1, 1, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
]
for row in range(len(lst)):
for col in range(len(lst[row])):
if lst[row][col] == 1:
continue
v1 = lst[row][col - 1] >= 1 if col - 1 >= 0 else 1
v2 = lst[row - 1][col] >= 1 if row - 1 >= 0 else 1
v3 = lst[row][col + 1] >= 1 if col + 1 < len(lst[row]) else 1
v4 = lst[row + 1][col] >= 1 if row + 1 < len(lst) else 1
if v1 + v2 + v3 + v4 == 4:
lst[row][col] += 1
from pprint import pprint
pprint(lst)
Prints:
[[1, 3, 1, 1, 1, 0, 0, 0, 0],
[3, 3, 3, 3, 3, 1, 0, 0, 0],
[1, 3, 3, 1, 1, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 1, 0],
[0, 1, 3, 1, 0, 0, 1, 1, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0]]

Related

How to make a lower triangle array of 10 but repeated across a diagonal n times?

I am trying to create an array of 10 for each item I have, but then put those arrays of 10 into a larger array diagonally with zeros filling the missing spaces.
Here is an example of what I am looking for, but only with arrays of 3.
import numpy as np
arr = np.tri(3,3)
arr
This creates an array that looks like this:
[[1,0,0],
[1,1,0],
[1,1,1]]
But I need an array of 10 * n that looks like this: (using arrays a 3 for example here, with n=2)
{1,0,0,0,0,0,
1,1,0,0,0,0,
1,1,1,0,0,0,
0,0,0,1,0,0,
0,0,0,1,1,0,
0,0,0,1,1,1}
Any help would be appreciated, thanks!
I have also tried
df_arr2 = pd.concat([df_arr] * (n), ignore_index=True)
df_arr3 = pd.concat([df_arr2] *(n), axis=1, ignore_index=True)
But this repeats the matrix across all rows and columns, when I only want the diagnonal ones.
Now I got it... AFAIU, the OP wants those np.tri triangles in the diagonal of a bigger, multiple of 3 square shaped array.
As per example, for n=2:
import numpy as np
n = 2
tri = np.tri(3)
arr = np.zeros((n*3, n*3))
for i in range(0, n*3, 3):
arr[i:i+3,i:i+3] = tri
arr.astype(int)
# Out:
# array([[1, 0, 0, 0, 0, 0],
# [1, 1, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0],
# [0, 0, 0, 1, 0, 0],
# [0, 0, 0, 1, 1, 0],
# [0, 0, 0, 1, 1, 1]])
I saw #brandt's solution which is definitely the best. Incase you want to construct the them manually you can use this method:
def custom_triangle_matrix(rows, rowlen, tsize):
cm = []
for i in range(rows):
row = []
for j in range(min((i//tsize)*tsize, rowlen)):
row.append(0)
for j in range((i//tsize)*tsize, min(((i//tsize)*tsize) + i%tsize + 1, rowlen)):
row.append(1)
for j in range(((i//tsize)*tsize) + i%tsize + 1, rowlen):
row.append(0)
cm.append(row)
return cm
Here are some example executions and what they look like using ppprint:
matrix = custom_triangle_matrix(6, 6, 3)
pprint.pprint(matrix)
[[1, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0],
[0, 0, 0, 1, 0, 0],
[0, 0, 0, 1, 1, 0],
[0, 0, 0, 1, 1, 1]]
matrix = custom_triangle_matrix(6, 9, 3)
pprint.pprint(matrix)
[[1, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 1, 1, 1, 0, 0, 0]]
matrix = custom_triangle_matrix(9, 6, 3)
pprint.pprint(matrix)
[[1, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0],
[0, 0, 0, 1, 0, 0],
[0, 0, 0, 1, 1, 0],
[0, 0, 0, 1, 1, 1],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]
matrix = custom_triangle_matrix(10, 10, 5)
pprint.pprint(matrix)
[[1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1]]
Good Luck!

Filtering elements of a numpy array depending on their occurrences

i have the following 2D numpy array M
M = np.array([[1,1,1,0,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,1,1],
[1,1,1,0,0,0,0,0,0,1,1],
[0,0,0,0,0,1,1,1,0,0,0],
[0,0,0,0,0,1,1,1,0,0,0],
[1,1,1,0,1,1,1,1,0,0,0],
[1,1,1,0,0,1,1,1,0,0,0],
[1,1,1,0,0,1,1,1,0,0,0]])
which I want to identify its spots (Pixels with value==1 and connected to each other).
Thanks to the function 'label' from scipy, I can identify all of my spots in the matrix. The output should seem like this:
Output, Nbr= label(M)
#Output= array([[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 2, 2],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 2, 2],
# [0, 0, 0, 0, 0, 3, 3, 3, 0, 0, 0],
# [0, 0, 0, 0, 0, 3, 3, 3, 0, 0, 0],
# [4, 4, 4, 0, 3, 3, 3, 3, 0, 0, 0],
# [4, 4, 4, 0, 0, 3, 3, 3, 0, 0, 0],
# [4, 4, 4, 0, 0, 3, 3, 3, 0, 0, 0]])
I want only to have spots with 9 elements, that means the first and fourth spot.
using a for loop like this works fine:
for i in range(Nbr+1):
Spot= np.argwhere(components[:,:]== i)
if len(Spot)!=9:
M[Spot[:, 0], Spot[:, 1]]=0
#M= array([[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0]])
The porblem is when my Spots are more than 4, my code is slower.
Is there any faster alternative that can do the job of the for loop?
Thanks.

How to permutate rows in sparse (Numpy) matrix efficiently using permutation array?

I used the Scipy Reverse Cuthill-McKee implementation (scipy.sparse.csgraph.reverse_cuthill_mckee) for creating a band matrix using a (high-dimensional) sparse csr_matrix.
The result of this method is a permutation array whichs gives me the indices of how to permutate the rows of my matrix as I understood.
Now is there any efficient solution for doing this permutation on my sparse csr_matrix in any other sparse matrix (csr, lil_matrix, etc)?
I tried a for-loop but my matrix has dimension like 200,000 x 150,000 and it takes too much time.
A = csr_matrix((data,(rowind,columnind)), shape=(200000, 150000), dtype=np.uint8)
permutation_array = csgraph.reverse_cuthill_mckee(A, false)
result_matrix = lil_matrix((200000, 150000), dtype=np.uint8)
i=0
for x in np.nditer(permutation_array):
result_matrix[x, :]=A[i, :]
i+=1
The result of the reverse_cuthill_mckee call is an array which is like a tupel containing the indices for my permutation. So this array is something like: [199999 54877 54873 ..., 12045 9191 0] (size = 200,000)
This means:
row with index 0 has now index 199999,
row with index 1 has now index 54877,
row with index 2 has now index 54873,
etc. see: https://en.wikipedia.org/wiki/Permutation#Definition_and_notations
(As I understood the return)
Thank you
I wonder if you are applying the permutation array correctly.
Make a random matrix (float) and convert it to a uint8 (beware, csr calculations might not work with this dtype):
In [963]: ran=sparse.random(10,10,.3, format='csr')
In [964]: A = sparse.csr_matrix((np.ones(ran.data.shape).astype(np.uint8),ran.indices, ran.indptr))
In [965]: A.A
Out[965]:
array([[1, 1, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 1, 1, 1, 1, 1, 1, 0, 1, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 1, 0, 1],
[0, 1, 0, 0, 1, 1, 0, 0, 0, 0],
[1, 0, 1, 0, 0, 1, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1],
[0, 1, 1, 1, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 1, 1, 0, 0, 0]], dtype=uint8)
(oops, used the wrong matrix here):
In [994]: permutation_array = csgraph.reverse_cuthill_mckee(A, False)
In [995]: permutation_array
Out[995]: array([9, 7, 0, 4, 6, 3, 5, 1, 8, 2], dtype=int32)
My first inclination is to use such an array to simply index rows of the original matrix:
In [996]: A[permutation_array,:].A
Out[996]:
array([[0, 0, 0, 0, 1, 1, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1],
[1, 1, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 1, 0, 0, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 1, 0, 1],
[1, 0, 1, 0, 0, 1, 0, 1, 0, 0],
[0, 1, 1, 1, 1, 1, 1, 0, 1, 0],
[0, 1, 1, 1, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=uint8)
I see some clustering; maybe the best we can expect from a random matrix.
You on the other hand appear to be doing:
In [997]: res = sparse.lil_matrix(A.shape,dtype=A.dtype)
In [998]: res[permutation_array,:] = A
In [999]: res.A
Out[999]:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1],
[0, 0, 0, 0, 1, 1, 1, 0, 0, 0],
[1, 0, 1, 0, 0, 1, 0, 1, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 1, 0, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 1, 1, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 1, 1, 0, 1, 0],
[0, 1, 1, 1, 0, 1, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 1, 0, 0, 0]], dtype=uint8)
I don't see any improvement in clustering of 1s in res.
The docs for the MATLAB equivalent say
r = symrcm(S) returns the symmetric reverse Cuthill-McKee ordering of S. This is a permutation r such that S(r,r) tends to have its nonzero elements closer to the diagonal.
In numpy terms, that means:
In [1019]: I,J=np.ix_(permutation_array,permutation_array)
In [1020]: A[I,J].A
Out[1020]:
array([[0, 0, 0, 1, 1, 0, 1, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 0, 1, 0, 0, 1, 0, 0],
[0, 0, 0, 1, 0, 0, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 1, 0, 0],
[0, 1, 1, 0, 0, 0, 1, 0, 0, 1],
[0, 0, 0, 1, 1, 1, 1, 1, 1, 1],
[0, 0, 0, 0, 0, 1, 1, 1, 0, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=uint8)
And indeed there are more 0 bands in the 2 off diagonal corners.
And using the bandwidth calculation on the MATLAB page, https://www.mathworks.com/help/matlab/ref/symrcm.html
In [1028]: i,j=A.nonzero()
In [1029]: np.max(i-j)
Out[1029]: 7
In [1030]: i,j=A[I,J].nonzero()
In [1031]: np.max(i-j)
Out[1031]: 5
The MATLAB docs say that with this permutation, the eigenvalues remain the same. Testing:
In [1032]: from scipy.sparse import linalg
In [1048]: linalg.eigs(A.astype('f'))[0]
Out[1048]:
array([ 3.14518213+0.j , -0.96188843+0.j ,
-0.58978939+0.62853903j, -0.58978939-0.62853903j,
1.09950364+0.54544497j, 1.09950364-0.54544497j], dtype=complex64)
In [1049]: linalg.eigs(A[I,J].astype('f'))[0]
Out[1049]:
array([ 3.14518023+0.j , 1.09950352+0.54544479j,
1.09950352-0.54544479j, -0.58978981+0.62853914j,
-0.58978981-0.62853914j, -0.96188819+0.j ], dtype=complex64)
Eigenvalues are not the same for the row permutations we tried earlier:
In [1050]: linalg.eigs(A[permutation_array,:].astype('f'))[0]
Out[1050]:
array([ 2.95226836+0.j , -1.60117996+0.52467293j,
-1.60117996-0.52467293j, -0.01723826+1.06249797j,
-0.01723826-1.06249797j, 0.90314150+0.j ], dtype=complex64)
In [1051]: linalg.eigs(res.astype('f'))[0]
Out[1051]:
array([-0.05822830-0.97881651j, -0.99999994+0.j ,
1.17350495+0.j , -0.91237622+0.8656373j ,
-0.91237622-0.8656373j , 2.26292515+0.j ], dtype=complex64)
This [I,J] permutation works with the example matrix in http://ciprian-zavoianu.blogspot.com/2009/01/project-bandwidth-reduction.html
In [1058]: B = np.matrix('1 0 0 0 1 0 0 0;0 1 1 0 0 1 0 1;0 1 1 0 1 0 0 0;0 0 0
...: 1 0 0 1 0;1 0 1 0 1 0 0 0; 0 1 0 0 0 1 0 1;0 0 0 1 0 0 1 0;0 1 0 0 0
...: 1 0 1')
In [1059]: B
Out[1059]:
matrix([[1, 0, 0, 0, 1, 0, 0, 0],
[0, 1, 1, 0, 0, 1, 0, 1],
[0, 1, 1, 0, 1, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 1, 0],
[1, 0, 1, 0, 1, 0, 0, 0],
[0, 1, 0, 0, 0, 1, 0, 1],
[0, 0, 0, 1, 0, 0, 1, 0],
[0, 1, 0, 0, 0, 1, 0, 1]])
In [1060]: Bm=sparse.csr_matrix(B)
In [1061]: Bm
Out[1061]:
<8x8 sparse matrix of type '<class 'numpy.int32'>'
with 22 stored elements in Compressed Sparse Row format>
In [1062]: permB = csgraph.reverse_cuthill_mckee(Bm, False)
In [1063]: permB
Out[1063]: array([6, 3, 7, 5, 1, 2, 4, 0], dtype=int32)
In [1064]: Bm[np.ix_(permB,permB)].A
Out[1064]:
array([[1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 1, 1, 1],
[0, 0, 0, 0, 0, 0, 1, 1]], dtype=int32)

Scipy label erosion

How can I keep a ring of pixels around labeled regions in a numpy array?
In a simple case, I'd subtract the erosion. That approach doesn't work when the labels touch. How can I get get B from A?
A = array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
B = array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 0, 0, 0, 0, 0, 2, 0, 0, 0],
[0, 0, 2, 0, 0, 0, 0, 0, 2, 0, 0, 0],
[0, 0, 2, 0, 0, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
I'm working with large arrays with many labels, so separate erosions on each label isn't an option.
New Answer
Actually, I just thought of a better way:
B = A * (np.abs(scipy.ndimage.laplace(A)) > 0)
As a full example:
import numpy as np
import scipy.ndimage
A = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
B = A * (np.abs(scipy.ndimage.laplace(A)) > 0)
I think this should work in all cases (of "labeled" arrays like A, at any rate...).
If you're worried about performance, you can split this into a few pieces to reduce memory overhead:
B = scipy.ndimage.laplace(A)
B = np.abs(B, B) # Preform abs in-place
B /= B # This will produce a divide by zero warning that you can safely ignore
B *= A
This version is a lot more verbose, but should use much less memory.
Old Answer
I can't think of a good way to do it in one step with the usual scipy.ndimage functions. (I feel like a tophat filter should do what you want, but I can't quite figure it out.)
However, doing several separate erosions is an option, as you mentioned.
You should get reasonable performance even on very large arrays if you use find_objects to extract the subregion of each label, and then just do the erosion on the subregion.
For example:
import numpy as np
import scipy.ndimage
A = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
regions = scipy.ndimage.find_objects(A)
mask = np.zeros_like(A).astype(np.bool)
for val, region in enumerate(regions, start=1):
if region is not None:
subregion = A[region]
mask[region] = scipy.ndimage.binary_erosion(subregion == val)
B = A.copy()
B[mask] = 0
This yields:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 0, 0, 0, 0, 0, 2, 0, 0, 0],
[0, 0, 2, 0, 0, 0, 0, 0, 2, 0, 0, 0],
[0, 0, 2, 0, 0, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
The performance should be reasonable for large arrays, but it's going to depend strongly on how large of an area the different labeled objects span and the number of labeled objects that you have....

Find sub list inside a list in python

I have a list of numbers
l = [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
[0, 0, 2, 1, 1, 2, 0, 0, 0, 0]
[0, 0, 2, 1, 1, 2, 2, 0, 0, 1]
[0, 0, 1, 2, 2, 0, 1, 0, 0, 2]
[1, 0, 1, 1, 1, 2, 1, 0, 2, 1]]
For example , i have to search a pattern '2,1,1,2' , as we can see that is present in row 6 and 7 .
in order to find that sequence i tried converting each list into str and tried to search the pattern , but for some reason the code isnt working.
import re
for i in l:
if re.search('2,1,1,2' , str(i).strip('[').strip(']')): print " pattern found"
am i missing something in here ?
Converting your list in string is really not a good idea.
How about something like this:
def getsubidx(x, y):
l1, l2 = len(x), len(y)
for i in range(l1):
if x[i:i+l2] == y:
return i
I suggest you to use the Knuth-Morris-Pratt algorithm. I suppose you are implicitly assuming that your pattern is present in the list just one time, or you are just interested in knowing if it's in or not.
If you want the list of each first element which starts the sequence, then you can use KMP. Think about it as a sort of string.find() for lists.
I hope this will help.
l = [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 2, 1, 1, 2, 0, 0, 0, 0],
[0, 0, 2, 1, 1, 2, 2, 0, 0, 1],
[0, 0, 1, 2, 2, 0, 1, 0, 0, 2],
[1, 0, 1, 1, 1, 2, 1, 0, 2, 1]]
import re
for i in l:
if re.search('2, 1, 1, 2' , str(i).strip('[').strip(']')):
print " pattern found"
str(list) will return the string with spaces between the elements... You should look for '2, 1, 1, 2' instead of 2,1,1,2
Here is the same idea, without regex
data = [
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 2, 1, 1, 2, 0, 0, 0, 0],
[0, 0, 2, 1, 1, 2, 2, 0, 0, 1],
[0, 0, 1, 2, 2, 0, 1, 0, 0, 2],
[1, 0, 1, 1, 1, 2, 1, 0, 2, 1],
]
pattern = '2112'
for item in data:
line = ''
for number in item:
line += str(number)
if pattern in line:
print 'pattern found: %s' % item

Categories

Resources