Scipy label erosion - python

How can I keep a ring of pixels around labeled regions in a numpy array?
In a simple case, I'd subtract the erosion. That approach doesn't work when the labels touch. How can I get get B from A?
A = array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
B = array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 0, 0, 0, 0, 0, 2, 0, 0, 0],
[0, 0, 2, 0, 0, 0, 0, 0, 2, 0, 0, 0],
[0, 0, 2, 0, 0, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
I'm working with large arrays with many labels, so separate erosions on each label isn't an option.

New Answer
Actually, I just thought of a better way:
B = A * (np.abs(scipy.ndimage.laplace(A)) > 0)
As a full example:
import numpy as np
import scipy.ndimage
A = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
B = A * (np.abs(scipy.ndimage.laplace(A)) > 0)
I think this should work in all cases (of "labeled" arrays like A, at any rate...).
If you're worried about performance, you can split this into a few pieces to reduce memory overhead:
B = scipy.ndimage.laplace(A)
B = np.abs(B, B) # Preform abs in-place
B /= B # This will produce a divide by zero warning that you can safely ignore
B *= A
This version is a lot more verbose, but should use much less memory.
Old Answer
I can't think of a good way to do it in one step with the usual scipy.ndimage functions. (I feel like a tophat filter should do what you want, but I can't quite figure it out.)
However, doing several separate erosions is an option, as you mentioned.
You should get reasonable performance even on very large arrays if you use find_objects to extract the subregion of each label, and then just do the erosion on the subregion.
For example:
import numpy as np
import scipy.ndimage
A = np.array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
regions = scipy.ndimage.find_objects(A)
mask = np.zeros_like(A).astype(np.bool)
for val, region in enumerate(regions, start=1):
if region is not None:
subregion = A[region]
mask[region] = scipy.ndimage.binary_erosion(subregion == val)
B = A.copy()
B[mask] = 0
This yields:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 0, 0, 0, 0, 0, 2, 0, 0, 0],
[0, 0, 2, 0, 0, 0, 0, 0, 2, 0, 0, 0],
[0, 0, 2, 0, 0, 2, 2, 2, 2, 0, 0, 0],
[0, 0, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
The performance should be reasonable for large arrays, but it's going to depend strongly on how large of an area the different labeled objects span and the number of labeled objects that you have....

Related

How to make a lower triangle array of 10 but repeated across a diagonal n times?

I am trying to create an array of 10 for each item I have, but then put those arrays of 10 into a larger array diagonally with zeros filling the missing spaces.
Here is an example of what I am looking for, but only with arrays of 3.
import numpy as np
arr = np.tri(3,3)
arr
This creates an array that looks like this:
[[1,0,0],
[1,1,0],
[1,1,1]]
But I need an array of 10 * n that looks like this: (using arrays a 3 for example here, with n=2)
{1,0,0,0,0,0,
1,1,0,0,0,0,
1,1,1,0,0,0,
0,0,0,1,0,0,
0,0,0,1,1,0,
0,0,0,1,1,1}
Any help would be appreciated, thanks!
I have also tried
df_arr2 = pd.concat([df_arr] * (n), ignore_index=True)
df_arr3 = pd.concat([df_arr2] *(n), axis=1, ignore_index=True)
But this repeats the matrix across all rows and columns, when I only want the diagnonal ones.
Now I got it... AFAIU, the OP wants those np.tri triangles in the diagonal of a bigger, multiple of 3 square shaped array.
As per example, for n=2:
import numpy as np
n = 2
tri = np.tri(3)
arr = np.zeros((n*3, n*3))
for i in range(0, n*3, 3):
arr[i:i+3,i:i+3] = tri
arr.astype(int)
# Out:
# array([[1, 0, 0, 0, 0, 0],
# [1, 1, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0],
# [0, 0, 0, 1, 0, 0],
# [0, 0, 0, 1, 1, 0],
# [0, 0, 0, 1, 1, 1]])
I saw #brandt's solution which is definitely the best. Incase you want to construct the them manually you can use this method:
def custom_triangle_matrix(rows, rowlen, tsize):
cm = []
for i in range(rows):
row = []
for j in range(min((i//tsize)*tsize, rowlen)):
row.append(0)
for j in range((i//tsize)*tsize, min(((i//tsize)*tsize) + i%tsize + 1, rowlen)):
row.append(1)
for j in range(((i//tsize)*tsize) + i%tsize + 1, rowlen):
row.append(0)
cm.append(row)
return cm
Here are some example executions and what they look like using ppprint:
matrix = custom_triangle_matrix(6, 6, 3)
pprint.pprint(matrix)
[[1, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0],
[0, 0, 0, 1, 0, 0],
[0, 0, 0, 1, 1, 0],
[0, 0, 0, 1, 1, 1]]
matrix = custom_triangle_matrix(6, 9, 3)
pprint.pprint(matrix)
[[1, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 1, 1, 1, 0, 0, 0]]
matrix = custom_triangle_matrix(9, 6, 3)
pprint.pprint(matrix)
[[1, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0],
[0, 0, 0, 1, 0, 0],
[0, 0, 0, 1, 1, 0],
[0, 0, 0, 1, 1, 1],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]
matrix = custom_triangle_matrix(10, 10, 5)
pprint.pprint(matrix)
[[1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1]]
Good Luck!

Filtering elements of a numpy array depending on their occurrences

i have the following 2D numpy array M
M = np.array([[1,1,1,0,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,1,1],
[1,1,1,0,0,0,0,0,0,1,1],
[0,0,0,0,0,1,1,1,0,0,0],
[0,0,0,0,0,1,1,1,0,0,0],
[1,1,1,0,1,1,1,1,0,0,0],
[1,1,1,0,0,1,1,1,0,0,0],
[1,1,1,0,0,1,1,1,0,0,0]])
which I want to identify its spots (Pixels with value==1 and connected to each other).
Thanks to the function 'label' from scipy, I can identify all of my spots in the matrix. The output should seem like this:
Output, Nbr= label(M)
#Output= array([[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 2, 2],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 2, 2],
# [0, 0, 0, 0, 0, 3, 3, 3, 0, 0, 0],
# [0, 0, 0, 0, 0, 3, 3, 3, 0, 0, 0],
# [4, 4, 4, 0, 3, 3, 3, 3, 0, 0, 0],
# [4, 4, 4, 0, 0, 3, 3, 3, 0, 0, 0],
# [4, 4, 4, 0, 0, 3, 3, 3, 0, 0, 0]])
I want only to have spots with 9 elements, that means the first and fourth spot.
using a for loop like this works fine:
for i in range(Nbr+1):
Spot= np.argwhere(components[:,:]== i)
if len(Spot)!=9:
M[Spot[:, 0], Spot[:, 1]]=0
#M= array([[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0]])
The porblem is when my Spots are more than 4, my code is slower.
Is there any faster alternative that can do the job of the for loop?
Thanks.

Numpy, Reading from File with no delimiter, and most efficient way

I'm looking for most efficient way to load file with matrix to numpy array without delimiter.
should I use generator to convert and fill? file consist of single 1 and 0 only
000000000
011111111
111000100
110001110
000001100
001000000
110000000
111111100
to:
[
[0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 1, 1, 1, 1, 1, 1, 1, 1]
[1, 1, 1, 0, 0, 0, 1, 0, 0]
...
]
You can use numpy.genfromtxt
import numpy as np
np.genfromtxt('matrix.txt', delimiter=1, dtype=int)
array([[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 0, 0, 0, 1, 0, 0],
[1, 1, 0, 0, 0, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 1, 1, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1, 0, 0]])

Finding groups of same value inside a 2D list

I'm currently working on something were I need to find groups of values in a 2d list that are surrounded by another value and then change the values of the surrounded elements. Start/end of a sub-list counts as surrounded.
For exemple if I have this list :
[[1, 2, 1, 1, 1, 0, 0, 0, 0],
[2, 2, 2, 2, 2, 1, 0, 0, 0],
[1, 2, 2, 1, 1, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 1, 0],
[0, 1, 2, 1, 0, 0, 1, 1, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0]]
I want the function to change it to this:
[[1, 3, 1, 1, 1, 0, 0, 0, 0],
[3, 3, 3, 3, 3, 1, 0, 0, 0],
[1, 3, 3, 1, 1, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 1, 0],
[0, 1, 3, 1, 0, 0, 1, 1, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0]]
I have no idea how to do it. Can anyone please help me ?
lst = [
[1, 2, 1, 1, 1, 0, 0, 0, 0],
[2, 2, 2, 2, 2, 1, 0, 0, 0],
[1, 2, 2, 1, 1, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 1, 0],
[0, 1, 2, 1, 0, 0, 1, 1, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
]
for row in range(len(lst)):
for col in range(len(lst[row])):
if lst[row][col] == 1:
continue
v1 = lst[row][col - 1] >= 1 if col - 1 >= 0 else 1
v2 = lst[row - 1][col] >= 1 if row - 1 >= 0 else 1
v3 = lst[row][col + 1] >= 1 if col + 1 < len(lst[row]) else 1
v4 = lst[row + 1][col] >= 1 if row + 1 < len(lst) else 1
if v1 + v2 + v3 + v4 == 4:
lst[row][col] += 1
from pprint import pprint
pprint(lst)
Prints:
[[1, 3, 1, 1, 1, 0, 0, 0, 0],
[3, 3, 3, 3, 3, 1, 0, 0, 0],
[1, 3, 3, 1, 1, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 1, 0],
[0, 1, 3, 1, 0, 0, 1, 1, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0]]

Find sub list inside a list in python

I have a list of numbers
l = [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
[0, 0, 2, 1, 1, 2, 0, 0, 0, 0]
[0, 0, 2, 1, 1, 2, 2, 0, 0, 1]
[0, 0, 1, 2, 2, 0, 1, 0, 0, 2]
[1, 0, 1, 1, 1, 2, 1, 0, 2, 1]]
For example , i have to search a pattern '2,1,1,2' , as we can see that is present in row 6 and 7 .
in order to find that sequence i tried converting each list into str and tried to search the pattern , but for some reason the code isnt working.
import re
for i in l:
if re.search('2,1,1,2' , str(i).strip('[').strip(']')): print " pattern found"
am i missing something in here ?
Converting your list in string is really not a good idea.
How about something like this:
def getsubidx(x, y):
l1, l2 = len(x), len(y)
for i in range(l1):
if x[i:i+l2] == y:
return i
I suggest you to use the Knuth-Morris-Pratt algorithm. I suppose you are implicitly assuming that your pattern is present in the list just one time, or you are just interested in knowing if it's in or not.
If you want the list of each first element which starts the sequence, then you can use KMP. Think about it as a sort of string.find() for lists.
I hope this will help.
l = [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 2, 1, 1, 2, 0, 0, 0, 0],
[0, 0, 2, 1, 1, 2, 2, 0, 0, 1],
[0, 0, 1, 2, 2, 0, 1, 0, 0, 2],
[1, 0, 1, 1, 1, 2, 1, 0, 2, 1]]
import re
for i in l:
if re.search('2, 1, 1, 2' , str(i).strip('[').strip(']')):
print " pattern found"
str(list) will return the string with spaces between the elements... You should look for '2, 1, 1, 2' instead of 2,1,1,2
Here is the same idea, without regex
data = [
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 2, 1, 1, 2, 0, 0, 0, 0],
[0, 0, 2, 1, 1, 2, 2, 0, 0, 1],
[0, 0, 1, 2, 2, 0, 1, 0, 0, 2],
[1, 0, 1, 1, 1, 2, 1, 0, 2, 1],
]
pattern = '2112'
for item in data:
line = ''
for number in item:
line += str(number)
if pattern in line:
print 'pattern found: %s' % item

Categories

Resources