get non zero ROI from numpy array - python

I want to extract a rectangular ROI from an image.
The image contains a single connected non zero part.
I need it to be efficient in run time.
I was thinking maybe:
Summing along each direction.
Finding first non zero and last non zero.
Slicing the image accordingly.
Is there a better way?
My code:
First is a function to find the first and last non zero:
import numpy as np
from PIL import Image
def first_last_nonzero(boolean_vector):
first = last = -1
for idx,val in enumerate(boolean_vector):
if val == True and first == -1:
first = idx
if val == False and first != -1:
last = idx
return first , last
Then creating an image:
np_im = np.array([[ 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 255 154 251 60 0 0 0]
[ 0 0 0 0 4 66 0 0 255 0 0 0]
[ 0 0 0 0 0 0 0 134 48 0 0 0]
[ 0 0 0 0 0 0 236 70 0 0 0 0]
[ 0 0 0 0 1 255 0 0 0 0 0 0]
[ 0 0 0 0 255 24 24 24 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0]])
Then running our function on the sum along each axis:
y_start, y_end = first_last_nonzero(np.sum(np_im, 1)>0)
x_start, x_end = first_last_nonzero(np.sum(np_im, 0)>0)
cropped_np_im = np_im[y_start:y_end, x_start:x_end]
# show the cropped image
Image.fromarray(cropped_np_im).show()
This works but there are probably a plenty of unnecessary calculations.
Is there a better way to do this? Or maybe more pythonic way?

You can make use of the functions from this post:
Numpy: How to find first non-zero value in every column of a numpy array?
def first_nonzero(arr, axis, invalid_val=-1):
mask = arr!=0
return np.where(mask.any(axis=axis), mask.argmax(axis=axis), invalid_val)
def last_nonzero(arr, axis, invalid_val=-1):
mask = arr!=0
val = arr.shape[axis] - np.flip(mask, axis=axis).argmax(axis=axis) - 1
return np.where(mask.any(axis=axis), val, invalid_val)
arr = np.array([
[0, 0, 0, 0, 1, 1],
[0, 0, 1, 1, 0, 0],
[0, 1, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0],
[0, 0, 1, 0, 1, 0],
[0, 0, 0, 0, 0, 0] ])
y_Min, y_Max, x_Min, x_Max = (0, 0, 0, 0)
y_Min = first_nonzero(arr, axis = 0, invalid_val = -1)
y_Min = (y_Min[y_Min >= 0]).min()
x_Min = first_nonzero(arr, axis = 1, invalid_val = -1)
x_Min = (x_Min[x_Min >= 0]).min()
y_Max = last_nonzero(arr, axis = 0, invalid_val = -1)
y_Max = (y_Max[y_Max >= 0]).max()
x_Max = last_nonzero(arr, axis = 1, invalid_val = -1)
x_Max = (x_Max[x_Max >= 0]).max()
print(x_Min)
print(y_Min)
print(x_Max)
print(y_Max)
For this example of mine, the code will return 1, 0, 5, 4.
As a general rule of thumb in python: Try to avoid loops at all costs. From my own experience that statement is true in 99 out of 100 cases

Related

How do you convert a matrix into a string? [duplicate]

This question already has answers here:
Printing 2D-array in a grid
(6 answers)
Closed 1 year ago.
let's say I have this matrix: m = [[0 for i in range(5)] for i in range(5)],
which when printed, outputs this:
[[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]]
How do I make it so that it outputs something like this:
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
You can simply unpack the list
m = [[0 for i in range(5)] for i in range(5)]
for i in m:
print(*i)
Output:
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
You can use the below where m is your matrix and can control the spacing in the ' '.join()
for line in m:
print (' '.join(map(str, line)))

How to reorder a binary list but keep 1's roughly evenly spread apart from each other in the list?

I basically want to reorder(don't think this is a shuffling task) a list of 100 binary numbers. The following properties should hold after the reorder: the fixed frequency of 1's should remain, which is 10 and the 1's should be roughly spread apart from each other as shown below, so every 9th, 10th, or 11th digit is a 1. I want this reordering to be random. The trivial approach I had in mind is to track the index of the first 1 in the input list and generate a new start index. Any ideas on other solutions?
x = [1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0]
Codes as follows:
def main():
from random import shuffle
from random import randint
from itertools import chain
num_of_10th = randint(0, 5) * 2
num_of_11th = num_of_9th = int((10 - num_of_10th) / 2)
lsts = []
for i in range(num_of_10th):
lsts.append([1, 0, 0, 0, 0, 0, 0, 0, 0, 0])
for i in range(num_of_9th):
lsts.append([1, 0, 0, 0, 0, 0, 0, 0, 0])
for i in range(num_of_11th):
lsts.append([1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
shuffle(lsts)
lsts = list(chain.from_iterable(lsts))
print(lsts)
You can use python's list multiplication.
My solution will generate a random size between 1 and 10 using random.randint. from this size I create the repeated_part that starts with a 1 and fills in the rest with zero's. For example
when size is 5 repeated_part will be [1, 0, 0, 0, 0].
From the size we can calculate the number of times it fits in a list of 100 100//spread and we add one overflow. Now the list will be too large for example with a size of 3 the total size of the list is ((100/3)+1)*3 = 102 so we truncate the list to become 100 in length with [:100].
import random
size = random.randint(1, 10)
repeated_part = [1] + [0]*(size-1)
result = (repeated_part * (100 // size + 1)) [:100]
Note is you want the 1 to not start as first you could use random.shuffle(repeated_part) but still hold all your other requirements.

How to find longest consecutive ocurrence of non-zero elements in 2D numpy array

I am simulating protein folding on a 2D grid where every angle is either ±90° or 0°, and have the following problem:
I have an n-by-n numpy array filled with zeros, except for certain places where the value is any integer from 1 to n. Every integer appears just once. Integer k is always a nearest neighbour to k-1 and k + 1, except for the endpoints. The array is saved as an object in the class Grid which I have created for doing energy calculations and folding the protein. Example array, with n=5:
>>> from Grid import Grid
>>> a = Grid(5)
>>> a.show()
[[0 0 0 0 0]
[0 0 0 0 0]
[1 2 3 4 5]
[0 0 0 0 0]
[0 0 0 0 0]]
My goal is to find the longest consecutive line of non-zero elements withouth any bends. In the above case, the result should be 5.
My idea so far are something like this:
def getDiameter(self):
indexes = np.zeros((self.n, 2))
for i in range(1, self.n + 1):
indexes[i - 1] = np.argwhere(self.array == i)[0]
for i in range(self.n):
j = 1
currentDiameter = 1
while indexes[0][i] == indexes[0][i + j] and i + j <= self.n:
currentDiameter += 1
j += 1
while indexes[i][0] == indexes[i + j][0] and i + j <= self.n:
currentDiameter += 1
j += 1
if currentDiameter > diameter:
diameter = currentDiameter
return diameter
This has two problems: (1) it doesn't work, and (2) it is horribly inefficient if I get it to work. I am wondering if anybody has a better way of doing this. If anything is unclear, please let me know.
Edit:
Less trivial example
[[ 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 10 0 0 0]
[ 0 0 0 0 0 0 9 0 0 0]
[ 0 0 0 0 0 0 8 0 0 0]
[ 0 0 0 4 5 6 7 0 0 0]
[ 0 0 0 3 0 0 0 0 0 0]
[ 0 0 0 2 1 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0]]
The correct answer here is 4 (both the longest column and the longest row have four non-zero elements).
What I understood from your question is you need to find the length of longest occurance of consecutive elements in numpy array (row by row).
So for this below one, the output should be 5:
[[1 2 3 4 0]
[0 0 0 0 0]
[10 11 12 13 14]
[0 1 2 3 0]
[1 0 0 0 0]]
Because [10 11 12 13 14] are consecutive elements and they have the longest length comparing to any consecutive elements in any other row.
If this is what you are expecting, consider this:
import numpy as np
from itertools import groupby
a = np.array([[1, 2, 3, 4, 0],
[0, 0, 0, 0, 0],
[10, 11, 12, 13, 14],
[0, 1, 2, 3, 0],
[1, 0, 0, 0, 0]])
a = a.astype(float)
a[a == 0] = np.nan
b = np.diff(a) # Calculate the n-th discrete difference. Consecutive numbers will have a difference of 1.
counter = []
for line in b: # for each row.
if 1 in line: # consecutive elements differ by 1.
counter.append(max(sum(1 for _ in g) for k, g in groupby(line) if k == 1) + 1) # find the longest length of consecutive 1's for each row.
print(max(counter)) # find the max of list holding the longest length of consecutive 1's for each row.
# 5
For your particular example:
[[0 0 0 0 0]
[0 0 0 0 0]
[1 2 3 4 5]
[0 0 0 0 0]
[0 0 0 0 0]]
# 5
Start by finding the longest consecutive occurrence in a list:
def find_longest(l):
counter = 0
counters =[]
for i in l:
if i == 0:
counters.append(counter)
counter = 0
else:
counter += 1
counters.append(counter)
return max(counters)
now you can apply this function to each row and each column of the array, and find the maximum:
longest_occurrences = [find_longest(row) for row in a] + [find_longest(col) for col in a.T]
longest_occurrence = max(longest_occurrences)

cv2.connectedComponents doesn't work properly

I want to use the function cv2.connectedComponents to connect components on a binary image, like the following...
.
Everything works, except the outputted labels array. In this array are only zeros and not sequential numbers as indicated, according to the identified components.
import cv2
import numpy as np
img = cv2.imread('eGaIy.jpg', 0)
img = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)[1] # ensure binary
ret, labels = cv2.connectedComponents(img)
# Map component labels to hue val
label_hue = np.uint8(179*labels/np.max(labels))
blank_ch = 255*np.ones_like(label_hue)
labeled_img = cv2.merge([label_hue, blank_ch, blank_ch])
# cvt to BGR for display
labeled_img = cv2.cvtColor(labeled_img, cv2.COLOR_HSV2BGR)
# set bg label to black
labeled_img[label_hue==0] = 0
cv2.imshow('labeled.png', labeled_img)
cv2.waitKey()
outputted labels --> labels.shape: (256L, 250L)
[[0 0 0 ..., 0 0 0]
[0 0 0 ..., 0 0 0]
[0 0 0 ..., 0 0 0]
...,
[0 0 0 ..., 0 0 0]
[0 0 0 ..., 0 0 0]
[0 0 0 ..., 0 0 0]]
It works for me:
And you should be careful that the function only find the component of nonzero. In the source image, the components are the edges. And the returned are labeled image as the same size of source.
The output of
[[0 0 0 ..., 0 0 0]
[0 0 0 ..., 0 0 0]
[0 0 0 ..., 0 0 0]
...,
[0 0 0 ..., 0 0 0]
[0 0 0 ..., 0 0 0]
[0 0 0 ..., 0 0 0]]
only represent the 4 corner regions(3x3) are all zeros, but it doesn't mean all elements are zeros.
If you call this after you call the cv2.connectedComponents:
print(set(labels.reshape(-1).tolist()))
You will get:
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14}
It means there exist 14 components(edges), and 1 background(0).

rotate an nxnxn matrix in python

I have a binary array of size 64x64x64, where a volume of 40x40x40 is set to "1" and rest is "0". I have been trying to rotate this cube about its center around z-axis using skimage.transform.rotate and also Opencv as:
def rotateImage(image, angle):
row, col = image.shape
center = tuple(np.array([row, col]) / 2)
rot_mat = cv2.getRotationMatrix2D(center, angle, 1.0)
new_image = cv2.warpAffine(image, rot_mat, (col, row))
return new_image
In the case of openCV, I tried, 2D rotation of each idividual slices in a cube (Cube[:,:,n=1,2,3...p]).
After rotating, total sum of the values in the array changes. This may be caused by interpolation during rotation. How can I rotate 3D array of this kind without adding anything to the array?
Ok so I understand now what you are asking. The closest I can come up with is scipy.ndimage. But there is a way interface with imagej from python if which might be easier. But here is what I did with scipy.ndimage:
from scipy.ndimage import interpolation
angle = 25 #angle should be in degrees
Rotatedim = interpolation.rotate(yourimage, angle, reshape = False,output = np.int32, order = 5,prefilter = False)
This worked for some angles to preserve the some and not others, perhaps by playing around more with the parameters you might be able to get your desired outcome.
One option is to convert into sparse, and transform the coordinates using a matrix rotation. Then transform back into dense. In 2 dimensions, this looks like:
import numpy as np
import scipy.sparse
import math
N = 10
space = np.zeros((N, N), dtype=np.int8)
space[3:7, 3:7].fill(1)
print(space)
print(np.sum(space))
space_coo = scipy.sparse.coo_matrix(space)
Coords = np.array(space_coo.nonzero()) - 3
theta = 30 * 3.1416 / 180
R = np.array([[math.cos(theta), math.sin(theta)], [-math.sin(theta), math.cos(theta)]])
space2_coords = R.dot(Coords)
space2_coords = np.round(space2_coords)
space2_coords += 3
space2_sparse = scipy.sparse.coo_matrix(([1] * space2_coords.shape[1], (space2_coords[0], space2_coords[1])), shape=(N, N))
space2 = space2_sparse.todense()
print(space2)
print(np.sum(space2))
Output:
[[0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0]
[0 0 0 1 1 1 1 0 0 0]
[0 0 0 1 1 1 1 0 0 0]
[0 0 0 1 1 1 1 0 0 0]
[0 0 0 1 1 1 1 0 0 0]
[0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0]]
16
[[0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0]
[0 0 0 1 0 0 0 0 0 0]
[0 0 1 1 1 1 0 0 0 0]
[0 0 1 1 1 1 1 0 0 0]
[0 1 1 0 1 1 0 0 0 0]
[0 0 0 1 1 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0]]
16
The advantage is that you'll get exactly as many 1 values before and after the transform. The downsides is that you might get 'holes', as above, and/or duplicate coordinates, giving values of '2' in the final dense matrix.

Categories

Resources