Fastest way to get bounding boxes around segments in a label map - python

A 3D label map is matrix in which every pixel (voxel) has an integer label. These values are expected to be contiguous, meaning that a segment with label k will not be fragmented.
Given such label map (segmentation), what is the fastest way to obtain the coordinates of a minimum bounding box around each segment, in Python?
I have tried the following:
Iterate through the matrix using multiindex iterator (from numpy.nditer) and construct a reverse index dictionary. This means that for every label you get the 3 coordinates of every voxel where the label is present.
For every label get the max and min of each coordinate.
The good thing is that you get all the location information in one O(N) pass. The bad thing is that I dont need this detailed information. I just need the extremities, so there might be a faster way to do this, using some numpy functions which are faster than so many list appends. Any suggestions?
The one pass through the matrix takes about 8 seconds on my machine, so it would be great to get rid of it. To give an idea of the data, there are a few hundred labels in a label map. Sizes of the label map can be 700x300x30 or 300x300x200 or something similar.
Edit: Now storing only updated max and min per coordinate for every label. This removes the need to maintain and store all these large lists (append).

If I understood your problem correctly, you have groups of voxels, and you would like to have the extremes of a group in each axis.
Let'd define:
arr: 3D array of integer labels
labels: list of labels (integers 0..labmax)
The code:
import numpy as np
# number of highest label:
labmax = np.max(labels)
# maximum and minimum positions along each axis (initialized to very low and high values)
b_first = np.iinfo('int32').min * np.ones((3, labmax + 1), dtype='int32')
b_last = np.iinfo('int32').max * np.ones((3, labmax + 1), dtype='int32')
# run through all of the dimensions making 2D slices and marking all existing labels to b
for dim in range(3):
# create a generic slice object to make the slices
sl = [slice(None), slice(None), slice(None)]
bf = b_first[dim]
bl = b_last[dim]
# go through all slices in this dimension
for k in range(arr.shape[dim]):
# create the slice object
sl[dim] = k
# update the last "seen" vector
bl[arr[sl].flatten()] = k
# if we have smaller values in "last" than in "first", update
bf[:] = np.clip(bf, None, bl)
After this operation we have six vectors giving the smallest and largest indices for each axis. For example, the bounding values along second axis of label 13 are b_first[1][13] and b_last[1][13]. If some label is missing, all corresponding b_first and b_last will be the maximum int32 value.
I tried this with my computer, and for a (300,300,200) array it takes approximately 1 sec to find the values.

Related

Trim an array with respective to another array with numpy

I am handling a set of data recorded by a 2D detector. Therefore, the data are represented by three arrays: x and y labelling the coordinate of a pixel and intensity storing the measured signal.
For example, a 6x6 grid will give a set of data:
xraw = np.array([0,1,2,3,4,5,0,1,2,3,4,5,...])
yraw = np.array([0,0,0,0,0,0,1,1,1,1,1,1,...])
intensity = np.array([i_00,i_01,i_02,i_03,i_04,i_05,i_10,i_11,...])
Due to various reasons, such as pixel defects, some of the data points are discarded in the raw data. Therefore, xraw, yraw, intensity have a size smaller than 36 (if that's a 6x6 grid), with, say, the point at (2,3) missing.
The intensity data needs further treatment by an element-wise multiplication with another array. This treatment array is from theoretical calculation and so it has a size of nxn (6x6 in this case). However, as some of the points in the true data are missing, the two arrays have different sizes.
I can use a loop to check for the missing points and eliminate the corresponding element in the treatment array. I wonder if there are some methods in numpy that take care of such operations. Thanks
First, construct the indices of available and all possible pixel positions by
avail_ind = yraw * h + xraw
all_ind = np.arange(0, h * w)
where h and w is the image's height and width in pixels.
Then, find the indices of the missing pixels by
missing_ind = all_ind[~np.in1d(all_ind, avail_ind)]
Once having the missing indices, use np.delete to construct a copy of the treatment_array with elements at the indices removed, then simply multiply that with your intensity array.
result = intensity * np.delete(treatment_array, missing_ind)

HoloViews Dataset transform to QuadMesh

Case description
I have a set of spectral maps (intensity dependent on time and frequency) for a set of detectors which can be fit into a 3D array BlockDataset of size M x N x K (here: M = number of frequencies, N number of time steps and K is the number of detectors).
The M frequencies are log-spaced and the K detectors are normally indexed by a tuple consiting of 2 angles but for brevity I'm considering only one angle. The N time values are equidistant.
Creating a HoloViews dataset from BlockDataset with appropriate value arrays for all of the dimensions is possible, but requires me to switch from a simple hv.Image display to a hv.QuadMesh display.
Problem description
If the dataset is created with actual angle values, instead of just detector numbers, a conversion to a HoloMap fails with the following error:
DataError: The shape of the intensity value array does not match the expected dimensionality indicated by the key dimensions. Expected 2-D array, found 3-D array.
If detector numbers (integers) are used instead of angles (floating point numbers) there's no problem.
Code
timeDim = hv.Dimension("time", label="Time", unit="sec", values=times)
freqDim = hv.Dimension("frequency", label = "Angular frequency", unit="$\\frac{rad}{s}", values=omega)
polarAngleDim = hv.Dimension("angle", label=" $\varphi$", unit="rad", values=angles[:,0])
intensityDim = hv.Dimension("intensity", label="Intensity $\\frac{d^2 W}{d\Omega d\omega}(t,\vartheta,\varphi)", unit="J/(s srad)")
hvDatasetNatural = hv.Dataset((times, angles[:,0], omega, BlockDataset.transpose()), [timeDim, polarAngleDim, freqDim], intensityDim)
subset = hvDatasetNatural.select( angle=list(angles[selectedIndices,0]) )
img = subset.to( new_type=hv.QuadMesh, kdims=[timeDim, freqDim])
The selection of a subset appears to work properly, but neither the conversion of the subset, nor of the entire dataset to QuadMesh works.
Note again: times are lin-spaced float values, angles are nonlinearly spaced floats and omega are log-spaced float values.
Query
What may be the problem here? I.e., why doesn't .to() work on the dataset when 2 of the 3 dimensions are non-equidistant, non-integer values but it works well if only omega is kept non-equidistant?
I can construct a QuadMesh for a specific angle using hv.QuadMesh( (...), kdim=[..]) and hence essentially unwrapping the original object.
(an extra) Why does an aggregation along the, e.g., time dimension using subset.reduce(timeDim, np.sum) work, but subset.reduce(timeDim, np.trapz) fails with:
DataError: None of the available storage backends were able to support the supplied data format. GridInterface raised following error:
GridInterface interface requires at least one value dimension.

Numpy array - Two unknown dimensions - png files

I have a dataset existing of a bunch of png files with different sizes regarding their heights and widths.
I read in this files with the following code to get a numpy array. In this case, it is 2D. But actually I want to get a 3D array which exists of the number of images n, the height of the images h and the width w.
import os.path
import glob
import numpy as np
def open_images(images_directory):
pattern_to_match = os.path.join(images_directory, "*.png")
png_files = (x for x in glob.iglob(pattern_to_match)
if os.path.isfile(x))
for current_png_filename in png_files:
print("Opening file", current_png_filename)
with open(current_png_filename, "rb") as current_png_file:
data = current_png_file.read()
return np.frombuffer(data, dtype=np.uint8, offset=16)\
.reshape(-1, 3)\
.astype(np.float32)
pass
directory_to_search = r"C:\Users\tobis\OneDrive\Desktop\Masterarbeit\data\2017-IWT4S-HDR_LP-dataset\crop_h1"
open_images(directory_to_search)
At the moment, I get an array with a shpae like this:
(21559, 3). I think the first number is a combination of width and height and the last is the RGB value. I would like to get an array that looks like this one: (n, h, w).
Is there a way to get such an array? Unfortunately, I have two unknown dimensions. This seems to be the problem...
You can't just read an image file like that. You need to use a library to read it and interpret the height, width, colourspace, bits per pixel, date, the GPS data, the camera make and model and all the compressed, encoded pixels.
For example, with PIL/Pillow:
from PIL import Image
import numpy as np
# Open image and make sure it is RGB - not palette
im = Image.open('image.png').convert('RGB')
# Make into Numpy array
na = np.array(im)
# Check shape
print(na.shape) # prints (480,640,3) for height, width, channels
If you have a flattened image and would like to recover the original row and column dimensions you can apply a heuristic that tests various possible combinations and checks the "smoothness" of the image along the row axis. This smoothness can be checked via the mean squared error of consecutive rows for example. This assumes that the original image has some kind of structure, also along the row axis, and so the change in consecutive pixels in the original image will be relatively small when compare to other possible shapes.
For example let's say the original image is 155 x 79 pixels and it has been flattened into an array of 155 * 79 == 12245. The prime factorization of this is 5, 31, 79. So the possible row dimensions are all unique combinations of these prime factors, i.e. 5, 31, 79, 155, 395, 2449. Now these possible row dimensions, in the following referred to as estimates, can be sorted into two different categories:
Estimates which are a divisor of the original row dimension: 5, 31 and 155. This means effectively that multiple row-skipped copies of the original image are stacked next to each other. So the resulting image will retain the original column grouping. Since similar columns remain together each element of the stack will have roughly the same smoothness. For example if the estimate is 31 this means that the original shape 31 x 5 , 79 is transformed to 31 , 5 x 79, i.e. only every 5-th row of the original image is considered and five such copies are stacked next to each other. For the original image (i.e. an estimate of 155) length-1 correlations are considered (i.e. each pair of consecutive rows is compared), while for an estimate of 31 length-5 correlations are considered (i.e. comparing row-pairs that have another 4 rows between them). Since the original image is expected to have some smooth structure, the smoothness should decrease when longer ranges are compared. The decrease in smoothness will be bigger when the skip-range increases, but it can also completely vanish if the image contains some degree of periodicity along the row axis.
All other estimates: 79, 155, 395, 2449. For estimates of this category different columns of the original image are mixed in the test image corresponding to the estimate. For example if the estimate is 79 we have 155 % 79 == 76, i.e. each new row in the test image shifts the original columns by 3 with respect to the previous row. Assuming that the original image varies along the column dimension these shifts will introduce an increasingly strong deviation for the emerging consecutive rows. Since this column shift increases from row to row the resulting decrease in row-smoothness should be strong unless the number of rows is small. If the original image is column-periodic with the shift number of the estimate this can lead to a perfect agreement however.
So to summarize, if we compute the smoothness for all row dimension estimates we expect the smoothness to decrease for a wrong estimate and the decrease will be small if the estimate falls in category (1) and bigger if it falls in category (2).
Important: If the images are periodic along either the row or column dimension this can lead to a false estimate.
The implementation needs to cover the following steps:
Compute the prime factorization of the length of the flattened image.
Compute all unique row dimension estimates from combinations of the prime factors.
For each estimate compute the row-smoothness of the resulting test image. For example use the mean squared error of consecutive rows (actually this will be a non-smoothness score).
Find the best estimate from the scores.
Here is some example code for the implementation:
import itertools as it
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
image = np.array(Image.open('example.jpg'))
original_shape = image.shape[:2]
image = image.reshape(-1, 3)
def compute_prime_factors(n):
i = 2
while i <= n:
if n % i == 0:
n //= i
yield i
else:
i += 1
prime_factors = list(compute_prime_factors(len(image)))
combinations = it.chain.from_iterable(it.combinations(prime_factors, r=i) for i in range(1, len(prime_factors)))
row_dims = sorted({np.prod(x) for x in combinations})
def test_row_dim(r):
c = len(image) // r
test = image.reshape(r, c, 3)
return np.mean((test[1:] - test[:-1])**2)
scores = [test_row_dim(r) for r in row_dims]
best_estimate = row_dims[np.argmin(scores)]
fig, ax = plt.subplots()
ax.set(xlabel='row dimension', ylabel='score')
ax.set_xscale('log')
ax.plot(row_dims, scores, '-o', label='Estimations')
ax.plot([best_estimate], [np.min(scores)], '*', ms=12, label=f'Best Estimate ({best_estimate})')
ax.axvline(original_shape[0], label=f'Actual Dim ({original_shape[0]})', color='#2ca02c', zorder=-100, lw=1.5, ls='--')
ax.legend()
plt.figure()
plt.imshow(image.reshape(205, -1, 3)) # second best score
plt.show()
Let's test it on some image (H x W: 410 x 640):
Photo by Cameron Venti on Unsplash
This produces the following estimate scores:
The peaks to the left of the best estimate are the category (1) estimates that have the smallest row-skip. The prime factorization of 410 and 640 is 2*5*41 and 2**7 * 5 respectively. So the category (1) estimates that get closest to the original row dimension are 205, 82 and 41 (the side peaks from right to left). A decreasing estimate implies an increasing row-skip range and hence an increasing MSE score. The peak to the left of the best estimate corresponds to an estimate of 205, i.e. each second row gets skipped and hence two such row-skipped versions are stacked next to each other:
As you can imagine, by skipping every second row, the image doesn't change too much and the change is the same for the two side-by-side versions. Hence the small difference to the original image's MSE score.

Counting the number of times a threshold is met or exceeded in a multidimensional array in Python

I have an numpy array that I brought in from a netCDF file with the shape (930, 360, 720) where it is organized as (time, latitudes, longitudes).
At each lat/lon pair for each of the 930 time stamps, I need to count the number of times that the value meets or exceeds a threshold "x" (such as 0.2 or 0.5 etc.) and ultimately calculate the percentage that the threshold was exceeded at each point, then output the results so they can be plotted later on.
I have attempted numerous methods but here is my most recent:
lat_length = len(lats)
#where lats has been defined earlier when unpacked from the netCDF dataset
lon_length = len(lons)
#just as lats; also these were defined before using np.meshgrid(lons, lats)
for i in range(0, lat_length):
for j in range(0, lon_length):
if ice[:,i,j] >= x:
#code to count number of occurrences here
#code to calculate percentage here
percent_ice[i,j] += count / len(time) #calculation
#then go on to plot percent_ice
I hope this makes sense! I would greatly appreciate any help. I'm self taught in Python so I may be missing something simple.
Would this be a time to use the any() function? What would be the most efficient way to count the number of times the threshold was exceeded and then calculate the percentage?
You can compare the input 3D array with the threshold x and then sum along the first axis with ndarray.sum(axis=0) to get the count and thereby the percentages, like so -
# Calculate count after thresholding with x and summing along first axis
count = (ice > x).sum(axis=0)
# Get percentages (ratios) by dividing with first axis length
percent_ice = np.true_divide(count,ice.shape[0])
Ah, look, another meteorologist!
There are probably multiple ways to do this and my solution is unlikely to be the fastest since it uses numpy's MaskedArray, which is known to be slow, but this should work:
Numpy has a data type called a MaskedArray which actually contains two normal numpy arrays. It contains a data array as well as a boolean mask. I would first mask all data that are greater than or equal to my threshold (use np.ma.masked_greater() for just greater than):
ice = np.ma.masked_greater_equal(ice)
You can then use ice.count() to determine how many values are below your threshold for each lat/lon point by specifying that you want to count along a specific axis:
n_good = ice.count(axis=0)
This should return a 2-dimensional array containing the number of good points. You can then calculate the number of bad by subtracting n_good from ice.shape[0]:
n_bad = ice.shape[0] - n_good
and calculate the percentage that are bad using:
perc_bad = n_bad/float(ice.shape[0])
There are plenty of ways to do this without using MaskedArray. This is just the easy way that comes to mind for me.

Avoid for-loops in assignment of data values

So this is a little follow up question to my earlier question: Generate coordinates inside Polygon and my answer https://stackoverflow.com/a/15243767/1740928
In fact, I want to bin polygon data to a regular grid. Therefore, I calculate a couple of coordinates within the polygon and translate their lat/lon combination to their respective column/row combo of the grid.
Currently, the row/column information is stored in a numpy array with its number of rows corresponding to the number of data polygons and its number of columns corresponding to the coordinates in the polygon.
The whole code takes less then a second, but this code is the bottleneck at the moment (with ~7sec):
for ii in np.arange(len(data)):
for cc in np.arange(data_lats.shape[1]):
final_grid[ row[ii,cc], col[ii,cc] ] += data[ii]
final_grid_counts[ row[ii,cc], col[ii,cc] ] += 1
The array "data" simply contains the data values for each polygon (80000,). The arrays "row" and "col" contain the row and column number of a coordinate in the polygon (shape: (80000,16)).
As you can see, I am summing up all data values within each grid cell and count the number of matches. Thus, I know the average for each grid cell in case different polygons intersect it.
Still, how can these two for loops take around 7 seconds? Can you think of a faster way?
I think numpy should add an nd-bincount function, I had one lying around from a project I was working on some time ago.
import numpy as np
def two_d_bincount(row, col, weights=None, shape=None):
if shape is None:
shape = (row.max() + 1, col.max() + 1)
row = np.asarray(row, 'int')
col = np.asarray(col, 'int')
x = np.ravel_multi_index([row, col], shape)
out = np.bincount(x, weights, minlength=np.prod(shape))
return out.reshape(shape)
weights = np.column_stack([data] * row.shape[1])
final_grid = two_d_bincount(row.ravel(), col.ravel(), weights.ravel())
final_grid_counts = two_d_bincount(row.ravel(), col.ravel())
I hope this helps.
I might not fully understand the shapes of your different grids, but you can maybe eliminate the cc loop using something like this:
final_grid = np.empty((nrows,ncols))
for ii in xrange(len(data)):
final_grid[row[ii,:],col[ii,:]] = data[ii]
This of course assumes that final_grid is starting with no other info (that the count you're incrementing starts at zero). And I'm not sure how to test if it works not understanding how your row and col arrays work.

Categories

Resources