Instance IOU fast calculation on large image

Instance IOU fast calculation on large image - python

I have a instances boolean mask which shape is (448, 1000, 1000) for 448 instances, the average pixel of instance is around 100.
Now if I have a prediction matrix which shape is (1000, 1000) and predict instance by integer, i.e. If the matrix predict 500 instance, np.unique(pred) will be 501 (500 class + 1 background).
I need to calculate the IOU (jaccard index) for each pair prediction and mask to find the maximum IOU. I have wrote codes below but it's super slow and inefficient.
c = 0 #intersection count
u = 0 #union count
pred_used = [] #record prediction used
# loop for every ground truth mask
for idx_m in range(len(mask[:,0,0])):
m = mask[idx_m,:,:] #take one mask
intersect_list = []
union_list = []
# loop every prediction
for idx_pred in range(1, int(np.max(pred))+1):
p = (pred==idx_pred) # take one prediction mask
intersect = np.sum(m.ravel() * p.ravel()) #calculate intersect
union = np.sum(m.ravel() + p.ravel() - m.ravel()*p.ravel())
intersect_list.append(intersect)
union_list.append(union_list)
if np.sum(intersect_list) > 0:
idx_max_iou = np.argmax(np.array(intersect_list))
c += intersect_list[idx_max_iou]
u += union_list[idx_max_iou]
pred_used.append(idx_max_iou)

So you have an output image sized [1000,1000] which is the predicted array/tensor by your model.
One of the first thing you can do is reshape the labels and predictions from [1000,1000] to [1000*1000, ]. This reduces the complexity from being N^2 to N. This should boostup the speed significantly .
And you can also try the IoU from Scikit which maybe a bit faster than your vesion.
You can find an example here: How to find IoU from segmentation masks?
Doc: http://scikit-learn.org/stable/modules/generated/sklearn.metrics.jaccard_similarity_score.html

Related

How to generate a Rank 5 matrix with entries Uniform?

I want to generate a rank 5 100x600 matrix in numpy with all the entries sampled from np.random.uniform(0, 20), so that all the entries will be uniformly distributed between [0, 20). What will be the best way to do so in python?
I see there is an SVD-inspired way to do so here (https://math.stackexchange.com/questions/3567510/how-to-generate-a-rank-r-matrix-with-entries-uniform), but I am not sure how to code it up. I am looking for a working example of this SVD-inspired way to get uniformly distributed entries.
I have actually managed to code up a rank 5 100x100 matrix by vertically stacking five 20x100 rank 1 matrices, then shuffling the vertical indices. However, the resulting 100x100 matrix does not have uniformly distributed entries [0, 20).
Here is my code (my best attempt):
import numpy as np
def randomMatrix(m, n, p, q):
# creates an m x n matrix with lower bound p and upper bound q, randomly.
count = np.random.uniform(p, q, size=(m, n))
return count
Qs = []
my_rank = 5
for i in range(my_rank):
L = randomMatrix(20, 1, 0, np.sqrt(20))
# L is tall
R = randomMatrix(1, 100, 0, np.sqrt(20))
# R is long
Q = np.outer(L, R)
Qs.append(Q)
Q = np.vstack(Qs)
#shuffle (preserves rank 5 [confirmed])
np.random.shuffle(Q)

Not a perfect solution, I must admit. But it's simple and comes pretty close.
I create 5 vectors that are gonna span the space of the matrix and create random linear combinations to fill the rest of the matrix.
My initial thought was that a trivial solution will be to copy those vectors 20 times.
To improve that, I created linear combinations of them with weights drawn from a uniform distribution, but then the distribution of the entries in the matrix becomes normal because the weighted mean basically causes the central limit theorm to take effect.
A middle point between the trivial approach and the second approach that doesn't work is to use sets of weights that favor one of the vectors over the others. And you can generate these sorts of weight vectors by passing any vector through the softmax function with an appropriately high temperature parameter.
The distribution is almost uniform, but the vectors are still very close to the base vectors. You can play with the temperature parameter to find a sweet spot that suits your purpose.
from scipy.stats import ortho_group
from scipy.special import softmax
import numpy as np
from matplotlib import pyplot as plt
N = 100
R = 5
low = 0
high = 20
sm_temperature = 100
p = np.random.uniform(low, high, (1, R, N))
weights = np.random.uniform(0, 1, (N-R, R, 1))
weights = softmax(weights*sm_temperature, axis = 1)
p_lc = (weights*p).sum(1)
rand_mat = np.concatenate([p[0], p_lc])
plt.hist(rand_mat.flatten())

I just couldn't take the fact the my previous solution (the "selection" method) did not really produce strictly uniformly distributed entries, but only close enough to fool a statistical test sometimes. The asymptotical case however, will almost surely not be distributed uniformly. But I did dream up another crazy idea that's just as bad, but in another manner - it's not really random.
In this solution, I do smth similar to OP's method of forming R matrices with rank 1 and then concatenating them but a little differently. I create each matrix by stacking a base vector on top of itself multiplied by 0.5 and then I stack those on the same base vector shifted by half the dynamic range of the uniform distribution. This process continues with multiplication by a third, two thirds and 1 and then shifting and so on until i have the number of required vectors in that part of the matrix.
I know it sounds incomprehensible. But, unfortunately, I couldn't find a way to explain it better. Hopefully, reading the code would shed some more light.
I hope this "staircase" method will be more reliable and useful.
import numpy as np
from matplotlib import pyplot as plt
'''
params:
N - base dimention
M - matrix length
R - matrix rank
high - max value of matrix
low - min value of the matrix
'''
N = 100
M = 600
R = 5
high = 20
low = 0
# base vectors of the matrix
base = low+np.random.rand(R-1, N)*(high-low)
def build_staircase(base, num_stairs, low, high):
'''
create a uniformly distributed matrix with rank 2 'num_stairs' different
vectors whose elements are all uniformly distributed like the values of
'base'.
'''
l = levels(num_stairs)
vectors = []
for l_i in l:
for i in range(l_i):
vector_dynamic = (base-low)/l_i
vector_bias = low+np.ones_like(base)*i*((high-low)/l_i)
vectors.append(vector_dynamic+vector_bias)
return np.array(vectors)
def levels(total):
'''
create a sequence of stritcly increasing numbers summing up to the total.
'''
l = []
sum_l = 0
i = 1
while sum_l < total:
l.append(i)
i +=1
sum_l = sum(l)
i = 0
while sum_l > total:
l[i] -= 1
if l[i] == 0:
l.pop(i)
else:
i += 1
if i == len(l):
i = 0
sum_l = sum(l)
return l
n_rm = R-1 # number of matrix subsections
m_rm = M//n_rm
len_rms = [ M//n_rm for i in range(n_rm)]
len_rms[-1] += M%n_rm
rm_list = []
for len_rm in len_rms:
# create a matrix with uniform entries with rank 2
# out of the vector 'base[i]' and a ones vector.
rm_list.append(build_staircase(
base = base[i],
num_stairs = len_rms[i],
low = low,
high = high,
))
rm = np.concatenate(rm_list)
plt.hist(rm.flatten(), bins = 100)
A few examples:
and now with N = 1000, M = 6000 to empirically demonstrate the nearly asymptotic behavior:

Python: Fastest way to perform millions of simple linear regression with 1 exogenous variable only

I am performing component wise regression on a time series data. This is basically where instead of regressing y against x1, x2, ..., xN, we would regress y against x1 only, y against x2 only, ..., and take the regression that reduces the sum of square residues the most and add it as a base learner. This is repeated M times such that the final model is the sum of many many simple linear regression of the form y against xi (1 exogenous variable only), basically gradient boosting using linear regression as the base learners.
The problem is that since I am performing a rolling window regression on the time series data, I have to do N × M × T regressions which is more than a million OLS. Though each OLS is very fast, it takes a few hours to run on my weak laptop.
Currently, I am using statsmodels.OLS.fit() as the way to get my parameters for each y against xi linear regression as such. The z_matrix is the data matrix and the i represents the ith column to slice for the regression. The number of rows is about 100 and z_matrix is about size 100 × 500.
ols_model = sm.OLS(endog=endog, exog=self.z_matrix[:, i][..., None]).fit()
return ols_model.params, ols_model.ssr, ols_model.fittedvalues[..., None]
I have read from a previous post in 2016 Fastest way to calculate many regressions in python? that using repeated calls to statsmodels is not efficient and I tried one of the answers which suggested numpy's pinv which is unfortunately slower:
# slower: 40sec vs 30sec for statsmodel for 100 repeated runs of 150 linear regressions
params = np.linalg.pinv(self.z_matrix[:, [i]]).dot(endog)
y_hat = self.z_matrix[:, [i]]#params
ssr = sum((y_hat-endog)**2)
return params, ssr, y_hat
Does anyone have any better suggestions to speed up the computation of the linear regression? I just need the estimated parameters, sum of square residues, and predicted ŷ value. Thank you!

Here is one way since you are always running regressions without a constant. This code runs around 900K models in about 0.5s. It retains the sse, the predicted values for each of the 900K regressions, and the estimated parameters.
The big idea is to exploit the math behind regressions of one variable on another, which is the ratio of a cross-product to an inner product (which the model does not contain a constant). This could be modified to also include a constant by using a moving window demean to estimate the intercept.
import numpy as np
from statsmodels.regression.linear_model import OLS
import datetime
gen = np.random.default_rng(20210514)
# Number of observations
n = 1000
# Number of predictors
m = 1000
# Window size
w = 100
# Simulate data
y = gen.standard_normal((n, 1))
x = gen.standard_normal((n, m))
now = datetime.datetime.now()
# Compute rolling covariance and variance-like terms
# These assume the model is y = x*b + e w/o a constant
c = np.r_[np.zeros((1, m)), np.cumsum(x * y, axis=0)]
v = np.r_[np.zeros((1, m)), np.cumsum(x * x, axis=0)]
c_trimmed = c[w:] - c[:-w]
v_trimmed = v[w:] - v[:-w]
# Parameters are just the ratio
params = c_trimmed / v_trimmed
# Build a selector array to quickly reshape y and the columns of x
step = np.arange(m - w + 1)
sel = np.arange(w)
locs = step[:, None] + sel
# Get the blocked reshape of y. It has n - w + 1 rows with window observations
# and looks like
# [[y[0],y[1],...,y[99]],
# [y[1],y[2],...,y[100]],
# ...,
# [y[900],y[901],...,y[999]],
y_block = y[locs, 0]
# Storage for the predicted values and the sse
y_pred = np.empty((x.shape[1],) + y_block.shape)
sse = np.empty((m - w + 1, n))
# Easiest to loop over columns.
# Could do broadcasting tricks, but noth worth the trouble since number of columns is modest
for i in range(x.shape[0]):
# Reshape a columns of x like y
x_block = x[locs, i]
# Get the parameters and make sure it is 2d with shape (m-w+1, 1)
# so the broadcasting works
p = params[:, i][:, None]
# Get the predicted value
y_pred[i] = x_block * p
# And the sse
sse[:, i] = ((y_block - y_pred[i]) ** 2).sum(1)
print(f"Time: {(datetime.datetime.now() - now).total_seconds()}s")
# Some test code
# Test any single observation
start = 124
assert start <= m - w
column = 342
assert column < x.shape[1]
res = OLS(y[start : start + 100], x[start : start + 100, [column]]).fit()
np.testing.assert_allclose(res.params[0], params[start, column])
np.testing.assert_allclose(res.fittedvalues, y_pred[column, start])
np.testing.assert_allclose(res.ssr, sse[start, column])

Sampling a 2D Matrix while keeping size of original matrix

I am working on Probabilistic Matrix Factorization. I am wanting to divide the ratings matrix into train and validation. The code I have seen is NOT a random sample.
I have been trying to figure out how to do this.
For a simple example sake, the original matrix R is (10,10). I am looking for a train matrix that is a (10,10) but only has, lets say 70% randomly sampled of the values of matrix R with everything else being 0. and the validations matrix also being (10,10) but having the left over 30% of values with the rest being 0.
The key thing here is keeping the matrices the same size.
Anyone know how to do this?
edit:
thinking about it more: TrainingMatrix + ValidationMatrix = Matrix R if you do an elementwise addition

My solution. Let me know if there is a easier/faster way to do this.
import copy
#Figuring out Matrix Sampling
np.random.seed(3)
training_percent = .7
R = np.arange(100).reshape(10,10)
print("R:")
print(R)
#Get the indexes of the non-zero elements
non_zero_is,non_zero_js = np.nonzero(R)
#Randomly sample the indexes
sampled_indexes_of_non_zeros = np.random.choice(len(i),
int(np.floor(training_percent * len(i))), replace=False)
#Training Mask
training_mask = np.zeros(R.shape,dtype=bool)
for position in sampled_indexes_of_non_zeros:
training_mask[non_zero_is[position],non_zero_js[position]] = True
#Training Matrix
training_matrix = R.copy()
training_matrix[~training_mask] = 0
print("\ntraining_martix:")
print(training_matrix)
#Validation Matrix
validation_matrix = R.copy()
validation_matrix[training_mask] = 0
print("\nvalidation_martix:")
print(validation_matrix)

Calculating Mean Squared Error through Matrix Arithmetic on Numpy Matrices of Binary Images

I have 2 binary images, one is a ground truth, and one is an image segmentation that I produced.
I am trying to calculate the mean squared distance ...
Let G = {g1, g2, . . . , gN} be the points in the ground truth boundary.
Let B = {b1, b2, . . . , bM} be the points in the segmented boundary.
Define d(p, p0) be a measure of distance between points p and p0 (e.g. Euclidean, city block, etc.)
between the two images using the following algorithm.
def MSD(A,G):
'''
Takes a thresholded binary image, and a ground truth img(binary), and computes the mean squared absolute difference
:param A: The thresholded binary image
:param G: The ground truth img
:return:
'''
sim = np.bitwise_xor(A,G)
sum = 0
for i in range(0,sim.shape[0]):
for j in range(0,sim.shape[1]):
if (sim[i,j] == True):
min = 9999999
for k in range(0,sim.shape[0]):
for l in range(0,sim.shape[1]):
if (sim[k, l] == True):
e = abs(i-k) + abs(j-l)
if e < min:
min = e
mink = k
minl = l
sum += min
return sum/(sim.shape[0]*sim.shape[1])
This algorithm is too slow though and never completes.
This example and this example (Answer 3) might show method of how to get the mean squared error using Matrix arithmetic, but I do not understand how these examples make any sense or why they work.

So if I understand your formula and code correctly, you have one (binary) image B and a (ground truth) image G. "Points" are defined by the pixel positions where either image has a True (or at least nonzero) value. From your bitwise_xor I deduce that both images have the same shape (M,N).
So the quantity d^2(b,g) is at worst an (M*N, M*N)-sized array, relating each pixel of B to each pixel of G. It's even better: we only need a shape (m,n) if there are m nonzeros in B and n nonzeros in G. Unless your images are huge we can get away with keeping track of this large quantity. This will cost memory but we will win a lot of CPU time by vectorization. So then we only have to find the minimum of this distance with respect to every n possible value, for each m. Then just sum up each minimum. Note that the solution below uses extreme vectorization, and it can easily eat up your memory if the images are large.
Assuming Manhattan distance (with the square in d^2 which seems to be missing from your code):
import numpy as np
# generate dummy data
M,N = 100,100
B = np.random.rand(M,N) > 0.5
G = np.random.rand(M,N) > 0.5
def MSD(B, G):
# get indices of nonzero pixels
nnz_B = B.nonzero() # (x_inds, y_inds) tuple, x_inds and y_inds are shape (m,)
nnz_G = G.nonzero() # (x_inds', y_inds') each with shape (n,)
# np.array(nnz_B) has shape (2,m)
# compute squared Manhattan distance
dist2 = abs(np.array(nnz_B)[...,None] - np.array(nnz_G)[:,None,:]).sum(axis=0)**2 # shape (m,n)
# alternatively: Euclidean for comparison:
#dist2 = ((np.array(nnz_B)[...,None] - np.array(nnz_G)[:,None,:])**2).sum(axis=0)
mindist2 = dist2.min(axis=-1) # shape (m,) of minimum square distances
return mindist2.mean() # sum divided by m, i.e. the MSD itself
print(MSD(B, G))
If the above uses too much memory we can introduce a loop over the elements of nnz_B, and only vectorize in the elements of nnz_G. This will take more CPU power and less memory. This trade-off is typical for vectorization.

An efficient method for calculating this distance is using the Distance Transform. SciPy has an implementation in the ndimage package: scipy.ndimage.morphology.distance_transform_edt.
The idea is to compute a distance transform for the background of the ground-truth image G. This leads to a new image D that is 0 for each pixel that is nonzero in G, and for each zero pixel in G there will be the distance to the nearest nonzero pixel.
Next, for each nonzero pixel in B (or A in the code that you posted), you look at the corresponding pixel in D. This is the distance to G for that pixel. So, simply average all the values in D for which B is nonzero to obtain your result.
import numpy as np
import scipy.ndimage as nd
import matplotlib.pyplot as pp
# Create some test data
img = pp.imread('erika.tif') # a random image
G = img > 120 # the ground truth
img = img + np.random.normal(0, 20, img.shape)
B = img > 120 # the other image
D = nd.morphology.distance_transform_edt(~G)
msd = np.mean(D[B]**2)

Adding realistic noise to a gaussian distribution while keeping the number of samples above/below a threshold approximately constant

I have a normal distribution and a threshold function that determines if a value is true.
I'd like to add noise to the truth values, such that the further above/below the threshold the less likely the value is to be inverted. Thus at the extremes perhaps there is a 1% chance of a flip, and right at the threshold there is a 50% chance.
I would also like to keep the number of samples above and below the threshold approximately constant before and after the noise is added.
The below code I think does the first half, but not certain as to the approach for the second half. (perhaps trial subtraction of a delta from rnum, till sum of trues before and after the operation is within some error margin)
import numpy as np
mean = .5
std_dev = .2
num_points = 10000
arr = np.sort( np.random.normal(loc=mean, scale= std_dev, size=(num_points)) )
threshold = .8
trues = arr >= threshold
temp = np.where(trues, 1-arr, arr)
scaling = max(temp)
temp *= .5/scaling
rnum = np.random.random(size=(num_points))
flip = rnum <= temp
trues = np.logical_xor(trues, flip)

If I got it right, you want to have an output vector with the following properties:
boolean vector
same number of elements as in the input vector
probability of each element being True depends on its value w.r.t. threshold
the number of Trues is the same as if we used a simple threshold
So, you need a probability function which tells the probability of each input value to give True for the output value. With plain thresholding the probability is 1 above the threshold and 0 below the threshold. You, however, need to have something softer.
Without the last requirement (number of Trues) for the output vector, the algorithm would be very simple. The probability function output would be compared to random valuess between 0 and 1, and that would be the result. Depending on the input signal distribution and the probability function this may produce satisfactory results.
Just an example of this:
# threshold at 0.8, below 0.7 always false, above 0.9 always True, linear in between
def prob_f(x):
return np.clip((x - 0.8) / .2 + .5, 0., 1.)
def noisy_threshold(sig):
p = prob_f(sig)
return p > random.random(sig.shape)
But if a better match for the number of Trues is required, we need to do something afterwards. We need a function which is given the desired number of Trues and the probabilities. Of course, doing this changes some of the properties of the resulting distribution, so there is no "clean" way of doing it.
One possibility would be to slightly adjust our probability threshold. For example:
def_ noisy_threshold(sig, threshold):
# number of Trues with simple thresholding
n_trues = np.asum(sig > threshold)
# difference between random noise and our probability
rdiff = prob_f(sig) - random.random(sig.shape)
# sort the differences
sortdiff = sorted(rdiff)
# a new threshold is used so that the number of Trues is correct:
return rdiff >= sortdiff[-n_trues]
This will return exactly n_trues Trues in case we are not so extremely unlucky as getting some exactly same random differences.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Instance IOU fast calculation on large image - python

Related

How to generate a Rank 5 matrix with entries Uniform?

Python: Fastest way to perform millions of simple linear regression with 1 exogenous variable only

Sampling a 2D Matrix while keeping size of original matrix

Calculating Mean Squared Error through Matrix Arithmetic on Numpy Matrices of Binary Images

Adding realistic noise to a gaussian distribution while keeping the number of samples above/below a threshold approximately constant

Categories

Resources