I have an array containing arrays of coordinates like this:
a = [[0,0,300,400],[1,1,15,59],[5,5,300,400]]
Now I want to get the overlap ratio of each rectangle to the other rectangles:
def bool_rect_intersect(A, B):
return not (B[0]>A[2] or B[2]<A[0] or B[3]<A[1] or B[1]>A[3])
def get_overlap_ratio(A, B):
in_ = bool_rect_intersect(A, B)
if not in_:
return 0
else:
left = max(A[0], B[0]);
top = max(A[1], B[1]);
right = min(A[2], B[2]);
bottom = min(A[3], B[3]);
intersection = [left, top, right, bottom];
surface_intersection = (intersection[2]-intersection[0])*(intersection[3]-intersection[1]);
surface_A = (A[2]- A[0])*(A[3]-A[1]) + 0.0;
return surface_intersection / surface_A
Now i'm looking for the fastest way to compute the grid of overlaps for arrays of size 2000+.
If I loop over it it takes more than a minute. I tried np.vectorize, but i don't think this is applicable in a multidimensional array
Approach #1 : Here's a vectorized approach -
def pairwise_overlaps(a):
r,c = np.triu_indices(a.shape[0],1)
lt = np.maximum(a[r,:2], a[c,:2])
tb = np.minimum(a[r,2:], a[c,2:])
si_vectorized = (tb[:,0] - lt[:,0]) * (tb[:,1] - lt[:,1])
slicedA_comps = ((a[:,2]- a[:,0])*(a[:,3]-a[:,1]) + 0.0)
sA_vectorized = np.take(slicedA_comps, r)
return si_vectorized/sA_vectorized
Sample run -
In [48]: a
Out[48]:
array([[ 0, 0, 300, 400],
[ 1, 1, 15, 59],
[ 5, 5, 300, 400]])
In [49]: print get_overlap_ratio(a[0], a[1]) # Looping thru pairs
...: print get_overlap_ratio(a[0], a[2])
...: print get_overlap_ratio(a[1], a[2])
...:
0.00676666666667
0.971041666667
0.665024630542
In [50]: pairwise_overlaps(a) # Proposed app to get all those in one-go
Out[50]: array([ 0.00676667, 0.97104167, 0.66502463])
Approach #2 : Upon close inspection, we will see that in the previous approach, the indexing with the r's and c's would be performance killers as they will make copies. We can improve on this, by performing computations for each element in a column against each of other elements in the same column, as listed in the implementation below -
def pairwise_overlaps_v2(a):
rl = np.minimum(a[:,2], a[:,None,2]) - np.maximum(a[:,0], a[:,None,0])
bt = np.minimum(a[:,3], a[:,None,3]) - np.maximum(a[:,1], a[:,None,1])
si_vectorized2D = rl*bt
slicedA_comps = ((a[:,2]- a[:,0])*(a[:,3]-a[:,1]) + 0.0)
overlaps2D = si_vectorized2D/slicedA_comps[:,None]
r = np.arange(a.shape[0])
tril_mask = r[:,None] < r
return overlaps2D[tril_mask]
Runtime test
In [238]: n = 1000
In [239]: a = np.hstack((np.random.randint(0,100,(n,2)), \
np.random.randint(300,500,(n,2))))
In [240]: np.allclose(pairwise_overlaps(a), pairwise_overlaps_v2(a))
Out[240]: True
In [241]: %timeit pairwise_overlaps(a)
10 loops, best of 3: 35.2 ms per loop
In [242]: %timeit pairwise_overlaps_v2(a)
100 loops, best of 3: 16 ms per loop
Let's add in the original approach as loop-comprehension -
In [244]: r,c = np.triu_indices(a.shape[0],1)
In [245]: %timeit [get_overlap_ratio(a[r[i]], a[c[i]]) for i in range(len(r))]
1 loops, best of 3: 2.85 s per loop
Around 180x speedup there with the second approach over the original one!
Related
I have two very large numpy arrays, which are both 3D. I need to find an efficient way to check if they are overlapping, because turning them both into sets first takes too long. I tried to use another solution I found here for this same problem but for 2D arrays, but I didn't manage to make it work for 3D.
Here is the solution for 2D:
nrows, ncols = A.shape
dtype={'names':['f{}'.format(i) for i in range(ndep)],
'formats':ndep * [A.dtype]}
C = np.intersect1d(A.view(dtype).view(dtype), B.view(dtype).view(dtype))
# This last bit is optional if you're okay with "C" being a structured array...
C = C.view(A.dtype).reshape(-1, ndep)
(where A and B are the 2D arrays)
I need to find the number of overlapping numpy arrays, but not the specific ones.
We could leverage views using a helper function that I have used across few Q&As. To get the presence of subarrays, we could use np.isin on the views or use a more laborious one with np.searchsorted.
Approach #1 : Using np.isin -
# https://stackoverflow.com/a/45313353/ #Divakar
def view1D(a, b): # a, b are arrays
a = np.ascontiguousarray(a)
b = np.ascontiguousarray(b)
void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
return a.view(void_dt).ravel(), b.view(void_dt).ravel()
def isin_nd(a,b):
# a,b are the 3D input arrays to give us "isin-like" functionality across them
A,B = view1D(a.reshape(a.shape[0],-1),b.reshape(b.shape[0],-1))
return np.isin(A,B)
Approach #2 : We could also leverage np.searchsorted upon the views -
def isin_nd_searchsorted(a,b):
# a,b are the 3D input arrays
A,B = view1D(a.reshape(a.shape[0],-1),b.reshape(b.shape[0],-1))
sidx = A.argsort()
sorted_index = np.searchsorted(A,B,sorter=sidx)
sorted_index[sorted_index==len(A)] = len(A)-1
idx = sidx[sorted_index]
return A[idx] == B
So, these two solutions give us the mask of presence of each of the subarrays from a in b. Hence, to get our desired count, it would be - isin_nd(a,b).sum() or isin_nd_searchsorted(a,b).sum().
Sample run -
In [71]: # Setup with 3 common "subarrays"
...: np.random.seed(0)
...: a = np.random.randint(0,9,(10,4,5))
...: b = np.random.randint(0,9,(7,4,5))
...:
...: b[1] = a[4]
...: b[3] = a[2]
...: b[6] = a[0]
In [72]: isin_nd(a,b).sum()
Out[72]: 3
In [73]: isin_nd_searchsorted(a,b).sum()
Out[73]: 3
Timings on large arrays -
In [74]: # Setup
...: np.random.seed(0)
...: a = np.random.randint(0,9,(100,100,100))
...: b = np.random.randint(0,9,(100,100,100))
...: idxa = np.random.choice(range(len(a)), len(a)//2, replace=False)
...: idxb = np.random.choice(range(len(b)), len(b)//2, replace=False)
...: a[idxa] = b[idxb]
# Verify output
In [82]: np.allclose(isin_nd(a,b),isin_nd_searchsorted(a,b))
Out[82]: True
In [75]: %timeit isin_nd(a,b).sum()
10 loops, best of 3: 31.2 ms per loop
In [76]: %timeit isin_nd_searchsorted(a,b).sum()
100 loops, best of 3: 1.98 ms per loop
I wanted to create this kind of array using numpy:
[[[0,0,0], [1,0,0], ..., [1919,0,0]],
[[0,1,0], [1,1,0], ..., [1919,1,0]],
...,
[[0,1019,0], [1,1019,0], ..., [1919,1019,0]]]
To which I can access via:
>>> data[25][37]
array([25, 37, 0])
I've tried to create an array this way, but it's not complete:
>>> data = np.mgrid[0:1920:1, 0:1080:1].swapaxes(0,2).swapaxes(0,1)
>>> data[25][37]
array([25, 37])
Do you have any idea how to solve this using numpy?
Approach #1 : Here's one way with np.ogrid and array-initialization -
def indices_zero_grid(m,n):
I,J = np.ogrid[:m,:n]
out = np.zeros((m,n,3), dtype=int)
out[...,0] = I
out[...,1] = J
return out
Sample run -
In [550]: out = indices_zero_grid(1920,1080)
In [551]: out[25,37]
Out[551]: array([25, 37, 0])
Approach #2 : A modification of #senderle's cartesian_product and also inspired by #unutbu's modification to it -
import functools
def indices_zero_grid_v2(m,n):
"""
Based on cartesian_product
http://stackoverflow.com/a/11146645 (#senderle)
Inspired by : https://stackoverflow.com/a/46135435 (#unutbu)
"""
shape = m,n
arrays = [np.arange(s, dtype='int') for s in shape]
broadcastable = np.ix_(*arrays)
broadcasted = np.broadcast_arrays(*broadcastable)
rows, cols = functools.reduce(np.multiply, broadcasted[0].shape), \
len(broadcasted)+1
out = np.zeros(rows * cols, dtype=int)
start, end = 0, rows
for a in broadcasted:
out[start:end] = a.reshape(-1)
start, end = end, end + rows
return out.reshape(-1,m,n).transpose(1,2,0)
Runtime test -
In [2]: %timeit indices_zero_grid(1920,1080)
100 loops, best of 3: 8.4 ms per loop
In [3]: %timeit indices_zero_grid_v2(1920,1080)
100 loops, best of 3: 8.14 ms per loop
In [50]: data = np.mgrid[:1920, :1080, :1].transpose(1,2,3,0)[..., 0, :]
In [51]: data[25][37]
Out[51]: array([25, 37, 0])
Note that data[25][37] two calls to __getitem__. With NumPy arrays, you can access the same value more efficiently (with one call to __getitem__) using data[25, 37]:
In [54]: data[25, 37]
Out[54]: array([25, 37, 0])
I have an array like this: tmp.shape = (128, 64, 64)
I am counting all zeros along the 128 axis like this:
nonzeros = np.count_nonzero(tmp, axis=0) // shape = (64, 64)
I have an array c.shape = (64, 64)
Now I want to add the values of c to tmp along the 128 axis but only if the values of tmp are > 0:
for i in range(tmp.shape[0]):
axis1 = tmp[i,:,:]
tmp[i, :, :] += (c / nonzeros) // only if tmp[i, :, :] > 0
Is that possible to do in a short way? Or do I have to use multiple loops?
I hope anyone can suggest a solution without another loop
Something like this does not work:
tmp[i, tmp > 0.0, tmp > 0.0] += (c / nonzeros)
IndexError: too many indices for array
LONG VERSION
for i in range(tmp.shape[0]):
for x in range(tmp.shape[1]):
for y in range(tmp.shape[2]):
pixel = tmp[i, x, y]
if pixel != 0:
pixel += (c[x,y] / nonzeros[x,y])
Well you are basically adding in broadcasted c/nonzeros into tmp array, except at places where the tmp element is zero. So, one approach would be to store the mask of 0s upfront, add in c/nonzeros and finally use the mask to reset tmp elements.
Hence, the implementation would be -
mask = tmp==0
tmp+= c/nonzeros
tmp[mask] = 0
Runtime test
Approaches -
# #DSM's soln
def fast(tmp, c, nonzeros):
return tmp + np.where(tmp > 0, c/nonzeros, 0)
# Proposed in this post
def fast2(tmp, c, nonzeros):
mask = tmp==0
tmp+= c/nonzeros
tmp[mask] = 0
return tmp
Timings -
In [341]: # Setup inputs
...: M,N = 128,64
...: tmp = np.random.randint(0,10,(M,N,N)).astype(float)
...: c = np.random.rand(N,N)*100
...: nonzeros = np.count_nonzero(tmp, axis=0)
...:
...: # Make copies for testing as input would be edited with the approaches
...: tmp1 = tmp.copy()
...: tmp2 = tmp.copy()
...:
In [342]: %timeit fast(tmp1, c, nonzeros)
100 loops, best of 3: 2.22 ms per loop
In [343]: %timeit fast2(tmp2, c, nonzeros)
1000 loops, best of 3: 1.61 ms per loop
Shorter alternative
If you are looking for a compact code, here's another using the mask of non-0s to do broadcasted element-wise multiplication with c/nonzeros and add into tmp and thus have a one-liner solution, like so -
tmp += (tmp!=0)*(c/nonzeros)
Note: To avoid division by 0, we could edit nonzeros at its 0s with anything other than 0, say 1 and then use the posted approaches, like so -
nonzeros = np.where(nonzeros > 0, nonzeros, 1)
You could use np.where and broadcasting. After fixing your example code (adding to pixel won't modify tmp),
def fast(tmp, c, nonzeros):
return tmp + np.where(tmp > 0, c/nonzeros, 0)
gives me
In [6]: tmp = np.random.randint(0, 5, (128, 64, 64)).astype(float)
...: c = np.random.randint(10, 15, (64, 64)).astype(float)
...: nonzeros = np.count_nonzero(tmp, axis=0)
...:
In [7]: %time slow_result = slow(tmp, c, nonzeros)
CPU times: user 488 ms, sys: 16 ms, total: 504 ms
Wall time: 553 ms
In [8]: %time fast_result = fast(tmp, c, nonzeros)
CPU times: user 4 ms, sys: 4 ms, total: 8 ms
Wall time: 16.4 ms
In [9]: np.allclose(slow_result, fast_result)
Out[9]: True
Alternatively, you can often replace np.where with a multiplication, something like tmp + (tmp > 0) * (c/nonzeros).
Modifying the code to protect against a situation in which an element of nonzeros is zero is left as an exercise for the reader. ;-)
Suppose I have an N-dimensional numpy array x and an (N-1)-dimensional index array m (for example, m = x.argmax(axis=-1)). I'd like to construct (N-1) dimensional array y such that y[i_1, ..., i_N-1] = x[i_1, ..., i_N-1, m[i_1, ..., i_N-1]] (for the argmax example above it would be equivalent to y = x.max(axis=-1)).
For N=3 I could achieve what I want by
y = x[np.arange(x.shape[0])[:, np.newaxis], np.arange(x.shape[1]), m]
The question is, how do I do this for an arbitrary N?
you can use indices :
firstdims=np.indices(x.shape[:-1])
And add yours :
ind=tuple(firstdims)+(m,)
Then x[ind] is what you want.
In [228]: allclose(x.max(-1),x[ind])
Out[228]: True
Here's one approach using reshaping and linear indexing to handle multi-dimensional arrays of arbitrary dimensions -
shp = x.shape[:-1]
n_ele = np.prod(shp)
y_out = x.reshape(n_ele,-1)[np.arange(n_ele),m.ravel()].reshape(shp)
Let's take a sample case with a ndarray of 6 dimensions and let's say we are using m = x.argmax(axis=-1) to index into the last dimension. So, the output would be x.max(-1). Let's verify this for the proposed solution -
In [121]: x = np.random.randint(0,9,(4,5,3,3,2,4))
In [122]: m = x.argmax(axis=-1)
In [123]: shp = x.shape[:-1]
...: n_ele = np.prod(shp)
...: y_out = x.reshape(n_ele,-1)[np.arange(n_ele),m.ravel()].reshape(shp)
...:
In [124]: np.allclose(x.max(-1),y_out)
Out[124]: True
I liked #B. M.'s solution for its elegance. So, here's a runtime test to benchmark these two -
def reshape_based(x,m):
shp = x.shape[:-1]
n_ele = np.prod(shp)
return x.reshape(n_ele,-1)[np.arange(n_ele),m.ravel()].reshape(shp)
def indices_based(x,m): ## #B. M.'s solution
firstdims=np.indices(x.shape[:-1])
ind=tuple(firstdims)+(m,)
return x[ind]
Timings -
In [152]: x = np.random.randint(0,9,(4,5,3,3,4,3,6,2,4,2,5))
...: m = x.argmax(axis=-1)
...:
In [153]: %timeit indices_based(x,m)
10 loops, best of 3: 30.2 ms per loop
In [154]: %timeit reshape_based(x,m)
100 loops, best of 3: 5.14 ms per loop
Given a list of rotation angles (lets say about the X axis):
import numpy as np
x_axis_rotations = np.radians([0,10,32,44,165])
I can create an array of matrices matching these angles by doing so:
matrices = []
for angle in x_axis_rotations:
matrices.append(np.asarray([[1 , 0 , 0],[0, np.cos(angle), -np.sin(angle)], [0, np.sin(angle), np.cos(angle)]]))
matrices = np.array(matrices)
This will work but it doesn't take advantage of numpy's strengths for dealing with large arrays... So if my array of angles is in the millions, doing it this way won't be very fast.
Is there a better (faster) way to do create an array of transform matrices from an array of inputs?
Here's a direct and simple approach:
c = np.cos(x_axis_rotations)
s = np.sin(x_axis_rotations)
matrices = np.zeros((len(x_axis_rotations), 3, 3))
matrices[:, 0, 0] = 1
matrices[:, 1, 1] = c
matrices[:, 1, 2] = -s
matrices[:, 2, 1] = s
matrices[:, 2, 2] = c
timings, for the curious:
In [30]: angles = 2 * np.pi * np.random.rand(1000)
In [31]: timeit OP(angles)
100 loops, best of 3: 5.46 ms per loop
In [32]: timeit askewchan(angles)
10000 loops, best of 3: 39.6 µs per loop
In [33]: timeit divakar(angles)
10000 loops, best of 3: 93.8 µs per loop
In [34]: timeit divakar_oneline(angles)
10000 loops, best of 3: 56.1 µs per loop
In [35]: timeit divakar_combine(angles)
10000 loops, best of 3: 43.9 µs per loop
All are much faster than your loop, so use whichever you like the most :)
You can use linear indexing to help out, like so -
# Get cosine and sine values in one-go
cosv = np.cos(x_axis_rotations)
sinv = np.sin(x_axis_rotations)
# Get size parameter
N = x_axis_rotations.size
# Initialize output array
out = np.zeros((N,3,3))
# Set the first element in each 3D slice as 1
out[:,0,0] = 1
# Calculate the first of positions where cosine valued elements are to be put
idx1 = 4 + 9*np.arange(N)[:,None]
# One by one put those 4 values in 2x2 blocks across all 3D slices
out.ravel()[idx1] = cosv
out.ravel()[idx1+1] = -sinv
out.ravel()[idx1+3] = sinv
out.ravel()[idx1+4] = cosv
Alternatively, you can set the elements in one-go after you have initialized the output array with zeros and set the first element in each slice as 1, like so -
out.reshape(N,-1)[:,[4,5,7,8]] = np.column_stack((cosv,-sinv,sinv,cosv))
Between the above mentioned two approaches, two more middleground approaches could evolve, again put right after initializing with zeros and setting the first element in each 3D slice as 1, like so -
out.reshape(N,-1)[:,[4,8]] = cosv[:,None]
out.reshape(N,-1)[:,[5,7]] = np.column_stack((-sinv[:,None],sinv[:,None]))
The last one would be -
out.reshape(N,-1)[:,[4,8]] = cosv[:,None]
out.reshape(N,-1)[:,5] = -sinv
out.reshape(N,-1)[:,7] = sinv