I have a 2D data array and I'm trying to get a profile of values about its center in an efficient manner. So the output should be two one-dimensional arrays: one with the values of distances from the center, the other with the mean of all the values in the original 2D that are at that distance from the center.
Each index has a non-integer distance from the center, which prevents me from using some already known solutions for the problem. Allow me to explain.
Consider these matrices
data = np.random.randn(5,5)
L = 2
x = np.arange(-L,L+1,1)*2.5
y = np.arange(-L,L+1,1)*2.5
xx, yy = np.meshgrid(x, y)
r = np.sqrt(xx**2. + yy**2.)
So the matrices are
In [30]: r
Out[30]:
array([[ 7.07106781, 5.59016994, 5. , 5.59016994, 7.07106781],
[ 5.59016994, 3.53553391, 2.5 , 3.53553391, 5.59016994],
[ 5. , 2.5 , 0. , 2.5 , 5. ],
[ 5.59016994, 3.53553391, 2.5 , 3.53553391, 5.59016994],
[ 7.07106781, 5.59016994, 5. , 5.59016994, 7.07106781]])
In [31]: data
Out[31]:
array([[ 1.27603322, 1.33635284, 1.93093228, 0.76229675, -0.00956535],
[ 0.69556071, -1.70829753, 1.19615919, -1.32868665, 0.29679494],
[ 0.13097791, -1.33302719, 1.48226442, -0.76672223, -1.01836614],
[ 0.51334771, -0.83863115, -0.41541794, 0.34743342, 0.1199237 ],
[-1.02042539, 0.90739383, -2.4858624 , -0.07417987, 0.90748933]])
For this case the expected output should be array([ 0. , 2.5 , 3.53553391, 5. , 5.59016994, 7.07106781]) for the index of distances, and a second array of same length with the mean of all the values that are at those corresponding distances: array([ 0.98791323, -0.32496927, 0.37221219, -0.6209728 , 0.27986926, 0.04060628]).
From this answer there is a very nice function to compute the profile about any arbitrary point. However, the problem with his approach is that it approximates the distance r by the index distance. So his r for my case would be this:
array([[2, 2, 2, 2, 2],
[2, 1, 1, 1, 2],
[2, 1, 0, 1, 2],
[2, 1, 1, 1, 2],
[2, 2, 2, 2, 2]])
which is a pretty big difference for me, since I'm working with small matrices. This approximation, however, allows him to use np.bincount, which is pretty handy (but won't work for me).
I've been trying to expand this for float distance, like my version r, but so far no luck. bincount doesn't work with floats and histogram needs equally-spaced bins, which is not the case. Any suggestion?
Approach #1
def radial_profile_app1(data, r):
mid = data.shape[0]//2
ids = np.rint((r**2)/r[mid-1,mid]**2).astype(int).ravel()
count = np.bincount(ids)
R = data.shape[0]//2 # Radial profile radius
R0 = R+1
dists = np.unique(r[:R0,:R0][np.tril(np.ones((R0,R0),dtype=bool))])
mean_data = (np.bincount(ids, data.ravel())/count)[count!=0]
return dists, mean_data
For the given sample data -
In [475]: radial_profile_app1(data, r)
Out[475]:
(array([ 0. , 2.5 , 3.53553391, 5. , 5.59016994,
7.07106781]),
array([ 1.48226442 , -0.3297520425, -0.8820454775, -0.3605795875,
0.5696863263, 0.2883829525]))
Approach #2
def radial_profile_app2(data, r):
R = data.shape[0]//2 # Radial profile radius
range_arr = np.arange(-R,R+1)
ids = (range_arr[:,None]**2 + range_arr**2).ravel()
count = np.bincount(ids)
R0 = R+1
dists = np.unique(r[:R0,:R0][np.tril(np.ones((R0,R0),dtype=bool))])
mean_data = (np.bincount(ids, data.ravel())/count)[count!=0]
return dists, mean_data
Runtime test -
In [562]: # Setup inputs
...: N = 2001
...: data = np.random.randn(N,N)
...: L = (N-1)//2
...: x = np.arange(-L,L+1,1)*2.5
...: y = np.arange(-L,L+1,1)*2.5
...: xx, yy = np.meshgrid(x, y)
...: r = np.sqrt(xx**2. + yy**2.)
...:
In [563]: out01, out02 = radial_profile_app1(data, r)
...: out11, out12 = radial_profile_app2(data, r)
...:
...: print np.allclose(out01, out11)
...: print np.allclose(out02, out12)
...:
True
True
In [566]: %timeit radial_profile_app1(data, r)
...: %timeit radial_profile_app2(data, r)
...:
10 loops, best of 3: 114 ms per loop
10 loops, best of 3: 91.2 ms per loop
Got what I was expecting with this function:
def radial_prof(data, r):
uniq = np.unique(r)
prof = np.array([ np.mean(data[ r==un ]) for un in uniq ])
return uniq, prof
But I'm still not happy with the fact that I had to use list comprehension (or a python loop), since it might be slow for very large matrices.
Here is an indirect sorting approach that should scale well if batch size and / or number of bins are large. The sorting is O(n log n) all the histogramming is O(n). I've also added a little unscientific speed test. For the speed test I use flat indexing but I left the 2d index code in because its more flexible when dealing with images of different sizes etc.
import numpy as np
# this need only be run once per batch
def r_to_ind(r, dist_bins="auto"):
f = np.argsort(r.ravel())
if dist_bins == "auto":
rs = r.ravel()[f]
bins = np.where(np.r_[True, rs[1:]!=rs[:-1]])[0]
dist_bins = rs[bins]
else:
bins = np.searchsorted(r.ravel()[f], dist_bins)
denom = np.diff(np.r_[bins, r.size])
return f, np.unravel_index(f, r.shape), bins, denom, dist_bins
# this is with adjustable offset
def profile_xy(image, yx, ij, bins, nynx, denom):
(y, x), (i, j), (ny, nx) = yx, ij, nynx
return np.add.reduceat(image[i + y - ny//2, j + x - nx//2], bins) / denom
# this is fixed
def profile_xy_no_offset(image, ij, bins, denom):
return np.add.reduceat(image[ij], bins) / denom
# this is fixed and flat
def profile_xy_no_offset_flat(image, k, bins, denom):
return np.add.reduceat(image.ravel()[k], bins) / denom
data = np.array([[ 1.27603322, 1.33635284, 1.93093228, 0.76229675, -0.00956535],
[ 0.69556071, -1.70829753, 1.19615919, -1.32868665, 0.29679494],
[ 0.13097791, -1.33302719, 1.48226442, -0.76672223, -1.01836614],
[ 0.51334771, -0.83863115, -0.41541794, 0.34743342, 0.1199237 ],
[-1.02042539, 0.90739383, -2.4858624 , -0.07417987, 0.90748933]])
r = np.array([[ 7.07106781, 5.59016994, 5. , 5.59016994, 7.07106781],
[ 5.59016994, 3.53553391, 2.5 , 3.53553391, 5.59016994],
[ 5. , 2.5 , 0. , 2.5 , 5. ],
[ 5.59016994, 3.53553391, 2.5 , 3.53553391, 5.59016994],
[ 7.07106781, 5.59016994, 5. , 5.59016994, 7.07106781]])
f, (i, j), bins, denom, dist_bins = r_to_ind(r)
result = profile_xy(data, (2, 2), (i, j), bins, (5, 5), denom)
print(dist_bins)
# [ 0. 2.5 3.53553391 5. 5.59016994 7.07106781]
print(result)
# [ 1.48226442 -0.32975204 -0.88204548 -0.36057959 0.56968633 0.28838295]
#########################
from timeit import timeit
n = 2001
batch = 100
fake = 10
a = np.random.random((fake, n, n))
l = np.linspace(-1, 1, n)**2
r = sum(np.ix_(l, l))
def run_all():
f, ij, bins, denom, dist_bins = r_to_ind(r)
for b in range(batch):
profile_xy_no_offset_flat(a[b%fake], f, bins, denom)
print(timeit(run_all, number=10))
# 47.4157 (for 10 batches of 100 images of size 2001x2001)
# and my computer is slower than Divakar's ;-)
I've made some more benchmarks comparing mine to #Divakar's approach 3 stripping out everything precomputable into a run-once-per-batch function. The general finding: they are similar mine has a higher upfront cost but is then faster. But they only cross over at around 100 pictures per batch.
Related
I'm interested in the version of Increment Numpy multi-d array with repeated indices indexed with a cross-product.
In particular, I want to perform the operation done by the following code using matrix operations to accelerate it:
def get_s(image, grid_size):
W, H = image.shape
s = np.zeros((W, H))
for w in range(W):
for h in range(H):
i, j = int(w / grid_size), int(h / grid_size)
s[i, j] += image[w, h]
return s
My idea was to compute all the (i, j) indices at once and use NumPy's ix_ method to index the matrix s:
def get_s(image, grid_size):
W, H = image.shape
s = np.zeros((W, H))
w_idx, h_idx = np.arange(W), np.arange(H)
x_idx, y_idx = np.trunc(w_idx / grid_size).astype(int), np.trunc(h_idx / grid_size).astype(int)
s[np.ix_(x_idx, y_idx)] += image
return s
It is easier to understand the code above with NumPy's example:
Using ix_ one can quickly construct index arrays that will index the cross product. a[np.ix_([1,3],[2,5])] returns the array [[a[1,2] a[1,5]], [a[3,2] a[3,5]]].
In my case, it's likely that some indices will be repeated (as for example with grid_size=2, int(0 / grid_size) = int(1 / grid_size)). And that's where the Increment Numpy multi-d array with repeated indices question comes.
In case the indices are repeated, I would like to update the matrix with the image value by the same number of times. I cannot get any solution to this problem without any additional loops (e.g., zipping the indices; but you essentially have to perform the actual cross product of the indices for s and the image).
I don't think this is the best way to do it but here's one way.
import numpy as np
image = np.arange(9).reshape(3, 3)
s = np.zeros((5, 5))
x_idx, y_idx = np.meshgrid([0, 0, 2], [1, 1, 2])
# find unique destinations
idxs = np.stack((x_idx.flatten(), y_idx.flatten())).T
idxs_unique, counts = np.unique(idxs, axis = 0, return_counts = True)
# create mask for the source and sumthe source pixels headed to the same destination
idxs_repeated = idxs[None, :, :].repeat(len(idxs_unique), axis = 0)
image_mask = (idxs_repeated == idxs_unique[:, None, :]).all(-1)
pixel_sum = (image.flatten()[None, :]*image_mask).sum(-1)
# assign summed sources to destination
s[tuple(idxs_unique.T)] += pixel_sum
EDIT 1:
If you run into problems caused by memory constraints you can do the image masking and summation in batches as done in the following implementation. I set the batch size to 10 but that parameter can be set to whatever works on your machine.
import numpy as np
image = np.arange(12).reshape(3, 4)
s = np.zeros((5, 5))
x_idx, y_idx = np.meshgrid([0, 0, 2], [1, 1, 2, 1])
idxs = np.stack((x_idx.flatten(), y_idx.flatten())).T
idxs_unique, counts = np.unique(idxs, axis = 0, return_counts = True)
batch_size = 10
pixel_sum = []
for i in range(len(unique_idxs)//batch_size + ((len(unique_idxs)%batch_size)!=0)):
batch = idxs_unique[i*batch_size:(i+1)*batch_size, None, :]
idxs_repeated = idxs[None, :, :].repeat(len(batch), axis = 0)
image_mask = (idxs_repeated == idxs_unique[i*batch_size:(i+1)*batch_size, None, :]).all(-1)
pixel_sum.append((image.flatten()[None, :]*image_mask).sum(-1))
pixel_sum = np.concatenate(pixel_sum)
s[tuple(idxs_unique.T)] += pixel_sum
EDIT 2:
OP's method seems to be faster by far if you use numba.
import numpy as np
from numba import jit
#jit(nopython=True)
def get_s(image, grid_size):
W, H = image.shape
s = np.zeros((W, H))
for w in range(W):
for h in range(H):
i, j = int(w / grid_size), int(h / grid_size)
s[i, j] += image[w, h]
return s
def get_s_vec(image, grid_size, batch_size = 10):
W, H = image.shape
s = np.zeros((W, H))
w_idx, h_idx = np.arange(W), np.arange(H)
x_idx, y_idx = np.trunc(w_idx / grid_size).astype(int), np.trunc(h_idx / grid_size).astype(int)
y_idx, x_idx = np.meshgrid(y_idx, x_idx)
idxs = np.stack((x_idx.flatten(), y_idx.flatten())).T
idxs_unique, counts = np.unique(idxs, axis = 0, return_counts = True)
pixel_sum = []
for i in range(len(unique_idxs)//batch_size + ((len(unique_idxs)%batch_size)!=0)):
batch = idxs_unique[i*batch_size:(i+1)*batch_size, None, :]
idxs_repeated = idxs[None, :, :].repeat(len(batch), axis = 0)
image_mask = (idxs_repeated == idxs_unique[i*batch_size:(i+1)*batch_size, None, :]).all(-1)
pixel_sum.append((image.flatten()[None, :]*image_mask).sum(-1))
pixel_sum = np.concatenate(pixel_sum)
s[tuple(idxs_unique.T)] += pixel_sum
return s
print(f'loop result = {get_s(image, 2)}')
print(f'vector result = {get_s_vec(image, 2)}')
%timeit get_s(image, 2)
%timeit get_s_vec(image, 2)
output:
loop result = [[10. 18. 0. 0.]
[17. 21. 0. 0.]
[ 0. 0. 0. 0.]]
vector result = [[10. 18. 0. 0.]
[17. 21. 0. 0.]
[ 0. 0. 0. 0.]]
The slowest run took 15.00 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 5: 751 ns per loop
1000 loops, best of 5: 195 µs per loop
Does skimage.measure.block_reduce do
what you want?
from skimage.measure import block_reduce
s = block_reduce(image, block_size=(grid_size, grid_size), func=np.sum)
I have to run the snippet shown below about 200000 times in a row and the snippet needs about 0.12585 seconds for 1000 iterations. Datapoints has a shape of (3, 2704, 64)
output = []
maxium = 0
for datapoint in datapoints:
tmp = []
for data in datapoint:
maxium = max(data)
if maxium == 0:
tmp.append(data)
else:
tmp.append(data / maxium)
output.append(tmp)
I have tried to rewrite it using map() but this gives me an average of 0.23237 seconds per iteration. This is probably due to the multiple max(y) and list() calls.
np.asarray(list(map(lambda datapoint: list(map(lambda data: data / max(data) if max(data) > 0 else y, datapoint)), datapoints)))
Is there a possibility to optimize the code again to improve performance?
Well here's a short answer:
def bar(datapoints):
m = np.amax(datapoints, axis=2)
m[m == 0] = 1
return datapoints / m[:,:,np.newaxis]
Here's an explanation of how you might have got there (it's how I did get there!):
Let's start off with some example data:
>>> x = np.array([[[1, 2, 3, 4], [11, -12, 13, -14]], [[26, 27, 28, 29], [0, 0, 0, 0]]])
Now check what you get on your original function:
def foo(datapoints):
output = []
maxium = 0
for datapoint in datapoints:
tmp = []
for data in datapoint:
maxium = max(data)
if maxium == 0:
tmp.append(data)
else:
tmp.append(data / maxium)
output.append(tmp)
return numpy.array(output)
The result is:
>>> foo(x)
array([[[ 0.25 , 0.5 , 0.75 , 1. ],
[ 0.84615385, -0.92307692, 1. , -1.07692308]],
[[ 0.89655172, 0.93103448, 0.96551724, 1. ],
[ 0. , 0. , 0. , 0. ]]])
Now let's try out amax:
>>> np.amax(x, axis=0)
array([[26, 27, 28, 29],
[11, 0, 13, 0]])
>>> np.amax(x, axis=2)
array([[ 4, 13],
[29, 0]])
Ah ha, looks like axis=2 is what we're after. Now we want to divide the original array by this, but only in the places where the max is non-zero. How do only divide in some places? The answer is: we divide everywhere, but in some places we divide by 1 so it has no effect. So let's replace zeros with ones:
>>> m = np.amax(x, axis=2)
>>> m[m == 0] = 1
>>> m
array([[ 4, 13],
[29, 1]])
Finally, let's divide by this, broadcasting back over axis 2 which we took the maximum over earlier:
>>> x / m[:,:,np.newaxis]
array([[[ 0.25 , 0.5 , 0.75 , 1. ],
[ 0.84615385, -0.92307692, 1. , -1.07692308]],
[[ 0.89655172, 0.93103448, 0.96551724, 1. ],
[ 0. , 0. , 0. , 0. ]]])
Putting that all together you get bar() at the top.
Try something like this:
maximum = datapoints.max(axis=2, keepdims=True)
output = np.where(maximum==0, datapoints, datapoints/maximum)
You would see a warning invalid value encounter in true_divide but it should work as expected.
Update as #ArthurTacca pointed out:
output = datapoints/np.where(maximum==0, 1, maximum)
will eliminate the warning.
Yes you can definitely speed this up w/ vectorized numpy operations. Here's how I would do it, if I understand what you're trying to do correctly:
import numpy as np
# I use a randomly initialized array here, replace this with your input
arr = np.random.random(size=(3, 2704, 64))
# Find max for 3rd dimension, returns array w/ shape (3, 2704)
max_arr = np.max(arr, axis=2)
# Set up divisor, returns array w/ shape (3, 2704)
divisor = np.where(max_arr == 0, 1, max_arr)
# Use expand_dims to add third dimension, returns array w/ shape (3, 2704, 1)
divisor = np.expand_dims(divisor, axis=2)
# Perform division, shape is (3, 2704, 64)
ans = np.divide(arr, divisor)
From your code, I gather that you intend to scale your data by the max of your 3rd axis, but in the event of there being 0, forego scaling instead. You seem to also want your output to have the same shape as your input, which explains the way you structured output and tmp. That's why I left the code snippet to end w/ output in a numpy array, but if you need it in its original form regardless, its a simple loop to re-arrange your data:
output = []
for i in ans:
tmp = []
for j in i:
tmp.append(list(j))
output.append(tmp)
For future reference, furnish your questions with more detail. It will make it easier for people to participate, and you'll increase the chance of getting your questions answered quickly!
Say I have 2 numpy 2D arrays, mins, and maxs, that will always be the same dimension as one another. I'd like to create a third array, results, that is the result of applying linspace to max and min value. Is there some "numpy"/vectorized way to do this? Example non-vectorized code is below to show results I would like.
import numpy as np
mins = np.random.rand(2,2)
maxs = np.random.rand(2,2)
# Number of elements in the linspace
x = 3
m, n = mins.shape
results = np.zeros((m, n, x))
for i in range(m):
for j in range(n):
min = mins[i][j]
max = maxs[i][j]
results[i][j] = np.linspace(min, max, num=x)
Here's one vectorized approach based on this post to cover for generic n-dim cases -
def create_ranges_nd(start, stop, N, endpoint=True):
if endpoint==1:
divisor = N-1
else:
divisor = N
steps = (1.0/divisor) * (stop - start)
return start[...,None] + steps[...,None]*np.arange(N)
Sample run -
In [536]: mins = np.array([[3,5],[2,4]])
In [537]: maxs = np.array([[13,16],[11,12]])
In [538]: create_ranges_nd(mins, maxs, 6)
Out[538]:
array([[[ 3. , 5. , 7. , 9. , 11. , 13. ],
[ 5. , 7.2, 9.4, 11.6, 13.8, 16. ]],
[[ 2. , 3.8, 5.6, 7.4, 9.2, 11. ],
[ 4. , 5.6, 7.2, 8.8, 10.4, 12. ]]])
As of Numpy version 1.16.0, non-scalar start and stop are now supported.
So, now you can do this:
assert np.__version__ > '1.17.2'
mins = np.random.rand(2,2)
maxs = np.random.rand(2,2)
# Number of elements in the linspace
x = 3
results = np.linspace(mins, maxs, num=x)
# And, if required
results = np.rollaxis(results, 0, 3)
I was wondering if there is a more straight forward, more efficient way of generating a distance matrix given the H x W of the matrix, and the starting index location.
For simplicity lets take a 3x3 matrix where the starting point is (0,0). Thus, the distance matrix to be generated is:
[[ 0. 1. 2. ]
[ 1. 1.41421356 2.23606798]
[ 2. 2.23606798 2.82842712]]
Index (0,1) is 1 distance away, while index (2,2) is 2.828 distance away.
The code I have so far is below:
def get_distances(start, height, width):
matrix = np.zeros((height, width), dtype=np.float16)
indexes = [(y, x) for y, row in enumerate(matrix) for x, val in enumerate(row)]
to_points = np.array(indexes)
start_point = np.array(start)
distances = np.linalg.norm(to_points - start_point, ord=2, axis=1.)
return distances.reshape((height, width))
height = 3
width = 3
start = [0,0]
distance_matrix = get_distances(start, height, width)
This is pretty efficient already, I think. But numpy always surprise me with some tricks that I usually never think of, so I was wondering if there exist one in this scenario. Thanks
You can use hypot() and broadcast:
import numpy as np
x = np.arange(3)
np.hypot(x[None, :], x[:, None])
or the outer method:
np.hypot.outer(x, x)
the result:
array([[ 0. , 1. , 2. ],
[ 1. , 1.41421356, 2.23606798],
[ 2. , 2.23606798, 2.82842712]])
to calculate the distance between every point on a grid to a fixed point (x, y):
x, y = np.ogrid[0:3, 0:3]
np.hypot(x - 2, y - 2)
I am a beginner at python and numpy and I need to compute the matrix logarithm for each "pixel" (i.e. x,y position) of a matrix-valued image of dimension NxMx3x3. 3x3 is the dimensions of the matrix at each pixel.
The function I have written so far is the following:
def logm_img(im):
from scipy import linalg
dimx = im.shape[0]
dimy = im.shape[1]
res = zeros_like(im)
for x in range(dimx):
for y in range(dimy):
res[x, y, :, :] = linalg.logm(asmatrix(im[x,y,:,:]))
return res
Is it ok?
Is there a way to avoid the two nested loops ?
Numpy can do that. Just call numpy.log:
>>> import numpy
>>> a = numpy.array(range(100)).reshape(10, 10)
>>> b = numpy.log(a)
__main__:1: RuntimeWarning: divide by zero encountered in log
>>> b
array([[ -inf, 0. , 0.69314718, 1.09861229, 1.38629436,
1.60943791, 1.79175947, 1.94591015, 2.07944154, 2.19722458],
[ 2.30258509, 2.39789527, 2.48490665, 2.56494936, 2.63905733,
2.7080502 , 2.77258872, 2.83321334, 2.89037176, 2.94443898],
[ 2.99573227, 3.04452244, 3.09104245, 3.13549422, 3.17805383,
3.21887582, 3.25809654, 3.29583687, 3.33220451, 3.36729583],
[ 3.40119738, 3.4339872 , 3.4657359 , 3.49650756, 3.52636052,
3.55534806, 3.58351894, 3.61091791, 3.63758616, 3.66356165],
[ 3.68887945, 3.71357207, 3.73766962, 3.76120012, 3.78418963,
3.80666249, 3.8286414 , 3.8501476 , 3.87120101, 3.8918203 ],
[ 3.91202301, 3.93182563, 3.95124372, 3.97029191, 3.98898405,
4.00733319, 4.02535169, 4.04305127, 4.06044301, 4.07753744],
[ 4.09434456, 4.11087386, 4.12713439, 4.14313473, 4.15888308,
4.17438727, 4.18965474, 4.20469262, 4.21950771, 4.2341065 ],
[ 4.24849524, 4.26267988, 4.27666612, 4.29045944, 4.30406509,
4.31748811, 4.33073334, 4.34380542, 4.35670883, 4.36944785],
[ 4.38202663, 4.39444915, 4.40671925, 4.41884061, 4.4308168 ,
4.44265126, 4.4543473 , 4.46590812, 4.47733681, 4.48863637],
[ 4.49980967, 4.51085951, 4.52178858, 4.53259949, 4.54329478,
4.55387689, 4.56434819, 4.57471098, 4.58496748, 4.59511985]])