Efficient numpy broadcasting not found

Efficient numpy broadcasting not found - python

It may be an easy problem but I could not find any practical solution. My code has following code segment involving 3 nested for loops. The target is to create a specialized intensity matrix for my algorithms for both prediction and ground_truth image matrix as follows:
for i in range (batch):
for j in range (img_width):
for k in range (img_height):
tensor=prediction[i][j][:]-prediction[i][k][:]
extracted_intensity_pred[i][j][k]=torch.norm(tensor, 2)
tensor=ground truth[i][j][:]-ground_truth[i][k][:]
extracted_intensity_ground_truth[i][j][k]=torch.norm(tensor, 2)
This nested for loop structure is slowing the execution intensively. Is there any broadcasting implementation(in numpy or pytorch tensor based) that may be used?

first lets clean up some notation; [:] does nothing
But first what's the dimensions, mostly 3d?
for i in range (batch):
for j in range (img_width):
for k in range (img_height):
tensor = prediction[i,j,:] - prediction[i,k,:]
# looks like a prediction[:,:,None]-prediction[:,None,:]; making 4d?
extracted_intensity_pred[i,j,k] = torch.norm(tensor, 2)
# what can torch.norm work with?
so maybe it's just
tensor = prediction[:,:,None] - prediction(:,None,:]
extracted_intensity_pred = torch.norm(tensor, ?)

Related

Suggestion to vectorize a Python function

I wrote the following function, which takes as inputs three 1D array (namely int_array, x, and y) and a number lim. The output is a number as well.
def integrate_to_lim(int_array, x, y, lim):
if lim >= np.max(x):
res = 0.0
if lim <= np.min(x):
res = int_array[0]
else:
index = np.argmax(x > lim) # To find the first element of x larger than lim
partial = int_array[index]
slope = (y[index-1] - y[index]) / (x[index-1] - x[index])
rest = (x[index] - lim) * (y[index] + (lim - x[index]) * slope / 2.0)
res = partial + rest
return res
Basically, outside form the limit cases lim>=np.max(x) and lim<=np.min(x), the idea is that the function finds the index of the first value of the array x larger than lim and then uses it to make some simple calculations.
In my case, however lim can also be a fairly big 2D array (shape ~2000 times ~1000 elements)
I would like to rewrite it such that it makes the same calculations for the case that lim is a 2D array.
Obviously, the output should also be a 2D array of the same shape of lim.
I am having a real struggle figuring out how to vectorize it.
I would like to stick only to the numpy package.
PS I want to vectorize my function because efficiency is important and as I understand using for loops is not a good choice in this regard.
Edit: my attempt
I was not aware of the function np.take, which made the task way easier.
Here is my brutal attempt that seems to work (suggestions on how to clean up or to make the code faster are more than welcome).
def integrate_to_lim_vect(int_array, x, y, lim_mat):
lim_mat = np.asarray(lim_mat) # Make sure that it is an array
shape_3d = list(lim_mat.shape) + [1]
x_3d = np.ones(shape_3d) * x # 3 dimensional version of x
lim_3d = np.expand_dims(lim_mat, axis=2) * np.ones(x_3d.shape) # also 3d
# I use np.argmax on the 3d matrices (is there a simpler way?)
index_mat = np.argmax(x_3d > lim_3d, axis=2)
# Silly calculations
partial = np.take(int_array, index_mat)
y1_mat = np.take(y, index_mat)
y2_mat = np.take(y, index_mat - 1)
x1_mat = np.take(x, index_mat)
x2_mat = np.take(x, index_mat - 1)
slope = (y1_mat - y2_mat) / (x1_mat - x2_mat)
rest = (x1_mat - lim_mat) * (y1_mat + (lim_mat - x1_mat) * slope / 2.0)
res = partial + rest
# Make the cases with np.select
condlist = [lim_mat >= np.max(x), lim_mat <= np.min(x)]
choicelist = [0.0, int_array[0]] # Shoud these options be a 2d matrix?
output = np.select(condlist, choicelist, default=res)
return output
I am aware that if the limit is larger than the maximum value in the array np.argmax returns the index zero (leading to wrong results). This is why I used np.select to check and correct for these cases.
Is it necessary to define the three dimensional matrices x_3d and lim_3d, or there is a simpler way to find the 2D matrix of the indices index_mat?
Suggestions, especially to improve the way I expanded the dimension of the arrays, are welcome.

I think you can solve this using two tricks. First, a 2d array can be easily flattened to a 1d array, and then your answers can be converted back into a 2d array with reshape.
Next, your use of argmax suggests that your array is sorted. Then you can find your full set of indices using digitize. Thus instead of a single index, you will get a complete array of indices. All the calculations you are doing are intrinsically supported as array operations in numpy, so that should not cause any problems.
You will have to specifically look at the limiting cases. If those are rare enough, then it might be okay to let the answers be derived by the default formula (they will be garbage values), and then replace them with the actual values you desire.

Reshaping numpy arrays of differing dimensions inside an array

So the task is to optimise a Neural Network with a PSO. The PSO needs a one-dimensional list of all the weights and biases, like so [0.1 0.244 ... 0.214]. The NN needs an array of arrays with different dimensions, like so [[x,y], [m,n], ...(all the hidden layer matrices)... ,[p,q]] X and y are the dimensions for the input layer, then all the hidden layers and finally p and q - the dimensions of the output layer.
I can easily flatten the array to pass it to the PSO, but I need a method that takes the modified array and reshapes it back into the same array of arrays with the same dimensions as the starting one from the NN.
The dimensions depend on the amount of neurons in a layer, we have that information from the start.
I have tried to keep track of the shapes array and create an indices array to know when to stop but it doesn't seem to work. I am trying something with slicing now but no cigar yet. A modification to the NN is also possible but how to create it so it takes a predefined list of weights? There might be a very nice and efficient way to do it but I just haven't thought of it yet... Any suggestions?
Example:
a = np.array([1,2,3])
b = np.array([7,8,9,10])
c = np.array([12,13,14,15,16])
b.reshape(2,2)
arr = []
arr.append(a)
arr.append(b)
arr.append(c)
This is a very simple example of what the list of weights is as the NN works with it - a list of multi-dimensional array. Arr can be converted into a numpy array of objects if necessary with np.asarray(arr).
Flattening is easy, here is how I do it (there might be a better that doesn't need a loop, if you know, I'd be thankful if you shared).
Flattening:
new_arr = np.array([])
for i in range(len(arr)):
new_arr = np.append(arr, arr[i].flatten())
My question is how to take new_arr and put it back together to look like arr and is there a beautiful and fast way to do it.

You can save the shape in a variable (it's just a tuple). Try something like:
...
old_shape = arr.shape
# ... do flattening here
new_arr.reshape(old_shape)

new_arr = np.array([])
shapes=[]
for i in range(len(arr)):
new_arr = np.append(new_arr, arr[i].flatten())
shapes.append(arr[i].shape)
#do whatever
restoredArray =[]
offset=0
for i in range(len(shapes)):
s = shapes[i]
n = np.prod(s)
restoredArray.append(new_arr[offset:(offset+n)].reshape(s))
offset+=n

Access to short diagonal elements in Numpy 3 dimensional array

I have a 3 dimensional numpy array and I want to access short diagonal elements of it. Let's say i,j,k are three dimensions. Is it possible to access elements where i==j or i==k or j==k, so that I can set them to a specific value.
I tried to solve this by creating a mask variable of indices. This mask variable of indices is fed to the final array where the values of {i=j or i=k or j=k} are set to specific values. Unfortunately this code is returning the set where {i=j=k}
import numpy as np
N = 3
maskXY = np.eye(N).reshape(N,N,1)
maskYZ = np.eye(N).reshape(1,N,N)
maskXZ = np.eye(N).reshape(N,1,N)
maskIndices = maskXY * maskYZ*maskXZ
#set the values of final array using above mask
finalArray[maskIndices] = #specific values

Approach #1
We could create open meshes with np.ix_ using the ranged arrays covering the dimensions of the input array and then perform OR-ing among those with a very close syntax to the one described in the question, like so -
i,j,k = np.ix_(*[np.arange(r) for r in finalArray.shape])
mask = (i==j) | (i==k) | (j==k)
finalArray[mask] = # desired values
Approach #2
It seems, we can also follow the posted code in the question and use boolean versions of the masks and then perform OR-ing to get the mask equivalent, like so -
mask = (maskXY==1) | (maskYZ==1) | (maskXZ==1)
But, this involves masks that are 2D (when squeezed) and as such won't be as memory-efficient as the previous approach that dealt with 1D arrays.

np.bincount for 1 line, vectorized multidimensional averaging

I am trying to vectorize an operation using numpy, which I use in a python script that I have profiled, and found this operation to be the bottleneck and so needs to be optimized since I will run it many times.
The operation is on a data set of two parts. First, a large set (n) of 1D vectors of different lengths (with maximum length, Lmax) whose elements are integers from 1 to maxvalue. The set of vectors is arranged in a 2D array, data, of size (num_samples,Lmax) with trailing elements in each row zeroed. The second part is a set of scalar floats, one associated with each vector, that I have a computed and which depend on its length and the integer-value at each position. The set of scalars is made into a 1D array, Y, of size num_samples.
The desired operation is to form the average of Y over the n samples, as a function of (value,position along length,length).
This entire operation can be vectorized in matlab with use of the accumarray function: by using 3 2D arrays of the same size as data, whose elements are the corresponding value, position, and length indices of the desired final array:
sz_Y = num_samples;
sz_len = Lmax
sz_pos = Lmax
sz_val = maxvalue
ind_len = repmat( 1:sz_len ,1 ,sz_samples);
ind_pos = repmat( 1:sz_pos ,sz_samples,1 );
ind_val = data
ind_Y = repmat((1:sz_Y)',1 ,Lmax );
copiedY=Y(ind_Y);
mask = data>0;
finalarr=accumarray({ind_val(mask),ind_pos(mask),ind_len(mask)},copiedY(mask), [sz_val sz_pos sz_len])/sz_val;
I was hoping to emulate this implementation with np.bincounts. However, np.bincounts differs to accumarray in two relevant ways:
both arguments must be of same 1D size, and
there is no option to choose the shape of the output array.
In the above usage of accumarray, the list of indices, {ind_val(mask),ind_pos(mask),ind_len(mask)}, is 1D cell array of 1x3 arrays used as index tuples, while in np.bincounts it must be 1D scalars as far as I understand. I expect np.ravel may be useful but am not sure how to use it here to do what I want. I am coming to python from matlab and some things do not translate directly, e.g. the colon operator which ravels in opposite order to ravel. So my question is how might I use np.bincount or any other numpy method to achieve an efficient python implementation of this operation.
EDIT: To avoid wasting time: for these multiD index problems with complicated index manipulation, is the recommend route to just use cython to implement the loops explicity?
EDIT2: Alternative Python implementation I just came up with.
Here is a heavy ram solution:
First precalculate:
Using index units for length (i.e., length 1 =0) make a 4D bool array, size (num_samples,Lmax+1,Lmax+1,maxvalue) , holding where the conditions are satisfied for each value in Y.
ALLcond=np.zeros((num_samples,Lmax+1,Lmax+1,maxvalue+1),dtype='bool')
for l in range(Lmax+1):
for i in range(Lmax+1):
for v in range(maxvalue+!):
ALLcond[:,l,i,v]=(data[:,i]==v) & (Lvec==l)`
Where Lvec=[len(row) for row in data]. Then get the indices for these using np.where and initialize a 4D float array into which you will assign the values of Y:
[indY,ind_len,ind_pos,ind_val]=np.where(ALLcond)
Yval=np.zeros(np.shape(ALLcond),dtype='float')
Now in the loop in which I have to perform the operation, I compute it with the two lines:
Yval[ind_Y,ind_len,ind_pos,ind_val]=Y[ind_Y]
Y_avg=sum(Yval)/num_samples
This gives a factor of 4 or so speed up over the direct loop implementation. I was expecting more. Perhaps, this is a more tangible implementation for Python heads to digest. Any faster suggestions are welcome :)

One way is to convert the 3 "indices" to a linear index and then apply bincount. Numpy's ravel_multi_index is essentially the same as MATLAB's sub2ind. So the ported code could be something like:
shape = (Lmax+1, Lmax+1, maxvalue+1)
posvec = np.arange(1, Lmax+1)
ind_len = np.tile(Lvec[:,None], [1, Lmax])
ind_pos = np.tile(posvec, [n, 1])
ind_val = data
Y_copied = np.tile(Y[:,None], [1, Lmax])
mask = posvec <= Lvec[:,None] # fill-value independent
lin_idx = np.ravel_multi_index((ind_len[mask], ind_pos[mask], ind_val[mask]), shape)
Y_avg = np.bincount(lin_idx, weights=Y_copied[mask], minlength=np.prod(shape)) / n
Y_avg.shape = shape
This is assuming data has shape (n, Lmax), Lvec is Numpy array, etc. You may need to adapt the code a little to get rid of off-by-one errors.
One could argue that the tile operations are not very efficient and not very "numpythonic". Something with broadcast_arrays could be nice, but I think I prefer this way:
shape = (Lmax+1, Lmax+1, maxvalue+1)
posvec = np.arange(1, Lmax+1)
len_idx = np.repeat(Lvec, Lvec)
pos_idx = np.broadcast_to(posvec, data.shape)[mask]
val_idx = data[mask]
Y_copied = np.repeat(Y, Lvec)
mask = posvec <= Lvec[:,None] # fill-value independent
lin_idx = np.ravel_multi_index((len_idx, pos_idx, val_idx), shape)
Y_avg = np.bincount(lin_idx, weights=Y_copied, minlength=np.prod(shape)) / n
Y_avg.shape = shape
Note broadcast_to was added in Numpy 1.10.0.

Constructing 3D array from base 2D array - Numpy

I am trying to create 3D array in python using Numpy and by multiplying 2D array in to 3rd dimension. I am quite new in Numpy multidimensional arrays and basically I am missing something important here.
In this example I am trying to make 10x10x20 3D array using base 2D array(10x10) by copying it 20 times.
My starting 2D array:
a = zeros(10,10)
for i in range(0,9):
a[i+1, i] = 1
What I tried to create 3D array:
b = zeros(20)
for i in range(0,19):
b[i]=a
This approach is probably stupid. So what is correct way to approach construction of 3D arrays from base 2D arrays?
Cheers.
Edit
Well I was doing things wrongly probably because of my R background.
Here is how I did it finally
b = zeros(20*10*10)
b = b.reshape((20,10,10))
for i in b:
for m in range(0, 9):
i[m+1, m] = 1
Are there any other ways to do the same?

There are many ways how to construct multidimensional arrays.
If you want to construct a 3D array from given 2D arrays you can do something like
import numpy
# just some 2D arrays with shape (10,20)
a1 = numpy.ones((10,20))
a2 = 2* numpy.ones((10,20))
a3 = 3* numpy.ones((10,20))
# creating 3D array with shape (3,10,20)
b = numpy.array((a1,a2,a3))
Depending on the situation there are other ways which are faster. However, as long as you use built-in constructors instead of loops you are on the fast side.
For your concrete example in Edit I would use numpy.tri
c = numpy.zeros((20,10,10))
c[:] = numpy.tri(10,10,-1) - numpy.tri(10,10,-2)

Came across similar problem...
I needed to modify 2D array into 3D array like so:
(y, x) -> (y, x, 3).
Here is couple solutions for this problem.
Solution 1
Using python tool set
array_3d = numpy.zeros(list(array_2d.shape) + [3], 'f')
for z in range(3):
array_3d[:, :, z] = array_2d.copy()
Solution 2
Using numpy tool set
array_3d = numpy.stack([array_2d.copy(), ]*3, axis=2)
That is what I came up with. If someone knows numpy to give a better solution I would love to see it! This works but I suspect there is a better way performance-wise.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Efficient numpy broadcasting not found - python

Related

Suggestion to vectorize a Python function

Reshaping numpy arrays of differing dimensions inside an array

Access to short diagonal elements in Numpy 3 dimensional array

np.bincount for 1 line, vectorized multidimensional averaging

Constructing 3D array from base 2D array - Numpy

Categories

Resources