2D version of numpy random choice with weighting

2D version of numpy random choice with weighting - python

This relates to this earlier post: Numpy random choice of tuples
I have a 2D numpy array and want to choose from it using a 2D probability array. The only way I could think to do this was to flatten and then use the modulo and remainder to convert the result back to a 2D index
import numpy as np
# dummy data
x=np.arange(100).reshape(10,10)
# dummy probability array
p=np.zeros([10,10])
p[4:7,1:4]=1.0/9
xy=np.random.choice(x.flatten(),1,p=p.flatten())
index=[int(xy/10),(xy%10)[0]] # convert back to index
print(index)
which gives
[5, 2]
but is there a cleaner way that avoids flattening and the modulo? i.e. I could pass a list of coordinate tuples as x, but how can I then handle the weights?

I don't think it's possible to directly specify a 2D shaped array of probabilities. So raveling should be fine. However to get the corresponding 2D shaped indices from the flat index you can use np.unravel_index
index= np.unravel_index(xy.item(), x.shape)
# (4, 2)
For multiple indices, you can just stack the result:
xy=np.random.choice(x.flatten(),3,p=p.flatten())
indices = np.unravel_index(xy, x.shape)
# (array([4, 4, 5], dtype=int64), array([1, 2, 3], dtype=int64))
np.c_[indices]
array([[4, 1],
[4, 2],
[5, 3]], dtype=int64)
where np.c_ stacks along the right hand axis and gives the same result as
np.column_stack(indices)

You could use numpy.random.randint to generate an index, for example:
# assumes p is a square array
ij = np.random.randint(p.shape[0], size=p.ndim) # size p.ndim = 2 generates 2 coords
# need to convert to tuple to index correctly
p[tuple(i for i in ij))]
>>> 0.0
You can also index multiple random values at once:
ij = np.random.randint(p.shape[0], size=(p.ndim, 5)) # get 5 values
p[tuple(i for i in ij))]
>>> array([0. , 0. , 0. , 0.11111111, 0. ])

Related

Dot product with numpy gives array with size (n, )

I am trying to get the dotproduct of two arrays in python using the numpy package. I get as output an array of size (n,). It says that my array has no column while I do see the results when I print it. Why does my array have no column and how do I fix this?
My goal is to calculate y - np.dot(x,b). The issue is that y is (124, 1) while np.dot(x,b) is (124,)
Thanks

It seems that you are trying to subtract two arrays of a different shape. Fortunately, it is off by a single additional axis, so there are two ways of handling it.
(1) You slice the y array to match the shape of the dot(x,b) array:
y = y[:,0]
print(y-np.dot(x,b))
(2) You add an additional axis on the np.dot(x,b) array:
dot = np.dot(x,b)
dot = dot[:,None]
print(y-dot)
Hope this helps

it may depends on the dimension of your array
For example :
a = [1, 0]
b = [[4, 1], [2, 2]]
c = np.dot(a,b)
gives
array([4, 1])
and its shape is (2,)
but if you change a like :
a = [[1, 0],[1,1]]
then result is :
array([[4, 1],
[6, 3]])
and its shape is (2,2)

Add lists of numpy arrays element-wise

I've been working on an algorithm for backpropagation in neural networks. My program calculates the partial derivative of each weight with respect to the loss function, and stores it in an array. The weights at each layer are stored in a single 2d numpy array, and so the partial derivatives are stored as an array of numpy arrays, where each numpy array has a different size depending on the number of neurons in each layer.
When I want to average the array of partial derivatives after a number of training data has been used, I want to add each array together and divide by the number of arrays. Currently, I just iterate through each array and add each element together, but is there a quicker way? I could use ndarray with dtype=object but apparently, this has been deprecated.
For example, if I have the arrays:
arr1 = [ndarray([[1,1],[1,1],[1,1]]), ndarray([[2,2],[2,2]])]
arr2 = [ndarray([[3,3],[3,3],[3,3]]), ndarray([[4,4],[4,4]])]
How can I add these together to get the array:
arr3 = [ndarray([[4,4],[4,4],[4,4]]), ndarray([[6,6],[6,6]])]

You don't need to add the numbers in the array element-wise, make use of numpy's parallel computations by using numpy.add
Here's some code to do just that:
import numpy as np
arr1 = np.asarray([[[1,1],[1,1],[1,1]], [[2,2],[2,2]]])
arr2 = np.asarray([[[3,3],[3,3],[3,3]], [[4,4],[6,6]]])
ans = []
for first, second in zip(arr1, arr2):
ans.append(np.add(first,second))
Outputs:
>>> [array([[4, 4], [4, 4], [4, 4]]), array([[6, 6], [8, 8]])]
P.S
Could use a one-liner list-comprehension as well
ans = [np.add(first, second) for first, second in zip(arr1, arr2)]

You can use zip/map/sum:
import numpy as np
arr1 = [np.array([[1,1],[1,1],[1,1]]), np.array([[2,2],[2,2]])]
arr2 = [np.array([[3,3],[3,3],[3,3]]), np.array([[4,4],[4,4]])]
arr3 = list(map(sum, zip(arr1, arr2)))
output:
>>> arr3
[array([[4, 4],
[4, 4],
[4, 4]]),
array([[6, 6],
[6, 6]])]

In NumPy, you can add two arrays element-wise by adding two NumPy arrays.
N.B: if your array shape varies then reshape the array and fill with 0.
arr1 = np.array([np.array([[1,1],[1,1],[1,1]]), np.array([[2,2],[2,2]])])
arr2 = np.array([np.array([[3,3],[3,3],[3,3]]), np.array([[4,4],[4,4]])])
arr3 = arr2 + arr1

You can use a list comprehension:
[x + y for x, y in zip(arr1, arr2)]

NumPy array with largest value on diagonal and other values shuffled

I am trying to create a square NumPy (or PyTorch, since PyTorch code can be turned into NumPy with minimal effort) matrix which has the following property: given a set of values, the diagonal elements in each row have the largest value and the other values are randomly shuffled for the other positions.
For example, if I have [1, 2, 3, 4], a possible desired output is:
[[4, 3, 1, 2],
[1, 4, 3, 2],
[2, 1, 4, 3],
[2, 3, 1, 4]]
There can be (several) other possible outputs, as long as the diagonal elements are the largest value (4 in this case) and the off-diagonal elements in each row contain the other values but shuffled.
A hacky/inefficient way of doing this could be first creating a square matrix (4x4) of zeros and putting the largest value (4) in all the diagonal positions, and then traversing the matrix row by row, where for each row i, populate the elements except index i with shuffled remaining values (shuffled versions of [1, 2, 3]). This would be very slow as the matrix size increases. Is there a cleaner/faster/Pythonic way of doing it? Thank you.

First you can generate a randomized array on the first axis with np.random.shuffle(), then I've used a (not so easy to understand) mathematical tricks to shift each rows:
import numpy as np
from numpy.fft import fft, ifft
# First create your randomized array with np.random.shuffle()
x = np.array([[1,2,3,4],
[2,4,3,1],
[4,1,2,3],
[2,3,1,4]])
# We use np.where to determine on which column each 4 are.
_,s = np.where(x==4);
# We compute the left shift that need to be applied to each row in order to get each 4 on the diagonal
s = s-np.r_[0:x.shape[0]]
# And here is the tricks, we can use the fast fourrier transform in order to left shift each row by a given value:
L = np.real(ifft(fft(x,axis=1)*np.exp(2*1j*np.pi/x.shape[1]*s[:,None]*np.r_[0:x.shape[1]][None,:]),axis=1).round())
# Noticed that we could also use a right shift, we simply have to negate our exponential exponant:
# np.exp(-2*1j*np.pi...
And we obtain the following matrix:
[[4. 1. 2. 3.]
[2. 4. 1. 3.]
[2. 3. 4. 1.]
[3. 2. 1. 4.]]
No hidden for loop, only pure linear algaebra stuff.
To give you an idea it take only a few milliseconds for a 1000x1000 matrix on my computer and ~20s for a 10000x10000 matrix.

How can I multiply numpy matrix elementwise without for loops?

I would like to apply the same matrix (3x3) to a large list of points that are contained in a vector. The vector is of the form (40000 x 3). The below code does the job but it is too slow. Are there any numpy tricks I can use to eliminate the for loop and append function?
def apply_matrix_to_shape(Matrix,Points):
"""input a desired transformation and an array of points that are in
the format np.array([[x1,y1,z1],[x2,y2,z2],...,]]). will output
a new array of translated points with the same format"""
New_shape = np.array([])
M = Matrix
for p in Points:
New_shape = np.append(New_shape,[p[0]*M[0][0]+p[1]*M[0][1]+p[2]*M[0][2],
p[0]*M[1][0]+p[1]*M[1][1]+p[2]*M[1][2],
p[0]*M[2][0]+p[1]*M[2][1]+p[2]*M[2][2]])
Rows = int(len(New_shape) / 3)
return np.reshape(New_shape,(Rows,3))

You basically want the matrix multiplication of both arrays (not an element-wise one). You just need to tranpose so the shapes are aligned, and transpose back the result:
m.dot(p.T).T
Or equivalently:
(m#p.T).T
m = np.random.random((3,3))
p = np.random.random((15,3))
np.allclose((m#p.T).T, apply_matrix_to_shape(m, p))
# True

Indeed, I think what you want is one of the main reason why NumPy came to live. You can use the dot product function and the transpose function (simply .T or .transpose())
import numpy as np
points = np.array([[1, 2, 3],
[4, 5, 6]])
T_matrix = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
result = points.dot(T_matrix.T)
print(result)
>>> [[ 14 32 50]
[ 32 77 122]]

sort numpy 2d array by indice of column

I am using numpy in python. I have a 1D(nx1) array and a 2D(nxm) array. I used argsort to get a indice of the 1D array. Now I want to use that indice to sort my 2D(nxm) array's colum.
I want to know how to do it?
For example:
>>>array1d = np.array([1, 3, 0])
>>>array2d = np.array([[1,2,3],[4,5,6]])
>>>array1d_indice = np.argsort(array1d)
array([2, 0, 1], dtype=int64)
I want use array1d_indice to sord array2d colum to get:
[[3, 1, 2],
[6, 4, 5]]
Or anyway easier to achieve this is welcome

If what you mean is that you want the columns sorted based on the vector, then you use argsort on the vector:
vi = np.argsort(vector)
then to arrange the columns of array in the right order,
sorted = array[:, tuple(vi)]
to get rows, switch around the order of : and tuple(vi)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

2D version of numpy random choice with weighting - python

Related

Dot product with numpy gives array with size (n, )

Add lists of numpy arrays element-wise

NumPy array with largest value on diagonal and other values shuffled

How can I multiply numpy matrix elementwise without for loops?

sort numpy 2d array by indice of column

Categories

Resources