Matrix vector multiplication along array axes - python

in a current project I have a large multidimensional array of shape (I,J,K,N) and a square matrix of dim N.
I need to perform a matrix vector multiplication of the last axis of the array with the square matrix.
So the obvious solution would be:
for i in range(I):
for j in range(J):
for k in range(K):
arr[i,j,k] = mat.dot(arr[i,j,k])
but of course this is rather slow. So I also tried numpy's tensordot but had little success.
I would expect that something like:
arr = tensordot(mat,arr,axes=((0,1),(3)))
should do the trick but I get a shape mismatch error.
Has someone a better solution or knows how to correctly use tensordot?
Thank you!

This should do what your loops, but with vectorized looping:
from numpy.core.umath_tests import matrix_multiply
arr[..., np.newaxis] = matrix_multiply(mat, arr[..., np.newaxis])
matrix_multiply and its sister inner1d are hidden, undocumented, gems of numpy, although a full set of linear algebra gufuncs should see the light with numpy 1.8. matrix_multiply does matrix multiplication on the last two dimensions of its inputs, and broadcasting on the rest. The only tricky part is setting an additional dimension, so that it sees column vectors when multiplying, and adding it also on assignment back into array, so that there is no shape mismatch.

I think your for loop is wrong, and for this case dot seems to be enough:
# a is your IJKN
# b is your NN
c = dot(a, b)
Here c will be a IJKN array. If you want to sum over the last dimension to get the IJK array:
arr = dot(a,b).sum(axis=3)
BUT I'm NOT SURE IF THIS IS WHAT YOU WANT...

Related

Finding nearest pixel in defined color space - quick implementation using numpy

I have been working on a task, where I implemented median cut for image quantization – representing the whole image by only limited set of pixels. I implemented the algorithm and now I am trying to implement the part, where I assign each pixel to a representant from the set found by median cut. So, I have variable 'color_space', which is 2d ndarray of shape (n,3), where n is the number of representatives. Then I have variable 'img', which is the original image of shape (rows, columns, 3).
Now I want to find the nearest pixel (bin) for each pixel from the image based on euclidean distance. I was able to come with this solution:
for row in range(img.shape[0]):
for column in range(img.shape[1]):
img[row][column] = color_space[np.linalg.norm(color_space - img[row][column], axis=1).argmin()]
What it does is, that for each pixel from the image, it computes the vector if distances from each of the bins and then it takes the closest one.
Problem is, that this solution is quite slow and I would like to vectorize it - instead of getting vector for each pixel, I would like to get a matrix, where for example first row would be the first vector of distances computed in my code etc...
This problem could be converted into a problem, where I want to do a matrix multiplication, but instead of getting dot product of two vectors, I would get their euclidean distance. Is there some good approach to such problems? Some general solution in numpy, if we want to do 'matrix multiplication' in numpy, but the function Rn x Rn -> R does not need to be dot product, but for example euclidean distance. Of course, for the multiplication, the original image should be resized to (row*columns, 3), but that is a detail.
I have been studying the documentation and searching internet, but didn't find any good approach.
Please note that I don't want others to solve my assignment, the solution I came up with is totally ok, I am just curious whether I could speed it up as I try to learn numpy properly.
Thanks for any advices!
Below is MWE for vectorizing your problem. See comments for explanation.
import numpy
# these are just random array declaration to work with.
image = numpy.random.rand(32, 32, 3)
color_space = numpy.random.rand(10,3)
# your code. I modified it to pick indexes
result = numpy.zeros((32,32))
for row in range(image.shape[0]):
for column in range(image.shape[1]):
result[row][column] = numpy.linalg.norm(color_space - image[row][column], axis=1).argmin()
result = result.astype(numpy.int)
# here we reshape for broadcasting correctly.
image = image.reshape(1,32,32,3)
color_space = color_space.reshape(10, 1,1,3)
# compute the norm on last axis, which is RGB values
result_norm = numpy.linalg.norm(image-color_space, axis=3)
# now compute the vectorized argmin
result_vectorized = result_norm.argmin(axis=0)
print(numpy.allclose(result, result_vectorized))
Eventually, you can get the correct solution by doing color_space[result]. You may have to remove the extra dimensions that you add in color space to get correct shapes in this final operation.
I think this approach might be a bit more numpy-ish/pythonic:
import numpy as np
from typing import *
from numpy import linalg as LA
# assume color_space is defined as a constant somewhere above and is of shape (n,3)
nearest_pixel_idxs: Callable[[np.ndarray], int] = lambda rgb: return LA.norm(color_space - rgb, axis=1).argmin()
img: np.ndarray = color_space[np.apply_along_axis(nearest_pixel_idxs, 1, img.reshape((-1, 3)))]
Why this solution might be more efficient:
It relies on the parallelizable apply_along_axis function nearest_pixel_idxs() rather than the nested for-loops. This is made possible by reshaping img and thereby removing the need for double indexing.
It avoids repeated writes into color_space by only indexing into it once at the very end.
Let me know if you would like me to go into greater depth on any of this - happy to help.
You could first broadcast to get all the combinations and then calculate each norm. You could then pick the smallest from there.
a = np.array([[1,2,3],
[2,3,4],
[3,4,5]])
b = np.array([[1,2,3],
[3,4,5]])
a = np.repeat(a.reshape(a.shape[0],1,3), b.shape[0], axis = 1)
b = np.repeat(b.reshape(1,b.shape[0],3), a.shape[0], axis = 0)
np.linalg.norm(a - b, axis = 2)
Each row of the result represents the distance of the row in a to each of the representatives in b
array([[0. , 3.46410162],
[1.73205081, 1.73205081],
[3.46410162, 0. ]])
You can then use argmin to get the final results.
IMO it is better to use (what #Umang Gupta proposed) numpy's automatic broadcasting than using repeat.

House-Holder Reflection for QR Decomposition

I am trying to implement the QR decomposition via householder reflectors. While attempting this on a very simple array, I am getting weird numbers. Anyone who can tell me, also, why using the # vs * operator between vec and vec.T on the last line of the function definition gets major bonus points.
This has stumped two math/comp sci phds as of this morning.
import numpy as np
def householder(vec):
vec[0] += np.sign(vec[0])*np.linalg.norm(vec)
vec = vec/vec[0]
gamma = 2/(np.linalg.norm(vec)**2)
return np.identity(len(vec)) - gamma*(vec*vec.T)
array = np.array([1, 3 ,4])
Q = householder(array)
print(Q#array)
Output:
array([-4.06557377, -7.06557377, -6.06557377])
Where it should be:
array([5.09, 0, 0])
* is elementwise multiplication, # is matrix multiplication. Both have their uses, but for matrix calculations you most likely want the matrix product.
vec.T for an array returns the same array. A simple array only has one dimension, there is nothing to transpose. vec*vec.T just returns the elementwise squared array.
You might want to use vec=vec.reshape(-1,1) to get a proper column vector, a one-column matrix. Then vec*vec.T does "by accident" the correct thing. You might want to put the matrix multiplication operator there anyway.

Numpy: Perform Multiplication-like Addition

I wanted to define my own addition operator that takes an Nx1 vector (call it A) and a 1xN vector (B) such that the element in the i^th row and j^th column is the sum of the i^th element in A and the j^th element in B. An example is illustrated here.
I was able to write the following code for the function (and it is correct as far as I know).
def test_fn(a, b):
a_len = a.shape[0]
b_len = b.shape[1]
prod = np.array([[0]*a_len]*b_len)
for i in range(a_len):
for j in range(b_len):
prod[i, j] = a[i, 0] + b[0, j]
return prod
However, the vectors I am working with contain thousands of elements, and the function above is quite slow. I was wondering if there was a better way to approach this problem, or if there was a numpy function that could be of use. Any help would be appreciated.
According to numpy's broadcasting rules, you can use a+b to implement your own defined operator.
The first rule of broadcasting is that if all input arrays do not have the same number of dimensions, a “1” will be repeatedly prepended to the shapes of the smaller arrays until all the arrays have the same number of dimensions.
The second rule of broadcasting ensures that arrays with a size of 1 along a particular dimension act as if they had the size of the array with the largest shape along that dimension. The value of the array element is assumed to be the same along that dimension for the “broadcast” array.

Solve broadcasting error without for loop, speed up code

I may be misunderstanding how broadcasting works in Python, but I am still running into errors.
scipy offers a number of "special functions" which take in two arguments, in particular the eval_XX(n, x[,out]) functions.
See http://docs.scipy.org/doc/scipy/reference/special.html
My program uses many orthogonal polynomials, so I must evaluate these polynomials at distinct points. Let's take the concrete example scipy.special.eval_hermite(n, x, out=None).
I would like the x argument to be a matrix shape (50, 50). Then, I would like to evaluate each entry of this matrix at a number of points. Let's define n to be an a numpy array narr = np.arange(10) (where we have imported numpy as np, i.e. import numpy as np).
So, calling
scipy.special.eval_hermite(narr, matrix)
should return Hermitian polynomials H_0(matrix), H_1(matrix), H_2(matrix), etc. Each H_X(matrix) is of the shape (50,50), the shape of the original input matrix.
Then, I would like to sum these values. So, I call
matrix1 = np.sum( [scipy.eval_hermite(narr, matrix)], axis=0 )
but I get a broadcasting error!
ValueError: operands could not be broadcast together with shapes (10,) (50,50)
I can solve this with a for loop, i.e.
matrix2 = np.sum( [scipy.eval_hermite(i, matrix) for i in narr], axis=0)
This gives me the correct answer, and the output matrix2.shape = (50,50). But using this for loop slows down my code, big time. Remember, we are working with entries of matrices.
Is there a way to do this without a for loop?
eval_hermite broadcasts n with x, then evaluates Hn(x) at each point. Thus, the output shape will be the result of broadcasting n with x. So, if you want to make this work, you'll have to make n and x have compatible shapes:
import scipy.special as ss
import numpy as np
matrix = np.ones([100,100]) # example
narr = np.arange(10) # example
ss.eval_hermite(narr[:,None,None], matrix).shape # => (10, 100, 100)
But note that this might actually be faster:
out = np.zeros_like(matrix)
for n in narr:
out += ss.eval_hermite(n, matrix)
In testing, it appears to be between 5-10% faster than np.sum(...) of above.
The documentation for these functions is skimpy, and a lot of the code is compiled, so this is just based on experimentation:
special.eval_hermite(n, x, out=None)
n apparently is a scalar or array of integers. x can be an array of floats.
special.eval_hermite(np.ones(5,int)[:,None],np.ones(6)) gives me a (5,6) result. This is the same shape as what I'd get from np.ones(5,int)[:,None] * np.ones(6).
The np.ones(5,int)[:,None] is a (5,1) array, np.ones(6) a (6,), which for this purpose is equivalent of (1,6). Both can be expanded to (5,6).
So as best I can tell, broadcasting rules in these special functions is the same as for operators like *.
Since special.eval_hermite(nar[:,None,None], x) produces a (10,50,50), you just apply sum to axis 0 of that to produce the (50,50).
special.eval_hermite(nar[:,Nar,Nar], x).sum(axis=0)
Like I wrote before, the same broadcasting (and summing) rules apply for this hermite as they do for a basic operation like *.

Null matrix with constant diagonal, with same shape as another matrix

I'm wondering if there is a simple way to multiply a numpy matrix by a scalar. Essentially I want all values to be multiplied by the constant 40. This would be an nxn matrix with 40's on the diagonal, but I'm wondering if there is a simpler function to use to scale this matrix. Or how would I go about making a matrix with the same shape as my other matrix and fill in its diagonal?
Sorry if this seems a bit basic, but for some reason I couldn't find this in the doc.
If you want a matrix with 40 on the diagonal and zeros everywhere else, you can use NumPy's function fill_diagonal() on a matrix of zeros. You can thus directly do:
N = 100; value = 40
b = np.zeros((N, N))
np.fill_diagonal(b, value)
This involves only setting elements to a certain value, and is therefore likely to be faster than code involving multiplying all the elements of a matrix by a constant. This approach also has the advantage of showing explicitly that you fill the diagonal with a specific value.
If you want the diagonal matrix b to be of the same size as another matrix a, you can use the following shortcut (no need for an explicit size N):
b = np.zeros_like(a)
np.fill_diagonal(b, value)
Easy:
N = 100
a = np.eye(N) # Diagonal Identity 100x100 array
b = 40*a # Multiply by a scalar
If you actually want a numpy matrix vs an array, you can do a = np.asmatrix(np.eye(N)) instead. But in general * is element-wise multiplication in numpy.

Categories

Resources