Elementwise multiplication of tensors of unknown dimension - python

How do I do an elementwise multiplication of tensors with the following shapes? The second array here is always assumed to be 2D.
[x, y, ?, ?, ?, ...] * [x, y]
I want to broadcast over all the dimensions marked ?, of which I don't know the number a-priori. Possible solutions I have considered (but don't know how to do):
Add a variable number of axes to the second array
Reverse the order of the axes of both arrays and then reverse them back again
Any pointers would be great.

The alternatives mentioned in the question (with b the 2D array):
Add a variable number of axes to the second array
a * b.reshape(b.shape + (1,)*(a.ndim-b.ndim))
Reverse the order of the axes of both arrays and then reverse them back again
(a.T * b.T).T
Another alternative with einsum:
numpy.einsum('ij...,ij->ij...', a, b)

Not pretty, but it works:
a = np.zeros((3, 4, 5, 6))
b = np.zeros((3, 4))
c = a*b[(slice(None), slice(None), )+(None, )*(a.ndim-2)]

Let's say the input arrays are A, B with B as the 2D array. To start off, reshape A to a 3D array with the trailing non-matching dimensions merged as one dimension, then perform the broadcasted elementwise multiplication with B and finally reshape back the product to original shape of A. The implementation would look like this -
shp = A.shape # Get shape of A
out = (A.reshape(shp[0],shp[1],-1)*B[:,:,None]).reshape(shp)
Verify output -
In [96]: A = np.random.rand(2,3,4,5,7,8,4)
In [97]: B = np.random.rand(2,3)
In [98]: shp = A.shape
...: out = (A.reshape(shp[0],shp[1],-1)*B[:,:,None]).reshape(shp)
...:
In [99]: direct_out = A*B[:,:,None,None,None,None,None]
In [100]: np.allclose(out,direct_out) # Verify
Out[100]: True

Related

Python - matrix multiplication

i have an array y with shape (n,), I want to compute the inner product matrix, which is a n * n matrix
However, when I tried to do it in Python
np.dot(y , y)
I got the answer n, this is not what I am looking for
I have also tried:
np.dot(np.transpose(y),y)
np.dot(y, np.transpose(y))
I always get the same answer n
I think you are looking for:
np.multiply.outer(y,y)
or equally:
y = y[None,:]
y.T#y
example:
y = np.array([1,2,3])[None,:]
output:
#[[1 2 3]
# [2 4 6]
# [3 6 9]]
You can try to reshape y from shape (70,) to (70,1) before multiplying the 2 matrices.
# Reshape
y = y.reshape(70,1)
# Either below code would work
y*y.T
np.matmul(y,y.T)
One-liner?
np.dot(a[:, None], a[None, :])
transpose doesn't work on 1-D arrays, because you need atleast two axes to 'swap' them. This solution adds a new axis to the array; in the first argument, it looks like a column vector and has two axes; in the second argument it still looks like a row vector but has two axes.
Looks like what you need is the # matrix multiplication operator. dot method is only to compute dot product between vectors, what you want is matrix multiplication.
>>> a = np.random.rand(70, 1)
>>> (a # a.T).shape
(70, 70)
UPDATE:
Above answer is incorrect. dot does the same things if the array is 2D. See the docs here.
np.dot computes the dot product of two arrays. Specifically,
If both a and b are 1-D arrays, it is inner product of vectors (without complex conjugation).
If both a and b are 2-D arrays, it is matrix multiplication, but using matmul or a # b is preferred.
Simplest way to do what you want is to convert the vector to a matrix first using np.matrix and then using the #. Although, dot can also be used # is better because conventionally dot is used for vectors and # for matrices.
>>> a = np.random.rand(70)
(70,)
>>> a.shape
>>> a = np.matrix(a).T
>>> a.shape
(70, 1)
>>> (a # a.T).shape
(70, 70)

add column Numpy array python

I am very new to python and am very familiar with R, but my question is very simple using Numpy Arrays:
Observe:
I have one array X of dimension (100,2) of floating point type and I want to add a 3rd column, preferably into a new Numpy array of dimension (100,3) such that the 3rd column = col(1)^2 for every row in array of X.
My understanding is Numpy arrays are generally of fixed dimension so I'm OK with creating a new array of dim 100x3, I just don't know how to do so using Numpy arrays.
Thanks!
One way to do this is by creating a new array and then concatenating it. For instance, say that M is currently your array.
You can compute col(1)^2 as C = M[:,0] ** 2 (which I'm interpreting as column 1 squared, not column 1 to the power of the values in column two). C will now be an array with shape (100, ), so we can reshape it using C = np.expand_dims(C, 1) which will create a new axis of length 1, so our new column now has shape (100, 1). This is important because we want all both of our arrays to have the same number of dimensions when concatenating them.
The last step here is to concatenate them using np.concatenate. In total, our result looks like this
C = M[:, 0] ** 2
C = np.expand_dims(C, 1)
M = np.concatenate([M, C], axis=1) #third row will now be col(1) ^ 2
If you're the kind of person who likes to do things in one line, you have:
M = np.concatenate([M, np.expand_dims(M[:, 0] ** 2, 0)], axis=1)
That being said, I would recommend looking at Pandas, it supports these actions more naturally, in my opinion. In Pandas, it would be
M["your_col_3_name"] = M["your_col_1_name"] ** 2
where M is a pandas dataframe.
Append with axis=1 should work.
a = np.zeros((5,2))
b = np.ones((5,1))
print(np.append(a,b,axis=1))
This should return:
[[0,0,1],
[0,0,1],
[0,0,1],
[0,0,1],
[0,0,1]]
# generate an array with shape (100,2), fill with 2.
a = np.full((100,2),2)
# calcuate the square to first column, this will be a 1-d array.
squared=a[:,0]**2
# concatenate the 1-d array to a,
# first need to convert it to 2-d arry with shape (100,1) by reshape(-1,1)
c = np.concatenate((a,squared.reshape(-1,1)),axis=1)

Repmat operation in python

I want to calculate the mean of a 3D array along two axes and subtract this mean from the array.
In Matlab I use the repmat function to achieve this as follows
% A is an array of size 100x50x100
mean_A = mean(mean(A,3),1); % mean_A is 1D of length 50
Am = repmat(mean_A,[100,1,100]) % Am is 3D 100x50x100
flc_A = A - Am % flc_A is 3D 100x50x100
Now, I am trying to do the same with python.
mean_A = numpy.mean(numpy.mean(A,axis=2),axis=0);
gives me the 1D array. However, I cannot find a way to copy this to form a 3D array using numpy.tile().
Am I missing something or is there another way to do this in python?
You could set keepdims to True in both cases so the resulting shape is broadcastable and use np.broadcast_to to broadcast to the shape of A:
np.broadcast_to(np.mean(np.mean(A,2,keepdims=True),axis=0,keepdims=True), A.shape)
Note that you can also specify a tuple of axes along which to take the successive means:
np.broadcast_to(np.mean(A,axis=tuple([2,0]), keepdims=True), A.shape)
numpy.tile is not the same with Matlab repmat. You could refer to this question. However, there is an easy way to repeat the work you have done in Matlab. And you don't really have to understand how numpy.tile works in Python.
import numpy as np
A = np.random.rand(100, 50, 100)
# keep the dims of the array when calculating mean values
B = np.mean(A, axis=2, keepdims=True)
C = np.mean(B, axis=0, keepdims=True) # now the shape of C is (1, 50, 1)
# then simply duplicate C in the first and the third dimensions
D = np.repeat(C, 100, axis=0)
D = np.repeat(D, 100, axis=2)
D is the 3D array you want.

Using an ND array to select on a dimension

How do I use an ndarray matrix to select elements of an ndarray?
Here's an example of what I mean.
a = np.arange(9)
b = np.arange(5)
c = np.arange(12)
A, B, C = np.meshgrid(a, b, c, indexing='ij')
Now, for each value of a, c, I want the b that minimizes A+C=B. Get the indices:
idx = np.abs(A+C-B).argmin(axis=1)
Clearly, idx has shape (9, 12) - it contains the index of b for each of the 9 a, and each of the 12 c.
Now, I would like to select the matrices with the "optimized b". That is, something along the lines of
B[:, idx, :]
that supposedly has shape (9, 1, 12) - because for each of the other combinations, it has only one value of b - the minimizing one. Now, B[:, idx, :] instead gives me the mesh of all potential combinations with shape (9, 9, 12, 12). I also tried
B[np.arange(B.shape[0]), idx, np.arange(B.shape[2])]
IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (9,) (9,12) (12,)
How do I get that specific type of matrix I described above?
You just need to add an axis there with np.newaxis/None to trigger advanced-indexing -
B[np.arange(B.shape[0])[:,None], idx, np.arange(B.shape[2])]
The idea basically is to map the rows of idx with the first indexing array of np.arange(B.shape[0]) and as such we need to add an axis there. For mapping the columns of idx, we already have np.arange(B.shape[2]) aligned along the columns of it.
Alternative to np.newaxis
Another way to add that new axis would be with reshaping Thus, we could replace B[np.arange(B.shape[0])[:,None] with np.arange(B.shape[0]).reshape(-1,1).
Further optimization
We could optimize the codes, by using open arrays to replace the huge arrays created by meshgrid, like so -
A0, B0, C0 = np.ix_(a,b,c)
idx = np.abs(A0+C0-B0).argmin(axis=1)
Thus, get the final output, like so -
B[np.arange(len(a))[:,None], idx, np.arange(len(c))]
Just to give ourselves the idea of memory saving here -
In [47]: A.nbytes + B.nbytes + C.nbytes
Out[47]: 12960
whereas A0, B0, C0 are views into the input arrays a, b, c respectively and as such don't occupy any additional memory, i.e. absolutely free -
In [49]: np.shares_memory(a,A0)
Out[49]: True
For completeness sake, a direct way to get idx would be -
np.abs(a[:,None,None]+c-b[:,None]).argmin(axis=1)

Pairwise product between array of vectors and array of matrices

I have a 2D array A which I am representing here as [v_1, v_2, v_3, ..., v_n].
I have a 3-d tensor B which I am representing here as [m_1, m_2, m_3, ...n m_n].
A.type = numpy.ndarray
A.shape = (300, 4)
B.type = numpy.ndarray
B.shape = (300, 4, 2)
I want to get the 1D array C = A*B such that C = [u_1, u_2, u_3, ..., u_n] where u_i = np.dot(v_i, m_i)
How can I do this without iterating over 1 to n and using numpy.tensordot() over A and B?
You can use the np.einsum function to do that. that will let you give a letter (index) to each dimension of the arrays you supply as a string and use the einstein sum notation to process. so in you case I'd say something like:
np.einsum( "ik,ikl->il", A,B )
so in this case i'd name the dimensions of A i,k --> 300,4 and the dimensions of B have to be i,k and something else e.g. l --> 300,4,2 an then with the arrow you specify which dimensions you want to get out. If you don't supply a letter (index) in the notation after the arrow this dimension will be summed over. so had you done "ik,ikl->l" it would have summed over the 300 dimension.

Categories

Resources