Matrix multiplication over specific dimensions in tensorflow (or numpy) - python

I hope I'm not missing anything obvious here, but I've scoured the inter-webs to no avail, and finally come to ask here...
Here's a really dry and simple description of what I'd like to do:
Say I've got a tensor of shape (20, 40, 3, 5), and another tensor of shape (20, 40, 5, 7). The first two dimension sizes are to be kept as are, and are purposely identical for the two tensors. The last two dimensions on the other hand, are to be (matrix-)multiplied, matmul style. Meaning my resulting tensor would be of shape (20, 40, 3, 7). How can that be done??
I realize I can theoretically just loop over the first two dimensions and use tf.matmul() directly, but that's an absolute no-go due to runtime, efficiency, model-trainer and GPU world-wide protests, and my conscience if that's of any weight :-).
I've unfortunately disregarded as "not what I need" the following options:
tf.tensordot would give me an output of shape (20, 40, 3, 20, 40, 7). No good.
tf.scan is only good for the first dimension if I'm reading it correctly (suitable for RNNs maybe? Not my case anyhow).
tf.matmul works for tensors of rank >= 2, but works like # over the last and first dimensions respectively. Again, not my case.
So again - how can this be done?
A numpy answer that helps me get in the right direction would also be very helpful, but I'll need a tf implementation at the end of the day.
Thanks in advance, and sorry if I'm missing something dumb.
The following is closer to what I need, but less clear and so is being written separately:
The first two dimensions are spatial dimensions of an image. The last two are actually square matrices, obtained via tf.contrib.distributions.fill_triangular, and are being multiplied (along with an appropriate transpose on one of them) to obtain covariance matrices associated to each spatial coordinate. I don't know if that helps in any way, but it gives some context at the very least. Also, there might or might not be a batch dimension as well, but I'm assuming that solving the 4-D tensor case would be generalizable enough.

Posting this for future reference:
From numpy matmul doc:
If either argument is N-D, N > 2, it is treated as a stack of matrices
residing in the last two indexes and broadcast accordingly.
For dimensions >2 it will treat it as a stack of matrices, attempting to matmul the last 2 dimensions, resulting with a np array as the OP required.
For example:
import numpy as np
A = np.ones((1,2,1,2))
B = np.ones((1,2,2,1))
print(A.shape)
print(B.shape)
print(np.matmul(A,B).shape)
with result:
(1, 2, 1, 2)
(1, 2, 2, 1)
(1, 2, 1, 1)

Related

How to concatenate tensor to another list of tensor in pytorch?

I have a tensor of shape "torch.Size([2, 2, 3])" and another tensor of shape "torch.Size([2, 1, 3])". I want a concatenated tensor of shape "torch.Size([2, 2, 6])".
For example :
a=torch.tensor([[[2,3,5],[12,13,15]],[[20,30,50],[120,130,150]]])
b=torch.tensor([[[99,99,99]],[[999,999,999]]])
I want the output as : [[[99,99,99,2,3,5],[99,99,99,12,13,15]],[[999,999,999,20,30,50],[999,999,999,120,130,150]]]
I have written a O(n2) solution using two for loops but,
This is taking a lot of time with millions of calculation, Does anyone help me in doing this efficiently ?? May be some matrix calculation trick for tensors ??
To exactly match the example you have provided:
c = torch.cat([b.repeat([1,a.shape[1]//b.shape[1],1]),a],2)
The reasoning behind this is that the concatenate operation in pytorch (and numpy and other libraries) will complain if the dimensions of the two tensors in the non-specified axes (in this case 0 and 1) do not match. Therefore, you have to repeat the tensor along the non-matching axis (the first axis, therefore the second element of the repeat list) in order to make the dimensions align. Note that the solution here will only work if the middle dimension of a is evenly divisible by the middle dimension of b.
In newer versions of pytorch, this can also be done using the torch.tile() function.

adding new axes to facilitate broadcast, a better way?

I am looking for a nice way to "clean up" the dimensions of two arrays which I would like to combine together using broadcasting.In particular I would like to broadcast a one dimensional array up to the shape of a multidimensional array and then add the two arrays. My understanding of the broadcasting rules tells me that this should work find if the last dimension of the multidimensional array matches that of the one dimensional array. For example, arrays with shapes (,3) and (10,3) would add fine
My problem is, given how the array I have is built the matching dimension happens to be the first dimension of the array so the broadcasting rules are not met. For reference my one d array has shape (,3) and the multi-dimensional array is (3,10,10,50).
I could correct this by reshaping the multi-dimensional array so that the compatible dimension is the last dimension but I'd like to avoid this as I find the logic of reshaping tricky to follow when the different dimensions have specific meaning.
I can also add empty dimensions to the one dimensional array as in the code below until the one dimensional array has as many dimensions as the high dimensional array as in the code snippet below.
>>> import numpy as np
>>> a = np.array([[1,2],
>>> [3,4],
>>> [5,6]])
>>> b = np.array([10,20,30])
>>> a+b[:,None]
array([[11, 12],
[23, 24],
[35, 36]])
This gives me my desired output however in my case my high dimensional array has 4 different axes so I would need to add in multiple empty dimensions which starts to feel inelegant. I can do something like
b = b[[slice(None)] + 3*[np.newaxis]]
and then proceed but that doesn't seem great. More generally one could imagine needing an arbitrary number of new axes on both sides of the original one dimension and writing a helper function to generalize the above logic. Is there a nicer/clearer way to achieve this?

Sanity Check on Numpy Dimensions?

Just wanted to make sure I'm seeing what I'm seeing in numpy.
In english terms, I want to create an array with a height of 2, width of 3, and a depth of four.
I would assume that the shape would be (2,3,4) following logical order, but when I create the array, the shape turns out to be (4,2,3).
some = np.array([[[1,2,3],[4,5,6]], [[5,6,7],[8,9,10]],[[5,6,7],[8,9,10]],[[5,6,7],[8,9,10]]])
# some.shape = (4,2,3), not what I expect (2,3,4)
Can someone explain how the dimensions are ordered in python?

The shape of a sliced array

I have problems about the array calculation after it is being sliced. THe problem is cased by the uncertainty of the shape of the sliced array.
For example, I have a 2D array data with shape of (118,3), however, when I only use the first column of data as following, I can only determine the shape as (118,). The #column cannot be determined unless I use reshape. I do not understand why.
print shape(data), shape(data[:, 0])
The result is : (118, 3) (118,).
I have found similar question asked on stackoverflow. But it did not answer my confusion.
Giving a concrete index for a dimension, reduces this dimension in the result. If you want to keep this dimension, you have to provide a one-element slice:
print data[:, 0:1].shape
results in (118, 1).

Writing functions that accept both 1-D and 2-D numpy arrays?

My understanding is that 1-D arrays in numpy can be interpreted as either a column-oriented vector or a row-oriented vector. For instance, a 1-D array with shape (8,) can be viewed as a 2-D array of shape (1,8) or shape (8,1) depending on context.
The problem I'm having is that the functions I write to manipulate arrays tend to generalize well in the 2-D case to handle both vectors and matrices, but not so well in the 1-D case.
As such, my functions end up doing something like this:
if arr.ndim == 1:
# Do it this way
else:
# Do it that way
Or even this:
# Reshape the 1-D array to a 2-D array
if arr.ndim == 1:
arr = arr.reshape((1, arr.shape[0]))
# ... Do it the 2-D way ...
That is, I find I can generalize code to handle 2-D cases (r,1), (1,c), (r,c), but not the 1-D cases without branching or reshaping.
It gets even uglier when the function operates on multiple arrays as I would check and convert each argument.
So my question is: am I missing some better idiom? Is the pattern I've described above common to numpy code?
Also, as a related matter of API design principles, if the caller passes a 1-D array to some function that returns a new array, and the return value is also a vector, is it common practice to reshape a 2-D vector (r,1) or (1,c) back to a 1-D array or simply document that the function returns a 2-D array regardless?
Thanks
I think in general NumPy functions that require an array of shape (r,c) make no special allowance for 1-D arrays. Instead, they expect the user to either pass an array of shape (r,c) exactly, or for the user to pass a 1-D array that broadcasts up to shape (r,c).
If you pass such a function a 1-D array of shape (c,) it will broadcast to shape (1,c), since broadcasting adds new axes on the left. It can also broadcast to shape (r,c) for an arbitrary r (depending on what other array it is being combined with).
On the other hand, if you have a 1-D array, x, of shape (r,) and you need it to broadcast up to shape (r,c), then NumPy expects the user to pass an array of shape (r,1) since broadcasting will not add the new axes on the right for you.
To do that, the user must pass x[:,np.newaxis] instead of just x.
Regarding return values: I think it better to always return a 2-D array. If the user knows the output will be of shape (1,c), and wants a 1-D array, let her slice off the 1-D array x[0] herself.
By making the return value always the same shape, it will be easier to understand code that uses this function, since it is not always immediately apparent what the shape of the inputs are.
Also, broadcasting blurs the distinction between a 1-D array of shape (c,) and a 2-D array of shape (r,c). If your function returns a 1-D array when fed 1-D input, and a 2-D array when fed 2-D input, then your function makes the distinction strict instead of blurred. Stylistically, this reminds me of checking if isinstance(obj,type), which goes against the grain of duck-typing. Don't do it if you don't have to.
unutbu's explanation is good, but I disagree on the return dimension.
The function internal pattern depends on the type of function.
Reduce operations with an axis argument can often be written so that the number of dimensions doesn't matter.
Numpy has also an atleast_2d (and atleast_1d) function that is also commonly used if you need an explicit 2d array. In statistics, I sometimes use a function like atleast_2d_cols, that reshapes 1d (r,) to 2d (r,1) for code that expects 2d, or if the input array is 1d, then the interpretation and linear algebra requires a column vector. (reshaping is cheap so this is not a problem)
In a third case, I might have different code paths if the lower dimensional case can be done cheaper or simpler than the higher dimensional case. (example: if 2d requires several dot products.)
return dimension
I think not following the numpy convention with the return dimension can be very confusing to users for general functions. (topic specific functions can be different.)
For example, reduce operations loose one dimension.
For many other functions the output dimension matches the input dimension. I think a 1d input should have a 1d output and not an extra redundant dimension. Except for functions in linalg, I don't remember any functions that would return a redundant extra dimension. (The scalar versus 1-element array case is not always consistent.)
Stylistically this reminds me of an isinstance check:
Try without it if you allow for example for numpy matrices and masked arrays. You will get funny results that are not easy to debug. Although, for most numpy and scipy functions the user has to know whether the array type will work with them, since there are few isinstance checks and asarray might not always do the right thing.
As a user, I always know what kind of "array_like" I have, a list, tuple or which array subclass, especially when I use multiplication.
np.array(np.eye(3).tolist()*3)
np.matrix(range(3)) * np.eye(3)
np.arange(3) * np.eye(3)
another example: What does this do?
>>> x = np.array(tuple(range(3)), [('',int)]*3)
>>> x
array((0, 1, 2),
dtype=[('f0', '<i4'), ('f1', '<i4'), ('f2', '<i4')])
>>> x * np.eye(3)
This question has already very good answers. Here I just want to add what I usually do (which somehow summarizes responses by others) when I want to write functions that accept a wide range of inputs while the operations I do on them require a 2d row or column vector.
If I know the input is always 1d (array or list):
a. if I need a row: x = np.asarray(x)[None,:]
b. if I need a column: x = np.asarray(x)[:,None]
If the input can be either 2d (array or list) with the right shape or 1d (which needs to be converted to 2d row/column):
a. if I need a row: x = np.atleast_2d(x)
b. if I need a column: x = np.atleast_2d(np.asarray(x).T).T or x = np.reshape(x, (len(x),-1)) (the latter seems faster)
This is a good use for decorators
def atmost_2d(func):
def wrapr(x):
return func(np.atleast_2d(x)).squeeze()
return wrapr
For example, this function will pick out the last column of its input.
#atmost_2d
def g(x):
return x[:,-1]
But: it works for:
1d:
In [46]: b
Out[46]: array([0, 1, 2, 3, 4, 5])
In [47]: g(b)
Out[47]: array(5)
2d:
In [49]: A
Out[49]:
array([[0, 1],
[2, 3],
[4, 5]])
In [50]: g(A)
Out[50]: array([1, 3, 5])
0d:
In [51]: g(99)
Out[51]: array(99)
This answer builds on the previous two.

Categories

Resources