Working with variable dimension tensors in tensorflow - python

I want to build a tensor placeholder, features with dimension, say (10, a, a). Such that features[i, :, :] can be an arbitrary square tensor. As an instance, features[0. :, :] may be of dimension 5*5, and features[1, :, :] can be of dimension 8*8 at the same time. How can we do this with tensorflow?
I found ragged tensors for this purpose. But the problem is that for feeding values in the ragged tensor, I will have to use normal lists in python. In my case this feed list is very sparse and there is no way to compress ragged lists.

Related

How to concatenate 2d tensors with 2 different dimensions

I was wondering if it is possible to concatenate two different pytorch tensors with different shapes.
one tensor is of shape torch.Size([247, 247]) and the other is of shape torch.Size([10, 183]). Is it possible to concatenate these using torch.cat() on dim=1?
I think you should use broadcasting. That is, to broadcast torch.Size([10, 183]) along dimension 0 (to reach 247) or do it for the other dimensions. For torch.cat to work, you need to have matching dimensions along which you are trying to concatenate.

How can I get a view of input as a complex tensor? RuntimeError: Tensor must have a last dimension with stride 1

I have a tensor with 64 elements in pytorch and I want to convert it to a complex tensor with 32 elements. Order is important for me and everything should be in PyTorch so I can use it in my customized loss function:
the first half in my primary tensor (W) are my real numbers and the second half are my imaginary ones. so my final tensor should be like:
W_final = tensor(W[0]+jW[32], W[1]+jW[33], W[2]+jW[34], W[3]+jW[35], ... , W[31]+jW[63])
I tried this approach:
import torch
W_1 = = torch.reshape(W,(2,32)) #reshape W with shape (64) to W_1 with shape (2,32)
W_2 = torch.transpose(W_1,0,1) #transpose W_1 to W_2 with shape (32,2), so I can use view_as_complex
W_final = torch.view_as_complex(W_2)
The problem is that with transpose, the stride also changes and I get this error:
RuntimeError: Tensor must have a last dimension with stride 1
Do know how can I deal with stride? or is there any way to reshape with different orders same as numpy?
or any other way to convert to complex?
It has to do with the non contiguous memory allocation for W_2 after you do reshape.
To handle this error you should call .contiguous() on W_2.
From Pytorch Docs:
" Strides are a list of integers: the k-th stride represents the jump in the memory necessary to go from one element to the next one in the k-th dimension of the Tensor. This concept makes it possible to perform many tensor operations efficiently."
Once you call contiguous all dimensions of returned tensor will have stride 1.
Here is a working sample code:
import torch
W = torch.randn(64)
W_2 = W.view(-1,32).permute(1,0).contiguous()
W_final = torch.view_as_complex(W_2)
First call view to reshape tensor to shape (2,32), then permute dimensions to transpose the result and call contiguous.

Tensorflow: Diagonal of matrix of matrices / Diagonal of 4D tensor

Given a 4D tensor x of shape (batch_size, batch_size, seq_len, feature_dim), I want to be able to retrieve the matrices along the diagonal entries, i.e. I need a way to fetch all x[diag_entry, diag_entry, :, :] slices for the values range(batch_size) producing a tensor of shape (batch_size, seq_len, feature_dim). However, I cannot explicitly loop over range(batch_size) as batch_size may vary since I work in Keras. Does Tensorflow have functionality supporting such an operation?

Different Numpy reshaping to 3D array syntax's

I'm looking at LSTM neural networks. I saw code like this below:
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
This code is meant to change a 2d array into a 3d array but the syntax looks off to me or at least I don't understand it. For example I would assume this code below as a 3d syntax
np.reshape(rows , columns, dimensions)
Could someone elaborate what the syntax is and what it is trying to do.
Function numpy.reshape gives a new shape to an array without changing its data. It is a numpy package function. First of all, it needs to know what to reshape, which is the first argument of this function (in your case, you want to reshape X_train).
Then it needs to know what is the shape of your new matrix. This argument needs to be a tuple. For 2D reshape you can pass (W,H), for three dimensional you can pass (W,H,D), for four dimensional you can pass (W,H,D,T) and so on.
However, you can also call reshape a Numpy matrix by X_train.reshape((W,H,D)). In this case, since reshape function is a method of X_train object, then you do not have to pass it and only pass the new shape.
It is also worth mentioning that the total number of element in a matrix with the new shape, should match your original matrix. For example, your 2D X_train has X_train.shape[0] x X_train.shape[1] elements. This value should be equal to W x H x D.

How to understand the trick in MNIST experiments using tensorflow?

I am a beginner for Tensorflow. I am a bit confused by the tutorial. The author firstly gives a formula y=softmax(Wx+b), but use xW+b in the python code and explain it is a small trick. I do not understand the trick, why does the author need to flip the formula?
https://www.tensorflow.org/get_started/mnist/beginners
First, we multiply x by W with the expression tf.matmul(x, W). This is
flipped from when we multiplied them in our equation, where we had Wx,
as a small trick to deal with x being a 2D tensor with multiple
inputs. We then add b, and finally apply tf.nn.softmax.
As you can see from the formula,
y=softmax(Wx + b)
the input x is multiplied by the Weight variable W,but in the doc
y = tf.nn.softmax(tf.matmul(x, W) + b)
W is multiplied by x for calculation convenience, so we must flip W from 10*784 to 784*10 keep consistent with the formula.
In general in machine learning, esp. tensorflow, you always want your first dimension to represent your batch. The trick is only a way of ensuring that without transposing everything before and after each matrix multiplication.
x is not really a column vector of features, but a 2D matrix of shape (batch_size, n_features).
If you keep Wx, then you'll transpose x (to x' of shape (n_features, batch_size)) use W of shape (n_outputs, n_features), and Wx' will be of shape (n_outputs, batch_size), so you'll have to transpose it back to (batch_size, n_outputs), which is what you want in the end.
If you're using tf.matmul(x, W), then W is of shape (n_features, n_outputs ), and the result is directly of shape (batch_size, n_outputs).
I agree this is not clear at first.
x being a 2D tensor with multiple inputs
is a very succinct way to tell you that in tensorflow, data is stored in tensors following conventions that are not those of linear algebra.
In particular, the outermost dimension (i.e. columns for matrices) is always the sample dimension: that is, it has the same size as your number of samples.
When you store sample features in a 2D tensor (a matrix), the features are therefore stored in the inner-most dimension, i.e. lines. That is, tensor x is the transposed of variable $x$ in the equation. So are W and b. The fact that x.T*W.T=(W.x).T explains the swap inconsistency in the multiplication between the linear algebra equation and the tensor implementation of it.

Categories

Resources