LSTM cell input matrix dimensions - python

Im trying to build an LSTM using just numpy to try and get a feel for whats going on, but I'm running into an issue with my understanding of how the LSTM matrixes work. I found this image from http://colah.github.io/posts/2015-08-Understanding-LSTMs/ of an RNN
From my understanding of an RNN, xt is dot producted with a weight matrix we will call Wx,h and ht-1 is dot producted with a weight matrix Wh,h and the result gets summed together:
This makes sense as the shape of xt is (b,d) where b is the batch size and d is the dimensionality, and Wx,h is of shape (d,h) thus the resulting matrix would be of shape (b,h).
Similarly ht-1 is of shape (b,h) and Wh,h is of shape (h,h) resulting in a matrix of size (b,h) also. So summing these together would result in a (b,h) shaped matrix as the result, which is perfect as that's the same shape of ht-1.
Here is where I run into problems, looking at this diagram, also from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Things start to not make much sense. If we look at the equation
I know that stacking ht-1 and xt and doing the dot product with Wf is the exact same thing as splitting Wf into two smaller matrices and doing the same thing as we done for the RNN above, the issue comes in with the dimensions. The shape of xt is (b,d) and the shape of ht-1 is (b,h). For them to be stacked on top of one another h must be equal to d, which isn't always the case.
So assuming I'm wrong about the shape of xt and ht-1, I'm guessing we need to do the dot products prior to passing into the cell so the shapes of xt is the same as the shape of ht-1 (both being (b,h)). But even then we run into issues, by stacking the two matrices we end up with a (2b,h) matrix and unfortunately we can't dot product this matrix by any weight matrix to result in a (b,h) matrix, so progressively the number of rows in this vector would grow as more inputs are added in, as the output ht+1 would be of shape (2b,h).
My question is what is wrong with my understanding. Looking at: https://www.quora.com/In-LSTM-how-do-you-figure-out-what-size-the-weights-are-supposed-to-be it appears that my assumption of doing the dot product prior to passing in to the lstm cell is correct, but for some reason this person said that post stacking the (b,h) matrix would result in a (b,h) matrix which doesn't make much sense.

Related

How to choose the dimensions of the input matrix?Graph structure or sequence or?

My data is a string, I first convert it into a graph structure, its adjacent vector dimension is 700×700, and then I embed each character into a feature vector of dimension 28, so the graph convolutional The input dimension is 13.72 million. Is the dimension of my feature vector too large? I can't find any information on how to get the dimensions. The current solution still uses strings, but only one dimension is used, so the dimension of the input matrix is only 700×28, which is much smaller, but I don't know if there are other solutions, I am a novice, thank you

How to balance data when they look like a 3-D array?

I've got a numpy_array of size (3275412, 50, 22) which represents my data reshaped for LSTM purposes and I have got a target vector of shape (3275412,).
I want to balance my data so that there is approximately the same number of data with target 0 and 1.
The way I prepared the data makes that I can not do this balancing operation before reshaping.
Firstly, I wanted to apply make_imbalance function (see this link for details) but I can't apply it on a 2-D array (got an error).
My question is : what's the most efficient way to do it for a 3D array ?
My thoughts: I thought about firstly "flatten" my 3-D array to a 2-D array by "concatenating" the second and third dimension (but don't know how so please tell me ??) then apply make_imbalance and then reshape the result to a 3-D array (again, don't know how to do). It seems a little bit tricky however...
So any help would be appreciated, either for an other imbalancing method or for help about reshaping 3D->2D or vice-versa
You can use np.reshape with -1 for unknown dimension size.
data2d = data3d.reshape(data3d.shape[0], -1)
will give you a 2d array of shape (n_samples, n_features)
with the second and the third dimensions merged.
data2d_new, y_new = make_imbalance(data2d, y)
After make_imbalance call, you will get a 2d array with a shape (n_samples_new, n_features), where the number of rows is "unknown" but you know your other two 'feature' dimensions of the original 3d array, so
data3d_new = data2d.reshape(-1, data3d.shape[1], data3d.shape[2])
will give you back the balanced 3d dataset.

Indexing for 3 dimensional Numpy Arrays (convolutional network)

I'm trying to write a function that performs Convolution, and I'm getting a little challenged trying to create the output volume using numpy. Specifically, I have an input image that is represented as an array of dimensions (150,150,3). Now, I want to convolve over this image with a set of kernels num_kernels, which are arrays of dimension (4,4,3), and I want these kernels to move over the image with a stride of 2. My thought process has been:
(1) I'll create an output array which is comprised of taking (4,4,3) size chunks out of the input array and stretching these out into rows, and ultimately making a large matrix of these.
(2) Then, I'll create a parameter array composed of all of my (4,4,3) kernels stretched out into rows, which will also make a large matrix.
(3) Then I can dot product these matrices together and reshape the output matrix into the proper dimensions.
My rough psuedo-code start to number (1) is as follows.
def Convolution(input, filter_size, num_filters, stride):
X = input
output_Volume = np.zeros(#dimensions)
weights = np.zeros(#dimensions)
#get weights from other function
for width in range(0,150,2):
for height in range(0,150,2):
row = X(#indexes here to take out chunk).flatten
output_Volume.append(row) #something of this sort
return #dot product output volume and weights
If someone could provide a specific code example of how to implement this (most helpful would be answers to (1) and (2)) in Python (I'm using numpy), it would be much appreciated. Thank you!

What does the MNIST tensorflow tutorial mean with matmul flipping trick?

The tutorial on MNIST for ML Beginners, in Implementing the Regression, shows how to make the regression on a single line, followed by an explanation that mentions the use of a trick (emphasis mine):
y = tf.nn.softmax(tf.matmul(x, W) + b)
First, we multiply x by W with the expression tf.matmul(x, W). This is flipped from when we multiplied them in our equation, where we had Wx, as a small trick to deal with x being a 2D tensor with multiple inputs.
What is the trick here, and why are we using it?
Well, there's no trick here. That line basically points to one previous equation multiplication order
# Here the order of W and x, this equation for single example
y = Wx +b
# if you want to use batch of examples you need the change the order of multiplication; instead of using another transpose op
y = xW +b
# hence
y = tf.matmul(x, W)
Ok, I think the main point is that if you train in batches (i.e. train with several instances of the training set at once), TensorFlow always assumes that the zeroth dimension of x indicates the number of events per batch.
Suppose you want to map a training instance of dimension M to a target instance of dimension N. You would typically do this by multiplying x (a column vector) with a NxM matrix (and, optionally, add a bias with dimension N (also a column vector)), i.e.
y = W*x + b, where y is also a column vector.
This is perfectly alright seen from the perspective of linear algebra. But now comes the point with the training in batches, i.e. training with several training instances at once.
To get to understand this, it might be helpful to not view x (and y) as vectors of dimension M (and N), but as matrices with the dimensions Mx1 (and Nx1 for y).
Since TensorFlow assumes that the different training instances constituting a batch are aligned along the zeroth dimension, we get into trouble here since the zeroth dimension is occupied by the different elements of one single instance.
The trick is then to transpose the above equation (remember that transposition of a product also switches the order of the two transposed objects):
y^T = x^T * W^T + b^T
This is pretty much what has been described in short within the tutorial.
Note that y^T is now a matrix of dimension 1xN (practically a row vector), while x^T is a matrix of dimension 1xM (also a row vector). W^T is a matrix of dimension MxN. In the tutorial, they did not write x^T or y^T, but simply defined the placeholders according to this transposed equation. The only point that is not clear to me is why they did not define b the "transposed way". I assume that the + operator automatically transposes b if it is necessary in order to get the correct dimensions.
The rest is now pretty easy: if you have batches larger than 1 instance, you just "stack" multiple of the x (1xM) matrices, say to a matrix of dimensions (AxM) (where A is the batch size). b will hopefully automatically broadcasted to this number of events (that means to a matrix of dimension (AxN). If you then use
y^T = x^T * W^T + b^T,
you will get a (AxN) matrix of the targets for each element of the batch.

How can I combine my three 2D tensors into a single 3D tensor in tensor flow?

Hello I am a newbie with the tensorflow and currently, I am working with colour Images and it's PCAS.
I have extracted PCAS in a form of "Red","Green" and "Blue" and also computed the weights which are associated with "Red","Green" and "Blue" components.
After doing the all the above stuff I want to combine all three 2D matrices into the single 3D matrix.
For a tensorflow it would be a 3D tensor.
def multi(h0,ppca,mu,i,scope=None):
with tf.variable_scope(scope or"multi"):
return tf.matmul(ppca[:,:,0],h0[i,:,:,0]) + tf.reshape(mu[:,0],[4096,1]) , tf.matmul(ppca[:,:,1],h0[i,:,:,1]) + tf.reshape(mu[:,1],[4096,1]) ,tf.matmul(ppca[:,:,2],h0[i,:,:,2]) + tf.reshape(mu[:,2],[4096,1])
So from the above function, I will get all three different 2D tensors and want to combine those 2D tensors to single 3D tensor which has dimensions [4096,1,3]
How can I do that?
any help is highly appreciated.
You need to concat them like this:
three_d_image = tf.concat(0, [[r], [g], [b]])
This tells tensorflow to concat them along the x dimension and treat each tensor as a matrix.
Doing the same without the additional brackets around the r,g,b tensors will try to concat them to one large 2D matrix
A clean, easy way to do it is using the tf.stack operation (tf.pack in older versions of tensorflow), it concatenats all tensors along a new dimension. If you want your new dimension to be after all previous, you need to set the axis argument to the number of dimensions of your tensors.
three_d_image = tf.stack([r,g,b], axis=2)
one of the solutions is that you can add one more empty dimension to your 2Ds so you will have 3 matrices of 3D dimension [4096,1,1] then you can concat these 3 matrices by axis 2 tf.concat(2,matrices) gives you [4096,1,3]
the second solution can be concat of axis 1, tf.concat(1,matrices) then reshape it to 3D

Categories

Resources