I am trying to get the Jacobian for a simple parameterization function within JAX. The code is as follows:
# imports
import jax
import jax.numpy as jnp
from jax import random
# simple parameterization function
def reparameterize(v_params):
theta = v_params[0] + jnp.exp(v_params[1]) * eps
return theta
Suppose I initialize eps to be a vector of shape (3,) and v_params to be of shape (3, 2):
key = random.PRNGKey(2022)
eps = random.normal(key, shape=(3,))
key, _ = random.split(key)
v_params = random.normal(key, shape=(3, 2))
I want the Jacobian to be an array of shape (3, 2) but by using
jacobian(vmap(reparameterize))(v_params)
returns an array of shape (3, 3, 3, 2). If I re-initialize with only a single eps:
key, _ = random.split(key)
eps = random.normal(key, shape=(1, ))
key, _ = random.split(key)
v_params = random.normal(key, shape=(2, ))
and call jacobian(reparameterize)(v_params) I get what I want, e.g., an array of shape (2, ). Effectively looping over all eps and stacking the results of each Jacobian gives me the desired Jacobian (and shape). What am I missing here? Thanks for your help!
For a function f that maps an input of shape shape_in to an output of shape shape_out, the jacobian will have shape (*shape_out, *shape_in).
In your case, vmap(reparameterize) takes an array of shape (3, 2) and returns an array of shape (3, 3), so the output of the jacobian is an array of shape (3, 3, 3, 2).
It's hard to tell from your question what computation you were intending, but if you want a jacobian the same shape as the input, you need a function that maps the input to a scalar. Perhaps the sum is what you had in mind?
result = jacobian(lambda x: vmap(reparameterize)(x).sum())(v_params)
print(result.shape)
# (3, 2)
Related
I am looking for a way to reduce the length of a 1D tensor by applying a pooling operation. How can I do it? If I apply MaxPool1d, I get the error max_pool1d() input tensor must have 2 or 3 dimensions but got 1.
Here is my code:
import numpy as np
import torch
A = np.random.rand(768)
m = nn.MaxPool1d(4,4)
A_tensor = torch.from_numpy(A)
output = m(A_tensor)
Your initialization is fine, you've defined the first two parameters of nn.MaxPool1d: kernel_size and stride. For one-dimensional max-pooling both should be integers, not tuples.
The issue is with your input, it should be two-dimensional (the batch axis is missing):
>>> m = nn.MaxPool1d(4, 4)
>>> A_tensor = torch.rand(1, 768)
Then inference will result in:
>>> output = m(A_tensor)
>>> output.shape
torch.Size([1, 192])
I think you meant the following instead:
m = nn.MaxPool1d((4,), 4)
As mentioned in the docs, the arguments are:
torch.nn.MaxPool1d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
As you can see, it's one kernel_size, it's not something like kernel_size1 kernel_size2. Instead it's just only kernel_size
For posterity: the solution is to reshape the tensor using A_tensor.reshape(768,1).
I ran a conv1D on a X matrix of shape (2000, 20, 28) for batch size of 2000, 20 time steps and 28 features.
I would like to move forward to a conv2D CNN and increase the dimensionality of my matrix to (2000, 20, 28, 10) having 10 elements for which I can build a (2000, 20, 28) X matrix. Similarly, I want to get a y array of size (2000, 10) i.e. 5 times the y array of size (2000, ) that I used to get for LSTM and Conv1D networks.
The code I used to create the 20 time-steps from input dataX, dataY, was
def LSTM_create_dataset(dataX, dataY, seq_length, step):
Xs, ys = [], []
for i in range(0, len(dataX) - seq_length, step):
v = dataX.iloc[i:(i + seq_length)].values
Xs.append(v)
ys.append(dataY.iloc[i + seq_length])
return np.array(Xs), np.array(ys)
I use this function within the loop I prepared to create the data of my conv2D NN :
for ric in rics:
dataX, dataY = get_model_data(dbInput, dbList, ric, horiz, drop_rows, triggerUp1, triggerLoss, triggerUp2 = 0)
dataX = get_model_cleanXset(dataX, trigger) # Clean X matrix for insufficient data
Xs, ys = LSTM_create_dataset(dataX, dataY, seq_length, step) # slide over seq_length for a 3D matrix
Xconv.append(Xs)
yconv.append(ys)
Xconv.append(Xs)
yconv.append(ys)
I obtain a (10, 2000, 20, 28) Xconv matrix instead of the (2000, 20, 28, 10) targeted output matrix X and a (10, 2000) matrix y instead of the targeted (2000, 10).
I know that I can easily reshape yconv with yconv = np.reshape(yconv, (2000, 5)). But the reshape function for Xconv Xconv = np.reshape(Xconv, (2000, 20, 28, 10)) seems hazardous as I cannot vizualize output and even erroneous.
How could I do it safely (or could you confirm my first attempt ?
Thanks a lot in advance.
If your matrix for y has shape (10, 2000) then you will not be able to shape it to your desired (2000,5). I've demonstrated this below.
# create array of same shape as your original y
arr_1 = np.arange(0,2000*10).reshape(10,2000)
print(arr_1.shape) # returns (10,2000)
arr_1 = arr_1.reshape(2000,5)
This returns the following error message as it is critical that the dimensions of the before and after shapes must match.
ValueError: cannot reshape array of size 20000 into shape (2000,5)
I do not fully understand the statement that you cannot visualise the output - you could manually check that the reshape function had done so correctly if you wished, for your dataset (or a small part of it to confirm that the function is working effectively) using print statements, as below - by comparing the output to your original data and what you expect the data to look like afterwards.
import numpy as np
arr = np.arange(0,2000)
arr = arr.reshape(20,10,10,1) # reshape array to shape (20, 10, 10, 1)
# these statements let you examine the array contents at varying depths
print(arr[0][0][0])
print(arr[0][0])
I want to transform general tensors / vectors in Tensorflow, but to have a concrete example let's say rotate images.
For this I would like to have a rotation matrix R, which is learned by my network, i.e. there should be gradients computable.
How would you do this?
I found tf.contrib.image.transform, but for this it is said no gradients are computed into the transformation parameters.
Via py_func also the gradients are not available or would have to be calculated by hand - before writing a long custom solution for this (if even possible), are there maybe any ready-to-use solutions?
I cannot be the first one doing this.
For the requested code: I just want to feed an image as input, maybe apply some convolutional layer and in the end get a 2x2 matrix representing my transformation:
conv_1 = tf.layers.conv2d(conv1, 16, [3, 3], strides=(2, 2), padding='same', activation=tf.nn.leaky_relu)
...
M = tf.contrib.layers.fully_connected(conv_n, 4, activation_fn=tf.nn.tanh)
The matrix M then describes how my indices are transformed (imagining each pixel in the image as a vector with endpoint x, y), and I move each pixel then to its new location.
In numpy I could for example do this:
indices = []
for i in range(28):
for j in range(28):
indices.append([i, j])
indices = np.repeat(np.expand_dims(np.asarray(indices), 0), self.batch_size, 0)
transformed = []
for b in range(self.batch_size):
transformed.append(tf.matmul(indices[b], M[b]))
transformed = tf.stack(transformed)
transformed_img = np.zeros((self.batch_size, 28, 28))
for b in range(self.batch_size):
transformed_img[b, transformed[b, :, :, 0].astype(np.int32), transformed[b, :, :, 1].astype(np.int32)] = input_img[b, :, :, 0]
I'm so news to Tensorflow . I already search for same questions,but i can't understand. there is the code .Hope you can help me.
Code:
import tensorflow as tf
w1 = tf.Variable(tf.random_normal([2,3],stddev=1,seed=1))
w2 = tf.Variable(tf.random_normal([3,3],stddev=1,seed=1))
x = tf.constant([0.7,0.9])
a = tf.matmul(x, w1)
y = tf.matmul(a, w2)
sess = tf.Session()
sess.run(w1.initializer)
sess.run(w2.initializer)
print(sess.run(y))
sess.close()
The shape of constant x is (2,), i.e. a one-dimensional array, and you are trying to multiply it with a two-dimensional array w1 of shape (2, 3), which is not possible for matrix multiplication, as number of columns of first parameter must be equal to number of rows in second parameter. Also, I think tf.matmul only works if both arrays are two-dimensional.
One of the many ways you can change your declaration of x as
x = tf.constant([[0.7], [0.9]])
This will create a two-dimensional constant tensor of shape (2, 1). And, then multiply it as,
a = tf.matmul(tf.transpose(x), w1)
tf.transpose() is used to create transpose of array x with shape (2, 1) to shape (1, 2).
Hope this helps.
In your case, the rank of variable x is 1. Hence the issue.
Following is the reason you are having this issue.
Please refer the tensorflow API https://www.tensorflow.org/api_docs/python/tf/matmul
tf.matmul(a, b, transpose_a=False, transpose_b=False, adjoint_a=False, adjoint_b=False,
a_is_sparse=False, b_is_sparse=False, name=None)
Args:
a: Tensor of type float16, float32, float64, int32, complex64, complex128 and rank > 1.
b: Tensor with same type and rank as a.
The shape of x is (2,) does not match the shape (2,3) of w1.
You should change
x = tf.constant([0.7,0.9])
to
x = tf.constant([[0.7,0.9]])
now the shape of x is (1,2) and works fine.
So, i want to multiply a matrix with a matrix. When I try an array with a matrix, it works:
import tensorflow as tf
x = tf.placeholder(tf.float32, [None, 3])
W = tf.Variable(tf.ones([3, 3]))
y = tf.matmul(x, W)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
curr_y = sess.run(y, feed_dict={x: [[1,2,3],[0,4,5]]})
print curr_y
So the array has the batch size 2 and shape 3x1. So I can multiply the matrix with shape 3x3 with the array 3x1.
But when I have again a matrix with the shape 3x3, but this time a matrix and not an array with the shape 3x2, with batch size 2, its not working.
But if I try to multiply a matrix with a matrix. It doesn't work.
import tensorflow as tf
x = tf.placeholder(tf.float32, [None, 3,3])
W = tf.Variable(tf.ones([3, 3]))
y = tf.matmul(x, W)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
curr_y = sess.run(y, feed_dict={x: [[[1,2,3],[1,2,3]],[[1,1,4],[0,4,5]]]})
print curr_y
ValueError: Shape must be rank 2 but is rank 3 for 'MatMul' (op:
'MatMul') with input shapes: [?,3,3], [3,3].
########EDIT
Sorry, what I want to do, is, to matmul a matrix to a batch of matrix or arrays. So I dont want to do
y = tf.matmul(x, W)
actually, I want to do
y = tf.matmul(W, x)
Your input to tensor 'x' has a shape (2, 2, 3).
You're trying to do matrix multiplication of (2, 2, 3) and (3, 3). they don't have the same rank, and that's the reason for the error.
from Tensorflow official site:
https://www.tensorflow.org/api_docs/python/tf/matmul
Args:
a: Tensor of type float16, float32, float64, int32, complex64, complex128 and rank > 1.
b: Tensor with same type and rank as a.
When you do matrices multiplication, the shape of the matrices need to follow the rule
(a, b) * (b, c) = (a, c)
Keep in mind the shape of W as you defined is (3, 3).
This feed_dict={x: [[1,2,3],[0,4,5]]} is a 2D array, the shape of it is (2, 3)
In [67]: x = [[1, 2, 3], [0, 4, 5]]
In [68]: x = np.array(x)
In [69]: x.shape
Out[69]: (2, 3)
It follows the rule (2, 3) * (3, 3) => (2, 3)
But your second example, the shape doesn't follow the rule of multiplication. The shape of your input is (2, 2, 3) which is not even in the same dimension as your defined W, so it won't work.
In [70]: foo = [[[1,2,3],[1,2,3]],[[1,1,4],[0,4,5]]]
In [71]: foo = np.array(foo)
In [72]: foo.shape
Out[72]: (2, 2, 3)