2D convolution with reparameterization using just tf.nn.convolution - python

I want to do something like what tfp.layers.Conv2DReparameterization does but simpler - no priors etc.
Given an augmented input x of shape [num_particles, batch, in_height, in_width, in_channels] and a filter of mean f_mean and standard deviation f_std shape [filter_height, filter_width, in_channels, out_channels] which are trainable variables, I use the reparameterization trick to get filter samples:
filter_samples = f_mean + f_std * tf.random_normal(([num_particles] + f_mean.shape))
Thus, filter_samples is of shape [num_particles, filter_height, filter_width, in_channels, out_channels].
Then, I want to do:
output = tf.nn.conv2d(x, filter_samples, padding='SAME') # or VALID
where output should be of shape [num_particles] + standard convolution output shape.
For dense layers, it works to just do a tf.matmul(x, filter_samples), but for conv2d I'm not sure about the results and I can't find the implementation code to check it. Implementing it myself would end up slower than TF code, so I want to avoid it.
For SAME padding, the resulting shape seems okay, for VALID the batch dim is changed making me believe it doesn't work as I expect.
Just to make it clear, I need the output to have the num_particles dim. Code is TF1.x
Any ideas on how to get that?

I think there is some code to do similar in tfp.experimental.nn. We can follow up in the github issues you filed/responded to.

Related

Softmax Output Layer. Which dimension?

I am having a question regarding Neuronal Nets used for image segmentation. I am using a 3D Implementation of Deeplab that can be found here
I am using softmax, so the output layer is the following:
elif self.last_activation.lower() == 'softmax':
output = nn.Softmax()(output)
No dimension is defined, so I want to define it manually. But I am not sure which dimension I need tó set. The dimension of the output tensor is the following:
[batch_size, num_classes, width, height, depth]
So I would think that dim=1 would be correct. Is that correct?
Thanks!
Indeed it should be 1 as you want this axis to be summed to 1.
Be careful if you need to train your network with a crossentropyloss as this latter already include a softmax.

Use tf.gradients or tf.hessians on flattened parameter tensor

Let's say I want to compute the Hessian of a scalar-valued function with respect to some parameters W (e.g the weights and biases of a feed-forward neural network).
If you consider the following code, implementing a two-dimensional linear model trained to minimize a MSE loss:
import numpy as np
import tensorflow as tf
x = tf.placeholder(dtype=tf.float32, shape=[None, 2]) #inputs
t = tf.placeholder(dtype=tf.float32, shape=[None,]) #labels
W = tf.placeholder(np.eye(2), dtype=tf.float32) #weights
preds = tf.matmul(x, W) #linear model
loss = tf.reduce_mean(tf.square(preds-t), axis=0) #mse loss
params = tf.trainable_variables()
hessian = tf.hessians(loss, params)
you'd expect session.run(tf.hessian,feed_dict={}) to return a 2x2 matrix (equal to W). It turns out that because paramsis a 2x2 tensor, the output is rather a tensor with shape [2, 2, 2, 2]. While I can easily reshape the tensor to obtain the matrix I want, it seems that this operation might be extremely cumbersome when paramsbecomes a list of tensors of varying size (i.e when the model is a deep neural network for instance).
It seems that are two ways around this:
Flatten params to be a 1D tensor called flat_params:
flat_params = tf.concat([tf.reshape(p, [-1]) for p in params])
so that tf.hessians(loss, flat_params) naturally returns a 2x2 matrix. However as noted in Why does Tensorflow Reshape tf.reshape() break the flow of gradients? for tf.gradients (but also holds for tf.hessians), tensorflow is not able to see the symbolic link in the graph between paramsand flat_params and tf.hessians(loss, flat_params) will raise an error as the gradients will be seen as None.
In https://afqueiruga.github.io/tensorflow/2017/12/28/hessian-mnist.html, the author of the code goes the other way, and first create the flat parameter and reshapes its parts into self.params. This trick does work and gets you the hessian with its expected shape (2x2 matrix). However, it seems to me that this will be cumbersome to use when you have a complex model, and impossible to apply if you create your model via built-in functions (like tf.layers.dense, ..).
Is there no straight-forward way to get the Hessian matrix (as in the 2x2 matrix in this example) from tf.hessians, when self.params is a list of tensor of arbitrary shapes? If not, how can you automatize the reshaping of the output tensor of tf.hessians?
It turns out (per TensorFlow r1.13) that if len(xs) > 1, then tf.hessians(ys, xs) returns tensors corresponding to only the block diagonal submatrices of the full Hessian matrix. Full story and solutions in this paper https://arxiv.org/pdf/1905.05559, and code at https://github.com/gknilsen/pyhessian

Why tf.pad return shape [?,?,?]

I tried do some customized padding before feeding to a conv1D net as following.
x=tf.placeholder("float",[None,50,1])
padding=tf.constant([0,0],[5,0],[0,0])
y=tf.pad(x,padding)
However, after the above manipulation, y would be a tensor of shape (?,?,?), thus when feeding to tf.layers.conv1d, I get an error that "The channel dimension of the inputs should be defined. Found 'None'".
My question is why does pad result has None shape? It should not be hard to calculate the shape, my guess is this is only calculated in run time, but it is not convenient right? And can I use reshape before passing to conv1d?

Convert Tensorflow conv2d_transpose to Caffe Deconvolution

I've been trying to convert a tensorflow model to caffe which contains conv2d_transpose layers to upscale an image. For the TF layer, I constructed the kernel shape as (3, 3, X, X) where X is some number of channels from the previous layer (didn't change channels with deconv) and specified parameters pad='same', stride=[1,2,2,1], output_shape=(N, 2 * input_shape[1], 2 * input_shape[2], X) where input_shape was the NHWC format output a previous conv layer.
The conversion I attempted followed from the patttern I've seen/used successfully for converting a caffe Convolution layer before:
layer = caffe.layers.Deconvolution(prev_layer, name=node.name,
convolution_param=dict(num_output=X, kernel_size=var.shape[0],
stride=2, pad=0))
... construct net ...
net.params[layer][0].data[:] = tf_weights.transpose((3,2,0,1))
net.params[layer][1].data[:] = tf_biases
The problem I'm seeing is that the output is not the correct size. As is, the code and network produce an output that is too large by 3 pixels in each dimension (I have two conv2d_transpose/Deconvolution layers). By changing pad=0 to 1 the output becomes similarly too small by 3. Otherwise the output looks more or less like it does in tensorflow, but the boundaries appear messed up which I assume results from this padding issue.
I'm not sure if this conversion is even possible as I've read that deconvolution does not necessarily describe the same operation as a transposed convolution. Let a dude know if it's possible/where this goes wrong. Thanks.
P.S. TF 1.5, and freshly installed caffe (as of commit 87e151281d)

Output of Tensorflow LSTM-Cell

I've got a question on Tensorflow LSTM-Implementation. There are currently several implementations in TF, but I use:
cell = tf.contrib.rnn.BasicLSTMCell(n_units)
where n_units is the amount of 'parallel' LSTM-Cells.
Then to get my output I call:
rnn_outputs, rnn_states = tf.nn.dynamic_rnn(cell, x,
initial_state=initial_state, time_major=False)
where (as time_major=False) x is of shape (batch_size, time_steps, input_length)
where batch_size is my batch_size
where time_steps is the amount of timesteps my RNN will go through
where input_length is the length of one of my input vectors (vector fed into the network on one specific timestep on one specific batch)
I expect rnn_outputs to be of shape (batch_size, time_steps, n_units, input_length) as I have not specified another output size.
Documentation of nn.dynamic_rnn tells me that output is of shape (batch_size, input_length, cell.output_size).
The documentation of tf.contrib.rnn.BasicLSTMCell does have a property output_size, which is defaulted to n_units (the amount of LSTM-cells I use).
So does each LSTM-Cell only output a scalar for every given timestep? I would expect it to output a vector of the length of the input vector. This seems not to be the case from how I understand it right now, so I am confused. Can you tell me whether that's the case or how I could change it to output a vector of size of the input vector per single lstm-cell maybe?
I think the primary confusion is on the terminology of the LSTM cell's argument: num_units. Unfortunately it doesn't mean, as the name suggests, "the no. of LSTM cells" that should be equal to your time-steps. They actually correspond to the number of dimensions in the hidden state (cell state + hidden state vector).
The call to dynamic_rnn() returns a tensor of shape: [batch_size, time_steps, output_size] where,
(Please note this) output_size = num_units; if (num_proj = None) in the lstm cell
where as, output_size = num_proj; if it is defined.
Now, typically, you will extract the last time_step's result and project it to the size of output dimensions using a mat-mul + biases operation manually, or use the num_proj argument in the LSTM cell.
I have been through the same confusion and had to look really deep to get it cleared. Hope this answer clears some of it.

Categories

Resources