Say I have CNN model that outputs N probability maps as mask the same size of the input image in a Unet like fashion. I would then want to apply for example least square fit on top of each mask to get coefficients for functions as output instead and use these to calculate my models loss.
def unet_model(...)
# init unet model
...
...
# final layer
mask_out = layers.Conv2D(output_channels, (1,1), activation='softmax')(conv9)
# start applying e.g least squares fit here
eq_list = tf.Variable((x_map, y_map, mask_out))
transp = tf.transpose(a)
...
transp would get the following error when I initialize the model. I have tested the least squares fit operations elsewhere.
FailedPreconditionError: Error while reading resource variable _AnonymousVar1423 from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/_AnonymousVar1423/N10tensorflow3VarE does not exist. name: transpose/
I have some assumptions such as that transpose cannot deal with placeholders axis for batch sizes, but am generally clueless about this.
before adding each variables I needed to make sure that x_map and y_map also is batched by expanding the dims with axis -1
Related
I have a working CNN-LSTM model trying to predict keypoints of human bodyparts on videos.
Currently, I have four keypoints as labels right hand, left hand, head and pelvis.
The problem is that on some frames I can't see the four parts of the human that I want to label, so by default I set those values to (0,0) (which is a null coordinate).
The problem that I faced was the model taking in account those points and trying to regress on them while being in a sequence.
Thus, I removed the (0,0) points in the loss calculation and the gradient retropropagation and it works much better.
The problem is that the Four points are still predicted, so I am trying to know by any means how to make it predict a variable number of keypoints.
I thought of adding a third parameter (is it visible ?), but it will probably add some complexity and loose the model.
I think that you'll have to write a custom loss function that computes the loss between points only when the target coordinates are not null.
See PyTorch custom loss function on writing custom losses.
Something like:
def loss(outputs, labels):
err = 0
n = 0
for xo, xt in zip(outputs, labels):
if xt.values == torch.zeros(2): # null coord
continue
err += torch.nn.functional.mse_loss(xo, xt)
n += 1
return (err / n)
This is pseudo-code only! An alternative form which will avoid the loop is to have an explicit binary vector (as suggested by #leleogere) that you can then multiply by the loss on each coordinate before reducing.
I have trained successfully a multi-output Gaussian Process model using an GPy.models.GPCoregionalizedRegression model of the GPy package. The model has ~25 inputs and 6 outputs.
The underlying kernel is an GPy.util.multioutput.ICM kernel consisting of an RationalQuadratic kernel GPy.kern.RatQuad and the GPy.kern.Coregionalize Kernel.
I am now interested in the feature importance on each individual output. The RatQuad kernel provides an ARD=True (Automatic Relevance Determination) keyword, which allows to get the feature importance of its output for a single output model (which is also exploited by the get_most_significant_input_dimension() method of the GPy model).
However, calling the get_most_significant_input_dimension() method on the GPy.models.GPCoregionalizedRegression model gives me a list of indices I assume to be the most significant inputs somehow for all outputs.
How can I calculate/obtain the lengthscale values or most significant features for each individual output of the model?
The problem is the model itself. The intrinsic coregionalized model (ICM) is set up such, that all outputs are determined by a shared underlying "latent" Gaussian Process. Thus, calling get_most_significant_input_dimension() on a GPy.models.GPCoregionalizationRegression model can only give you one set of input dimensions significant to all outputs together.
The solution is to use a GPy.util.multioutput.LCM model kernel, which is defined as a sum of ICM kernels with a list of individual (latent) GP kernels. It works as follows
import GPy
# Your data
# x = ...
# y = ...
# # ICM case
# kernel = GPy.util.multioutput.ICM(input_dim=x.shape[1],
# num_outputs=y.shape[1],
# kernel=GPy.kern.RatQuad(input_dim=x.shape[1], ARD=True))
# LCM case
k_list = [GPy.kern.RatQuad(input_dim=x.shape[1], ARD=True) for _ in range(y.shape[1])]
kernel = GPy.util.multioutput.LCM(input_dim=x.shape[1], num_outputs=y.shape[1],
W_rank=rank, kernels_list=k_list)
A reshaping is of the data is needed (This is also necessary for the ICM model and thus independent of the scope of this questions, see here for details)
# Reshaping data to fit GPCoregionalizedRegression
xx = reshape_for_coregionalized_regression(x)
yy = reshape_for_coregionalized_reshaping(y)
m = GPy.models.GPCoregionalizedRegression(xx, yy, kernel=kernel)
m.optimize()
After converged optimization one can call get_most_significant_input_dimension() on an individual latent GPs (here output 0).
sig_inputs_0 = m.sum.ICM0.get_most_significant_input_dimensions()
or looping over all kernels
sig_inputs = []
for part in self.gpy_model.kern.parts:
sig_inputs.append(part.get_most_significant_input_dimensions())
I am working with REINFORCE algorithm with PyTorch. I noticed that the batch inference/predictions of my simple network with Softmax doesn’t sum to 1 (not even close to 1). I am attaching a minimum working code so that you can reproduce it. What am I missing here?
import numpy as np
import torch
obs_size = 9
HIDDEN_SIZE = 9
n_actions = 2
np.random.seed(0)
model = torch.nn.Sequential(
torch.nn.Linear(obs_size, HIDDEN_SIZE),
torch.nn.ReLU(),
torch.nn.Linear(HIDDEN_SIZE, n_actions),
torch.nn.Softmax(dim=0)
)
state_transitions = np.random.rand(3, obs_size)
state_batch = torch.Tensor(state_transitions)
pred_batch = model(state_batch) # WRONG PREDICTIONS!
print('wrong predictions:\n', *pred_batch.detach().numpy())
# [0.34072137 0.34721774] [0.30972624 0.30191955] [0.3495524 0.3508627]
# DOES NOT SUM TO 1 !!!
pred_batch = [model(s).detach().numpy() for s in state_batch] # CORRECT PREDICTIONS
print('correct predictions:\n', *pred_batch)
# [0.5955179 0.40448207] [0.6574412 0.34255883] [0.624833 0.37516695]
# DOES SUM TO 1 AS EXPECTED
Although PyTorch lets us get away with it, we don’t actually provide an input with the right dimensionality. We have a model that takes one input and produces one output, but PyTorch nn.Module and its subclasses are designed to do so on multiple samples at the same time. To accommodate multiple samples, modules expect the zeroth dimension of the input to be the number of samples in the batch.
Deep Learning with PyTorch
That your model works on each individual sample is an implementation nicety. You have incorrectly specified the dimension for the softmax (across batches instead of across the variables), and hence when given a batch dimension it is computing the softmax across samples instead of within samples:
nn.Softmax requires us to specify the dimension along which the softmax function is applied:
softmax = nn.Softmax(dim=1)
In this case, we have two input vectors in two rows (just like when we work with
batches), so we initialize nn.Softmax to operate along dimension 1.
Change torch.nn.Softmax(dim=0) to torch.nn.Softmax(dim=1) to get appropriate results.
Currently, I am working on a Universal Perturbation type of research, where I would use the gradient of the layer before the activation function to retrace the gradient step taken in the last iteration.
However, when I try to extract the gradient using K.gradients, I can't seem to extract the right stuff.
Either I get a tensor, which I don't want, or I get [zero]. What I want are the exact gradients of that second to last layer, given the input-image. This is what I currently have:
f_image = np.array(model.predict(image)).flatten()
I = (np.array(f_image)).flatten().argsort()[::-1]
I = I[0:num_classes]
pert_image = image
gradients = np.asarray(grads(pert_image,I))
Here grads should be the gradient function to get the exact gradients. When I use the following code, I get a tensor:
gradients = K.gradients(model.layers[-2].output, model.layers[0].input)[0]
Where the output is the I, which gives the largest influences before making the activation to classify, and the input is the perturbed image, starting off with the original image.
Could someone tell me what is wrong with my K.gradients implementation?
K.gradients computes the gradient in a symbolic way, you need to evaluate the gradient with actual inputs in order to get numerical values. You can do this using K.function to build a callable:
import keras.backend as K
gradients = K.gradients(model.layers[-2].output, model.layers[0].input)[0]
grad_fn = K.function([model.input], [gradients])
Then you can now call grad_fn with an appropriate input (including the batch dimension) that will return the numerical values of the gradient:
actual_gradients = grad_fn([image])
Im getting an error when attempting to load the Caltech tensorflow-dataset. I'm using the standard code found in the tensorflow-datasets GitHub
The error is this:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot batch tensors with different shapes in component 0. First element had shape [204,300,3] and element 1 had shape [153,300,3]. [Op:IteratorGetNextSync]
The error points to the line for features in ds_train.take(1)
Code:
ds_train, ds_test = tfds.load(name="caltech101", split=["train", "test"])
ds_train = ds_train.shuffle(1000).batch(128).prefetch(10)
for features in ds_train.take(1):
image, label = features["image"], features["label"]
The issue comes from the fact that the dataset contains variable-sized images (see the dataset description here). Tensorflow can only batch together things with the same shape, so you first need to either reshape the images to a common shape (e.g., the input shape of your network) or pad them accordingly.
If you want to resize, use tf.image.resize_images:
def preprocess(features, label):
features['image'] = tf.image.resize_images(features['image'], YOUR_TARGET_SIZE)
# Other possible transformations needed (e.g., converting to float, normalizing to [0,1]
return features, label
If, instead, you want to pad, use tf.image.pad_to_bounding_box (just replace it in the above preprocess function and adapt the parameters as needed).
Normally, for most of the networks I'm aware of, resizing is used.
Finally, map the function on your dataset:
ds_train = (ds_train
.map(prepocess)
.shuffle(1000)
.batch(128)
.prefetch(10))
Note: The variable shapes in the error codes come from the shuffle call.