Custom Integral Loss Term Keras - python

I am trying to define a custom loss function in Keras where I have an additional term that is an integral over the domain of the neural network output. So this would look like:
The key point is that the integral runs over an entire domain that I've specified, not just training data. I don't mind using any form of quadrature to evaluate the integral, I just need to be able to evaluate it. Currently, as far as the documentation indicates, this is not possible to do with a custom loss as it only provides access to y_pred and y_true.
Is there any way of achieving this in Keras?

If the idea is just defining extra variables, you can do this either inside (locally) or outside (globally) the loss function, using keras backend functions:
import keras.backend as K
myDomain = K.variable(range(100)) / 10 #for instance
def custom_loss(y_true,y_pred):
localVar = K.variable([[1,2],[3,1]])
return calculationsWith(y_true,y_pred,localVar,myDomain)
It's important that you use functions coming from the backend to do the calculations. (Either from K or directly from tensoflow, theano or CNTK).

Related

Pytorch - predict multiple parameters in neural network

I have two parameters which I want a neural network to predict. What is the best or most conventional method to implement the loss function? Currently I just define the loss, torch.nn.L1Loss(), which automatically computes the mean for both parameters such that it becomes a scalar.
Another plausible method would be to create two loss functions, one for each parameter, and successively backpropagate.
I don't really see whether both methods compute the same thing and whether one method is better (or plain wrong).
The probelm could be seen as a Multi-task Probelm. For example, two parameters represents A-Task and B-Task respectively.
In Multi-task, two loss function is often used.
The usual form is as follows,
$$total_loss = \alpha * A_losss(\hat{y_1},y_1) + \bata * A_losss(\hat{y_2},y_2)$$
The $\alpha$ and $\beta$ is the weight of the loss function.Usually they are both 1 or 0.5.

What is keras loss class vs function?

In Keras loss page which is here there are 2 main distinction I saw is loss classes vs loss functions? Can anyone explain why for same losses these 2 APIs given? Is it just for class initialization or any other purposes? Also if anyone can explain that in which cases we should use which one that would be great.
Thanks in advance.
A deep learning model can be built and trained in multiple ways.
The simplest approach to build a model would be to use Keras functional API or sequential API to build the model, use compile method to specify optimizer, loss, metrics, etc, and use the fit method for training the model.
If you choose to build the model this way the compile method accepts loss class.
Note: You can use the loss function as well
The actual logic that computes the loss is present in the special call method of a class which is used internally in the fit method.
However, there are cases (mostly in research) where the training loop has to be written in a certain way from scratch, in that case, you can use loss function to compute losses.
Note: You can use the loss class as well
The loss class gives you some extra functionality like specifying logit value, reduction technique, etc. So if your code requires the use of those functionality use the loss class to compute the losses.
If you do not require any such functionality you can simply use the loss function.
Note: Under the hood, both functions call the same TensorFlow graph.

How does TensorFlow compute the gradient of vgg19.preprocess_input?

I am following the tutorial on neural style transfer. The style transfer is done by minimizing a loss function with respect to an image (initialized with the content image). What confuses me is the following piece of code:
preprocessed_input = tf.keras.applications.vgg19.preprocess_input(inputs)
which is part of the call method in the StyleContentModel class. How does TensorFlow know the gradient of this operation? I have checked if this operation has a gradient function using get_gradient_function in the module tensorflow.python.framework.ops, and as far as I can tell it does not.
It is very simple, the function internally uses symbolic tensor operations that are differentiable. TensorFlow can compute gradients through functions that internally use TensorFlow operations, there is no need to manually define a gradient for each function.
You can confirm by looking at the code of that function here, specially if you look at the _preprocess_symbolic_function here which is using normal scalar operations and Keras backend functions (which are just TensorFlow functions in tf.keras).
This has nothing to do with the model or gradients. What this function does is scale the input images so the pixels are in the range from -1 to +1. This is a common requirement for many models used in transfer learning like VGG and MobileNet. If you use the ImageDataGenerator it has a parameter preprocessing_function which the generator calls to preprocess the images. Make sure if you preprocess the training images you do the same for the test and validation images.

Keras custom loss coupling different samples

I am trying to understand how Keras actually computes the gradients of a custom loss in a general setting.
Normally losses are defined as a sum over the samples of independent contributions. This allows eventually a proper parallelisation in the computation of the gradients.
However, if I add a global non linearity on top of it, thus coupling the contribution of the individual samples, is Keras able to treat the differentiation properly?
In practice, is it actually minimising f(sum_i(x_i)) or computes it one sample at the time and thus reducing to sum_i(f(x_i))?
Below an example in the case of a log function.
def custom_loss(y_true,y_pred):
return K.log(1+K.mean((y_pred-y_true)*(y_pred-y_true)))
I have checked for documentation but I couldn't find any precise answer.
It minimizes whatever you tell it to minimize.
If you want to minimize the log of the whole sum, then apply the log after the sum.
If you want to minimize the log of each sample and sum later, then apply the log before the sum
def log_of_sum(y_true, y_pred):
return K.log(1 + K.mean(K.square(y_true-y_pred)))
def sum_of_logs(y_true, y_ored):
return K.mean(K.log(1 + K.square(y_true-y_pred)))
#mean is optional here - you can return all the samples and Keras will handle it
#returning all the samples allows other functions to work, like sample_weights

Tensorflow, tf.gradients calculations

I am learning how to use Tensorflow and at this 1 particular point I am really stuck and can not make a sense around it. Imagine I have a 5 layer network and the output is represented by output. Now suppose I want to find the gradient of output with respect to layer_2. For that purpose, the code I will write in Tensorflow will be something like:
gradients_i_want = tf.gradients(output, layer_2)
Theoretically, this gradient should be calculated via chain rule. I want to ask, that whether Tensorflow calculates these gradients via chain rule or it will just take the derivative of output with respect to layer_2
Tensorflow will create a graph for your model, where each node is an operation (e.g. addition, multiplication, or a combination of them). Basic ops have manually defined gradient functions, and those functions will be used when applying the chain rule while traveling backwards through the graph.
If you write your own custom op, you might need to also write the corresponding gradient function.

Categories

Resources