In Keras loss page which is here there are 2 main distinction I saw is loss classes vs loss functions? Can anyone explain why for same losses these 2 APIs given? Is it just for class initialization or any other purposes? Also if anyone can explain that in which cases we should use which one that would be great.
Thanks in advance.
A deep learning model can be built and trained in multiple ways.
The simplest approach to build a model would be to use Keras functional API or sequential API to build the model, use compile method to specify optimizer, loss, metrics, etc, and use the fit method for training the model.
If you choose to build the model this way the compile method accepts loss class.
Note: You can use the loss function as well
The actual logic that computes the loss is present in the special call method of a class which is used internally in the fit method.
However, there are cases (mostly in research) where the training loop has to be written in a certain way from scratch, in that case, you can use loss function to compute losses.
Note: You can use the loss class as well
The loss class gives you some extra functionality like specifying logit value, reduction technique, etc. So if your code requires the use of those functionality use the loss class to compute the losses.
If you do not require any such functionality you can simply use the loss function.
Note: Under the hood, both functions call the same TensorFlow graph.
Related
I am trying to optimize a convolutional neural network using both Adam and L-BFGS for comparison purposes. However, I am having a hard time implementing the wrap around function in order to be able to use a Keras Sequential model inside the TensorFlow Probability lbfgs_minimize function (https://www.tensorflow.org/probability/api_docs/python/tfp/optimizer/lbfgs_minimize).
Can anyone please provide me with some direction, here is the reference on the wrap around function:
https://pychao.com/2019/11/02/optimize-tensorflow-keras-models-with-l-bfgs-from-tensorflow-probability/
I am following the tutorial on neural style transfer. The style transfer is done by minimizing a loss function with respect to an image (initialized with the content image). What confuses me is the following piece of code:
preprocessed_input = tf.keras.applications.vgg19.preprocess_input(inputs)
which is part of the call method in the StyleContentModel class. How does TensorFlow know the gradient of this operation? I have checked if this operation has a gradient function using get_gradient_function in the module tensorflow.python.framework.ops, and as far as I can tell it does not.
It is very simple, the function internally uses symbolic tensor operations that are differentiable. TensorFlow can compute gradients through functions that internally use TensorFlow operations, there is no need to manually define a gradient for each function.
You can confirm by looking at the code of that function here, specially if you look at the _preprocess_symbolic_function here which is using normal scalar operations and Keras backend functions (which are just TensorFlow functions in tf.keras).
This has nothing to do with the model or gradients. What this function does is scale the input images so the pixels are in the range from -1 to +1. This is a common requirement for many models used in transfer learning like VGG and MobileNet. If you use the ImageDataGenerator it has a parameter preprocessing_function which the generator calls to preprocess the images. Make sure if you preprocess the training images you do the same for the test and validation images.
I have an Auto Encoder model with multiple outputs and weightening which a want to enrich into a Variational Auto Encoder.
I followed this: https://keras.io/examples/generative/vae/ official keras tutorial.
But if a manually adapt the train_step function I lose the majority of my original implementation details:
I got two weighted optimization goals: re-construction (decoder) and classification (softmax)
accuracy metrics for the classification
the original fit method also takes care of the validation data and corresponding metrics
Adding the suggested sampling layer according to the keras link is no problem, but to correctly implement the Kullback-Leibler-Loss as it depends on the additional parameters z_mu and z_log_var which is not supported by standard Keras losses.
I search for some workarounds to solve this issue but none of them was succesfull:
re-writing the train_step: its hard to fully re-implement all details (
weightening, multiple losses with different inputs -> decoder: data, classifier: labels etc)
adding a psyeudo layer to the ecoder that calculates the loss: https://tiao.io/post/tutorial-on-variational-autoencoders-with-a-concise-keras-implementation/ like here. But here is the problem that the add loss function does not specify to which key and how KL-Loss is added to the model's total loss
Adding everything as global/top-level element to make the z_mu, z_log_var accessible for the loss calculation like here: https://www.machinecurve.com/index.php/2019/12/30/how-to-create-a-variational-autoencoder-with-keras/. This is the approach I like the least as my current architecture is parametrized to be able to e.g. perform hyperopt tuning
I was not able to find a pleasing solution to this problem, as VAE's are more and more popular I am surprised by the phenomenon that there is no extended tutorial about this especially when dealing with multiple in- and outputs. Or I am just unable to find the right answers through my query.
Any opinions welcome!
After a couple of re-designs I and bug-ticket tracing I found this recent example:
here
The VAE examples can be found at the very bottom of the post.
Solution: write your own train_step: cleanest but also hardest solution depending how complex your loss calculation is.
Solution: use a functional approach the access the necessary variables and add the loss with .add_loss: not very clean but straight to implement (you will lose an additional loss tracker for the KL-loss)
To achieve my weighting I weighted the KL loss before I added it via .add_loss according to the weight of my decoder loss.
Note: The first solution I tested was to define a custom loss function for the mse+kl loss and added it into my functional designed model - this works if one turns of the tf eager eval off. But be careful this really slows down your network and you will lose the ability to monitor your training via tensorboard if you don't have admin rights for your nvidia gpu (profile_batch=0 does not turn off profiling if eager mode is switched off, therefore you ran into INSUFFICENT_PRIVILEDGES Errors with the CUPTI driver)
I am new to tensorflow.
Question: How can I add a custom loss function to a canned estimator like DNNLinearCombinedRegressor?
Thank you in advance for your help!
I did not find a loss_fn argument to pass in to DNNLinearCombinedRegressor. But it looks straightforward to define any custom estimator:
A good reference is the code for DNNLinearCombinedRegressor.
I believe you need following steps:
define a custom loss_fn (what ever you like)
define or use a RegressionHead that uses the custom loss_fn defined above.
pass the custom head to your own Regressor. Of course, you might need to copy/build your own model_fn as well.
I can't find how Keras defines "accuracy" and "loss". I know I can specify different metrics (e.g. mse, cross entropy) - but keras prints out a standard "accuracy". How is that defined? Likewise for loss: I know I can specify different types of regularization -- are those in the loss?
Ideally, I'd like to print out the equation used to define it; if not, I'll settle for an answer here.
Have a look at metrics.py, there you can find definition of all available metrics including different types of accuracy. Accuracy is not printed unless you add it to the list of desired metrics when you compile your model.
Regularizers are by definition added to the loss. For example, see add_loss method of the Layerclass.
Update
The type of accuracy is determined based on the objective function, see training.py. The default choice is categorical_accuracy. Other types like binary_accuracy and sparse_categorical_accuracy are selected when the objective function is either binary or sparse.