This question already has answers here:
Cost function training target versus accuracy desired goal
(2 answers)
Closed 1 year ago.
As I know we optimize our model with changing the weight parameters over the iterations.
The aim is to minimize the loss and maximize the accuracy.
I don't understand why we using loss as parameter as well if we have accuracy as parameter.
Can we use only accuracy and drop loss from our model?
With accuracy we can also change the model weights?
In short, perfecting a neural network is all about minimizing the difference between the intended result and given result. The difference is known as the cost/loss. So the smaller the cost/loss, the closer the intended value, so the higher the accuracy
I suggest you watch 3Blue1Brown's video series on neural networks on youtube
Related
So i want to evaluate the best values of :
Learning Rate
Batch Size
Number of epoch
Activation function
Optimizers
Loss function
no of Hidden Layers
noof nodes in the hidden layers
to solve a regression problem
I have done this so far using randomized search. I have to do this individually for each parameter- so set the keras model to a standard model then test one parameter, then repeat.
Can anyone advise why my loss is >50,000 despite using the best network? And mean squared error is >40000000000.0000? This is a network fit using the optimal parameters from the randomized search but it isnt performing well at all. The data has been pre-processed, and scaled. I added in L2 regularisation but it doesnt improve it much.
I am trying to train a Deep Neural Network (DNN) with labeled data. The labels are encoded in such a way that it only contains values 0 and 1. The shape of the encoded label is 5 x 5 x 232. About 95% of values in the label is 0and rests are 1. Currently, I am using binary_crossentroy loss function to train the network.
What is the best technique to train the DNN in such a scenario? Is the choice of binary_crossentroy
as the loss function is appropriate in this case? Any suggestion to improve the performance of the model.
You can try MSE loss. If you want to stick to binary cross-entropy (used in binary classification), consider using label smoothing.
You may use 2 other alternative loss functions instead of Binary cross-entropy.They are
Hinge Loss
An alternative to cross-entropy for binary classification problems is the hinge loss function, primarily developed for use with Support Vector Machine (SVM) models.
It is intended for use with binary classification where the target values are in the set {-1, 1}.
Squared Hinge Loss
For more Detail on loss function with examples.click here
Hope helpful, happy learning.
binary_crossentroy as loss is fine
Don't use accuracy as your metrics, because model will just predict every thing as label 0 and will still get 95% accuracy. Instead use F1 score (or precision or recall)
Use Weighted loss: I.e penalize class 1 heavily if they are wrong as compared to class 0.
Instead of class weights you can also use methods like oversampling form the minority class. (Techniques like SMOTE)
How to calculate class weight
You can use sklearn.utils.class_weight to calculate weight from your labels. Check this answer
In such scenarios where you have highly imbalanced data, I would suggest going with Random Forest with up-Sampling. This approach will up-sample the minority class and hence improve the model accuracy.
This question already has answers here:
What function defines accuracy in Keras when the loss is mean squared error (MSE)?
(3 answers)
Closed 3 years ago.
How is the accuracy calculated when the problem is a regression one?
I'm working on a regression problem to predict how much electricity each user USES each day,I use keras build a LSTM model to do this time series prediction. At the beginning, I use the 'accuracy' as the metrics, and when run
model.fit(...,verbose=2,...)
val_acc has a value after every epoch. And in my result, the value doesn't change, it's always the same value.
Then I realized that the regression problem was that there was no concept of accuracy, and then I started to wonder, how is that accuracy calculated?
I have a guess that when metrics is 'accuracy' in the regression question, accuracy is also calculated in a similar way to the classification problem: the number of predicted values equal to true values divided by the total sample size.
Am I right?
In regression you use metrics that measure how far your prediction is from the actual value, such as Squared Error, Mean Squared Error, etc.
Please see How to determine the accuracy of regression? Which measure should be used?
When building a keras LSTM model, you usually build a "skeleton" first, then you compile, fit and at the end, predict. During the compile step you need to define your loss function (see Keras documentation on Sequential models) and a metric, so you could do e.g.
model.compile(loss='mean_squared_error', optimizer='sgd', metrics=['mean_squared_error'])
(see Keras documentation on metrics). Therefore, if you put accuracy as a metric in a regression setting, you would not get reasonable results, as this metric is designed only for categorical tasks.
Yes, the accuracy is computed in exactly the same way as in classification, keras does not do any kind of adjustment. As you say it makes no sense to use the accuracy (which is a classification metric) for a regression problem.
Is there any way we can target a specific metric to optimize using inbuilt tensorflow optimizers? If not, how to achieve this? For eg. If I want to focus only on maximizing F-score of my classifier specifically, is it possible to do so in tensorflow?
estimator = tf.estimator.LinearClassifier(
feature_columns=feature_cols,
config=my_checkpointing_config,
model_dir=output_dir,
optimizer=lambda: tf.train.FtrlOptimizer(
learning_rate=tf.train.exponential_decay(
learning_rate=0.1,
global_step=tf.train.get_or_create_global_step(),
decay_steps=1000,
decay_rate=0.96)))
I am trying to optimize my classifier specifically on the basis of getting a better F-score. Despite using the decaying learning_rate and 300 training steps I am getting inconsistent results. While checking the metrics in the logs, I found the behavior of precision, recall and accuracy to be very erratic. Despite increasing the number of training steps, there was no significant improvement. So I thought that if i could make the optimizer focus more on improving the F-score as a whole I might get better results. Hence the question. Is there something that I am missing?
In classification settings, optimizers minimize the loss, e.g. cross entropy; quantities like accuracy, F-score, precision, recall etc. are essentially business metrics, and they are not (and cannot be) directly minimized during the optimization process.
This is a question that pops up rather frequently here in SO in various disguises; here are some threads which will hopefully help you disentangle the concepts (although they refer to accuracy, precision, and recall, the argument is exactly the same for the F-score):
Loss & accuracy - Are these reasonable learning curves?
Cost function training target versus accuracy desired goal
Is there an optimizer in keras based on precision or recall instead of loss?
The bottom line, adapting one of my own (linked) answers:
Loss and metrics like accuracy or F-score are different things; roughly speaking, metrics like accuracy & F-score are what we are actually interested in from a business perspective, while the loss is the objective function that the learning algorithms (optimizers) are trying to minimize from a mathematical perspective. Even more roughly speaking, you can think of the loss as the "translation" of the business objective (accuracy, F-score etc) to the mathematical domain, a translation which is necessary in classification problems (in regression ones, usually the loss and the business objective are the same, or at least can be the same in principle, e.g. the RMSE)...
One could technically adjust the threshold parameter that distinguishes between class 1 and 0. For example, in logistic regression, if the threshold is lowered from 0.5 to 0.3, recall would decrease and precision would increase, and viceversa. But as others have mentioned, this is not the same as optimizing ("minimizing") the loss function.
I am using Tensorflow DNNRegressor Estimator model for making a neural network. But calling estimator.train() function is giving output as follows:
I.e. my loss function is varying a lot with every step. But as far as I know, my loss function should decrease with no of iterations. Also, find the attached screenshot for Tensorboard Visualisation for loss function:
The doubts I'm not able to figure out are:
Whether it is overall loss function value (combined loss for every step processed till now) or just that step's loss value?
If it is that step's loss value, then how to get value of overall loss function and see its trend, which I feel should decrease with increasing no of iterations? And In my knowledge that is the value we should look at while training a dataset.
If this is overall loss value, then why is it fluctuating so much? Am I missing something?
First of all, let me point out that tf.contrib.learn.DNNRegressor uses a linear regression head with mean_squared_loss, i.e. simple L2 loss.
Whether it is overall loss function value (combined loss for every
step processed till now) or just that step's loss value?
Each point on a chart is the value of a loss function on the last step after learning so far.
If it is that step's loss value, then how to get value of overall loss
function and see its trend, which I feel should decrease with
increasing no of iterations?
There's no overall loss function, probably you mean a chart how the loss changed after each step. That's exactly what tensorboard is showing to you. You are right, its trend is not downwards, as it should. This indicates that your neural network is not learning.
If this is overall loss value, then why is it fluctuating so much? Am I missing something?
A common reason for the neural network not learning is poor choice of hyperparameters (though there are many more mistakes you can possibly make). For example:
the learning rate is too large
it's also possible that the learning rate is too small, which means that the neural network is learning, but very very slowly, so that you can't see it
weights initialization is probably too large, try to decrease it
batch size may be too large as well
you're passing wrong labels for the inputs
training data contains missing values, or unnormalized
...
What I usually do to check if the neural network is at least somehow working is reduce the training set to few examples and try to overfit the network. This experiment is very fast, so I can try various learning rates, initialization variance and other parameters to find a sweet spot. Once I have a steady decreasing loss chart, I go on with a bigger set.
Though previous comment is very informative and good, it doesn't quite address your issue. When you instantiate DNNRegressor, add:
loss_reduction=tf.losses.Reduction.MEAN
in the constructor, and you'll see your average loss, converges.
estimator = tf.estimator.DNNRegressor(
feature_columns=feat_clmns,
hidden_units=[32, 64, 32],
weight_column=weight_clmn,
**loss_reduction=tf.losses.Reduction.MEAN**