keras evaluating best parameters for regression - python

So i want to evaluate the best values of :
Learning Rate
Batch Size
Number of epoch
Activation function
Optimizers
Loss function
no of Hidden Layers
noof nodes in the hidden layers
to solve a regression problem
I have done this so far using randomized search. I have to do this individually for each parameter- so set the keras model to a standard model then test one parameter, then repeat.
Can anyone advise why my loss is >50,000 despite using the best network? And mean squared error is >40000000000.0000? This is a network fit using the optimal parameters from the randomized search but it isnt performing well at all. The data has been pre-processed, and scaled. I added in L2 regularisation but it doesnt improve it much.

Related

analyze the train-validation accuracy learning curve

I am building a two-layer neural network from scratch on the Fashion MNIST dataset. In between, using the RELU as activation and on the last layer, I am using softmax cross entropy. I am getting the below learning curve between train and validation accuracy which is wrong obviously. But if you see my loss curve, it's decreasing but my model is not learning. I am not able to my head around where I am going wrong. Could anyone explain these two graphs, like where I could be possibly going wrong?
I don't know exactly what you are doing, and I don't know anything about your architecture, but it's wrong to use ReLU on the last layer.
Usually you leave the last layer as linear (no activation). This will produce the logits that enter the Softmax. The output of the softmax will try to approximate the probability distribution on the classes.
This could be a reason for your results.

Keras Create New Loss Function

I am looking to design a custom loss function for Keras model. The model itself is neural network that accepts a set of images and is supposed to run a regression to get an output, which is a value. Due to the physical conditions of the problem, I need to add a regularization term to the regular mse which would be calculated as $cos(y_{pred})*f(X_i)$, where $y_{pred}$ is the output of the neural network, $X_i$ is the training example used to calculate $y_{pred}$, $f$ is some function which would calculate a value based on the image.
My problem is how to get the $X_i$ from the model? Loss function is supposed to accept just two inputs: $y_{pred}$ and $y_{true}$ which are tensors.

How is accuracy calculated when the problem is a regression one? [duplicate]

This question already has answers here:
What function defines accuracy in Keras when the loss is mean squared error (MSE)?
(3 answers)
Closed 3 years ago.
How is the accuracy calculated when the problem is a regression one?
I'm working on a regression problem to predict how much electricity each user USES each day,I use keras build a LSTM model to do this time series prediction. At the beginning, I use the 'accuracy' as the metrics, and when run
model.fit(...,verbose=2,...)
val_acc has a value after every epoch. And in my result, the value doesn't change, it's always the same value.
Then I realized that the regression problem was that there was no concept of accuracy, and then I started to wonder, how is that accuracy calculated?
I have a guess that when metrics is 'accuracy' in the regression question, accuracy is also calculated in a similar way to the classification problem: the number of predicted values equal to true values divided by the total sample size.
Am I right?
In regression you use metrics that measure how far your prediction is from the actual value, such as Squared Error, Mean Squared Error, etc.
Please see How to determine the accuracy of regression? Which measure should be used?
When building a keras LSTM model, you usually build a "skeleton" first, then you compile, fit and at the end, predict. During the compile step you need to define your loss function (see Keras documentation on Sequential models) and a metric, so you could do e.g.
model.compile(loss='mean_squared_error', optimizer='sgd', metrics=['mean_squared_error'])
(see Keras documentation on metrics). Therefore, if you put accuracy as a metric in a regression setting, you would not get reasonable results, as this metric is designed only for categorical tasks.
Yes, the accuracy is computed in exactly the same way as in classification, keras does not do any kind of adjustment. As you say it makes no sense to use the accuracy (which is a classification metric) for a regression problem.

Why does my TensorFlow NN model's predicted values have upper limit?

I have a neural network with three layers. I've tried using tanh and sigmoid functions for my activations and then the output layer is just a simple linear function (I'm trying to model a regression problem).
For some reason my model seems to have a hard cut off where it will never predict a value above some threshold (even though it should). What reason could there be for this?
Here is what predictions from the model look like (with sigmoid activations):
update:
With relu activation, and switching from gradient descent to Adam, and adding L2 regularization... the model predicts same value for every input...
A linear layer regressing a single value will have outputs of the form
output = bias + sum(kernel * inputs)
If inputs comes from a tanh, then -1 <= inputs <= 1, and hence
bias - sum(abs(kernel)) <= output <= bias + sum(abs(kernel))
If you want an unbounded output, consider using an unbounded activation on all intermediate layers, e.g. relu.
I think your problem concerns the generalization/expressiveness of the model. Regression is a basic task, there should be no problem with the method itself, but problem with the execution. #DomJack explained how output is restricted for a specific set of parameters, but that only happens for anomaly data. In general, when training parameters would be tuned so that it will predict output correctly.
So first point is about the quality of training data. Make sure you have large enough training data (and it is split randomly if you split train/test from one dataset). Also, maybe trivial, but make sure you didn't mess up input/output value in preprocessing.
Another point is about the size of the network. Make sure you use large enough hidden layer.

How to interpret loss function in Tensorflow DNNRegressor Estimator model?

I am using Tensorflow DNNRegressor Estimator model for making a neural network. But calling estimator.train() function is giving output as follows:
I.e. my loss function is varying a lot with every step. But as far as I know, my loss function should decrease with no of iterations. Also, find the attached screenshot for Tensorboard Visualisation for loss function:
The doubts I'm not able to figure out are:
Whether it is overall loss function value (combined loss for every step processed till now) or just that step's loss value?
If it is that step's loss value, then how to get value of overall loss function and see its trend, which I feel should decrease with increasing no of iterations? And In my knowledge that is the value we should look at while training a dataset.
If this is overall loss value, then why is it fluctuating so much? Am I missing something?
First of all, let me point out that tf.contrib.learn.DNNRegressor uses a linear regression head with mean_squared_loss, i.e. simple L2 loss.
Whether it is overall loss function value (combined loss for every
step processed till now) or just that step's loss value?
Each point on a chart is the value of a loss function on the last step after learning so far.
If it is that step's loss value, then how to get value of overall loss
function and see its trend, which I feel should decrease with
increasing no of iterations?
There's no overall loss function, probably you mean a chart how the loss changed after each step. That's exactly what tensorboard is showing to you. You are right, its trend is not downwards, as it should. This indicates that your neural network is not learning.
If this is overall loss value, then why is it fluctuating so much? Am I missing something?
A common reason for the neural network not learning is poor choice of hyperparameters (though there are many more mistakes you can possibly make). For example:
the learning rate is too large
it's also possible that the learning rate is too small, which means that the neural network is learning, but very very slowly, so that you can't see it
weights initialization is probably too large, try to decrease it
batch size may be too large as well
you're passing wrong labels for the inputs
training data contains missing values, or unnormalized
...
What I usually do to check if the neural network is at least somehow working is reduce the training set to few examples and try to overfit the network. This experiment is very fast, so I can try various learning rates, initialization variance and other parameters to find a sweet spot. Once I have a steady decreasing loss chart, I go on with a bigger set.
Though previous comment is very informative and good, it doesn't quite address your issue. When you instantiate DNNRegressor, add:
loss_reduction=tf.losses.Reduction.MEAN
in the constructor, and you'll see your average loss, converges.
estimator = tf.estimator.DNNRegressor(
feature_columns=feat_clmns,
hidden_units=[32, 64, 32],
weight_column=weight_clmn,
**loss_reduction=tf.losses.Reduction.MEAN**

Categories

Resources