High train accuracy poor test accuracy

High train accuracy poor test accuracy - python

I have a neural network which classify 3 output.My dataset is very small, I have 340 images for train, and 60 images for test. I build a model and when I compile at my result is this:
Epoch 97/100
306/306 [==============================] - 46s 151ms/step - loss: 0.2453 - accuracy: 0.8824 - val_loss: 0.3557 - val_accuracy: 0.8922
Epoch 98/100
306/306 [==============================] - 47s 152ms/step - loss: 0.2096 - accuracy: 0.9031 - val_loss: 0.3795 - val_accuracy: 0.8824
Epoch 99/100
306/306 [==============================] - 47s 153ms/step - loss: 0.2885 - accuracy: 0.8627 - val_loss: 0.4501 - val_accuracy: 0.7745
Epoch 100/100
306/306 [==============================] - 46s 152ms/step - loss: 0.1998 - accuracy: 0.9150 - val_loss: 0.4586 - val_accuracy: 0.8627
when I predict the test images, test accuracy is poor.
What should I do ? I also use ImageDatagenerator for data augmentation but the result is same.Is it because I have small dataset.

You can use Regularization on fully connected layers. But the fact that you already have high validation accuracy it's probably your data. your train data might not fully represent your test data. try to analyze that and make sure you do all the pre processing on the test data before testing as you did for the train data.

Related

Negative huge loss in tensorflow

I am trying to predict price values from datasets using keras. I am following this tutorial: https://keras.io/examples/structured_data/structured_data_classification_from_scratch/, but when I get to the part of fitting the model, I am getting a huge negative loss and very small accuracy
Epoch 1/50
1607/1607 [==============================] - ETA: 0s - loss: -117944.7500 - accuracy: 3.8897e-05
2022-05-22 11:14:28.922065: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:113] Plugin optimizer for device_type GPU is enabled.
1607/1607 [==============================] - 15s 10ms/step - loss: -117944.7500 - accuracy: 3.8897e-05 - val_loss: -123246.0547 - val_accuracy: 7.7791e-05
Epoch 2/50
1607/1607 [==============================] - 15s 9ms/step - loss: -117944.7734 - accuracy: 3.8897e-05 - val_loss: -123246.0547 - val_accuracy: 7.7791e-05
Epoch 3/50
1607/1607 [==============================] - 15s 10ms/step - loss: -117939.4844 - accuracy: 3.8897e-05 - val_loss: -123245.9922 - val_accuracy: 7.7791e-05
Epoch 4/50
1607/1607 [==============================] - 16s 10ms/step - loss: -117944.0859 - accuracy: 3.8897e-05 - val_loss: -123245.9844 - val_accuracy: 7.7791e-05
Epoch 5/50
1607/1607 [==============================] - 15s 10ms/step - loss: -117944.7422 - accuracy: 3.8897e-05 - val_loss: -123246.0547 - val_accuracy: 7.7791e-05
Epoch 6/50
1607/1607 [==============================] - 15s 10ms/step - loss: -117944.8203 - accuracy: 3.8897e-05 - val_loss: -123245.9766 - val_accuracy: 7.7791e-05
Epoch 7/50
1607/1607 [==============================] - 15s 10ms/step - loss: -117944.8047 - accuracy: 3.8897e-05 - val_loss: -123246.0234 - val_accuracy: 7.7791e-05
Epoch 8/50
1607/1607 [==============================] - 15s 10ms/step - loss: -117944.7578 - accuracy: 3.8897e-05 - val_loss: -123245.9766 - val_accuracy: 7.7791e-05
Epoch 9/50
This is my graph, as far as the code, it looks like the one from the example but adapted:
# Categorical feature encoded as string
desc = keras.Input(shape=(1,), name="desc", dtype="string")
# Numerical features
date = keras.Input(shape=(1,), name="date")
quant = keras.Input(shape=(1,), name="quant")
all_inputs = [
desc,
quant,
date,
]
# String categorical features
desc_encoded = encode_categorical_feature(desc, "desc", train_ds)
# Numerical features
quant_encoded = encode_numerical_feature(quant, "quant", train_ds)
date_encoded = encode_numerical_feature(date, "date", train_ds)
all_features = layers.concatenate(
[
desc_encoded,
quant_encoded,
date_encoded,
]
)
x = layers.Dense(32, activation="sigmoid")(all_features)
x = layers.Dropout(0.5)(x)
output = layers.Dense(1, activation="relu")(x)
model = keras.Model(all_inputs, output)
model.compile("adam", "binary_crossentropy", metrics=["accuracy"])
And the dataset looks like this:
date desc quant price
0 20140101.0 CARBONATO DE DIMETILO 999.00 1428.57
1 20140101.0 HIDROQUINONA 137.00 1314.82
2 20140101.0 1,5 PENTANODIOL TECN. 495.00 2811.60
3 20140101.0 SOSA CAUSTICA LIQUIDA 50% 567160.61 113109.14
4 20140101.0 BOROHIDRURO SODICO 6.24 299.27
Also I am converting the date from being YYYY-MM-DD to being numbers using:
dataset['date'] = pd.to_datetime(dataset["date"]).dt.strftime("%Y%m%d").astype('float64')
What am I doing wrong? :(
EDIT: I though the encoder function from the tutorial was normalizing data, but it wasnt. Is there any other tutorial that you know guys which can guide me better? The loss problem has been fixed ! (was due to normalization)

You seem to be quite confused by the components of your model.
Binary cross entropy is a classification loss, your problem is regression -> use MSE. Also "accuracy" makes no sense for regression, change it to MSE too.
You data is huge and thus your loss is huge. You have a price of 113109.14 in the data, what if your model is bad initially and says 0? You get a loss of ~100,000^2 = 10,000,000,000. Normalise your data, in your case - the output variable (target, price) to in between -1 and 1
There are some use cases where an output neuron should have an activation function, but unless you know why you are doing this, leaving it as a linear is a much safer choice.
Dropout is a method for regularising your model, do not start with having it, always start with the simplest possible model, and make sure you can learn before trying to maximise test score.
Neural networks will not extrapolate, feeding in an ever growing signal (date) in a raw format almost surely will cause problems.

Keras: val_loss is increasing and evaluate loss is too high

I'm new to Keras and I'm using it to build a normal Neural Network to classify number MNIST dataset.
Beforehand I have already split the data into 3 parts: 55000 to train, 5000 to evaluate and 10000 to test, and I have scaled the pixel density down (by dividing it by 255.0)
My model looks like this:
model = keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape=[28,28]))
model.add(keras.layers.Dense(100, activation='relu'))
model.add(keras.layers.Dense(10, activation='softmax'))
And here is the compile:
model.compile(loss='sparse_categorical_crossentropy',
optimizer = 'Adam',
metrics=['accuracy'])
I train the model:
his = model.fit(xTrain, yTrain, epochs = 20, validation_data=(xValid, yValid))
At first the val_loss decreases, then it increases although the accuracy is increasing.
Train on 55000 samples, validate on 5000 samples
Epoch 1/20
55000/55000 [==============================] - 5s 91us/sample - loss: 0.2822 - accuracy: 0.9199 - val_loss: 0.1471 - val_accuracy: 0.9588
Epoch 2/20
55000/55000 [==============================] - 5s 82us/sample - loss: 0.1274 - accuracy: 0.9626 - val_loss: 0.1011 - val_accuracy: 0.9710
Epoch 3/20
55000/55000 [==============================] - 5s 83us/sample - loss: 0.0899 - accuracy: 0.9734 - val_loss: 0.0939 - val_accuracy: 0.9742
Epoch 4/20
55000/55000 [==============================] - 5s 84us/sample - loss: 0.0674 - accuracy: 0.9796 - val_loss: 0.0760 - val_accuracy: 0.9770
Epoch 5/20
55000/55000 [==============================] - 5s 94us/sample - loss: 0.0541 - accuracy: 0.9836 - val_loss: 0.0842 - val_accuracy: 0.9742
Epoch 15/20
55000/55000 [==============================] - 4s 82us/sample - loss: 0.0103 - accuracy: 0.9967 - val_loss: 0.0963 - val_accuracy: 0.9788
Epoch 16/20
55000/55000 [==============================] - 5s 84us/sample - loss: 0.0092 - accuracy: 0.9973 - val_loss: 0.0956 - val_accuracy: 0.9774
Epoch 17/20
55000/55000 [==============================] - 5s 82us/sample - loss: 0.0081 - accuracy: 0.9977 - val_loss: 0.0977 - val_accuracy: 0.9770
Epoch 18/20
55000/55000 [==============================] - 5s 85us/sample - loss: 0.0076 - accuracy: 0.9977 - val_loss: 0.1057 - val_accuracy: 0.9760
Epoch 19/20
55000/55000 [==============================] - 5s 83us/sample - loss: 0.0063 - accuracy: 0.9980 - val_loss: 0.1108 - val_accuracy: 0.9774
Epoch 20/20
55000/55000 [==============================] - 5s 85us/sample - loss: 0.0066 - accuracy: 0.9980 - val_loss: 0.1056 - val_accuracy: 0.9768
And when I evaluate the loss is too high:
model.evaluate(xTest, yTest)
Result:
10000/10000 [==============================] - 0s 41us/sample - loss: 25.7150 - accuracy: 0.9740
[25.714989705941953, 0.974]
Is this ok, or is it a sign of overfitting? Should I do something to improve it? Thanks in advance.

Usually, it is not Ok. You want the loss rate to be as small as possible. Your result is typical for overfitting. Your Network 'knows' its training data, but isn't capable of analysing new Images. You may want to add some layers. Maybe Convolutional Layers, Dropout Layer... another idea would be to augment your training images. The ImageDataGenerator-Class provided by Keras might help you out here
Another thing to look at could be your hyperparameters. Why do you use 100 nodes in the first dense layer? maybe something like 784 (28*28) seems more interesting if you want to start with a dense layer. I would suggest some combination of Convolutional-Dropout-Dense. Then your dense -layer maybe doesn't need that many nodes...

Keras: My model loss and accuracy randomly drop to zero

I have a rather complex sequence to sequence encoder decoder model. I run into an issue where my loss and accuracy drop to zero and I can't reproduce this error. It has nothing to do with the training data as it happens with different sets.
It seems to be learning as the loss slowly drops. Below is what it is like just before:
Epoch 1/2
5000/5000 [==============================] - 235s 47ms/step - loss: 0.9825 - acc: 0.7077
Epoch 2/2
5000/5000 [==============================] - 235s 47ms/step - loss: 0.9443 - acc: 0.7177
And here is what is like during the next mode.fit() iteration:
Epoch 1/2
2882/2882 [==============================] - 136s 47ms/step - loss: 0.7033 - acc: 0.4399
Epoch 2/2
2882/2882 [==============================] - 136s 47ms/step - loss: 1.1921e-07 - acc: 0.0000e+00
After this, the loss and accuracy remain the same:
Epoch 1/2
5000/5000 [==============================] - 278s 56ms/step - loss: 1.1921e-07 - acc: 0.0000e+00
Epoch 2/2
5000/5000 [==============================] - 279s 56ms/step - loss: 1.1921e-07 - acc: 0.0000e+00
The reason I have to train in such a manner is because I have variable input sizes and output sizes. So I have to make batches of my training data with fixed input size before I train.
sgd = optimizers.SGD(lr= 0.015, decay=0.002)
out2 = model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
I need to use curriculum learning to reach sentence level predictions, so I am doing the following:
I initially train my model to output "1 word + end" token. Training on this works fine. When i start to train on "2 words + end", this problem starts to arise.
After training on 1 word, I save the model. Then I define a new model with output size for 2 words, and use the following:
new_model = createModel(...,num_output_words)
new_model.set_weights(old_model.get_weights())
I have to do this as I can't define a model with variable output length.
I can provide more information if needed. I can't find any information online.

How to understand loss acc val_loss val_acc in Keras model fitting

I'm new on Keras and have some questions on how to understanding my model results. Here is my result:(for your convenience, I only paste the loss acc val_loss val_acc after each epoch here)
Train on 4160 samples, validate on 1040 samples as below:
Epoch 1/20
4160/4160 - loss: 3.3455 - acc: 0.1560 - val_loss: 1.6047 - val_acc: 0.4721
Epoch 2/20
4160/4160 - loss: 1.7639 - acc: 0.4274 - val_loss: 0.7060 - val_acc: 0.8019
Epoch 3/20
4160/4160 - loss: 1.0887 - acc: 0.5978 - val_loss: 0.3707 - val_acc: 0.9087
Epoch 4/20
4160/4160 - loss: 0.7736 - acc: 0.7067 - val_loss: 0.2619 - val_acc: 0.9442
Epoch 5/20
4160/4160 - loss: 0.5784 - acc: 0.7690 - val_loss: 0.2058 - val_acc: 0.9433
Epoch 6/20
4160/4160 - loss: 0.5000 - acc: 0.8065 - val_loss: 0.1557 - val_acc: 0.9750
Epoch 7/20
4160/4160 - loss: 0.4179 - acc: 0.8296 - val_loss: 0.1523 - val_acc: 0.9606
Epoch 8/20
4160/4160 - loss: 0.3758 - acc: 0.8495 - val_loss: 0.1063 - val_acc: 0.9712
Epoch 9/20
4160/4160 - loss: 0.3202 - acc: 0.8740 - val_loss: 0.1019 - val_acc: 0.9798
Epoch 10/20
4160/4160 - loss: 0.3028 - acc: 0.8788 - val_loss: 0.1074 - val_acc: 0.9644
Epoch 11/20
4160/4160 - loss: 0.2696 - acc: 0.8923 - val_loss: 0.0581 - val_acc: 0.9856
Epoch 12/20
4160/4160 - loss: 0.2738 - acc: 0.8894 - val_loss: 0.0713 - val_acc: 0.9837
Epoch 13/20
4160/4160 - loss: 0.2609 - acc: 0.8913 - val_loss: 0.0679 - val_acc: 0.9740
Epoch 14/20
4160/4160 - loss: 0.2556 - acc: 0.9022 - val_loss: 0.0599 - val_acc: 0.9769
Epoch 15/20
4160/4160 - loss: 0.2384 - acc: 0.9053 - val_loss: 0.0560 - val_acc: 0.9846
Epoch 16/20
4160/4160 - loss: 0.2305 - acc: 0.9079 - val_loss: 0.0502 - val_acc: 0.9865
Epoch 17/20
4160/4160 - loss: 0.2145 - acc: 0.9185 - val_loss: 0.0461 - val_acc: 0.9913
Epoch 18/20
4160/4160 - loss: 0.2046 - acc: 0.9183 - val_loss: 0.0524 - val_acc: 0.9750
Epoch 19/20
4160/4160 - loss: 0.2055 - acc: 0.9120 - val_loss: 0.0440 - val_acc: 0.9885
Epoch 20/20
4160/4160 - loss: 0.1890 - acc: 0.9236 - val_loss: 0.0501 - val_acc: 0.9827
Here are my understandings:
The two losses (both loss and val_loss) are decreasing and the tow acc (acc and val_acc) are increasing. So this indicates the modeling is trained in a good way.
The val_acc is the measure of how good the predictions of your model are. So for my case, it looks like the model was trained pretty well after 6 epochs, and the rest training is not necessary.
My Questions are:
The acc (the acc on training set) is always smaller, actually much smaller, than val_acc. Is this normal? Why this happens?In my mind, acc should usually similar to better than val_acc.
After 20 epochs, the acc is still increasing. So should I use more epochs and stop when acc stops increasing? Or I should stop where val_acc stops increasing, regardless of the trends of acc?
Is there any other thoughts on my results?
Thanks!

Answering your questions:
As described on official keras FAQ
the training loss is the average of the losses over each batch of training data. Because your model is changing over time, the loss over the first batches of an epoch is generally higher than over the last batches. On the other hand, the testing loss for an epoch is computed using the model as it is at the end of the epoch, resulting in a lower loss.
Training should be stopped when val_acc stops increasing, otherwise your model will probably overffit. You can use earlystopping callback to stop training.
Your model seems to achieve very good results. Keep up the good work.

What are loss and val_loss?
In deep learning, the loss is the value that a neural network is trying to minimize: it's the distance between the ground truth and the predictions. In order to minimize this distance, the neural network learns by adjusting weights and biases in a manner that reduces the loss.
For instance, in regression tasks, you have a continuous target, e.g., height. What you want to minimize is the difference between your predictions, and the actual height. You can use mean_absolute_error as loss so the neural network knows this is what it needs to minimize.
In classification, it's a little more complicated, but very similar. Predicted classes are based on probability. The loss is therefore also based on probability. In classification, the neural network minimizes the likelihood to assign a low probability to the actual class. The loss is typically categorical_crossentropy.
loss and val_loss differ because the former is applied to the train set, and the latter the test set. As such, the latter is a good indication of how the model performs on unseen data. You can get a validation set by using validation_data=[x_test, y_test] or validation_split=0.2.
It's best to rely on the val_loss to prevent overfitting. Overfitting is when the model fits the training data too closely, and the loss keeps decreasing while the val_loss is stale, or increases.
In Keras, you can use EarlyStopping to stop training when the val_loss stops decreasing. Read here.
Read more about deep learning losses here: Loss and Loss Functions for Training Deep Learning Neural Networks.
What are acc and val_acc?
Accuracy is a metric only for classification. It makes no sense on a task with a continuous target. It gives the percentage of instances that are correctly classified.
Once again, acc is on the training data, and val_acc is on the validation data. It's best to rely on val_acc for a fair representation of model performance because a good neural network will end up fitting the training data at 100%, but would perform poorly on unseen data.

What does the standard Keras model output mean? What is epoch and loss in Keras?

I have just built my first model using Keras and this is the output. It looks like the standard output you get after building any Keras artificial neural network. Even after looking in the documentation, I do not fully understand what the epoch is and what the loss is which is printed in the output.
What is epoch and loss in Keras?
(I know it's probably an extremely basic question, but I couldn't seem to locate the answer online, and if the answer is really that hard to glean from the documentation I thought others would have the same question and thus decided to post it here.)
Epoch 1/20
1213/1213 [==============================] - 0s - loss: 0.1760
Epoch 2/20
1213/1213 [==============================] - 0s - loss: 0.1840
Epoch 3/20
1213/1213 [==============================] - 0s - loss: 0.1816
Epoch 4/20
1213/1213 [==============================] - 0s - loss: 0.1915
Epoch 5/20
1213/1213 [==============================] - 0s - loss: 0.1928
Epoch 6/20
1213/1213 [==============================] - 0s - loss: 0.1964
Epoch 7/20
1213/1213 [==============================] - 0s - loss: 0.1948
Epoch 8/20
1213/1213 [==============================] - 0s - loss: 0.1971
Epoch 9/20
1213/1213 [==============================] - 0s - loss: 0.1899
Epoch 10/20
1213/1213 [==============================] - 0s - loss: 0.1957
Epoch 11/20
1213/1213 [==============================] - 0s - loss: 0.1923
Epoch 12/20
1213/1213 [==============================] - 0s - loss: 0.1910
Epoch 13/20
1213/1213 [==============================] - 0s - loss: 0.2104
Epoch 14/20
1213/1213 [==============================] - 0s - loss: 0.1976
Epoch 15/20
1213/1213 [==============================] - 0s - loss: 0.1979
Epoch 16/20
1213/1213 [==============================] - 0s - loss: 0.2036
Epoch 17/20
1213/1213 [==============================] - 0s - loss: 0.2019
Epoch 18/20
1213/1213 [==============================] - 0s - loss: 0.1978
Epoch 19/20
1213/1213 [==============================] - 0s - loss: 0.1954
Epoch 20/20
1213/1213 [==============================] - 0s - loss: 0.1949

Just to answer the questions more specifically, here's a definition of epoch and loss:
Epoch: A full pass over all of your training data.
For example, in your view above, you have 1213 observations. So an epoch concludes when it has finished a training pass over all 1213 of your observations.
Loss: A scalar value that we attempt to minimize during our training of the model. The lower the loss, the closer our predictions are to the true labels.
This is usually Mean Squared Error (MSE) as David Maust said above, or often in Keras, Categorical Cross Entropy
What you'd expect to see from running fit on your Keras model, is a decrease in loss over n number of epochs. Your training run is rather abnormal, as your loss is actually increasing. This could be due to a learning rate that is too large, which is causing you to overshoot optima.
As jaycode mentioned, you will want to look at your model's performance on unseen data, as this is the general use case of Machine Learning.
As such, you should include a list of metrics in your compile method, which could look like:
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
As well as run your model on validation during the fit method, such as:
model.fit(data, labels, validation_split=0.2)
There's a lot more to explain, but hopefully this gets you started.

One epoch ends when your model had run the data through all nodes in your network and ready to update the weights to reach optimal loss value. That is, smaller is better. In your case, as there are higher loss scores on higher epoch, it "seems" the model is better on first epoch.
I said "seems" since we can't actually tell for sure yet as the model has not been tested using proper cross validation method i.e. it is evaluated only against its training data.
Ways to improve your model:
Use cross validation in your Keras model in order to find out how the model actually perform, does it generalize well when predicting new data it has never seen before?
Adjust your learning rate, structure of neural network model, number of hidden units / layers, init, optimizer, and activator parameters used in your model among myriad other things.
Combining sklearn's GridSearchCV with Keras can automate this process.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.