Why my callback is not invoking in Tensorflow? - python

Below is my Tensorflow and Python code which will end the training when accuracy in 99% with the call back function. But the callback is not invoking. Where is the problem ?
def train_mnist():
class myCallback(tf.keras.callbacks.Callback):
def on_epoc_end(self, epoch,logs={}):
if (logs.get('accuracy')>0.99):
print("Reached 99% accuracy so cancelling training!")
self.model.stop_training=True
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data(path=path)
x_train= x_train/255.0
x_test= x_test/255.0
callbacks=myCallback()
model = tf.keras.models.Sequential([
# YOUR CODE SHOULD START HERE
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(256, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# model fitting
history = model.fit(x_train,y_train, epochs=10,callbacks=[callbacks])
# model fitting
return history.epoch, history.history['acc'][-1]

You're misspelling epoch and also you should return accuracy not acc.
from tensorflow.keras.layers import Input, Dense, Add, Activation, Flatten
from tensorflow.keras.models import Model, Sequential
import tensorflow as tf
import numpy as np
import random
from tensorflow.python.keras.layers import Input, GaussianNoise, BatchNormalization
def train_mnist():
class myCallback(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch,logs={}):
print(logs.get('accuracy'))
if (logs.get('accuracy')>0.9):
print("Reached 90% accuracy so cancelling training!")
self.model.stop_training=True
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train= x_train/255.0
x_test= x_test/255.0
callbacks=myCallback()
model = tf.keras.models.Sequential([
# YOUR CODE SHOULD START HERE
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(256, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# model fitting
history = model.fit(x_train,y_train, epochs=10,callbacks=[callbacks])
# model fitting
return history.epoch, history.history['accuracy'][-1]
train_mnist()
Epoch 1/10
1859/1875 [============================>.] - ETA: 0s - loss: 0.2273 - accuracy: 0.93580.93586665391922
Reached 90% accuracy so cancelling training!
1875/1875 [==============================] - 3s 2ms/step - loss: 0.2265 - accuracy: 0.9359
([0], 0.93586665391922)

Unfortunately don't have enough reputation to provide commentary on one of the above comments, but I wanted to point out that the on_epoch_end function is something called directly through tensorflow when an epoch ends. In this case, we're just implementing it inside a custom python class that will be called automatically by the underlying framework. I'm sourcing from Tensorflow in Practice deeplearning.ai week 2 on coursera. Very similar where the issues with the above callback are coming from it seems.
Here's some proof from my most recent run:
Epoch 1/20
59968/60000 [============================>.] - ETA: 0s - loss: 1.0648 - acc: 0.9491Inside callback
60000/60000 [==============================] - 34s 575us/sample - loss: 1.0645 - acc: 0.9491
Epoch 2/20
59968/60000 [============================>.] - ETA: 0s - loss: 0.0560 - acc: 0.9825Inside callback
60000/60000 [==============================] - 35s 583us/sample - loss: 0.0560 - acc: 0.9825
Epoch 3/20
59840/60000 [============================>.] - ETA: 0s - loss: 0.0457 - acc: 0.9861Inside callback
60000/60000 [==============================] - 31s 512us/sample - loss: 0.0457 - acc: 0.9861
Epoch 4/20
59840/60000 [============================>.] - ETA: 0s - loss: 0.0428 - acc: 0.9873Inside callback
60000/60000 [==============================] - 32s 528us/sample - loss: 0.0428 - acc: 0.9873
Epoch 5/20
59808/60000 [============================>.] - ETA: 0s - loss: 0.0314 - acc: 0.9909Inside callback
60000/60000 [==============================] - 30s 507us/sample - loss: 0.0315 - acc: 0.9909
Epoch 6/20
59840/60000 [============================>.] - ETA: 0s - loss: 0.0271 - acc: 0.9924Inside callback
60000/60000 [==============================] - 32s 532us/sample - loss: 0.0270 - acc: 0.9924
Epoch 7/20
59968/60000 [============================>.] - ETA: 0s - loss: 0.0238 - acc: 0.9938Inside callback
60000/60000 [==============================] - 33s 555us/sample - loss: 0.0238 - acc: 0.9938
Epoch 8/20
59936/60000 [============================>.] - ETA: 0s - loss: 0.0255 - acc: 0.9934Inside callback
60000/60000 [==============================] - 33s 550us/sample - loss: 0.0255 - acc: 0.9934
Epoch 9/20
59872/60000 [============================>.] - ETA: 0s - loss: 0.0195 - acc: 0.9953Inside callback
60000/60000 [==============================] - 33s 557us/sample - loss: 0.0194 - acc: 0.9953
Epoch 10/20
59744/60000 [============================>.] - ETA: 0s - loss: 0.0186 - acc: 0.9959Inside callback
60000/60000 [==============================] - 33s 551us/sample - loss: 0.0185 - acc: 0.9959
Epoch 11/20
59968/60000 [============================>.] - ETA: 0s - loss: 0.0219 - acc: 0.9954Inside callback
60000/60000 [==============================] - 32s 530us/sample - loss: 0.0219 - acc: 0.9954
Epoch 12/20
59936/60000 [============================>.] - ETA: 0s - loss: 0.0208 - acc: 0.9960Inside callback
60000/60000 [==============================] - 33s 558us/sample - loss: 0.0208 - acc: 0.9960
Epoch 13/20
59872/60000 [============================>.] - ETA: 0s - loss: 0.0185 - acc: 0.9968Inside callback
60000/60000 [==============================] - 31s 520us/sample - loss: 0.0184 - acc: 0.9968
Epoch 14/20
59872/60000 [============================>.] - ETA: 0s - loss: 0.0181 - acc: 0.9970Inside callback
60000/60000 [==============================] - 35s 587us/sample - loss: 0.0181 - acc: 0.9970
Epoch 15/20
59936/60000 [============================>.] - ETA: 0s - loss: 0.0193 - acc: 0.9971Inside callback
60000/60000 [==============================] - 33s 555us/sample - loss: 0.0192 - acc: 0.9972
Epoch 16/20
59968/60000 [============================>.] - ETA: 0s - loss: 0.0176 - acc: 0.9972Inside callback
60000/60000 [==============================] - 33s 558us/sample - loss: 0.0176 - acc: 0.9972
Epoch 17/20
59968/60000 [============================>.] - ETA: 0s - loss: 0.0183 - acc: 0.9974Inside callback
60000/60000 [==============================] - 33s 555us/sample - loss: 0.0182 - acc: 0.9974
Epoch 18/20
59872/60000 [============================>.] - ETA: 0s - loss: 0.0225 - acc: 0.9970Inside callback
60000/60000 [==============================] - 34s 570us/sample - loss: 0.0224 - acc: 0.9970
Epoch 19/20
59808/60000 [============================>.] - ETA: 0s - loss: 0.0185 - acc: 0.9975Inside callback
60000/60000 [==============================] - 33s 548us/sample - loss: 0.0185 - acc: 0.9975
Epoch 20/20
59776/60000 [============================>.] - ETA: 0s - loss: 0.0150 - acc: 0.9979Inside callback
60000/60000 [==============================] - 34s 565us/sample - loss: 0.0149 - acc: 0.9979
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-25-1ff3c304aec3> in <module>
----> 1 _, _ = train_mnist_conv()
<ipython-input-24-b469df35dac0> in train_mnist_conv()
38 )
39 # model fitting
---> 40 return history.epoch, history.history['accuracy'][-1]
41
KeyError: 'accuracy'
The key error is because of the history object not having the keyword 'accuracy', so I wanted to address that as a source of concern before continuing on.

Related

Keras Dense Neural Network Accuracy stuck at 0.5

My accuracy is stuck at 0.5. I already tried to vary with different parameters, such as learning_rate, optimizer, loss function, etc. But the accuracy always stays the same. Any ideas how to fix this? This is my code:
import tensorflow as tf
imdb = tf.keras.datasets.imdb
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)
import numpy as np
def vectorize_sequences(sequences, dimension=10000):
result = np.zeros((len(sequences),dimension))
for i, sequence in enumerate(sequences):
result[i, sequence] = 1.
return result
x_train = vectorize_sequences(train_data)
x_test = vectorize_sequences(test_data)
y_train = np.asarray(train_labels).astype('float32')
y_test = np.asarray(test_labels).astype('float32')
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(16,activation='relu', input_shape=(10000,)),
tf.keras.layers.Dense(16,activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer= tf.keras.optimizers.Adam(learning_rate=0.0001), loss= 'binary_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10)
And this is my output:
Epoch 1/10
782/782 [==============================] - 3s 3ms/step - loss: 0.6931 - accuracy: 0.5047
Epoch 2/10
782/782 [==============================] - 3s 3ms/step - loss: 0.6932 - accuracy: 0.5000
Epoch 3/10
782/782 [==============================] - 3s 3ms/step - loss: 0.6931 - accuracy: 0.5000
Epoch 4/10
782/782 [==============================] - 3s 4ms/step - loss: 0.6931 - accuracy: 0.4982
Epoch 5/10
782/782 [==============================] - 3s 3ms/step - loss: 0.6931 - accuracy: 0.4993
Epoch 6/10
782/782 [==============================] - 3s 3ms/step - loss: 0.6931 - accuracy: 0.4980
Epoch 7/10
782/782 [==============================] - 3s 3ms/step - loss: 0.6931 - accuracy: 0.5000
Epoch 8/10
782/782 [==============================] - 3s 4ms/step - loss: 0.6931 - accuracy: 0.4979
Epoch 9/10
782/782 [==============================] - 3s 3ms/step - loss: 0.6931 - accuracy: 0.4941
Epoch 10/10
782/782 [==============================] - 3s 3ms/step - loss: 0.6931 - accuracy: 0.4988

Same training errors in all nodes when using tensorflow MultiWorkerMirroredStrategy

I am trying to train model in three machines using tensorflow MultiWorkerMirroredStrategy. The script is based on the tensorflow tutorial:Multi-worker training with Keras(https://www.tensorflow.org/tutorials/distribute/multi_worker_with_keras#dataset_sharding_and_batch_size):
import tensorflow_datasets as tfds
import tensorflow as tf
tfds.disable_progress_bar()
import os
import json
strategy = tf.distribute.MultiWorkerMirroredStrategy()
BUFFER_SIZE = 10000
BATCH_SIZE = 64
def make_datasets_unbatched():
# scale MNIST data from (0, 255] to (0., 1.]
def scale(image, label):
image = tf.cast(image, tf.float32)
image /= 255
return image, label
# data download to /home/pzs/tensorflow_datasets/mnist/
datasets, info = tfds.load(name='mnist',
with_info=True,
as_supervised=True)
return datasets['train'].map(scale).cache().shuffle(BUFFER_SIZE)
def build_and_compile_cnn_model():
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, 3, activation='relu', input_shape=(28, 28, 1)),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(10)
])
model.compile(
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=tf.keras.optimizers.SGD(learning_rate=0.001),
metrics=['accuracy'])
return model
NUM_WORKERS = 3
GLOBAL_BATCH_SIZE = 64 * NUM_WORKERS
train_datasets = make_datasets_unbatched().batch(GLOBAL_BATCH_SIZE)
options = tf.data.Options()
options.experimental_distribute.auto_shard_policy = tf.data.experimental.AutoShardPolicy.OFF
train_datasets = make_datasets_unbatched().batch(BATCH_SIZE)
train_datasets = train_datasets.with_options(options)
with strategy.scope():
multi_worker_model = build_and_compile_cnn_model()
multi_worker_model.fit(x=train_datasets, epochs=30, steps_per_epoch=5)
I run this script separately on tree node3:
on node 1:
TF_CONFIG='{"cluster": {"worker": ["192.168.4.36:12346", "192.168.4.83:12346", "192.168.4.83:12346"]}, "task": {"index": 0, "type": "worker"}}' python3 multi_worker_with_keras.py
on node 2:
TF_CONFIG='{"cluster": {"worker": ["192.168.4.36:12346", "192.168.4.83:12346", "192.168.4.83:12346"]}, "task": {"index": 1, "type": "worker"}}' python3 multi_worker_with_keras.py
on node 3:
TF_CONFIG='{"cluster": {"worker": ["192.168.4.36:12346", "192.168.4.83:12346", "192.168.4.83:12346"]}, "task": {"index": 2, "type": "worker"}}' python3 multi_worker_with_keras.py
and the results of training error and accuracy are:
Epoch 1/30
2022-02-16 11:52:25.060362: I tensorflow/stream_executor/cuda/cuda_dnn.cc:366] Loaded cuDNN version 8201
5/5 [==============================] - 7s 195ms/step - loss: 2.3010 - accuracy: 0.0719
Epoch 2/30
5/5 [==============================] - 1s 181ms/step - loss: 2.2984 - accuracy: 0.0688
Epoch 3/30
5/5 [==============================] - 1s 182ms/step - loss: 2.2993 - accuracy: 0.0781
Epoch 4/30
5/5 [==============================] - 1s 182ms/step - loss: 2.2917 - accuracy: 0.0594
Epoch 5/30
5/5 [==============================] - 1s 182ms/step - loss: 2.2987 - accuracy: 0.0969
Epoch 6/30
5/5 [==============================] - 1s 183ms/step - loss: 2.2992 - accuracy: 0.0906
Epoch 7/30
5/5 [==============================] - 1s 181ms/step - loss: 2.2978 - accuracy: 0.1000
Epoch 8/30
5/5 [==============================] - 1s 183ms/step - loss: 2.2887 - accuracy: 0.0969
Epoch 9/30
5/5 [==============================] - 1s 182ms/step - loss: 2.2887 - accuracy: 0.0969
Epoch 10/30
5/5 [==============================] - 1s 183ms/step - loss: 2.2930 - accuracy: 0.0844
Epoch 11/30
5/5 [==============================] - 1s 184ms/step - loss: 2.2905 - accuracy: 0.1000
Epoch 12/30
5/5 [==============================] - 1s 184ms/step - loss: 2.2884 - accuracy: 0.0812
Epoch 13/30
5/5 [==============================] - 1s 186ms/step - loss: 2.2837 - accuracy: 0.1250
Epoch 14/30
5/5 [==============================] - 1s 189ms/step - loss: 2.2842 - accuracy: 0.1094
Epoch 15/30
5/5 [==============================] - 1s 190ms/step - loss: 2.2856 - accuracy: 0.0750
Epoch 16/30
5/5 [==============================] - 1s 192ms/step - loss: 2.2911 - accuracy: 0.0719
Epoch 17/30
5/5 [==============================] - 1s 188ms/step - loss: 2.2805 - accuracy: 0.1031
Epoch 18/30
5/5 [==============================] - 1s 187ms/step - loss: 2.2800 - accuracy: 0.1219
Epoch 19/30
5/5 [==============================] - 1s 190ms/step - loss: 2.2799 - accuracy: 0.1063
Epoch 20/30
5/5 [==============================] - 1s 192ms/step - loss: 2.2769 - accuracy: 0.1187
Epoch 21/30
5/5 [==============================] - 1s 193ms/step - loss: 2.2768 - accuracy: 0.1344
Epoch 22/30
5/5 [==============================] - 1s 190ms/step - loss: 2.2754 - accuracy: 0.1187
Epoch 23/30
5/5 [==============================] - 1s 190ms/step - loss: 2.2821 - accuracy: 0.1187
Epoch 24/30
5/5 [==============================] - 1s 188ms/step - loss: 2.2832 - accuracy: 0.0844
Epoch 25/30
5/5 [==============================] - 1s 190ms/step - loss: 2.2793 - accuracy: 0.1125
Epoch 26/30
5/5 [==============================] - 1s 191ms/step - loss: 2.2762 - accuracy: 0.1406
Epoch 27/30
5/5 [==============================] - 1s 194ms/step - loss: 2.2696 - accuracy: 0.1344
Epoch 28/30
5/5 [==============================] - 1s 192ms/step - loss: 2.2717 - accuracy: 0.1406
Epoch 29/30
5/5 [==============================] - 1s 191ms/step - loss: 2.2680 - accuracy: 0.1500
Epoch 30/30
5/5 [==============================] - 1s 193ms/step - loss: 2.2696 - accuracy: 0.1500
all results are exactly the same for 3 nodes.
my question is:
When using tf.distribute.MultiWorkerMirroredStrategy to train model among multiple machines, each process does forward and backward propagation independently using different slice of a batch training data, why the training errors are all the same for the corresponding epoch in 3 nodes? I try to run a different script and found the same case.
This is expected. The metric values would be allreduced in fit method.
https://github.com/tensorflow/tensorflow/issues/39343#issuecomment-627008557

Validation Accuracy not improving CNN

I am fairly new to deep learning and right now am trying to predict consumer choices based on EEG data. The total dataset consists of 1045 EEG recordings each with a corresponding label, indicating Like or Dislike for a product. Classes are distributed as follows (44% Likes and 56% Dislikes). I read that Convolutional Neural Networks are suitable to work with raw EEG data so I tried to implement a network based on keras with the following structure:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(full_data, target, test_size=0.20, random_state=42)
y_train = np.asarray(y_train).astype('float32').reshape((-1,1))
y_test = np.asarray(y_test).astype('float32').reshape((-1,1))
# X_train.shape = ((836, 512, 14))
# y_train.shape = ((836, 1))
from keras.optimizers import Adam
from keras.optimizers import SGD
from keras.layers import MaxPooling1D
model = Sequential()
model.add(Conv1D(16, kernel_size=3, activation="relu", input_shape=(512,14)))
model.add(MaxPooling1D())
model.add(Conv1D(8, kernel_size=3, activation="relu"))
model.add(MaxPooling1D())
model.add(Flatten())
model.add(Dense(1, activation="sigmoid"))
model.compile(optimizer=Adam(lr = 0.001), loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=20, batch_size = 64)
When I fit the model however the validation accuracy does not change at all with the following output:
Epoch 1/20
14/14 [==============================] - 0s 32ms/step - loss: 292.6353 - accuracy: 0.5383 - val_loss: 0.7884 - val_accuracy: 0.5407
Epoch 2/20
14/14 [==============================] - 0s 7ms/step - loss: 1.3748 - accuracy: 0.5598 - val_loss: 0.8860 - val_accuracy: 0.5502
Epoch 3/20
14/14 [==============================] - 0s 6ms/step - loss: 1.0537 - accuracy: 0.5598 - val_loss: 0.7629 - val_accuracy: 0.5455
Epoch 4/20
14/14 [==============================] - 0s 6ms/step - loss: 0.8827 - accuracy: 0.5598 - val_loss: 0.7010 - val_accuracy: 0.5455
Epoch 5/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7988 - accuracy: 0.5598 - val_loss: 0.8689 - val_accuracy: 0.5407
Epoch 6/20
14/14 [==============================] - 0s 6ms/step - loss: 1.0221 - accuracy: 0.5610 - val_loss: 0.6961 - val_accuracy: 0.5455
Epoch 7/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7415 - accuracy: 0.5598 - val_loss: 0.6945 - val_accuracy: 0.5455
Epoch 8/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7381 - accuracy: 0.5574 - val_loss: 0.7761 - val_accuracy: 0.5455
Epoch 9/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7326 - accuracy: 0.5598 - val_loss: 0.6926 - val_accuracy: 0.5455
Epoch 10/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7338 - accuracy: 0.5598 - val_loss: 0.6917 - val_accuracy: 0.5455
Epoch 11/20
14/14 [==============================] - 0s 7ms/step - loss: 0.7203 - accuracy: 0.5610 - val_loss: 0.6916 - val_accuracy: 0.5455
Epoch 12/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7192 - accuracy: 0.5610 - val_loss: 0.6914 - val_accuracy: 0.5455
Epoch 13/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7174 - accuracy: 0.5610 - val_loss: 0.6912 - val_accuracy: 0.5455
Epoch 14/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7155 - accuracy: 0.5610 - val_loss: 0.6911 - val_accuracy: 0.5455
Epoch 15/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7143 - accuracy: 0.5610 - val_loss: 0.6910 - val_accuracy: 0.5455
Epoch 16/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7129 - accuracy: 0.5610 - val_loss: 0.6909 - val_accuracy: 0.5455
Epoch 17/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7114 - accuracy: 0.5610 - val_loss: 0.6907 - val_accuracy: 0.5455
Epoch 18/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7103 - accuracy: 0.5610 - val_loss: 0.6906 - val_accuracy: 0.5455
Epoch 19/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7088 - accuracy: 0.5610 - val_loss: 0.6906 - val_accuracy: 0.5455
Epoch 20/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7075 - accuracy: 0.5610 - val_loss: 0.6905 - val_accuracy: 0.5455
Thanks in advance for any insights!
The phenomenon you run into is called underfitting. This happens when the amount our quality of your training data is insufficient, or your network architecture is too small and not capable to learn the problem.
Try normalizing your input data and experiment with different network architectures, learning rates and activation functions.
As #Muhammad Shahzad stated in his comment, adding some Dense Layers after flatting would be a concrete architecture adaption you should try.
You can also increase the epoch and must increase the data set. And you also can use-
train_datagen= ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
vertical_flip = True,
channel_shift_range=0.2,
fill_mode='nearest'
)
for feeding the model more data and I hope you can increase the validation_accuracy.

Tensorflow linear model returns nan

I have a very simple model where I try to predict the value for the expression 2x - 2
It works well, but here is my question.
So far I trained it based on just 20 values (-10 to 10), and it works fine. What I don't understand is that, when I train it on more values, let's say (-10 to 25), my prediction returns [[nan]]. Even the model weights are [<tf.Variable 'dense/kernel:0' shape=(1, 1) dtype=float32, numpy=array([[nan]], dtype=float32)>, <tf.Variable 'dense/bias:0' shape=(1,) dtype=float32, numpy=array([nan], dtype=float32)>]
Why does adding more training data result in nan?
import tensorflow as tf
import numpy as np
from tensorflow import keras
def gen_vals(x):
return x*2 - 2
model = tf.keras.Sequential([
keras.layers.InputLayer(input_shape=(1,)),
keras.layers.Dense(units=1)
])
model.compile(optimizer='sgd', loss='mean_squared_error', metrics=['accuracy'])
xs = []
ys = []
for x in range(-10, 10):
xs.append(x)
ys.append(gen_vals(x))
xs = np.array(xs, dtype=float)
ys = np.array(ys, dtype=float)
model.fit(xs, ys, epochs=500)
print(model.predict([20]))
So I checked your code and the problem is in your loss function. You are using mean_squared_erro. Due to this, your error is reaching infinity.
Epoch 1/15
7/7 [==============================] - 0s 1ms/step - loss: 22108.5449 - accuracy: 0.0000e+00
Epoch 2/15
7/7 [==============================] - 0s 1ms/step - loss: 2046332.6250 - accuracy: 0.0286
Epoch 3/15
7/7 [==============================] - 0s 1ms/step - loss: 18862860288.0000 - accuracy: 0.0000e+00
Epoch 4/15
7/7 [==============================] - 0s 1ms/step - loss: 8550264864768.0000 - accuracy: 0.0286
Epoch 5/15
7/7 [==============================] - 0s 1ms/step - loss: 24012283831123968.0000 - accuracy: 0.0000e+00
Epoch 6/15
7/7 [==============================] - 0s 1ms/step - loss: 22680820415763316736.0000 - accuracy: 0.0000e+00
Epoch 7/15
7/7 [==============================] - 0s 1ms/step - loss: 1655609635839244500992.0000 - accuracy: 0.0000e+00
Epoch 8/15
7/7 [==============================] - 0s 1ms/step - loss: 611697420191128514199552.0000 - accuracy: 0.0000e+00
Epoch 9/15
7/7 [==============================] - 0s 1ms/step - loss: 229219278753403035799519232.0000 - accuracy: 0.0286
Epoch 10/15
7/7 [==============================] - 0s 1ms/step - loss: 2146224141449145393293494845440.0000 - accuracy: 0.0000e+00
Epoch 11/15
7/7 [==============================] - 0s 1ms/step - loss: 1169213631609383639522618269237248.0000 - accuracy: 0.0000e+00
Epoch 12/15
7/7 [==============================] - 0s 1ms/step - loss: 1042864695227246165669313090114551808.0000 - accuracy: 0.0000e+00
Epoch 13/15
7/7 [==============================] - 0s 1ms/step - loss: inf - accuracy: 0.0286
Epoch 14/15
7/7 [==============================] - 0s 3ms/step - loss: inf - accuracy: 0.0286
Epoch 15/15
7/7 [==============================] - 0s 1ms/step - loss: inf - accuracy: 0.0286
As MSE loss function squares the actual loss and due to the toy dataset that you have it might happen that it reaches inf as in your case.
I will suggest using MAE mean absolute error for your toy example and toy network.
I checked the network provides decent results.
import tensorflow as tf
import numpy as np
from tensorflow import keras
def gen_vals(x):
return x*2 - 2
model = tf.keras.Sequential([
keras.layers.InputLayer(input_shape=(1,)),
keras.layers.Dense(units=1)
])
model.compile(optimizer='sgd', loss='mae', metrics=['accuracy'])
xs = []
ys = []
for x in range(-10, 25):
xs.append(x)
ys.append(gen_vals(x))
Epoch 1/15
7/7 [==============================] - 0s 1ms/step - loss: 14.5341 - accuracy: 0.0000e+00
Epoch 2/15
7/7 [==============================] - 0s 2ms/step - loss: 7.5144 - accuracy: 0.0000e+00
Epoch 3/15
7/7 [==============================] - 0s 2ms/step - loss: 2.0986 - accuracy: 0.0000e+00
Epoch 4/15
7/7 [==============================] - 0s 1ms/step - loss: 1.4349 - accuracy: 0.0000e+00
Epoch 5/15
7/7 [==============================] - 0s 1ms/step - loss: 1.3424 - accuracy: 0.0000e+00
Epoch 6/15
7/7 [==============================] - 0s 1ms/step - loss: 1.5290 - accuracy: 0.0000e+00
Epoch 7/15
7/7 [==============================] - 0s 1ms/step - loss: 1.4349 - accuracy: 0.0000e+00
Epoch 8/15
7/7 [==============================] - 0s 1ms/step - loss: 1.2839 - accuracy: 0.0000e+00
Epoch 9/15
7/7 [==============================] - 0s 1ms/step - loss: 1.4003 - accuracy: 0.0000e+00
Epoch 10/15
7/7 [==============================] - 0s 1ms/step - loss: 1.4593 - accuracy: 0.0000e+00
Epoch 11/15
7/7 [==============================] - 0s 1ms/step - loss: 1.4561 - accuracy: 0.0000e+00
Epoch 12/15
7/7 [==============================] - 0s 1ms/step - loss: 1.4761 - accuracy: 0.0000e+00
Epoch 13/15
7/7 [==============================] - 0s 2ms/step - loss: 1.3080 - accuracy: 0.0000e+00
Epoch 14/15
7/7 [==============================] - 0s 1ms/step - loss: 1.1885 - accuracy: 0.0000e+00
Epoch 15/15
7/7 [==============================] - 0s 1ms/step - loss: 1.2665 - accuracy: 0.0000e+00
[[38.037006]]

Why does ETA increase so much when i define steps_per_epoch?

This is my training function:
model.fit(treinar_estados, treinar_mov, epochs= numEpochs,
validation_data = (testar_estados,testar_mov))
which generates this:
Train on 78800 samples, validate on 33780 samples
Epoch 1/100
32/78800 [..............................] - ETA: 6:37 - loss: 4.8805 - acc: 0.0000e+00
640/78800 [..............................] - ETA: 26s - loss: 4.1140 - acc: 0.0844
1280/78800 [..............................] - ETA: 16s - loss: 3.7132 - acc: 0.1172
1920/78800 [..............................] - ETA: 12s - loss: 3.5422 - acc: 0.1354
2560/78800 [..............................] - ETA: 11s - loss: 3.4102 - acc: 0.1582
3200/78800 [>.............................] - ETA: 10s - loss: 3.3105 - acc: 0.1681
3840/78800 [>.............................] - ETA: 9s - loss: 3.2102 - acc: 0.1867
...
but when i define steps_per_epoch:
model.fit(treinar_estados, treinar_mov, epochs= numEpochs,
validation_data = (testar_estados,testar_mov),
steps_per_epoch=78800//32,
validation_steps=33780//32)
this happens:
Epoch 1/100
1/2462 [..............................] - ETA: 2:53:46 - loss: 4.8079 - acc: 9.3909e-04
2/2462 [..............................] - ETA: 2:02:31 - loss: 4.7448 - acc: 0.0116
3/2462 [..............................] - ETA: 1:45:10 - loss: 4.6837 - acc: 0.0437
4/2462 [..............................] - ETA: 1:36:48 - loss: 4.6196 - acc: 0.0583
5/2462 [..............................] - ETA: 1:30:55 - loss: 4.5496 - acc: 0.0666
6/2462 [..............................] - ETA: 1:26:40 - loss: 4.4721 - acc: 0.0718
7/2462 [..............................] - ETA: 1:23:43 - loss: 4.3886 - acc: 0.0752
So i really wanna undestand, is this normal? if not, what could be the cause?
This is the model:
model = keras.Sequential([
keras.layers.Flatten(input_shape=(8, 4, 4)),
keras.layers.Dense(300, activation=tf.nn.relu),
keras.layers.Dense(300, activation=tf.nn.relu),
keras.layers.Dense(300, activation=tf.nn.relu),
keras.layers.Dense(128, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])

Categories

Resources