I load the Keras model I have been training with 150 epochs
tbCallBack = tensorflow.keras.callbacks.TensorBoard(log_dir='./Graph', histogram_freq=0, write_graph=True, write_images=True)
my_model.fit(X_train, X_train,
validation_data = (X_test, X_test),
# Save the model
Then, I will load the Keras model
my_model = load_model("my_model.hdf5")
Is there a way to load all the epochs logs (loss, accuracy.. ) ?
You can use the keras callback called CSVLogger.
According to the documentation, it streams the results from each epoch into a csv file.
This is the code from the documentation of it.
from keras.callbacks import CSVLogger
csv_logger = CSVLogger('training.log')
model.fit(X_train, Y_train, callbacks=[csv_logger])
You can then manipulate it as a normal CSV file, for your needs.
I'm learning Tensorflow and I'm trying to pass a list of metrics to be evaluated when I compile classification models in a loop, like the example in the documentation.
However, when I pass a list like:
keras.metrics.AUC(name='prc', curve='PR')
def compile_model(model, X_train, y_train, X_val=None, y_val=None, callbacks=None, batch_size=1000, epochs = 10, optimizer='adam', loss_func=keras.losses.BinaryCrossentropy(), metrics=['accuracy']):
# Copy the model
mdl = model
mdl.compile(optimizer = optimizer, loss = loss_func, metrics = metrics)
# Fit model
if (X_val is not None) and (y_val is not None):
mdl.fit(X_train, y_train, epochs = epochs, batch_size=batch_size, callbacks=callbacks,
validation_data = (X_val, y_val))
mdl.fit(X_train, y_train, epochs = epochs, batch_size=batch_size, callbacks=callbacks)
return mdl
# Compile model
model = compile_model(model, X_train, y_train, X_test, y_test, callbacks=callbacks, batch_size=1000, epochs = 10, metrics=METRICS)
I get the error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'metrics/tp/AssignAddVariableOp/resource_1' with dtype resource
[[node metrics/tp/AssignAddVariableOp/resource_1 (defined at /opt/anaconda3/envs/tf/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1751) ]] [Op:__inference_keras_scratch_graph_6163]
If I try to compile the model with a list of metrics like
model = compile_model(model, X_train, y_train, X_test, y_test, callbacks=callbacks, batch_size=1000, \
epochs = 10, metrics=[keras.metrics.AUC(name='auc'), keras.metrics.AUC(name='prc', curve='PR')] )
the models are able to compile without any issues. None of the related questions I've seen about these placeholder tensors involve Keras' metrics. Why does the error only occur when I try to pass the metrics that way?
I'm training a ResNet model to classify car brands.
I saved the weights during training for every epoch.
For a test, I stopped the training at epoch 3.
# checkpoint = ModelCheckpoint("best_model.hdf5", monitor='loss', verbose=1)
checkpoint_path = "weights/cp-{epoch:04d}.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(
checkpoint_path, verbose=1,
# Save weights, every epoch.
history = model.fit_generator(
validation_data = test_set,
epochs = 50,
steps_per_epoch = len(training_set),
validation_steps = len(test_set),
callbacks = [cp_callback]
However, when loading them, I am unsure if it is resuming from the last epoch saved one since it says epoch 1/50 again. Below is the code I use to load the last saved model.
from keras.models import Sequential, load_model
# load the model
new_model = load_model('./weights/cp-0003.ckpt')
# fit the model
history = new_model.fit_generator(
validation_data = test_set,
epochs = 50,
steps_per_epoch = len(training_set),
validation_steps = len(test_set),
callbacks = [cp_callback]
This is what it looks like:
Image showing that running the saved weight starts from epoch 1/50 again
Can someone please help?
You can use the initial_epoch argument of the fit_generator. By default, it is set to 0 but you can set it to any positive number:
from keras.models import Sequential, load_model
import tensorflow as tf
checkpoint_path = "weights/cp-{epoch:04d}.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(
checkpoint_path, verbose=1,
# Save weights, every epoch.
history = model.fit_generator(
callbacks = [cp_callback]
new_model = load_model('./weights/cp-0003.ckpt')
# fit the model
history = new_model.fit_generator(
This will train your model for 50 - 3 = 47 additional epochs.
Some remarks regarding your code if you use Tensorflow 2.X:
fit_generator is deprecated since fit supports generator now
you should replace your import from keras.... to from tensorflow.keras...
If training is early stopped, the best model is saved to the file best_model.h5. But loading the model from the file takes a relatively long time. Is there a way to get the best model in another way?
For example, by creating the file in memory and reading it from memory. Or by putting each model for each epoch into a list and getting the best model from the list by using EarlyStopping.stopped_epoch to access the corresponding list item.
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.callbacks import EarlyStopping, ModelCheckpoint
from keras.models import load_model
from sklearn.model_selection import train_test_split
df = pd.DataFrame(np.random.randint(0,100,size=(1000, 3))/100, columns=['x_1', 'x_2','y'])
x_train, x_test, y_train, y_test = train_test_split(df[['x_1', 'x_2']], df[['y']], test_size=0.2, random_state=0)
x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.2, random_state=0)
callbacks = [EarlyStopping(monitor='val_loss', patience=2),
ModelCheckpoint(filepath='best_model.h5', monitor='val_loss', save_best_only=True)]
model = Sequential()
model.add(Dense(units=1, activation='linear', input_dim=2))
model.compile(loss='mean_squared_error',optimizer='adam',metrics=['mean_absolute_error', 'mean_squared_error'])
model.fit(x_train, y_train, epochs=100, batch_size=32, callbacks=callbacks, validation_data=(x_val, y_val))
model = load_model('best_model.h5')
print(model.evaluate(x_test, y_test, batch_size=32))
You need to store the model's current state at some point because further training will change this state.
If you are not going to save (the easiest option with ModelCheckpoint), then you need a custom callback that performs a stored_weights = model.get_weights().
Option 1
You can try to use save_weights_only=True in the ModelCheckpoint and later, with the same model use model.load_weights(path). This will not create a new model, only load the weights.
Option 2
If even so you think it's too slow, then you can create your own callback and use RAM to store the weights:
from keras.callbacks import LambdaCallback
bestLoss = 1000000000000000000
bestWeights = None
def storeWeights(e, logs):
if logs['val_loss'] < bestLoss:
bestLoss = logs['val_loss']
bestWeights = model.get_weights()
callbacks = [EarlyStopping(monitor='val_loss', patience=2),
#train here
#finished train
I encountered some problems when I continued training my model and visualized the progress on tensorboard.
My question is how do I resume training from the same step without specifying any epoch manually? If possible, simply by loading the saved model, it somehow could read the global_step from the optimizer saved and continue training from there.
I have provided some codes below to reproduce similar errors.
import tensorflow as tf
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.models import load_model
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(512, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
model.fit(x_train, y_train, epochs=10, callbacks=[Tensorboard()])
model.save('./final_model.h5', include_optimizer=True)
del model
model = load_model('./final_model.h5')
model.fit(x_train, y_train, epochs=10, callbacks=[Tensorboard()])
You can run the tensorboard by using the command:
tensorboard --logdir ./logs
You can set the parameter initial_epoch in the function model.fit() to the number of the epoch you want your training to start from. Take into account that the model trains until the epoch of index epochs is reached (and not a number of iterations given by epochs).
In your example, if you want to train for 10 epochs more, it should be:
model.fit(x_train, y_train, initial_epoch=9, epochs=19, callbacks=[Tensorboard()])
It will allow you to visualise your plots on Tensorboard in a correct manner.
More extensive information about these parameters can be found in the docs.
Here is sample code in case someone needs it. It implements the idea proposed by Abhinav Anand:
mca = ModelCheckpoint(join(dir, 'model_{epoch:03d}.h5'),
monitor = 'loss',
save_best_only = False)
tb = TensorBoard(log_dir = join(dir, 'logs'),
write_graph = True,
write_images = True)
files = sorted(glob(join(fold_dir, 'model_???.h5')))
if files:
model_file = files[-1]
initial_epoch = int(model_file[-6:-3])
print('Resuming using saved model %s.' % model_file)
model = load_model(model_file)
model = nn.model()
initial_epoch = 0
epochs = 100,
initial_epoch = initial_epoch,
callbacks = [mca, tb])
Replace nn.model() with your own function for defining the model.
It's very simple. Create checkpoints while training the model and then use those checkpoints to resume training from where you left of.
import tensorflow as tf
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.models import load_model
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(512, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
model.fit(x_train, y_train, epochs=10, callbacks=[Tensorboard()])
model.save('./final_model.h5', include_optimizer=True)
model = load_model('./final_model.h5')
callbacks = list()
tensorboard = Tensorboard()
file_path = "model-{epoch:02d}-{loss:.4f}.hdf5"
# now here you can create checkpoints and save according to your need
# here period is the no of epochs after which to save the model every time during training
# another option is save_weights_only, for your case it should be false
checkpoints = ModelCheckpoint(file_path, monitor='loss', verbose=1, period=1, save_weights_only=False)
model.fit(x_train, y_train, epochs=10, callbacks=callbacks)
After this just load the checkpoint from where you want to resume training again
model = load_model(checkpoint_of_choice)
model.fit(x_train, y_train, epochs=10, callbacks=callbacks)
And you are done.
Let me know if you have more questions about this.
What I have done?
I implemented a keras model as follow:
train_X, test_X, train_Y, test_Y = train_test_split(X, Y, test_size=0.2, random_state=np.random.seed(7), shuffle=True)
train_X = np.reshape(train_X, (train_X.shape[0], 1, train_X.shape[1]))
test_X = np.reshape(test_X, (test_X.shape[0], 1, test_X.shape[1]))
model = Sequential()
model.add(LSTM(100, return_sequences=False, input_shape=(train_X.shape[1], train_X.shape[2])))
model.add(Dense(train_Y.shape[1], activation='softmax'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
model.fit(train_X, train_Y, validation_split=.20,
epochs=1000, batch_size=50)
What i want?
I want to give support vector machine(SVM) the output of the penultimate layer (LSTM), in any epoch(that is 1000) to svm also be trained.
But I do not know how to do this?
Any idea?
I use from ModelCheckpoint as follow:
model = Sequential()
model.add(LSTM(100, return_sequences=False, input_shape=(train_X.shape[1], train_X.shape[2])))
model.add(Dense(train_Y.shape[1], activation='softmax'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
# checkpoint
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]
model.fit(train_X, train_Y, validation_split=.20,
epochs=1000, batch_size=50, callbacks=callbacks_list, verbose=0)
Epoch 00991: val_acc did not improve
Epoch 00992: val_acc improved from 0.93465 to 0.93900, saving model to weights-992-0.94.hdf5
Epoch 00993: val_acc did not improve
Epoch 00994: val_acc did not improve
Epoch 00995: val_acc did not improve
Epoch 00996: val_acc did not improve
Epoch 00997: val_acc did not improve
Epoch 00998: val_acc improved from 0.93900 to 0.94543, saving model to weights-998-0.94.hdf5
Epoch 00999: val_acc did not improve
How to load all these models to obtain the output of the LSTM layer in each epochs as #IonicSolutions said?
What works best in your situation depends on how exactly you set up and train your SVM, but there are at least two options using callbacks:
You could use the ModelCheckpoint callback to save a copy of the model you are training at each epoch and then later load all these models to obtain the output of the LSTM layer.
You can also create your own callback by implementing the Callback base class. Within the callback, the model can be accessed and you can use on_epoch_end to extract the LSTM output at the end of each epoch.
Edit: To get convenient access to the penultimate layer, you can do the following:
# Create the model with the functional API
inp = Input((train_X.shape[1], train_X.shape[2],))
lstm = LSTM(100, return_sequences=False)(inp)
dense = Dense(train_Y.shape[1], activation='softmax')(lstm)
# Create the full model
model = Model(inputs=inp, outputs=dense)
# Create the model for access to the LSTM layer
access = Model(inputs=inp, outputs=lstm)
Then, you can pass access to your callback when you instantiate it. The key thing to note here is that model and access share the very same LSTM layer, whose weights will change when training model.
In order to get prediction output at each epoch here is what we can do:
import tensorflow as tf
import keras
# define your custom callback for prediction
class PredictionCallback(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs={}):
y_pred = self.model.predict(self.validation_data[0])
print('prediction: {} at epoch: {}'.format(y_pred, epoch))
# ...
# register the callback before training starts
model.fit(X_train, y_train, batch_size=32, epochs=25,
validation_data=(X_valid, y_valid),