Tensorboard Display Validation Data and Training Data in two Graphs - python

I try to display the accuracy and loss of my net with Tensorboard as graphs, but the training and validation data are shown as separate runs. I am still relatively inexperienced with Tensorflow and Tensorboard, so I hope you can see the reason for this
Here is my code:
import os
import time
import pickle
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.callbacks import TensorBoard
print("Loading Data via Pickel")
X = pickle.load(open("X.pickle", "rb"))
y = pickle.load(open("y.pickle", "rb"))
print(len(X))
print(len(y))
startTime = time.time()
hidden_dense_layers = [0,1,2]
hidden_dense_layer_size = [64, 128, 256, 512, 1024]
for dense_layer_ammount in hidden_dense_layers:
for dense_layer_size in hidden_dense_layer_size:
NAME = "{}-hidden_layers-{}-layersize".format(dense_layer_ammount, dense_layer_size)
print("----------", NAME, "----------")
print("Building Model")
# model = keras.Sequential([
# keras.layers.Flatten(input_shape=(200, 200)),
# keras.layers.Dense(500, activation="relu"),
# keras.layers.Dense(1, activation="sigmoid")
# ])
model = keras.Sequential()
model.add(keras.layers.Flatten(input_shape=(75, 75)))
for i in range(dense_layer_ammount):
model.add(keras.layers.Dense(dense_layer_size, activation="relu"))
model.add(keras.layers.Dense(1, activation="sigmoid"))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
print("Creating Callbacks")
print("Creating Checkpoint Callback")
checkpoint_path = "training_2/cp-{epoch:04d}.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)
# Create a callback that saves the model's weights
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_path,
save_weights_only=True,
verbose=1
)
print("Creating Tensorboard Callback")
tensorboard_callback = TensorBoard(log_dir="logs/{}".format(NAME))
print("Training Model")
model.fit(
X,
y,
# batch_size=32,
epochs=10,
callbacks=[
# checkpoint_callback,
tensorboard_callback
],
validation_split=0.3
)
Here is how the runs are Displayed for me
Here is how the Graphs are displayed to me

It is completely normal to have two curves for both graphs. Each curve corresponds to training data or validation data (resp. orange and blue on your plots). To each epoch you get a two-step process:
first you get the actual model parameter tuning with gradient descent, the training step. The blue curve tells you learn something (e.g.: is the model complex enough for the given task ?).
secondly you need to make sure that the trained model is performing well on data that have not been used to tune the parameter, this is the validation step. The red curve will tell you how close you are to an overfitting situation (meaning that you get good performances for the tuning part, but that the model is very bad when feeding with "new data").

Related

No hparams data was found when using tensorboard with keras-tuner

versions: tensorboard==2.9.0, keras-tuner==1.1.2
Here is simple model of binary classification with hyperparameters to search added in the model by using keras-tuner.
def build_model(hp):
n_layers = 4
n_features = len(X_train.columns)
inputs = tf.keras.Input(shape=(n_features,))
dense = tf.keras.layers.Dense(hp.Int("input_units", min_value=128, max_value=256, step=32),
activation=hp.Choice("activation", ['relu', 'tanh'])
)(inputs)
dense = tf.keras.layers.Dropout(0.2)(dense)
# num_layer as hyperparameter
for i in range(hp.Int("dense_layer", 1, n_layers)):
dense = tf.keras.layers.Dense(hp.Int(f"hidden_unit_{i}", 128, 256, 32),
activation=hp.Choice("activation", ['relu', 'tanh'])
)(dense)
output = tf.keras.layers.Dense(1, activation='sigmoid')(dense)
model = tf.keras.Model(inputs=inputs, outputs=output)
lr = hp.Float("lr", min_value=1e-4, max_value=1e-1, sampling="log")
model.compile(optimizer=tf.keras.optimizers.Adam(lr),
loss=tf.keras.losses.BinaryCrossentropy(),
metrics=metrics)
return model
hyperparameter search space would be
{neurons:[128, 160, 192, 224, 256],
num_hidden_layers:[1,2,3],
activation_function = ['relu', 'tanh'],
learning_rate = [0.0001, 0.001, 0.01]}
Now begin search
tuner = RandomSearch(
build_model,
objective = kt.Objective("val_binary_accuracy", direction="max"),
max_trials = 3,
executions_per_trial = 1,
directory=LOG_DIR
)
tensorboard_cb = tf.keras.callbacks.TensorBoard('logs/hyp_tune/')
tuner.search(X_train, y_train, epochs=10, batch_size=512,
validation_data=(X_test, y_test),
callbacks=[tensorboard_cb]
)
From keras-tuner guide https://keras.io/guides/keras_tuner/visualize_tuning/ This should work fine, showing Hparams when opening tensorboard.
However when I select HPARAMS tab, it outputs message below:
No hparams data was found.
Probable causes:
You haven’t written any hparams data to your event files.
Event files are still being loaded (try reloading this page).
TensorBoard can’t find your event files.
If you’re new to using TensorBoard, and want to find out how to add data and set up your event files, check out the README and perhaps the TensorBoard tutorial.
If you think TensorBoard is configured properly, please see the section of the README devoted to missing data problems and consider filing an issue on GitHub.
I've tried re-searching, restarting notebook, however cannot still no luck.
[EDIT]
when I load tensorboard tensorboard --logdir='logs/t1' it should show logs/t1 at left side of screen below Runs however it shows logs/t0 which is previous run(simple model run w/o hyperparameter tuning) I think since it is showing previous run w/o hyperparameter tuning it has no data showing in HPARAMS tab. How can I delete previous log and load new one? (overwriting hyperparameter tuning model with 'logs/t0' works fine)
I write this code and run it correctly:
At the end, use these two commands and get your output:
%load_ext tensorboard
%tensorboard --logdir /logs/hyp_tune/
Full code:
# !pip install keras-tuner -q
import numpy as np
import keras_tuner
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
(x_train, y_train), (x_test, y_test) = (np.random.rand(1000,4), np.random.rand(1000)) , (np.random.rand(100,4), np.random.rand(100))
def build_model(hp):
n_layers = 4
n_features = x_train.shape[1]
inputs = tf.keras.Input(shape=(n_features,))
dense = tf.keras.layers.Dense(hp.Int("input_units", min_value=128, max_value=256, step=32),
activation=hp.Choice("activation", ['relu', 'tanh'])
)(inputs)
dense = tf.keras.layers.Dropout(0.2)(dense)
# num_layer as hyperparameter
for i in range(hp.Int("dense_layer", 1, n_layers)):
dense = tf.keras.layers.Dense(hp.Int(f"hidden_unit_{i}", 128, 256, 32),
activation=hp.Choice("activation", ['relu', 'tanh'])
)(dense)
output = tf.keras.layers.Dense(1, activation='sigmoid')(dense)
model = tf.keras.Model(inputs=inputs, outputs=output)
lr = hp.Float("lr", min_value=1e-4, max_value=1e-1, sampling="log")
model.compile(optimizer=tf.keras.optimizers.Adam(lr),
loss=tf.keras.losses.BinaryCrossentropy(),
metrics=["accuracy"])
return model
hp = keras_tuner.HyperParameters()
model = build_model(hp)
model.summary()
tuner = keras_tuner.RandomSearch(
build_model,
max_trials=10,
overwrite=True,
objective="val_accuracy",
# Set a directory to store the intermediate results.
directory="/logs/hyp_tune/",
)
tensorboard_cb = tf.keras.callbacks.TensorBoard('/logs/hyp_tune/')
tuner.search(
x_train,
y_train,
validation_data=(x_test, y_test),
batch_size=512,
epochs=10,
callbacks=[tensorboard_cb],
)
output:
%load_ext tensorboard
%tensorboard --logdir /logs/hyp_tune/

Issues with Keras load_model function

I am building a CNN in Keras using a Tensorflow backend for speaker identification, and currently I am attempting to train the model and then save it in as an .hdf5 file. The program trains the model for 100 epochs with early stopping and checkpoints, saving only the best model to a file, as illustrated in the code below:
class BuildModel:
# Create First Model in Ensemble
def createModel(self, model_input, n_outputs, first_session=True):
if first_session != True:
model = load_model('SI_ideal_model_fixed.hdf5')
return model
# Define Input Layer
inputs = model_input
# Define Densely Connected Layers
conv = Dense(16, activation='relu')(inputs)
conv = Dense(64, activation='relu')(conv)
conv = Dense(16, activation='relu')(conv)
conv = Reshape((conv.shape[1]*conv.shape[2]*conv.shape[3],))(conv)
outputs = Dense(n_outputs, activation='softmax')(conv)
# Create Model
model = Model(inputs, outputs)
model.summary()
return model
# Train the Model
def evaluateModel(self, x_train, x_val, y_train, y_val, num_classes, first_session=True):
# Model Parameters
verbose, epochs, batch_size, patience = 1, 100, 64, 10
# Determine Input and Output Dimensions
x = x_train[0].shape[0] # Number of MFCC rows
y = x_train[0].shape[1] # Number of MFCC columns
c = 1 # Number of channels
# Create Model
inputs = Input(shape=(x, y, c), name='input')
model = self.createModel(model_input=inputs,
n_outputs=num_classes,
first_session=first_session)
# Compile Model
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
# Callbacks
es = EarlyStopping(monitor='val_loss',
mode='min',
verbose=verbose,
patience=patience,
min_delta=0.0001) # Stop training at right time
mc = ModelCheckpoint('SI_ideal_model_fixed.hdf5',
monitor='val_accuracy',
verbose=verbose,
save_best_only=True,
mode='max') # Save best model after each epoch
reduce_lr = ReduceLROnPlateau(monitor='val_loss',
factor=0.2,
patience=patience//2,
min_lr=1e-3) # Reduce learning rate once learning stagnates
# Evaluate Model
model.fit(x_train, y=y_train, epochs=epochs,
callbacks=[es,mc,reduce_lr], batch_size=batch_size,
validation_data=(x_val, y_val))
accuracy = model.evaluate(x=x_train, y=y_train,
batch_size=batch_size,
verbose=verbose)
# Load Best Model
model = load_model('SI_ideal_model_fixed.hdf5')
return (accuracy[1], model)
However, it appears that the load_model function is not working properly since the model achieved a validation accuracy of 0.56193 after the first training session but then only started with a validation accuracy of 0.2508 at the beginning of the second training session. (From what I have seen, the first epoch of the second training session should have a validation accuracy much closer to the that of the best model.)
Moreover, I then attempted to test the trained model on a set of unseen samples with model.predict, and it failed on all six, often with high probabilities, which leads me to believe that it was using minimally trained (or untrained) weights.
So, my question is could this be an issue from loading and saving the models using the load_model and ModelCheckpoint functions? If so, what is the best alternative method? If not, what are some good troubleshooting tips for improving the model's prediction functionality?
I am not sure what you mean by training session. What I would do is first train for a few epochs epochs and note the validation accuracy. Then, load the model and use evaluate() to get the same accuracy. If it differs, then yes something is wrong with your loading. Here is what I would do:
def createModel(self, model_input, n_outputs):
# Define Input Layer
inputs = model_input
# Define Densely Connected Layers
conv = Dense(16, activation='relu')(inputs)
conv2 = Dense(64, activation='relu')(conv)
conv3 = Dense(16, activation='relu')(conv2)
conv4 = Reshape((conv.shape[1]*conv.shape[2]*conv.shape[3],))(conv3)
outputs = Dense(n_outputs, activation='softmax')(conv4)
# Create Model
model = Model(inputs, outputs)
return model
# Train the Model
def evaluateModel(self, x_train, x_val, y_train, y_val, num_classes, first_session=True):
# Model Parameters
verbose, epochs, batch_size, patience = 1, 100, 64, 10
# Determine Input and Output Dimensions
x = x_train[0].shape[0] # Number of MFCC rows
y = x_train[0].shape[1] # Number of MFCC columns
c = 1 # Number of channels
# Create Model
inputs = Input(shape=(x, y, c), name='input')
model = self.createModel(model_input=inputs,
n_outputs=num_classes)
# Compile Model
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
# Callbacks
es = EarlyStopping(monitor='val_loss',
mode='min',
verbose=verbose,
patience=patience,
min_delta=0.0001) # Stop training at right time
mc = ModelCheckpoint('SI_ideal_model_fixed.h5',
monitor='val_accuracy',
verbose=verbose,
save_best_only=True,
save_weights_only=False) # Save best model after each epoch
reduce_lr = ReduceLROnPlateau(monitor='val_loss',
factor=0.2,
patience=patience//2,
min_lr=1e-3) # Reduce learning rate once learning stagnates
# Evaluate Model
model.fit(x_train, y=y_train, epochs=5,
callbacks=[es,mc,reduce_lr], batch_size=batch_size,
validation_data=(x_val, y_val))
model.evaluate(x=x_val, y=y_val,
batch_size=batch_size,
verbose=verbose)
# Load Best Model
model2 = load_model('SI_ideal_model_fixed.h5')
model2.evaluate(x=x_val, y=y_val,
batch_size=batch_size,
verbose=verbose)
return (accuracy[1], model)
The two evaluations should print the same thing really.
P.S. TF might change the order of your computations so I used different names to prevent that in the model e.g. conv1, conv2 ...)

Tensorboard does not show all training data

Hello I am rather new to neural network. I am trying to plot the MAE of my training and validation data through tensorboard. However, the tensorboard only shows a small portion of the training data. It shows all the validation data though. I also plot the history using matplotlib, and everything is correct. Below is my code and the graph. What is the problem and how do I fix it? Thank you in advance.
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import numpy as np
from tensorflow import keras
from datetime import datetime
# ---------train the neural network-----------
model = keras.Sequential([ # Sequential training function
keras.layers.Dense(30, activation="relu", kernel_initializer='random_uniform',
bias_initializer='zeros', input_shape=(2,), name="input"), # input layer
keras.layers.Dense(30, activation="relu", name="hidden_1"), # the first hidden layer with 30 neurons
keras.layers.Dense(30, activation="relu", name="hidden_2"), # the second hidden layer with 30 neurons
keras.layers.Dense(1, name="output") # the output layer with 1 neuron since there is only one output
])
epochs = 500
optimizer = keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, amsgrad=True) # define the optimizer
tb = keras.callbacks.TensorBoard(log_dir=r"C:\Users\Beichao\Desktop\BEICHAO\Neural Network\python code\regression"
.format(datetime.now()))
model.compile(loss="mean_squared_error", optimizer=optimizer, metrics=['mae']) # define some parameters
history = model.fit(x_train, y_train, batch_size=32, validation_data=[x_validation, y_validation]
, epochs=epochs, verbose=0, callbacks=[tb]) # train the NN
error = model.evaluate(x_validation, y_validation)
print(f"mean absolute error is {error}")
tensorboard figure
matplotlib figure

Use tensorflow learning-rate decay in a Keras-to-TPU model

I'm following the "How to train Keras model x20 times faster with TPU for free" guide (click here) to run a keras model on google's colab TPU. It works perfectly. But...I like to use cosine restart learning rate decay when I fit my models. I've coded up my own as a keras callback, but it won't work within this framework because the tensorflow TFOptimizer class doesn't have a learning-rate variable that can be reset. I see that tensorflow itself has a bunch of decay function in tf.train, like tf.train.cosine_decay but I can't figure out how to embed it within my model.
Here's the basic code from that blog post. Anyone have a fix?
import tensorflow as tf
import os
from tensorflow.python.keras.layers import Input, LSTM, Bidirectional, Dense, Embedding
def make_model(batch_size=None):
source = Input(shape=(maxlen,), batch_size=batch_size,
dtype=tf.int32, name='Input')
embedding = Embedding(input_dim=max_features,
output_dim=128, name='Embedding')(source)
lstm = LSTM(32, name='LSTM')(embedding)
predicted_var = Dense(1, activation='sigmoid', name='Output')(lstm)
model = tf.keras.Model(inputs=[source], outputs=[predicted_var])
model.compile(
optimizer=tf.train.RMSPropOptimizer(learning_rate=0.01),
loss='binary_crossentropy',
metrics=['acc'])
return model
training_model = make_model(batch_size=128)
# This address identifies the TPU we'll use when configuring TensorFlow.
TPU_WORKER = 'grpc://' + os.environ['COLAB_TPU_ADDR']
tf.logging.set_verbosity(tf.logging.INFO)
tpu_model = tf.contrib.tpu.keras_to_tpu_model(
training_model,
strategy=tf.contrib.tpu.TPUDistributionStrategy(
tf.contrib.cluster_resolver.TPUClusterResolver(TPU_WORKER)))
history = tpu_model.fit(x_train, y_train,
epochs=20,
batch_size=128 * 8,
validation_split=0.2)
One option is to manually set the learning rates - there is a Keras+TPU example with a callback here: https://github.com/tensorflow/tpu/blob/master/models/experimental/resnet50_keras/resnet50.py#L197-L201
The following seems to work, where lr is the initial learning rate you choose and M is the number of initial steps over which you want to the cosine decay to work.
def make_model(batch_size=None,lr=1.e-3,n_steps=2000):
source = Input(shape=(maxlen,), batch_size=batch_size,
dtype=tf.int32, name='Input')
embedding = Embedding(input_dim=max_features,
output_dim=128, name='Embedding')(source)
lstm = LSTM(32, name='LSTM')(embedding)
predicted_var = Dense(1, activation='sigmoid', name='Output')(lstm)
model = tf.keras.Model(inputs=[source], outputs=[predicted_var])
# implement cosine decay or other learning rate decay here
global_step = tf.Variable(0)
global_step=1
learning_rate = tf.train.cosine_decay_restarts(
learning_rate=lr,
global_step=global_step,
first_decay_steps=n_steps,
t_mul= 1.5,
m_mul= 1.,
alpha=0.1
)
# now feed this into the optimizer as shown below
model.compile(
optimizer=tf.train.RMSPropOptimizer(learning_rate=learning_rate),
loss='binary_crossentropy',
metrics=['acc'])
return model

Save trained model in Keras

I am follorwing this tutorial to create a classifier
Tutorial Link
`
# MLP for Pima Indians Dataset with grid search via sklearn
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import GridSearchCV
import numpy
# Function to create model, required for KerasClassifier
def create_model(optimizer='rmsprop', init='glorot_uniform'):
# create model
model = Sequential()
model.add(Dense(12, input_dim=8, kernel_initializer=init, activation='relu'))
model.add(Dense(8, kernel_initializer=init, activation='relu'))
model.add(Dense(1, kernel_initializer=init, activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
return model
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load pima indians dataset
dataset = numpy.loadtxt("pima-indians-diabetes.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]
# create model
model = KerasClassifier(build_fn=create_model, verbose=0)
# grid search epochs, batch size and optimizer
optimizers = ['rmsprop', 'adam']
init = ['glorot_uniform', 'normal', 'uniform']
epochs = [50, 100, 150]
batches = [5, 10, 20]
param_grid = dict(optimizer=optimizers, epochs=epochs, batch_size=batches, init=init)
grid = GridSearchCV(estimator=model, param_grid=param_grid)
grid_result = grid.fit(X, Y)
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
print("%f (%f) with: %r" % (mean, stdev, param))
The code works fine, I wish to save the trained model to an external file, please guide me.
I know about keras model.save to save a model, but here we have done some external work on the model, how do I save the model with all changes?
In your call to model.fit(), you have to include a callback. The callback will save the model on a file using ModelCheckpoint. After training, you can load the model back using Keras load_model.
epochs = 10
batch_size = 64
filepath = "checkpoint/model.{epoch:02d}.hdf5"
checkpoint = ModelCheckpoint(filepath=filepath, verbose=1,\
save_best_only=False)
callbacks = [checkpoint]
model.fit(X, Y, epochs=epochs,\
batch_size=batch_size,\
shuffle=True, callbacks=callbacks)

Categories

Resources