Keras, get output of a layer at each epochs - python

What I have done?
I implemented a keras model as follow:
train_X, test_X, train_Y, test_Y = train_test_split(X, Y, test_size=0.2, random_state=np.random.seed(7), shuffle=True)
train_X = np.reshape(train_X, (train_X.shape[0], 1, train_X.shape[1]))
test_X = np.reshape(test_X, (test_X.shape[0], 1, test_X.shape[1]))
model = Sequential()
model.add(LSTM(100, return_sequences=False, input_shape=(train_X.shape[1], train_X.shape[2])))
model.add(Dense(train_Y.shape[1], activation='softmax'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
model.fit(train_X, train_Y, validation_split=.20,
epochs=1000, batch_size=50)
What i want?
I want to give support vector machine(SVM) the output of the penultimate layer (LSTM), in any epoch(that is 1000) to svm also be trained.
But I do not know how to do this?
Any idea?
UPDATED:
I use from ModelCheckpoint as follow:
model = Sequential()
model.add(LSTM(100, return_sequences=False, input_shape=(train_X.shape[1], train_X.shape[2])))
model.add(Dense(train_Y.shape[1], activation='softmax'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
# checkpoint
filepath="weights-{epoch:02d}-{val_acc:.2f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]
model.fit(train_X, train_Y, validation_split=.20,
epochs=1000, batch_size=50, callbacks=callbacks_list, verbose=0)
OUTPUT:
Epoch 00991: val_acc did not improve
Epoch 00992: val_acc improved from 0.93465 to 0.93900, saving model to weights-992-0.94.hdf5
Epoch 00993: val_acc did not improve
Epoch 00994: val_acc did not improve
Epoch 00995: val_acc did not improve
Epoch 00996: val_acc did not improve
Epoch 00997: val_acc did not improve
Epoch 00998: val_acc improved from 0.93900 to 0.94543, saving model to weights-998-0.94.hdf5
Epoch 00999: val_acc did not improve
PROBLEM:
How to load all these models to obtain the output of the LSTM layer in each epochs as #IonicSolutions said?

What works best in your situation depends on how exactly you set up and train your SVM, but there are at least two options using callbacks:
You could use the ModelCheckpoint callback to save a copy of the model you are training at each epoch and then later load all these models to obtain the output of the LSTM layer.
You can also create your own callback by implementing the Callback base class. Within the callback, the model can be accessed and you can use on_epoch_end to extract the LSTM output at the end of each epoch.
Edit: To get convenient access to the penultimate layer, you can do the following:
# Create the model with the functional API
inp = Input((train_X.shape[1], train_X.shape[2],))
lstm = LSTM(100, return_sequences=False)(inp)
dense = Dense(train_Y.shape[1], activation='softmax')(lstm)
# Create the full model
model = Model(inputs=inp, outputs=dense)
# Create the model for access to the LSTM layer
access = Model(inputs=inp, outputs=lstm)
Then, you can pass access to your callback when you instantiate it. The key thing to note here is that model and access share the very same LSTM layer, whose weights will change when training model.

In order to get prediction output at each epoch here is what we can do:
import tensorflow as tf
import keras
# define your custom callback for prediction
class PredictionCallback(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs={}):
y_pred = self.model.predict(self.validation_data[0])
print('prediction: {} at epoch: {}'.format(y_pred, epoch))
# ...
# register the callback before training starts
model.fit(X_train, y_train, batch_size=32, epochs=25,
validation_data=(X_valid, y_valid),
callbacks=[PredictionCallback()])

Related

How to apply Attention layer to LSTM model

I am doing a speech emotion recognition machine training.
I wish to apply an attention layer to the model. The instruction page is hard to understand.
def bi_duo_LSTM_model(X_train, y_train, X_test,y_test,num_classes,batch_size=68,units=128, learning_rate=0.005, epochs=20, dropout=0.2, recurrent_dropout=0.2):
class myCallback(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs={}):
if (logs.get('acc') > 0.95):
print("\nReached 99% accuracy so cancelling training!")
self.model.stop_training = True
callbacks = myCallback()
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Masking(mask_value=0.0, input_shape=(X_train.shape[1], X_train.shape[2])))
model.add(tf.keras.layers.Bidirectional(LSTM(units, dropout=dropout, recurrent_dropout=recurrent_dropout,return_sequences=True)))
model.add(tf.keras.layers.Bidirectional(LSTM(units, dropout=dropout, recurrent_dropout=recurrent_dropout)))
# model.add(tf.keras.layers.Bidirectional(LSTM(32)))
model.add(Dense(num_classes, activation='softmax'))
adamopt = tf.keras.optimizers.Adam(lr=learning_rate, beta_1=0.9, beta_2=0.999, epsilon=1e-8)
RMSopt = tf.keras.optimizers.RMSprop(lr=learning_rate, rho=0.9, epsilon=1e-6)
SGDopt = tf.keras.optimizers.SGD(lr=learning_rate, momentum=0.9, decay=0.1, nesterov=False)
model.compile(loss='binary_crossentropy',
optimizer=adamopt,
metrics=['accuracy'])
history = model.fit(X_train, y_train,
batch_size=batch_size,
epochs=epochs,
validation_data=(X_test, y_test),
verbose=1,
callbacks=[callbacks])
score, acc = model.evaluate(X_test, y_test,
batch_size=batch_size)
yhat = model.predict(X_test)
return history, yhat
How can I apply it to fit for my model?
And are use_scale, causal and dropout all the arguments?
If there is a dropout in attention layer, how do we deal with it since we have dropout in LSTM layer?
Attention can be interpreted as a soft vector retrieval.
You have some query vectors. For each query, you want to retrieve some
values, such that you compute a weighted of them,
where the weights are obtained by comparing a query with keys (the number of keys must the be same as the number of values and often they are the same vectors).
In sequence-to-sequence models, the query is the decoder state and keys and values are the decoder states.
In classification task, you do not have such an explicit query. The easiest way how to get around this is training a "universal" query that is used to collect relevant information from the hidden states (something similar to what was originally described in this paper).
If you approach the problem as sequence labeling, assigning a label not to an entire sequence, but to individual time steps, you might want to use a self-attentive layer instead.

Issues with Keras load_model function

I am building a CNN in Keras using a Tensorflow backend for speaker identification, and currently I am attempting to train the model and then save it in as an .hdf5 file. The program trains the model for 100 epochs with early stopping and checkpoints, saving only the best model to a file, as illustrated in the code below:
class BuildModel:
# Create First Model in Ensemble
def createModel(self, model_input, n_outputs, first_session=True):
if first_session != True:
model = load_model('SI_ideal_model_fixed.hdf5')
return model
# Define Input Layer
inputs = model_input
# Define Densely Connected Layers
conv = Dense(16, activation='relu')(inputs)
conv = Dense(64, activation='relu')(conv)
conv = Dense(16, activation='relu')(conv)
conv = Reshape((conv.shape[1]*conv.shape[2]*conv.shape[3],))(conv)
outputs = Dense(n_outputs, activation='softmax')(conv)
# Create Model
model = Model(inputs, outputs)
model.summary()
return model
# Train the Model
def evaluateModel(self, x_train, x_val, y_train, y_val, num_classes, first_session=True):
# Model Parameters
verbose, epochs, batch_size, patience = 1, 100, 64, 10
# Determine Input and Output Dimensions
x = x_train[0].shape[0] # Number of MFCC rows
y = x_train[0].shape[1] # Number of MFCC columns
c = 1 # Number of channels
# Create Model
inputs = Input(shape=(x, y, c), name='input')
model = self.createModel(model_input=inputs,
n_outputs=num_classes,
first_session=first_session)
# Compile Model
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
# Callbacks
es = EarlyStopping(monitor='val_loss',
mode='min',
verbose=verbose,
patience=patience,
min_delta=0.0001) # Stop training at right time
mc = ModelCheckpoint('SI_ideal_model_fixed.hdf5',
monitor='val_accuracy',
verbose=verbose,
save_best_only=True,
mode='max') # Save best model after each epoch
reduce_lr = ReduceLROnPlateau(monitor='val_loss',
factor=0.2,
patience=patience//2,
min_lr=1e-3) # Reduce learning rate once learning stagnates
# Evaluate Model
model.fit(x_train, y=y_train, epochs=epochs,
callbacks=[es,mc,reduce_lr], batch_size=batch_size,
validation_data=(x_val, y_val))
accuracy = model.evaluate(x=x_train, y=y_train,
batch_size=batch_size,
verbose=verbose)
# Load Best Model
model = load_model('SI_ideal_model_fixed.hdf5')
return (accuracy[1], model)
However, it appears that the load_model function is not working properly since the model achieved a validation accuracy of 0.56193 after the first training session but then only started with a validation accuracy of 0.2508 at the beginning of the second training session. (From what I have seen, the first epoch of the second training session should have a validation accuracy much closer to the that of the best model.)
Moreover, I then attempted to test the trained model on a set of unseen samples with model.predict, and it failed on all six, often with high probabilities, which leads me to believe that it was using minimally trained (or untrained) weights.
So, my question is could this be an issue from loading and saving the models using the load_model and ModelCheckpoint functions? If so, what is the best alternative method? If not, what are some good troubleshooting tips for improving the model's prediction functionality?
I am not sure what you mean by training session. What I would do is first train for a few epochs epochs and note the validation accuracy. Then, load the model and use evaluate() to get the same accuracy. If it differs, then yes something is wrong with your loading. Here is what I would do:
def createModel(self, model_input, n_outputs):
# Define Input Layer
inputs = model_input
# Define Densely Connected Layers
conv = Dense(16, activation='relu')(inputs)
conv2 = Dense(64, activation='relu')(conv)
conv3 = Dense(16, activation='relu')(conv2)
conv4 = Reshape((conv.shape[1]*conv.shape[2]*conv.shape[3],))(conv3)
outputs = Dense(n_outputs, activation='softmax')(conv4)
# Create Model
model = Model(inputs, outputs)
return model
# Train the Model
def evaluateModel(self, x_train, x_val, y_train, y_val, num_classes, first_session=True):
# Model Parameters
verbose, epochs, batch_size, patience = 1, 100, 64, 10
# Determine Input and Output Dimensions
x = x_train[0].shape[0] # Number of MFCC rows
y = x_train[0].shape[1] # Number of MFCC columns
c = 1 # Number of channels
# Create Model
inputs = Input(shape=(x, y, c), name='input')
model = self.createModel(model_input=inputs,
n_outputs=num_classes)
# Compile Model
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
# Callbacks
es = EarlyStopping(monitor='val_loss',
mode='min',
verbose=verbose,
patience=patience,
min_delta=0.0001) # Stop training at right time
mc = ModelCheckpoint('SI_ideal_model_fixed.h5',
monitor='val_accuracy',
verbose=verbose,
save_best_only=True,
save_weights_only=False) # Save best model after each epoch
reduce_lr = ReduceLROnPlateau(monitor='val_loss',
factor=0.2,
patience=patience//2,
min_lr=1e-3) # Reduce learning rate once learning stagnates
# Evaluate Model
model.fit(x_train, y=y_train, epochs=5,
callbacks=[es,mc,reduce_lr], batch_size=batch_size,
validation_data=(x_val, y_val))
model.evaluate(x=x_val, y=y_val,
batch_size=batch_size,
verbose=verbose)
# Load Best Model
model2 = load_model('SI_ideal_model_fixed.h5')
model2.evaluate(x=x_val, y=y_val,
batch_size=batch_size,
verbose=verbose)
return (accuracy[1], model)
The two evaluations should print the same thing really.
P.S. TF might change the order of your computations so I used different names to prevent that in the model e.g. conv1, conv2 ...)

Keep training Keras model with loading and saving the weights

Since I cannot install h5py due to package inconsistency I am wondering if it is possible to save and load the weights in Keras to keep training your model on a new data. I know I can do the following:
old_weights = model.get_weights()
del model
new_model.set_weights(old_weights)
where model is the old model and new_model is the new one.Here is a complete example:
for i in training data:
model = Sequential()
model.add(Dense(20, activation='tanh', input_dim=Input))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
model.fit(X, y, epochs=8, batch_size=16, shuffle=False, verbose=0)
new_model = Sequential()
new_model.add(Dense(20, activation='tanh', input_dim=Input))
new_model.add(Dense(1))
new_model.compile(optimizer='adam', loss='mse')
old_weights = model.get_weights()
del model
new_model.set_weights(old_weights)
model=new_model
I want after reading each training example (X and y are different at each iteration) save the weights and load it again and start from pre-trained model. I am not sure if my code does that since I am defining optimizer and model.compile again. Can anyone help me if the following code save the model and every iteration starts from pre-trained model.
You don't need to keep recompiling the model. Instead just fit your model multiple times after loading your samples.
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(20, activation='tanh', input_dim=Input))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# load the data into training_data
for data in training_data:
model.fit(data[0], data[1], epochs=8, batch_size=16, shuffle=False, verbose=0)

Keras - Save output of intermediate layer

What I have done?
I'm using from this code to implement my keras'model:
X, tx, Y, ty = train_test_split(X, Y, test_size=0.2, random_state=np.random.seed(7), shuffle=True)
X = np.reshape(X, (X.shape[0], 1, X.shape[1]))
tx = np.reshape(tx, (tx.shape[0], 1, tx.shape[1]))
model = Sequential()
model.add(LSTM(100, return_sequences=False, input_shape=(X.shape[1], X.shape[2])))
model.add(Dense(Y.shape[1], activation='softmax'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
filepath="weights-{epoch:02d}-{val_acc:.2f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]
model.fit(X, Y, validation_split=.20,
epochs=1000, batch_size=50, callbacks=callbacks_list, verbose=0)
OUTPUT:
Below is a part of the program's output:
Epoch 00993: val_acc did not improve
Epoch 00994: val_acc did not improve
Epoch 00995: val_acc did not improve
Epoch 00996: val_acc did not improve
Epoch 00997: val_acc did not improve
Epoch 00998: val_acc improved from 0.93900 to 0.94543, saving model to weights-998-0.94.hdf5
Epoch 00999: val_acc did not improve
PROBLEM:
I need to save the output of LSTM layer in any epoch but i do not know how?
Any idea?
You can use Keras functional API. You will have to rewrite your model creation but it's not much work. Then when you write something like this:
lstm_output = LSTM(128, ...)(x)
the output of the LSTM layer will be in the lstm_output variable and you can save it in every iteration of every epoch.
I hope this answers your question.

Keras earlystopping: print selected epoch

Simple question. I am using Keras earlystopping in the following form:
Earlystop = EarlyStopping(monitor='val_loss', min_delta=0, patience=5, verbose=1, mode='auto')
How can I get Keras to print the selected epoch once the model has been fit? I think you have to use logs but don't quite know how.
Thanks.
Edit:
The full code is very long! Let me add a bit more than I gave. Hopefully it will help.
# Define model
def design_flexiNN(m_type, neurons, shape_timestep, shape_feature, activation, kernel_ini):
model = Sequential()
model.add(Dense(neurons, input_dim=shape_feature, activation = activation, use_bias=True, kernel_initializer=kernel_ini))
model.add(Dense(1, use_bias=True))
model.compile(loss='mae', optimizer='Adam')
return model
# fit model
def fit_flexiNN(m_type, train_X, train_y, epochs, batch_size, test_X, test_y):
history = model.fit(train_X, train_y, epochs=epochs, batch_size=batch_size, callbacks=callbacks_list, validation_data=(test_X, test_y), verbose=0, shuffle=False)
Earlystop = EarlyStopping(monitor='val_loss', min_delta=0, patience=5, verbose=1, mode='auto')
callbacks_list = [Earlystop]
model = design_flexiNN(m_type, neurons, neurons_step, train_X_feature_shape, activation, kernel_ini);
history = fit_flexiNN(m_type, train_X, train_y, ini_epochs, batch_size, test_X, test_y)
I've been able to infer the selected epoch by doing len(history.history['val_loss']) minus 1, but that doesn't work if you have a patience above zero.
Been trying to solve this myself and realised that the len(history.history['val_loss']) method is almost correct. All you need to add is:
len(history.history['val_loss']) - patience
which should give you the epoch number for the selected model (assuming that the model didnt run for the full number of epochs).
A slightly more thorough method would be:
model_loss = history.history["val_loss"]
epoch_chosen = model_loss.index(min(model_loss)) +1
print(epoch_chosen)
Hope this helps!

Categories

Resources