Python/Keras - accessing ModelCheckpoint callback - python

I'm using Keras to predict a time series. As standard I'm using 20 epochs. I want to know what did my neural network predict for each one of the 20 epochs.
By using model.predict I get the last prediction. However I want all predictions, or at least the last 10 ones (which have acceptable error levels).
To access that I'm trying the ModelCheckpoint function from Keras, however I'm having trouble to access it afterwards. I'm using the following code:
model=Sequential()
model.add(GRU(input_dim=col,init='uniform',output_dim=20))
model.add(Dense(10))
model.add(Dense(5))
model.add(Activation("softmax"))
model.add(Dense(1))
model.compile(loss="mae", optimizer="RMSprop")
checkpoint=ModelCheckpoint(filepath='/Users/Alex/checkpoint.hdf5')
model.fit(X=predictor_train, y=target_train, nb_epoch=20, batch_size=batch,validation_split=0.1) #best validation split at 0.1
model.evaluate(X=predictor_train, y=target_train,batch_size=batch,show_accuracy=True)
print checkpoint
Objectively, my questions are:
I expected that after running the code I would find a file named checkpoint.hdf5 inside the folder /Users/Alex, however I didn't. What am I missing?
When I print checkpoint out what I get is a keras.callbacks.ModelCheckpoint object at 0x117471290. Is there a way to print what I want? How would the code look like?
Your help is very much appreciated :)

There are two problems in this code:
You are not passing the callback to the model's fit method. This is done with the keyword argument "callbacks".
The filepath should contain placeholders (like "{epoch:02d}-{val_loss:.2f}" that are used with str.format by Keras in order to save each epoch to a different file.
So the correct version should be something like:
checkpoint = ModelCheckpoint(filepath='/Users/Alex/checkpoint-{epoch:02d}-{val_loss:.2f}.hdf5')
model.fit(X=predictor_train, y=target_train, nb_epoch=20,
batch_size=batch,validation_split=0.1, callbacks=[checkpoint])
You can also add other kinds of callbacks in the list that is assigned to that keyword.
Unfortunately the callback object doesn't store the history information so it cannot be recovered from it.

Related

keras problems loading custom model from yolov2

I've searched around for a couple of answers regarding the load_model from keras but I still have a question.
I am following this model really closely (https://github.com/experiencor/keras-yolo2), and am training on a custom dataset.
I have done the training which gives me a yolov2.h5 file, basically model weights to fit into the keras model. But I am encountering some problems with the loading of the model.
After loading the model (in a separate.py file)
model = load_model('file_dir/yolov2.h5')
First I encounter the issue
NameError: name 'tf' is not defined
Which I then searched up to modify my code to add custom objects as such:
model = load_model('file_dir/yolov2.h5', custom_objects={'tf':tf})
This clears the first error but results in another
ValueError: Unknown loss function : custom_loss
I used the custom_loss function from the yolov2 (https://github.com/experiencor/keras-yolo2/blob/master/frontend.py), so i tried to solve it by
from frontend import YOLO
model = load_model('file_dir/yolov2.h5' custom_objects={'tf':tf, 'custom_loss':YOLO.custom_loss)
But ran into another error:
TypeError: custom_loss() missing 1 required positional argument
I got rather stuck here because I have no idea how to fit in the parameters for custom_loss. Seek some help regarding this (Don't particularly understand this part since I'm loading my model in a different python script separate.py). Thank you so much!
(Edit: This fix doesn't work for me either)
model = load_model('file_dir/yolov2.h5', compile = False)
To resolve this problem, as you already have the network at hand, only save trained weights (like what keras trainer does in callback).
For testing, make model, no need to compile, and then load trained weights using model.load_weights(path/to/saved/weights).
You also can use "by_name=True" if you make the network in a different way, this time you should keep layer names.
Another option id to manually set weights; for this you will load .h5 file bu "h5py" (h5py.File(path/to/weights, mode='r')) for example (have look how keras do that), then try to correspond layer names of the model and loaded weights.

Why do I have such inconsistent results when training my model?

I am using Keras to train my model.
I have initialised numpy and tensorflow seeds. I have made a 50-iterations loop where I train and test my Keras deep learning architecture (always the same) on the same training, validation and test sets. I get those results :
print (np.mean(train_accuracy_vec))
print (np.std(train_accuracy_vec))
print ()
print (np.mean(valid_accuracy_vec))
print (np.std(valid_accuracy_vec))
print ()
print (np.mean(test_accuracy_vec))
print (np.std(test_accuracy_vec))
print ()
I get this :
Sometimes it gives an unacceptable false positive rates while sometimes, it works quite well. I used EarlyStopping based on val_acc behaviour.
So, what could cause a so great instability ?
Also isn't it a bit odd to have validation score far under test score ?
Thanks
EDIT: Despite #Thomas Pinetz kind answer, I don't get better results at the second time : still high std...
To be more precise, here is how my loop is made :
# tf, random and numpy seeds...
# lots of data reading, preprocessing,...(including split between train, valid and test sets)
for k in range (0,50) :
print (k)
model = Sequential()
model.add(Dense(200, activation='elu', input_dim=trainX.shape[1], init=keras.initializers.glorot_uniform(1)))
model.add(Dropout(0.3))
# some additional layers...
model.compile(loss='binary_crossentropy',metrics=['accuracy'], optimizer='adam')
model.fit(trainX, trainY, validation_data=(validX, validY), epochs=100, verbose=0 , callbacks=callbacks_list)
train_score = model.evaluate(trainX, trainY)
train_accuracy_vec.append (train_score[1])
print(train_score)
trainPredict = model.predict(trainX)
print(confusion_matrix(trainY, trainPredict.round()))
# and the same for valid and test...
What causes differences between runs is the random initialization of weights. Gradient-descent based methods get stuck in local minima, so, the best solution that will be found on each run depends on the initial weights. There's not much you can do about that. It's inherent problem of neural networks. It might help to take a look at Xavier/He initialization though.
As to why your validation error is quite worse than the test error, it's indeed weird. However, if your dataset is relatively small, and you are using the same splitting at all runs, it might have just happened that the test set has similar patterns to the training set, while the validation has different. You'd better split at each run.
To obtain reproducable results in keras follow the following instructions: https://keras.io/getting-started/faq/#how-can-i-obtain-reproducible-results-using-keras-during-development.
It might be that multi-threading is causing problems.
Edit:
Every time you run a method, that returns a random number, this number depends on your initial seed. So while your script always returns the same values, e.g. same mean and std for your training/val/test set evaluation it will not use the same random number in each iteration of the for loop.
What you can try is to loop around the entire script and set the random set at the beginning of the for loop. Maybe then you will get the same results.
There is all kind of randomness in generating and training a DL model. From the initialization of the weights to the order of your training set, which by default is random. This initialization will not be the same if you do not reset the random seed. Same for the order of the dataset. In each epoch your training data is shuffled and this will be different in every for loop run. There are also layers which use stochastic elements like dropout, which need the same seed to guarantee the same performance.

Keras Functional API changing layer names in every API

When I run the functional API in the model for k-fold cross-validation, the numbers in the naming the dense layer is increased in the return fitted model of each fold.
Like in the first fold it’s dense_2_acc, then in 2nd fold its dense_5_acc.
By my model summary shows my model is correct. why is it changing the names in the fitted model history object of each fold?
This is a really good question which shows something really important about keras. The reason why names change in such manner is that keras is not clearing previously defined variables even when you overwrite the model. You can easily check that variables are still in session.graph by calling:
from keras import backend as K
K.get_session().graph.get_collection('variables')
In order to clear previous model variables one may call:
K.clear_session()
However - be careful - as you might lose an existing model. If you want to keep names the same you can simply name your layers by adding name parameter to your layer instantiation, e.g.:
Dense(10, activation='softmax', name='output')

Does calling the model.fit method again reinitialize the already trained weights?

I am using Keras to train a network. Let's say that after 20 epochs I want to stop the training to check if everything is fine, then continue form the 21st epoch. Does calling the model.fit method for a second time reinitialize the already trained weights ?
Does calling the model.fit method for a second time reinitialize the already trained weights ?
No, it will use the preexisting weights your model had and perform updates on them. This means you can do consecutive calls to fit if you want to and manage it properly.
This is true also because in Keras you are also able to save a model (with the save and load_model methods), load it back, and call fit on it. For more info on that check this question.
Another option you got is to use the train_on_batch method instead:
train_on_batch(self, x, y, sample_weight=None, class_weight=None)
Runs a single gradient update on a single batch of data.
This way I think you may have more control in between the updates of you model, where you can check if everything is fine with the training, and then continue to the next gradient update.

Output the loss/cost function in keras

I am trying to find the cost function in Keras. I am running an LSTM with the loss function categorical_crossentropy and I added a Regularizer. How do I output what the cost function looks like after my Regularizer this for my own analysis?
model = Sequential()
model.add(LSTM(
NUM_HIDDEN_UNITS,
return_sequences=True,
input_shape=(PHRASE_LEN, SYMBOL_DIM),
kernel_regularizer=regularizers.l2(0.01)
))
model.add(Dropout(0.3))
model.add(LSTM(NUM_HIDDEN_UNITS, return_sequences=False))
model.add(Dropout(0.3))
model.add(Dense(SYMBOL_DIM))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer=RMSprop(lr=1e-03, rho=0.9, epsilon=1e-08))
How do i output what the cost function looks like after my regularizer this for my own analysis?
Surely you can achieve this by obtaining the output (yourlayer.output) of the layer you want to see and print it (see here). However there are better ways to visualize these things.
Meet Tensorboard.
This is a powerful visualization tool that enables you to track and visualize your metrics, outputs, architecture, kernel_initializations, etc. The good news is that there is already a Tensorboard Keras Callback that you can use for this purpose; you just have to import it. To use it just pass an instance of the Callback to your fit method, something like this:
from keras.callbacks import TensorBoard
#indicate folder to save, plus other options
tensorboard = TensorBoard(log_dir='./logs/run1', histogram_freq=1,
write_graph=True, write_images=False)
#save it in your callback list
callbacks_list = [tensorboard]
#then pass to fit as callback, remember to use validation_data also
model.fit(X, Y, callbacks=callbacks_list, epochs=64,
validation_data=(X_test, Y_test), shuffle=True)
After that, start your Tensorboard sever (it runs locally on your pc) by executing:
tensorboard --logdir=logs/run1
For example, this is what my Kernels look like on two different models I tested (to compare them you have to save separate runs and then start Tensorboard on the parent directory instead). This is on the Histograms tab, on my second layer:
The model on the left I initialized with kernel_initializer='random_uniform', thus its shape is the one of a Uniform Distribution. The model on the right I initialized with kernel_initializer='normal', thus why it appears as a Gaussian distribution throughout my epochs (about 30).
This way you could visualize how your kernels and layers "look like", in a more interactive and understandable way than printing outputs. This is just one of the great features Tensorboard has, and it can help you develop your Deep Learning models faster and better.
Of course there are more options to the Tensorboard Callback and for Tensorboard in general, so I do suggest you thoroughly read the links provided if you decide to attempt this. For more information you can check this and also this questions.
Edit: So, you comment you want to know how your regularized loss "looks" analytically. Let's remember that by adding a Regularizer to a loss function we are basically extending the loss function to include some "penalty" or preference in it. So, if you are using cross_entropy as your loss function and adding an l2 regularizer (that is Euclidean Norm) with a weight of 0.01 your whole loss function would look something like:

Categories

Resources