Apply any callback in a custom training loop in TF2.0

Apply any callback in a custom training loop in TF2.0 - python

I am writing a custom CycleGan training loop following TF's documentation. I'd like to use several existing callbacks, and prefer not re-writing their logic.
I've found this SO question, whose answers unfortunately focused on EarlyStopping, rather than on the broad range of callbacks.
I further found this reddit post, which suggested to call them manually. Several callbacks however work with an internal model object (they call self.model, which is the model they are applied on).
How can I simply use the existing callbacks in a custom training loop? I appreciate any code outlines or further recommendations!

If you want to use the existing callbacks you have to create the callbacks you want and pass them to callbacks argument while using model.fit().
For example, i want to use the ModelCheckpoint and EarlyStopping
checkpoint = keras.callbacks.ModelCheckpoint(filepath, monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')
es = keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=1)
Create a callbacks_list
callbacks_list = [checkpoint, es]
pass the callbacks_list to callbacks argument while training the model.
model.fit(x_train,y_train,validation_data=(x_test, y_test),epochs=10,callbacks=callbacks_list)
Please refer to this gist for complete code example. Thank You.

Related

keras model.save() isn't saving

I have a keras NN that I want to train and validate using two sets of data, and then test the ultimate performance of using a third set. In order to avoid having to rerun the training every time I restart my google colab runtime or want to change my test data, I want to save the final state of the model after training in one script and then load it again in another script.
I've looked everywhere and it seems that model.save("content/drive/My Drive/Directory/ModelName", save_format='tf') should do the trick, but even though it outputs INFO:tensorflow:Assets written to: content/drive/My Drive/Directory/ModelName/assets nothing appears in my Google Drive, so I assume it isn't actually saving.
Please can someone help me solve this issue?
Thanks in advance!

The standard way of saving and retrieving your model's state after Google Colab terminated your connection is to use a feature called ModelCheckpoint. This is a callback in Keras that would run after each epoch and it will save your model for instance any time there's an improvement. Here's is the steps needed to accomplish what you want:
Connect to Google Drive
Use this code in order to connect to Google Drive:
from google.colab import drive
drive.mount('/content/gdrive')
Give access to Google Colab
Then you'll presented with a link that you should go to and after authorizing Google Colab by copying the given code to the text box as shown below:
Define your ModelCheckpoint
This is how you could define your ModelCheckpoint's callback:
from keras.callbacks import *
filepath="/content/gdrive/My Drive/MyCNN/epochs:{epoch:03d}-val_acc:{val_acc:.3f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]
Use it as a callback in while you're training the model
Then you need to tell your model that after each epoch run this functionality for me to save the model's state.
model.fit(X_train, y_train,
batch_size=64,
epochs=epochs,
verbose=1,
validation_data=(X_val, y_val),
callbacks=callbacks_list)
Load the model after Google Colab terminated
Finally after your session was terminated, you can load your previous model's state by simply running the following code. Don't forget to re-define your model first and only load weights at this stage.
model.load_weights('/content/gdrive/My Drive/MyCNN/epochs:047-val_acc:0.905.hdf5'
Hope that this answers your question.

Is there a standard way to implement callbacks with GridSearchCV

I love using GridSearchCV for hyper parameter tuning of machine learning models (mostly using sklearn here).
Is there a way to pass a function/lambda as callback that would get called after every search combination? I would need this to add custom logging and even send events in certain scenarios.
In fact, I'm looking for a similar pattern as with Keras callbacks where every callback is being executed after every epoch.
Thanks for any insights

I was searching for a way to get current parameters on my callback and found your question, hope this helps someone.
grid = GridSearchCV(estimator=model, param_grid=param_grid, verbose=0, n_jobs=1)
grid_result = grid.fit(X_train, Y_train, callbacks=[YourCallback()])

Python crashes when saving Keras model

I am building a model in Keras that contains roughly 4.2M parameters. When I try to save the model using ModelCheckpoint or using model.save('best_model.hdf5'), Python crashes.
The model runs without any issues when I comment out the code, to save the model, so there isn't any other issue that could potentially be causing python to crash.
My reasoning here is that a large number of parameters is causing python to crash.
I have looked but haven't been able to find any solution.
Are there any alternatives available to save my model and reuse it in Keras? Or is there a way to fix this issue?
checkpoint = ModelCheckpoint(filepath, monitor='val_mean_squared_error', verbose=1, save_best_only=True, mode='max')
model.save(filepath)
Python doesn't shout out any error. This is all that pops up -
PythonErrorPopup

Keras - method on_batch_end is slow but only callback I have is checkpoint

I set up a network with keras using TensorFlow backend.
When I train my network I often times keep getting message:
UserWarning: Method on_batch_end() is slow compared to the batch update (0.195523). Check your callbacks.
% delta_t_median)
The issue is that my network is set up with only checkpoint callback:
checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True, mode='min')
callbacks_list = [checkpoint]
As far as I see in documentation this method is called only on epoch end, so it can't slow down on_batch_end method. Can anyone provide some information on what is the issue?

This is most probably a Generator (fit_generator()) issue. When using a generator as data source it has to be called at the end of a batch. Consider revisiting your generator code, using multiprocessing (workers > 1) or a higher batchsize (if possible)

TensorBoard Distributions and Histograms with Keras and fit_generator

I'm using Keras to train a CNN using the fit_generator function.
It seems to be a known issue that TensorBoard doesn't show histograms and distributions in this setup.
Did anybody figure out a way to make it work anyway?

There is no easy way to just plug it in with one line of code, you have to write your summaries by hand.
The good news is that it's not difficult and you can use the TensorBoard callback code in Keras as a reference.
(There is also a version 2 ready for TensorFlow 2.x.)
Basically, write a function e.g. write_summaries(model) and call it whenever you want to write your summaries (e.g. just after your fit_generator())
Inside your write_summaries(model) function use tf.summary, histogram_summary and other summary functions to log data you want to see on tensorboard.
If you don't know exactly how to check official tutorial:
and this great example of MNIST with summaries.

I believe bartgras's explanation is superseded in more recent versions of Keras (I'm using Keras 2.2.2). To get histograms in Tensorboard all I did was the following, (where bg is a data wrangling class which exposes a generator for gb.training_batch(); gb.validation_batch() however is NOT a generator):
NAME = "Foo_{}".format(datetime.now().isoformat(timespec='seconds')).replace(':', '-')
tensorboard = keras.callbacks.TensorBoard(
log_dir="logs/{}".format(NAME),
histogram_freq=1,
write_images=True)
callbacks = [
tensorboard
]
history = model.fit_generator(
bg.training_batch(),
validation_data=bg.validation_batch(),
epochs=EPOCHS,
steps_per_epoch=bg.steps_per_epoch,
validation_steps=bg.validation_steps,
verbose=1,
shuffle=False,
callbacks=callbacks)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Apply any callback in a custom training loop in TF2.0 - python

Related

keras model.save() isn't saving

Is there a standard way to implement callbacks with GridSearchCV

Python crashes when saving Keras model

Keras - method on_batch_end is slow but only callback I have is checkpoint

TensorBoard Distributions and Histograms with Keras and fit_generator

Categories

Resources