How to log Keras loss output to a file - python

When you run a Keras neural network model you might see something like this in the console:
Epoch 1/3
6/1000 [..............................] - ETA: 7994s - loss: 5111.7661
As time goes on the loss hopefully improves. I want to log these losses to a file over time so that I can learn from them. I have tried:
logging.basicConfig(filename='example.log', filemode='w', level=logging.DEBUG)
but this doesn't work. I am not sure what level of logging I need in this situation.
I have also tried using a callback like in:
def generate_train_batch():
while 1:
for i in xrange(0,dset_X.shape[0],3):
yield dset_X[i:i+3,:,:,:],dset_y[i:i+3,:,:]
class LossHistory(keras.callbacks.Callback):
def on_train_begin(self, logs={}):
self.losses = []
def on_batch_end(self, batch, logs={}):
self.losses.append(logs.get('loss'))
logloss=LossHistory()
colorize.fit_generator(generate_train_batch(),samples_per_epoch=1000,nb_epoch=3,callbacks=['logloss'])
but obviously this isn't writing to a file. Whatever the method, through a callback or the logging module or anything else, I would love to hear your solutions for logging loss of a keras neural network to a file. Thanks!

You can use CSVLogger callback.
as example:
from keras.callbacks import CSVLogger
csv_logger = CSVLogger('log.csv', append=True, separator=';')
model.fit(X_train, Y_train, callbacks=[csv_logger])
Look at: Keras Callbacks

There is a simple solution to your problem. Every time any of the fit methods are used - as a result the special callback called History Callback is returned. It has a field history which is a dictionary of all metrics registered after every epoch. So to get list of loss function values after every epoch you can easly do:
history_callback = model.fit(params...)
loss_history = history_callback.history["loss"]
It's easy to save such list to a file (e.g. by converting it to numpy array and using savetxt method).
UPDATE:
Try:
import numpy
numpy_loss_history = numpy.array(loss_history)
numpy.savetxt("loss_history.txt", numpy_loss_history, delimiter=",")
UPDATE 2:
The solution to the problem of recording a loss after every batch is written in Keras Callbacks Documentation in a Create a Callback paragraph.

Old question, but here goes. Keras history output perfectly matches pandas DataSet input.
If you want the entire history to csv in one line:
pandas.DataFrame(model.fit(...).history).to_csv("history.csv")
Cheers

You can redirect the sys.stdout object to a file before the model.fit method and reassign it to the standard console after model.fit method as follows:
import sys
oldStdout = sys.stdout
file = open('logFile', 'w')
sys.stdout = file
model.fit(Xtrain, Ytrain)
sys.stdout = oldStdout

So In TensorFlow 2.0, it is quite easy to get Loss and Accuracy of each epoch because it returns a History object. Its History.history attribute is a record of training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values
If you have validation Data
History = model.fit(trainX,trainY,validation_data = (testX,testY),batch_size= 100, epochs = epochs,verbose = 1)
train_loss = History.history['loss']
val_loss = History.history['val_loss']
acc = History.history['accuracy']
val_acc = History.history['val_accuracy']
If you don't have validation Data
History = model.fit(trainX,trainY,batch_size= 100, epochs = epochs,verbose = 1)
train_loss = History.history['loss']
acc = History.history['accuracy']
Then to save list data into text file use the below code
import numpy as np
train_loss = np.array(loss_history)
np.savetxt("train_loss.txt", train_loss, delimiter=",")

Best is to create a LambdaCallback:
from keras.callbacks import LambdaCallback
txt_log = open('loss_log.txt', mode='wt', buffering=1)
save_op_callback = LambdaCallback(
on_epoch_end = lambda epoch, logs: txt_log.write(
{'epoch': epoch, 'loss': logs['loss']} + '\n'),
on_train_end = lambda logs: txt_log.close()
)
Now,Just add it like this in the model.fit function:
model.fit(...,callbacks = [save_op_callback])

Related

Saving accuracy and loss with callback on colab

So im trying to train a model on colab, and it is going to take me roughly 70-72 hr of continues running. I have a free account, so i get kicked due to over-use or inactivity pretty frequently, which means I cant just dump history in a pickle file.
history = model.fit_generator(custom_generator(train_csv_list,batch_size), steps_per_epoch=len(train_csv_list[:13400])//(batch_size), epochs=1000, verbose=1, callbacks=[stop_training], validation_data=(x_valid,y_valid))
I found the CSVLogger in callback method and added it to my callback as below. But it wont create model_history_log.csv for some reason. I don't get any error or warning. What part am i doing wrong ?
My goal is to only save accuracy and loss, throughout the training process
class stop_(Callback):
def on_epoch_end(self, epoch, logs={}):
model.save(Path("/content/drive/MyDrive/.../model" +str(int(epoch))))
CSVLogger("/content/drive/MyDrive/.../model_history_log.csv", append=True)
if(logs.get('accuracy') > ACCURACY_THRESHOLD):
print("\nReached %2.2f%% accuracy, so stopping training!!" %(ACCURACY_THRESHOLD*100))
self.model.stop_training = True
stop_training = stop_()
Also since im saving the model at every epoch, does the model save this information ? so far i havent found anything, and i doubt it saves accuracy, loss, val accuracy,etc
Think you want to write your callback as follows
class STOP(tf.keras.callbacks.Callback):
def __init__ (self, model, csv_path, model_save_dir, epochs, acc_thld): # initialization of the callback
# model is your compiled model
# csv_path is path where csv file will be stored
# model_save_dir is path to directory where model files will be saved
# number of epochs you set in model.fit
self.model=model
self.csv_path=csv_path
self.model_save_dir=model_save_dir
self.epochs=epochs
self.acc_thld=acc_thld
self.acc_list=[] # create empty list to store accuracy
self.loss_list=[] # create empty list to store loss
self.epoch_list=[] # create empty list to store the epoch
def on_epoch_end(self, epoch, logs=None): # method runs on the end of each epoch
savestr='_' + str(epoch+1) + '.h5' # model will be save as an .h5 file with name _epoch.h5
save_path=os.path.join(self.model_save_dir, savestr)
acc= logs.get('accuracy') #get the accuracy for this epoch
loss=logs.get('loss') # get the loss for this epoch
self.model.save (save_path) # save the model
self.acc_list.append(logs.get('accuracy'))
self.loss_list.append(logs.get('loss'))
self.epoch_list.append(epoch + 1)
if acc > self.acc_thld or epoch+1 ==epochs: # see of acc >thld or if this was the last epoch
self.model.stop_training = True # stop training
Eseries=pd.Series(self.epoch_list, name='Epoch')
Accseries =pd.Series(self.acc_list, name='accuracy')
Lseries=pd.Series(self.loss_list, name='loss')
df=pd.concat([Eseries, Lseries, Accseries], axis=1) # create a dataframe with columns epoch loss accuracy
df.to_csv(self.csv_path, index=False) # convert dataframe to a csv file and save it
if acc > self.acc_thld:
print ('\nTraining halted on epoch ', epoch + 1, ' when accuracy exceeded the threshhold')
then before you run model.fit use code
epochs=20 # set number of epoch for model.fit and the callback
sdir=r'C:\Temp\stooges' # set directory where save model files and the csv file will be stored
acc_thld=.98 # set accuracy threshold
csv_path=os.path.join(sdir, 'traindata.csv') # name your csv file to be saved in sdir
callbacks=STOP(model, csv_path, sdir, epochs, acc_thld) # instantiate the callback
Remember in model.fit set callbacks = callbacks. I tested this on a simple dataset. It ran for only 3 epochs before the accuracy exceeded the threshold of .98. So since it ran for 3 epoch it created 3 save model files in the sdir labeled as
_1.h5
_2.h5
_3.h5
It also created the csv file labelled as traindata.csv. The csv file content was
Epoch loss accuracy
1 8.086007 .817778
2 6.911876 .974444
3 6.129871 .987778

Evaluate the model during training affects its performance PyTorch

In PyTorch, I want to evaluate my model on the validation set every eval_step during training, and I wrote code like this:
def tune(model, loader_train, loader_dev, optimizer, epochs, eval_step):
for epoch in range(epochs):
for step,x in enumerate(loader_train):
optimizer.zero_grad()
loss = model(x)
loss.backward()
optimizer.step()
if step % eval_step == 0:
model.eval()
test(model, loader_dev)
model.train()
When eval_step = int(len(loader_train)/2) and eval_step = int(len(loader_train)/8), they lead to quite different metric result after training through one whole epoch (which means the second output for the former differs the eighth output for the latter).
Could anyone explain why?
The length of loader_train is 20000 (it depends on batch size), and here is my testing script:
def test(model, loader_dev):
preds = []
labels = []
for step,x in enumerate(loader_dev):
preds.append(model(x).view(-1))
labels.apend(x['label'].view(-1))
metric = cal_metric(preds, labels)
logger.info(metric)
I think you probably set 'shffule=True' in your dataloader. Even though you fix 'random seed', dataloader in torch will generate different results if you use another dataloader while using current dataloader. In the scenario you describe, it may cause your model get data input in different order and then result in different metric result.

Tensorflow 2.0: Custom metric (balanced accuracy score) for modelcheckpoint not working

I would like to implement a model checkpoint callback based on balanced accuracy score. For this, I implemented following class:
class BalAccScore(keras.callbacks.Callback):
def __init__(self, validation_data=None):
super(BalAccScore, self).__init__()
self.validation_data = validation_data
def on_train_begin(self, logs={}):
self.balanced_accuracy = []
def on_epoch_end(self, epoch, logs={}):
y_predict = tf.argmax(self.model.predict(self.validation_data[0]), axis=1)
y_true = tf.argmax(self.validation_data[1], axis=1)
balacc = balanced_accuracy_score(y_true, y_predict)
self.balanced_accuracy.append(round(balacc,6))
logs["val_bal_acc"] = balacc
keys = list(logs.keys())
print("\n ------ validation balanced accuracy score: %f ------\n" %balacc)
I then define following callbacks
balAccScore = BalAccScore(validation_data=(X_2, y_2))
mc = ModelCheckpoint(filepath=callback_path, monitor="val_bal_acc", verbose=1, save_best_only=True, save_freq='epoch')
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=['val_bal_acc'])
history = model.fit(X_1, y_1, epochs = 5, batch_size = 512,
callbacks=[balAccScore, mc],
validation_data = (X_2, y_2)
)
I then get the error
ValueError: Unknown metric function: val_bal_acc
despite the fact that I find it under history when using for example accuracy instead, i.e. by setting metrics=["acc"] when compiling instead. In which case, I get the to be expected warning:
WARNING:tensorflow:Can save best model only with val_bal_acc available, skipping.
but otherwise the model runs perfectly. Not sure why it is not running otherwise.
you should just remove the quotations in compile :
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=[val_bal_acc])
or at least this how it works in R
You're getting that error because you're not passing the balanced_accuracy_score as a value for the metrics argument when compiling the model. The string 'val_bal_acc' that you passed in the metrics argument when compiling the model doesn't work because it's not a known metric. You can access metrics by their string name, only for those metrics already implemented in tf.keras.metrics. If you want to monitor the validation balanced accuracy during training you should implement a custom metric class (you can look here how to do it) and then pass it to the metrics argument. Once you've done this you can monitor your custom metric using the name you gave to it with the prefix 'val', if you want to monitor it during the validation time. There's no need for a supplementary custom callback as you did, the logs are updated automatically once you've defined the metric. For this particular case, you can find some implementation of this metric in the answers to this question.
If you prefer a callback approach instead you don't need to define a custom metric but take advantage of the already logged metrics. You can find an implementation of that in my answer here.

CSVLogger does not work for model.evaluate process for keras

I want to log the evaluate result to a file with tensorflow 2.0 (keras). I use the callback method. It works well for model.fit but seems not work for model.evaluate.
this is my code:
csv_logger = CSVLogger(logfile, append=True, separator=';')
model.fit(train_dataset, epochs=50, callbacks=[csv_logger]) // works
model.evaluate(test_dataset, callbacks=[csv_logger]) // not work
Am I do the wrong way to log evaluate result to file?
This is expected behaviour. CSVLogger implements callback methods only for training.
For evaluate, on_test_begin(), on_test_batch_begin(), on_test_batch_end() and on_test_end() callback methods are invoked by the keras training module which are not implemented by CSVLogger and hence no csv files are created when used with model.evaluate().
https://github.com/tensorflow/tensorflow/blob/v2.0.0-alpha0/tensorflow/python/keras/callbacks.py#L1514
A custom CSV callback could be implemented like this:
class MyCSVLogger(Callback):
def __init__(self, filename):
self.filename = filename
print(filename)
def on_test_begin(self, logs=None):
# open csv file
print('test begin')
def on_test_batch_begin(self, batch, logs=None):
pass
def on_test_batch_end(self, batch, logs=None):
# write the contents of the dictionary logs to csv file
# sample content of logs {'batch': 0, 'size': 2, 'loss': -0.0, 'accuracy': 1.0}
print(logs)
def on_test_end(self, logs=None):
# close csv file
print('test end')
csv_logger = MyCSVLogger('abc.csv')
model.evaluate(X_eval,y_eval, callbacks=[csv_logger])
Just here to add my 2 cents to Manoj Mohan's great answer. It is somewhat hacky, but I found that adding the following lines of code makes it work for TF2.1:
from tensorflow.keras.callbacks import CSVLogger
CSVLogger.on_test_begin = CSVLogger.on_train_begin
CSVLogger.on_test_batch_end = CSVLogger.on_epoch_end
CSVLogger.on_test_end = CSVLogger.on_train_end
Basically it just adds test-case functions to the CSVLogger class that are identical to those of the train-case functions. The only caveat is that it will also create an epoch column in your CSV file. The saved statistics are from the batches though

Callback returning train and validation performance

could you help me to understand how to implement a callback function which determines the performance of a model on the test and validation data?
I got a bit confused reading this from a nice block entry:
len(self.model.validation_data) == 3,
because validation_data[0] ==> train_x (that you input in
model.fit()),
validation_data[1] ==> train_y,
validation_data[2]=sample_weight.
As you can see, the blogger mentions that validation_data is here the training data. From the keyword "validation_data" which is also used in model.fit I would assume that self.model.validation_data returns the validation data and something like self.model.x would return the training data (as model.fit(x= ...,) is the input for the training data)
Could someone shed light on this?
Thanks in advance for your help
edit:
I checked with dir(self) if there would be something like an x for the training data. But indeed, there is only validation_data. Could someone explain to me how I can differentiate between test and validation?
Is validation_data always the training data used in
def on_train_end(self, logs={}):
and validation_data becomes the actual validation set when using ?
def on_epoch_end(self, epoch, logs={}):
edit edit:
in the same block the author talks about self.model.training_data
he found but could not retrieve. I searched for this but apparently, it was removed.
So my question may be more on point: How can you load the used training data at the end of an epoch in a callback.
The answer is quite simple.
Before you do model.fit() you can stick anything to the model.
I just sticked X_train and Y_train to it. You can here stick the validation or test data to it, whatever you want.
model.X_train=X_train
model.Y_train=Y_train
Then, you use those values with the sklearn library to calculate any performance (accuracy, f1, kappa, ... )
First, add variables to self to enable later appending each epoch result to the variable
class yourowncallbackname(Callback):
def_on_train_begin(self,logs={}):
self.val_f1=[] #define the variable to collect results
self.val_kappa=[]
then define what happens on the end of each epoch in the same class "yourowncallbackname" (on epoch end works best). Here you use the data sticked to the model (e.g.
model.x_train) to get the results from model.predict.
def on_epoch_end(self, epoch, logs={}):
#LOAD DATA
train_predict= (np.asarray(self.model.predict(self.model.X_train)))
train_true=self.model.Y_train_jan
val_predict = np.asarray(self.model.predict(self.validation_data[0]))
val_true = self.validation_data[1]
Maybe, depending on what you want to predict you have to round the predictions with .round
val_predict = (np.asarray(self.model.predict(self.model.X_train))).round()
Then calculate any performance you like on the predicted data (same as defined in on_train_begin) for e.g. test and validation
_train_f1= f1_score(train_true, train_predict, labels=...)
_train_k= cohen_kappa_score(train_true.ravel(),train_predict,labels=...)
_val_f1= f1_score(val_true, val_predict, labels=...)
_val_k= cohen_kappa_score(val_true.ravel(), val_predict, labels=...)
To collect them, stick them now to self. This will stick to the callback itself which you can later call again.
self.train_f1.append(_train_f1)
self.train_kappa.append(_train_k)
self.val_f1.append(_val_f1)
self.val_kappa.append(_val_k)
This callback you have to initialize now wherever you want to fit your model
callbackmetric=yourowncallbackname()
Then you can put it into the fit:
history=model.fit(...
callbacks=[callbackmetric])
If you now want to use the appended results per epoch you can call them as you would call a class
Yourcallbackresults.val_f1=callbackmetric.val_f1
Yourcallbackresults.train_f1=callbackmetric.train_f1
For completion, I will post the total callback class again. You just have to addapt the correct perfroamnce calculation with al its parameters etc.:
class yourowncallbackname(Callback):
def on_train_begin(self, logs={}):
self.val_f1 = []
self.val_k = []
self.train_f1 = []
self.train_k = []
def on_epoch_end(self, epoch, logs={}):
#LOAD DATA
train_predict= (np.asarray(self.model.predict(self.model.X_train)))
train_true=self.model.Y_train_jan
val_predict = np.asarray(self.model.predict(self.validation_data[0]))
val_true = self.validation_data[1]
#CALC. PERFORMANCE
_train_f1= f1_score(...)
_train_k= cohen_kappa_score(...)
_val_f1= f1_score(...)
_val_k= cohen_kappa_score(...)
self.train_f1.append(_train_f1)
self.train_k.append(_train_k)
self.val_f1.append(_val_f1)
self.val_k.append(_val_k)
print (" val_f1: %f val_k: %f" %(_val_f1, _val_k))
return

Categories

Resources