Trial ID of the best model

Trial ID of the best model - python

I'm using Autokeras to find the best regression model and I need to plot the learning curves of the best model. However, I cannot find the trial ID of the best model. Here is a part of my code:
#path for saving the logs
import datetime
%load_ext tensorboard
path = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback= tf.keras.callbacks.TensorBoard(log_dir=path)
# Creating Keras Search Type
regressor = ak.StructuredDataRegressor(
output_dim=2,
loss="mean_squared_error",
metrics=["mae","mean_squared_error"],
project_name="structured_data_regressor",
max_trials=3,
directory=None,
objective="val_mean_squared_error",
tuner=None,
overwrite=False,
seed=100,
)
#Fitting ANN to Training Set - Validation Data Provide --> Validation Split = 0.1111
history = regressor.fit(x = X_train, y = y_train, epochs= 5, callbacks=[tensorboard_callback], validation_split=0.2)
#Learning Curves
%tensorboard --logdir logs/fit

Related

Receiving validation data from the validation_split within model training with TensorFlow

I have a problem. I used the validation_split to split my data and I did get an output of accuarcy and loss after each epoch.
However, I would like to visualize a ROC curve based on the validation data. Is there a possibility to get this validation data and to create the ROC curve on it?
loss = keras.losses.categorical_crossentropy
optim = keras.optimizers.Adam(learning_rate=0.0009)
metrics = ["accuracy"]
model_lstm.compile(loss=loss ,optimizer = optim, metrics=metrics)
history = model_lstm.fit(train_X,train_y,batch_size=32, epochs=10, validation_split=0.15, callbacks=CALLBACKS)
# What I want
X_val, y_val = history.validation_data
y_true = model.predict(X_val)
...

From what I see in the code -> https://github.com/keras-team/keras/blob/HEAD/keras/engine/training.py#L1304
it's not possible to get back the validation data to run your ROC curve using validation_split.
However you can split the dataset before between train, test and validation and pass the validation to the fit function with the paramater validation_data.

Train many neural networks and pick best one

I'm working on a classification task, trying to reconstruct a network from paper. In that paper, they are talking about doing a train test split 300 times and training the network each time after they are taking the mean of all predictions from each network for specific input data.
So here's the question: What is the best option for doing that, I've already reconstructed their network and thinking about using a for loop and saving outputs of each network in a data frame but can't get it the right way.
Here's the code :
# Set X and Y for training
X = dum_bll_fsrq.drop(['type2', 'name', 'Type_is_bll', 'Type_is_fsrq'], axis = 1)
Y = dum_bll_fsrq.iloc[:,-2:]
# Train test split
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3, stratify = Y)
# Create model
model_two_neuron = tf.keras.Sequential([
tf.keras.layers.Dense(40, input_shape=(15,)), # input shape required
tf.keras.layers.Dense(2, activation=tf.nn.sigmoid)
])
model_two_neuron.compile(optimizer=tf.keras.optimizers.Adam(),
loss=tf.keras.losses.MeanSquaredError(),
metrics=[tf.keras.metrics.Precision()])
# Train
model_two_neuron.fit(X_train, y_train, epochs=20)

You can use callbacks to save the best weights for each of your models, then evaluate the best results saved by callbacks after training.
Here is a basic example, provided in the Documentation:
model.compile(loss=..., optimizer=...,
metrics=['accuracy'])
EPOCHS = 10
checkpoint_filepath = '/tmp/checkpoint'
model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_filepath,
save_weights_only=True,
monitor='val_accuracy',
mode='max',
save_best_only=True)
# Model weights are saved at the end of every epoch, if it's the best seen
# so far.
model.fit(epochs=EPOCHS, callbacks=[model_checkpoint_callback])
# The model weights (that are considered the best) are loaded into the model.
model.load_weights(checkpoint_filepath)

Tensorflow 2.0: How are metrics computed when the ouput is sequential?

I have been working with binary sequential inputs and outputs using Tensorflow 2.0, and I've been wondering which approach Tensorflow uses to compute metrics such as recall or accuracy during training in those scenarios.
Each sample to my network consists of 60 timesteps, each with 300 features, and thus my expected output is a (60, 1) array of 1s and 0s. Suppose I have 2000 validation samples. When evaluating the validation set for each epoch, does tensorflow concatenates all of the 2000 samples into a single (2000*60=120000, 1) array and then compares to the concatenated groundtruth labels, or does it evalutes each of the (60, 1) individually and then returns a mean of those values? Is there any way to modify this behavior?

Tensorflow/Keras by default computes the metrics batch-wise for train data, while it computes the same metrics on ALL the data passed in validation_data parameters in fit method.
This means that the metric printed during fitting for the train data is the mean of that score calculated on all the batches. In other words, for trainset keras evaluates each bach individually and then returns a mean of those values. For validation data is different, keras gets all the validation samples and then compares them with the "concatenated" groundtruth labels.
To prove this behavior with code I propose a dummy example. I provide a custom callback that computes for sure the accuracy score on ALL the data passed at the end of the epoch (for train and optionally validation). this is useful for us to understand the behavior of tensorflow during training.
import numpy as np
from sklearn.metrics import accuracy_score
import tensorflow as tf
from tensorflow.keras.layers import *
from tensorflow.keras.models import *
from tensorflow.keras.callbacks import *
class ACC_custom(tf.keras.callbacks.Callback):
def __init__(self, train, validation=None):
super(ACC_custom, self).__init__()
self.validation = validation
self.train = train
def on_epoch_end(self, epoch, logs={}):
logs['ACC_score_train'] = float('-inf')
X_train, y_train = self.train[0], self.train[1]
y_pred = (self.model.predict(X_train).ravel()>0.5)+0
score = accuracy_score(y_train.ravel(), y_pred)
if (self.validation):
logs['ACC_score_val'] = float('-inf')
X_valid, y_valid = self.validation[0], self.validation[1]
y_val_pred = (self.model.predict(X_valid).ravel()>0.5)+0
val_score = accuracy_score(y_valid.ravel(), y_val_pred)
logs['ACC_score_train'] = np.round(score, 5)
logs['ACC_score_val'] = np.round(val_score, 5)
else:
logs['ACC_score_train'] = np.round(score, 5)
create dummy data
x_train = np.random.uniform(0,1, (1000,60,10))
y_train = np.random.randint(0,2, (1000,60,1))
x_val = np.random.uniform(0,1, (500,60,10))
y_val = np.random.randint(0,2, (500,60,1))
fit model
inp = Input(shape=((60,10)), dtype='float32')
x = Dense(32, activation='relu')(inp)
out = Dense(1, activation='sigmoid')(x)
model = Model(inp, out)
es = EarlyStopping(patience=10, verbose=1, min_delta=0.001,
monitor='ACC_score_val', mode='max', restore_best_weights=True)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(x_train,y_train, epochs=10, verbose=2,
callbacks=[ACC_custom(train=(x_train,y_train),validation=(x_val,y_val)),es],
validation_data=(x_val,y_val))
in the graphs below I make a comparison between the accuracies computed by our callback and the accuracy computed by keras
plt.plot(history.history['ACC_score_train'], label='accuracy_callback_train')
plt.plot(history.history['accuracy'], label='accuracy_default_train')
plt.legend(); plt.title('train accuracy')
plt.plot(history.history['ACC_score_val'], label='accuracy_callback_valid')
plt.plot(history.history['val_accuracy'], label='accuracy_default_valid')
plt.legend(); plt.title('validation accuracy')
as we can see the accuracy on the train data (first plot) is different between the default method and our callbacks. this means that the accuracy of train data is calculated batch-wise.
the validation accuracy (second plot) calculated by our callback and the default method is the same! this means that the score on validation data is computed one-shoot

Saving TensorFlow model (ensemble) to one pickle file, instead of multiple pickles, in Python

I am looking to deploy a 50 model ensemble for a regression problem, with each model being a Keras.Sequential Neural Network.
Below is a (simplified to 3 models) version of my code, that runs and works fine.
However, I don't want to create a pickle file for every individual model, and so is there a way of creating a class with a list of all the models, resulting in me having to save/load only one pickle file?
from __future__ import absolute_import, division, print_function
import tensorflow as tf
from tensorflow import keras
import pandas as pd
import numpy as np
train = pd.read_csv("Training Data.csv").fillna(0)
X_train = train.drop(['ID_NUMBER','DATE','X','Y','Z'],axis=1)
Y_train = train[['X','Y','Z']]
EPOCHS = 1500
BATCH_SIZE = 256
#Defining the 3 layered Neural Network
def build_model():
model = keras.Sequential([
keras.layers.Dense(1000, activation=tf.nn.softplus,
input_shape=(X_train.shape[1],)),
keras.layers.Dense(500, activation=tf.nn.softplus),
keras.layers.Dense(3)
])
model.compile(loss='mse',optimizer='adam', metrics=['mse'])
return model
model0 = build_model()
# Store training stats
history0 = model0.fit(X_train, Y_train, epochs=EPOCHS, batch_size=BATCH_SIZE,
validation_split=0.0, verbose=1)
model1 = build_model()
# Store training stats
history1 = model1.fit(X_train, Y_train, epochs=EPOCHS, batch_size=BATCH_SIZE,
validation_split=0.0, verbose=1)
model2 = build_model()
# Store training stats
history2 = model2.fit(X_train, Y_train, epochs=EPOCHS, batch_size=BATCH_SIZE,
validation_split=0.0, verbose=1)
model0.save("model0.pkl")
model1.save("model1.pkl")
model2.save("model2.pkl")
For making new predictions, my code would look something like this:
#Loading Models
model0 = tf.keras.models.load_model("model0.pkl")
model1 = tf.keras.models.load_model("model1.pkl")
model2 = tf.keras.models.load_model("model2.pkl")
#Finding Weights (based on train score)
train_nn_predictions = model0.predict(X_train)
train['X'],train['Y'],train['Z'] = train_nn_predictions[:,0],train_nn_predictions[:,1],train_nn_predictions[:,2]
nn0 = #training score metric (irrelevant to show how it is calculated here)
print("Average Train Score for Model 0 is:",nn0)
train_nn_predictions = model1.predict(X_train)
train['X'],train['Y'],train['Z'] = train_nn_predictions[:,0],train_nn_predictions[:,1],train_nn_predictions[:,2]
nn1 = #training score metric (irrelevant to show how it is calculated here)
print("Average Train Score for Model 1 is:",nn1)
train_nn_predictions = model2.predict(X_train)
train['X'],train['Y'],train['Z'] = train_nn_predictions[:,0],train_nn_predictions[:,1],train_nn_predictions[:,2]
nn2 = #training score metric (irrelevant to show how it is calculated here)
print("Average Train Score for Model 2 is:",nn2)
#Apply the weightings for each of the models
w0,w1,w2 = 1/nn0,1/nn1,1/nn2
#New Predictions
new_record = np.array([my variables])
target_predictions = (w0*model0.predict(new_record)+w1*model1.predict(new_record)+w2*model2.predict(new_record))/(w0+w1+w2)

You could try to:
Merge all the models using layers.concatenate. It will create an output for all the 50 models. For more details regarding the code check:
According to https://keras.io/api/layers/merging_layers/concatenate/:
x = np.arange(20).reshape(2, 2, 5)
y = np.arange(20, 30).reshape(2, 1, 5)
tf.keras.layers.Concatenate(axis=1)([x, y])
Use KerasPickleWrapper to pickle it.

Keras LSTM - Validation Loss Increasing From Epoch #1

I'm currently undertaking my first 'real' DL project of (surprise) predicting stock movements. I know that I'm 1000:1 to make anything useful but I'm enjoying it and want to see it through, I've learnt more in my few weeks of attempting this than I have in the prior 6 months of completing MOOC's.
I'm building an LSTM using Keras to currently predict the next 1 step forward and have attempted the task as both classification (up/down/steady) and now as a regression problem. Both result in a similar roadblock in that my validation loss never improves from epoch #1.
I can get the model to overfit such that training loss approaches zero with MSE (or 100% accuracy if classification), but at no stage does the validation loss decrease. This screams overfitting to my untrained eye so I added varying amounts of dropout but all that does is stifle the learning of the model/training accuracy and shows no improvements on the validation accuracy.
I have attempted to change a significant number of hyperparameters - learning rate, optimiser, batchsize, lookback window, #layers, #units, dropout, #samples, etc, also tried with subset of data and subset of features but I just can't get it to work so I'm very thankful for any help.
Code Below (it's not pretty I know):
# Import saved full dataframe ~ 200 features
import feather
df = feather.read_dataframe('df_feathered')
df.set_index('time', inplace=True)
# Difference the dataset to make stationary
df = df.diff(periods=1, axis=0)
# MAKE LARGE SAMPLE FOR TESTING
df_train = df.loc['2017-3-1':'2017-6-30']
df_val = df.loc['2017-7-1':'2017-8-31']
df_test = df.loc['2017-9-1':'2017-9-30']
# Make x_train, x_val sets by dropping target variable
x_train = df_train.drop('close+1', axis=1)
x_val = df_val.drop('close+1', axis=1)
# Scale the training data first then fit the transform to the test set
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_val)
# scaler = MinMaxScaler(feature_range=(0,1))
# x_train = scaler.fit_transform(df_train1)
# x_test = scaler.transform(df_val1)
# Create y_train, y_test, simply target variable for regression
y_train = df_train['close+1']
y_test = df_val['close+1']
# Define Lookback window for LSTM input
sliding_window = 15
# Convert x_train, x_test, y_train, y_test into 3d array (samples,
timesteps, features) for LSTM input
dataXtrain = []
for i in range(len(x_train)-sliding_window-1):
a = x_train[i:(i+sliding_window), 0:(x_train.shape[1])]
dataXtrain.append(a)
dataXtest = []
for i in range(len(x_test)-sliding_window-1):
a = x_test[i:(i+sliding_window), 0:(x_test.shape[1])]
dataXtest.append(a)
dataYtrain = []
for i in range(len(y_train)-sliding_window-1):
dataYtrain.append(y_train[i + sliding_window])
dataYtest = []
for i in range(len(y_test)-sliding_window-1):
dataYtest.append(y_test[i + sliding_window])
# Make data the divisible by a variety of batch_sizes for training
# Started at 1000 to not include replaced NaN values
dataXtrain = np.array(dataXtrain[1000:172008])
dataYtrain = np.array(dataYtrain[1000:172008])
dataXtest = np.array(dataXtest[1000:83944])
dataYtest = np.array(dataYtest[1000:83944])
# Checking input shapes
print('dataXtrain size is: {}'.format((dataXtrain).shape))
print('dataXtest size is: {}'.format((dataXtest).shape))
print('dataYtrain size is: {}'.format((dataYtrain).shape))
print('dataYtest size is: {}'.format((dataYtest).shape))
### ACTUAL LSTM MODEL
batch_size = 256
timesteps = dataXtrain.shape[1]
features = dataXtrain.shape[2]
# Model set-up, stacked 4 layer stateful LSTM
model = Sequential()
model.add(LSTM(512, return_sequences=True, stateful=True,
batch_input_shape=(batch_size, timesteps, features)))
model.add(LSTM(256,stateful=True, return_sequences=True))
model.add(LSTM(256,stateful=True, return_sequences=True))
model.add(LSTM(128,stateful=True))
model.add(Dense(1, activation='linear'))
model.summary()
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.9, patience=5, min_lr=0.000001, verbose=1)
def coeff_determination(y_true, y_pred):
from keras import backend as K
SS_res = K.sum(K.square( y_true-y_pred ))
SS_tot = K.sum(K.square( y_true - K.mean(y_true) ) )
return ( 1 - SS_res/(SS_tot + K.epsilon()) )
model.compile(loss='mse',
optimizer='nadam',
metrics=[coeff_determination,'mse','mae','mape'])
history = model.fit(dataXtrain, dataYtrain,validation_data=(dataXtest, dataYtest),
epochs=100,batch_size=batch_size, shuffle=False, verbose=1, callbacks=[reduce_lr])
score = model.evaluate(dataXtest, dataYtest,batch_size=batch_size, verbose=1)
print(score)
predictions = model.predict(dataXtest, batch_size=batch_size)
print(predictions)
import matplotlib.pyplot as plt
%matplotlib inline
#plt.plot(history.history['mean_squared_error'])
#plt.plot(history.history['val_mean_squared_error'])
plt.plot(history.history['coeff_determination'])
plt.plot(history.history['val_coeff_determination'])
#plt.plot(history.history['mean_absolute_error'])
#plt.plot(history.history['mean_absolute_percentage_error'])
#plt.plot(history.history['val_mean_absolute_percentage_error'])
#plt.title("MSE")
plt.ylabel("R2")
plt.xlabel("epoch")
plt.legend(["train", "val"], loc="best")
plt.show()
plt.plot(history.history["loss"][5:])
plt.plot(history.history["val_loss"][5:])
plt.title("model loss")
plt.ylabel("loss")
plt.xlabel("epoch")
plt.legend(["train", "val"], loc="best")
plt.show()
plt.figure(figsize=(20,8))
plt.plot(dataYtest)
plt.plot(predictions)
plt.title("Prediction")
plt.ylabel("Price")
plt.xlabel("Time")
plt.legend(["Truth", "Prediction"], loc="best")
plt.show()

Maybe you should remember you are predicting sock returns, which it's very likely to predict nothing. So val_loss increasing is not overfitting at all. Instead of adding more dropouts, maybe you should think about adding more layers to increase it's power.

Try to reduce learning rate much (and remove dropouts for now).
Why do you use
shuffle=False
in fit() function?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Trial ID of the best model - python

Related

Receiving validation data from the validation_split within model training with TensorFlow

Train many neural networks and pick best one

Tensorflow 2.0: How are metrics computed when the ouput is sequential?

Saving TensorFlow model (ensemble) to one pickle file, instead of multiple pickles, in Python

Keras LSTM - Validation Loss Increasing From Epoch #1

Categories

Resources