Selecting Best Performing Model over Various iterations - python

I want to create a for loop in order to run my model various times and keep the best performing model for each run. This because I've noticed that each time I train my model it might perform better on one run and much worse on another. Thus I want to store possibly each model in a list or just select the best.
I have the current process but I'm not sure if this is the most adequate manner and also I'm not actually sure on how to select the best performing model through all these iterations. Here I am doing it only for 10 iterations, but I want to know if there is a better way of doing this.
My Code Implementation
def build_model(input1, input2):
"""
Creates the a multi-channel ANN, capable of accepting multiple inputs.
:param: none
:return: the model of the ANN with a single output given
"""
input1 = np.expand_dims(input1,1)
# Define Inputs for ANN
input1 = Input(shape = (input1.shape[1], ), name = "input1")
input2 = Input(shape = (input2.shape[1],), name = "input2")
# First Branch of ANN (Weight)
x = Dense(units = 1, activation = "relu")(input1)
x = BatchNormalization()(x)
# Second Branch of ANN (Word Embeddings)
y = Dense(units = 36, activation = "relu")(input2)
y = BatchNormalization()(y)
# Merge the input models into a single large vector
combined = Concatenate()([x, y])
#Apply Final Output Layer
outputs = Dense(1, name = "output")(combined)
# Create an Interpretation Model (Accepts the inputs from previous branches and has single output)
model = Model(inputs = [input1, input2], outputs = outputs)
# Compile the Model
model.compile(loss='mse', optimizer = Adam(lr = 0.01), metrics = ['mse'])
# Summarize the Model Summary
model.summary()
return model
test_outcomes = [] # list of model scores
r2_outcomes = [] #list of r2 scores
stored_models = [] #list of stored_models
for i in range(10):
model = build_model(x_train['input1'], x_train['input2'])
print("Model Training")
model.fit([x_train['input1'], x_train['input2']], y_train,
batch_size = 25, epochs = 60, verbose = 0 #, validation_split = 0.2
,validation_data = ([x_valid['input1'],x_valid['input2']], y_valid))
#Determine Model Predictions
print("Model Predictions")
y_pred = model.predict([x_valid['input1'], x_valid['input2']])
y_pred = y_pred.flatten()
#Evaluate the Model
print("Model Evaluations")
score = model.evaluate([x_valid['input1'], x_valid['input2']], y_valid, verbose=1)
test_loss = round(score[0], 3)
print ('Test loss:', test_loss)
test_outcomes.append(test_loss)
#Calculate R_Squared
r_squared = r2_score(y_valid, y_pred)
print(r_squared)
r2_outcomes.append(r_squared)
#Store Final Model
print("Model Stored")
stored_models.append(model) #list of stored_models
mean_test= np.mean(test_outcomes)
r2_means = np.mean(r2_outcomes)
Output Example

You should use Callbacks
you can stop training using callback
Here an example of how you can create a custom callback in order stop training when certain accuracy threshold
#example
acc_threshold =0.95
class myCallback(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs={}):
if(logs.get('acc') > acc_threshold):
print("\nReached %2.2f%% accuracy, so stopping training!!" %(acc_threshold))
self.model.stop_training = True
my_callback = myCallback()
model.fit([x_train['input1'], x_train['input2']], y_train,
batch_size = 25, epochs = 60, verbose = 0 #, validation_split = 0.2
,validation_data = ([x_valid['input1'],x_valid['input2']], y_valid),
callbacks=my_callback )
You can also use EarlyStopping to monitor metrics (Like stopping when loss isnt improving)

Related

Train a LSTM model using multiple datasets in for loop

I am in the process of training my LSTM neural networks that shall predict quintiles of stock price distributions. As I would like to train the model on not just one stock but a sample of 500 I wrote the below training loop that shall fit the model to each stock, save the model params and the load the params again when training the next stock. My question is if I can write the code in the for loop like below or whether I can also just use a complete dataset including all 500 stocks where data is just concatenated along the 0 axis.
The idea is, that the model iterates over each stock, the best model is then saved by the checkpoint function and is reloaded again for the fitting of the next stock.
This is the training loop I would like to use:
def compile_and_fit(model_type,model,checkpoint_path,config, stock_data,macro_data, factor_data, patience, batch_size,
num_epochs,train_set_ratio, val_set_ratio, Y_name):
"""
model = NN model,
data = stock data, factor data, macro data,
batch_size = timesteps per batch
alpha adam = learning rate optimizer
data set ratios = train_set_ratio, val_set_ratio (eg. 0.5)
"""
early_stopping = tf.keras.callbacks.EarlyStopping(
monitor='loss', #'loss'
patience=patience,
mode='min')
cp_callback = tf.keras.callbacks.ModelCheckpoint(
checkpoint_path,
monitor= 'loss',
verbose=True,
save_best_only=True,
save_freq = batch_size,
mode='min')
permno_list = stock_data.permno.unique()
test_data = pd.DataFrame()
counter = 0
for p in permno_list:
#checkpoints
if counter == 0:
trained_model = model
cp_callback = cp_callback
else:
trained_model = tf.keras.models.load_model(checkpoint_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(checkpoint_path,monitor= 'loss',verbose=True, save_best_only=True,save_freq = batch_size, mode='min')
stock_data_length = len(stock_data.loc[stock_data.permno==p])
train_data_stocks = stock_data.loc[stock_data.permno==p][0:int(stock_data_length*train_set_ratio)]
val_data_stocks = stock_data.loc[stock_data.permno==p][int(stock_data_length*train_set_ratio):int(stock_data_length*(val_set_ratio+train_set_ratio))]
test_data_stocks = stock_data.loc[stock_data.permno==p][int(stock_data_length*(val_set_ratio+train_set_ratio)):]
test_data = pd.concat([test_data, test_data_stocks],axis=0)
train_date_index = train_data_stocks.index.values.tolist()
val_date_index = val_data_stocks.index.values.tolist()
train_data_factors = factor_data.loc[factor_data.index.isin(train_date_index)]
train_data_macro = macro_factors.loc[macro_factors.index.isin(train_date_index)]
train_data_macro_norm = train_data_macro.copy(deep=True)
for c in train_data_macro_norm.columns:
train_data_macro_norm[c] = MinMaxScaler([-1,1]).fit_transform(pd.DataFrame(train_data_macro_norm[c]))
train_data_merged = pd.concat([train_data_factors, train_data_macro_norm],axis=1)
val_data_factors = factor_data.loc[factor_data.index.isin(val_date_index)]
val_data_macro = macro_factors.loc[macro_factors.index.isin(val_date_index)]
val_data_macro_norm = val_data_macro.copy(deep=True)
for c in val_data_macro_norm.columns:
val_data_macro_norm[c] = MinMaxScaler([-1,1]).fit_transform(pd.DataFrame(val_data_macro_norm[c]))
val_data_merged = pd.concat([val_data_factors, val_data_macro_norm],axis=1)
if model_type=='combined':
x_train_factors = []
x_train_macro = []
y_train =[]
for i in range(batch_size, len(train_data_factors)):
x_train_factors.append(train_data_factors.values[i-batch_size:i,:])
x_train_macro.append(train_data_macro_norm.values[i-batch_size:i,:])
y_train.append(train_data_stocks[Y_name].values[i])
x_train_factors, x_train_macro, y_train= np.array(x_train_factors),np.array(x_train_macro), np.array(y_train)
x_val_factors = []
x_val_macro = []
y_val =[]
for i in range(batch_size, len(val_data_factors)):
x_val_factors.append(val_data_factors.values[i-batch_size:i,:])
x_val_macro.append(val_data_macro_norm.values[i-batch_size:i,:])
y_val.append(val_data_stocks[Y_name].values[i])
x_val_factors, x_val_macro, y_val = np.array(x_val_factors),np.array(x_val_macro), np.array(y_val)
score =trained_model.evaluate([x_train_macro,x_train_factors],y_train,batch_size=batch_size)
score = list(score)
score.sort(reverse=True)
score = score[-2]
cp_callback.best = score
trained_model.fit(x=[x_train_macro,x_train_factors],y=y_train,batch_size=batch_size, epochs=num_epochs,
validation_data=[[x_val_macro,x_val_factors], y_val], callbacks=[early_stopping,cp_callback])
if model_type=='merged':
x_train_merged = []
y_train =[]
for i in range(batch_size, len(train_data_merged)):
x_train_merged.append(train_data_merged.values[i-batch_size:i,:])
y_train.append(train_data_stocks[Y_name].values[i])
x_train_merged, y_train= np.array(x_train_merged), np.array(y_train)
x_val_merged = []
y_val =[]
for i in range(batch_size, len(val_data_merged)):
x_val_merged.append(val_data_merged.values[i-batch_size:i,:])
y_val.append(val_data_stocks[Y_name].values[i])
x_val_merged, y_val = np.array(x_val_merged), np.array(y_val)
score =trained_model.evaluate(x_train_merged,y_train,batch_size=batch_size)
score = list(score)
score.sort(reverse=True)
score = score[-2]
cp_callback.best = score
trained_model.fit(x=x_train_merged,y=y_train,batch_size=batch_size, epochs=num_epochs,
validation_data=[x_val_merged, y_val], callbacks=[early_stopping,cp_callback])
return trained_model, test_data
If someone has an idea whether this works or not, I would be incredibly grateful!
In my testing I could see the mse constantly decreasing, however if the loop continues for the next stop the mse starts with avery high value again.
According to this answer
How can I use multiple datasets with one model in Keras?
you can repeatedly fit the same model on more datasets.
If you want to save the model and load it at each iteration, that should also work with the caveat that you loose the optimizer state (see Loading a trained Keras model and continue training).

Word2Vec embedding to LSTM layers?

I am now working on a neural network that should predict the next activity and the outcome (both or just one, depending on the self.net_out parameter of a trace (sequence of events, taken from an eventlog). The inputs of the net are windows (prefixes) of a trace of a specific size. Right now it looks like this:
def nn(self,params):
#done in this function so that, in case, win_size easily can become a parameter
X_train,Y_train,Z_train = self.build_windows(self.traces_train,self.win_size)
if(self.net_embedding==0):
if(self.net_out!=2):
Y_train = self.leA.fit_transform(Y_train)
Y_train = to_categorical(Y_train)
label=Y_train
if(self.net_out!=1):
Z_train = self.leO.fit_transform(Z_train)
Z_train = to_categorical(Z_train)
label=Z_train
unique_events = len(self.act_dictionary)
input_act = Input(shape=self.win_size, dtype='int32', name='input_act')
if(self.net_embedding==0):
x_act = Embedding(output_dim=params["output_dim_embedding"], input_dim=unique_events + 1, input_length=self.win_size)(
input_act)
else:
print("WIP")
n_layers = int(params["n_layers"]["n_layers"])
l1 = LSTM(params["shared_lstm_size"], return_sequences=True, kernel_initializer='glorot_uniform',dropout=params['dropout'])(x_act)
l1 = BatchNormalization()(l1)
if(self.net_out!=2):
l_a = LSTM(params["lstmA_size_1"], return_sequences=(n_layers != 1), kernel_initializer='glorot_uniform',dropout=params['dropout'])(l1)
l_a = BatchNormalization()(l_a)
elif(self.net_out!=1):
l_o = LSTM(params["lstmO_size_1"], return_sequences=(n_layers != 1), kernel_initializer='glorot_uniform',dropout=params['dropout'])(l1)
l_o = BatchNormalization()(l_o)
for i in range(2,n_layers+1):
if(self.net_out!=2):
l_a = LSTM(params["n_layers"]["lstmA_size_%s_%s" % (i, n_layers)], return_sequences=(n_layers != i), kernel_initializer='glorot_uniform',dropout=params['dropout'])(l_a)
l_a = BatchNormalization()(l_a)
if(self.net_out!=1):
l_o = LSTM(params["n_layers"]["lstmO_size_%s_%s" % (i, n_layers)], return_sequences=(n_layers != i), kernel_initializer='glorot_uniform',dropout=params['dropout'])(l_o)
l_o = BatchNormalization()(l_o)
outputs=[]
if(self.net_out!=2):
output_l = Dense(self.outsize_act, activation='softmax', name='act_output')(l_a)
outputs.append(output_l)
if(self.net_out!=1):
output_o = Dense(self.outsize_out, activation='softmax', name='outcome_output')(l_o)
outputs.append(output_o)
model = Model(inputs=input_act, outputs=outputs)
print(model.summary())
opt = Adam(lr=params["learning_rate"])
if(self.net_out==0):
loss = {'act_output':'categorical_crossentropy', 'outcome_output':'categorical_crossentropy'}
loss_weights= [params['gamma'], 1-params['gamma']]
if(self.net_out==1):
loss = {'act_output':'categorical_crossentropy'}
loss_weights= [1,1]
if(self.net_out==2):
loss = {'outcome_output':'categorical_crossentropy'}
loss_weights=[1,1]
model.compile(loss=loss, optimizer=opt, loss_weights=loss_weights ,metrics=['accuracy'])
early_stopping = EarlyStopping(monitor='val_loss',
patience=20)
lr_reducer = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=10, verbose=0, mode='auto',
min_delta=0.0001, cooldown=0, min_lr=0)
if(self.net_out==0):
history = model.fit(X_train, [Y_train,Z_train], epochs=3, batch_size=2**params['batch_size'], verbose=2, callbacks=[early_stopping, lr_reducer], validation_split =0.2 )
else:
history = model.fit(X_train, label, epochs=300, batch_size=2**params['batch_size'], verbose=2, callbacks=[early_stopping, lr_reducer], validation_split =0.2 )
scores = [history.history['val_loss'][epoch] for epoch in range(len(history.history['loss']))]
score = min(scores)
#global best_score, best_model
if self.best_score > score:
self.best_score = score
self.best_model = model
return {'loss': score, 'status': STATUS_OK}
As it can be seen, I need to consider 2 types of embeddings: for the one that I already implemented and tested (self.net_embedding=0), each activity/event in each trace (and consequently window) is mapped as an integer; then I apply fit_transform and to_categorical.
The second type of embedding that I have to try is by using word2vec. To do so, I already changed the format of the input, not converting each activity in an integer but by keeping it as a string (the actual name of the activity, standardized to just numbers and letters). I don't know how to proceed though: I guess I should do something like
w2vModel= Word2Vec(X_train, size=params['word2vec_size'], min_count=1)
to get the embedded windows by w2vModel.wv, but how do I pass these to the lstm layers then? Into what should I change the embedding layer after the input one (where I put print(WIP) for now)?

ValueError: No gradients provided for any variable when using model.fit

I'm trying to use the features extracted from two pre-trained models (resnet and mobilenet) as inputs to train a functional model using Keras. I need to classify images as categories 1,2 or 3 using a softmax layer.
My model.fit function is giving me the following error:
ValueError: No gradients provided for any variable: ['dense_66/kernel:0', 'dense_66/bias:0',
'dense_64/kernel:0', 'dense_64/bias:0', 'dense_67/kernel:0', 'dense_67/bias:0',
'dense_65/kernel:0', 'dense_65/bias:0', 'dense_68/kernel:0', 'dense_68/bias:0',
'dense_69/kernel:0', 'dense_69/bias:0', 'dense_70/kernel:0', 'dense_70/bias:0'].
Here's the relevant part of code:
Creating the dataset
def datasetgenerator(url,BATCH_SIZE,IMG_SIZE):
data=image_dataset_from_directory(url,
shuffle=True,
batch_size=BATCH_SIZE,
image_size=IMG_SIZE,
label_mode='int'
)
return data
BATCH_SIZE = 20
IMG_SIZE = (160, 160)
train_dir='wound_dataset2/train'
train_dataset = datasetgenerator(url=train_dir,BATCH_SIZE=BATCH_SIZE,IMG_SIZE= IMG_SIZE)
val_dir='wound_dataset2/val'
validation_dataset = datasetgenerator(url=val_dir,BATCH_SIZE=BATCH_SIZE,IMG_SIZE= IMG_SIZE)
test_dir='wound_dataset2/test'
test_dataset = datasetgenerator(url=test_dir,BATCH_SIZE=BATCH_SIZE,IMG_SIZE= IMG_SIZE)
print(train_dataset)
Feature extraction
mobilenet_features = np.empty([20, 1280])
resnet_features = np.empty([20, 2048])
for data in train_dataset:
image_batch, label_batch = data
image_batch = data_augmentation(image_batch)
preprocess_input_image_resnet = preprocess_input_resnet(image_batch)
preprocess_input_image_mobilenet = preprocess_input_mobilenet(image_batch)
feature_batch_resnet = base_model_resnet(preprocess_input_image_resnet)
feature_batch_average_resnet = global_average_layer(feature_batch_resnet)
feature_batch_mobilenet = base_model_mobilenet(preprocess_input_image_mobilenet)
feature_batch_average_mobilenet = global_average_layer(feature_batch_mobilenet)
mobilenet_features = np.concatenate((mobilenet_features, np.array(feature_batch_average_mobilenet)))
resnet_features = np.concatenate((resnet_features, np.array(feature_batch_average_resnet)))
Model Generation
from tensorflow.keras.layers import concatenate
# define two sets of inputs
inputA = tf.keras.Input(shape=(1280,))
inputB = tf.keras.Input(shape=(2048,))
# the first branch operates on the first input
x = tf.keras.layers.Dense(8, activation="relu")(inputA)
x = tf.keras.layers.Dense(4, activation="relu")(x)
x = tf.keras.Model(inputs=inputA, outputs=x)
# the second branch opreates on the second input
y = tf.keras.layers.Dense(64, activation="relu")(inputB)
y = tf.keras.layers.Dense(32, activation="relu")(y)
y = tf.keras.layers.Dense(4, activation="relu")(y)
y = tf.keras.Model(inputs=inputB, outputs=y)
# combine the output of the two branches
combined = concatenate([x.output, y.output])
fc_layers = [1024, 1024]
dropout = 0.5
# apply a FC layer and then a regression prediction on the
# combined outputs
z = Flatten()(combined)
for fc in fc_layers:
# New FC layer, random init
z = Dense(fc, activation='relu')(z)
z = Dropout(dropout)(z)
# New softmax layer
predictions = Dense(3, activation='softmax')(z)
# our model will accept the inputs of the two branches and
# then output a single value
model = tf.keras.Model(inputs=[x.input, y.input], outputs=z)
Training
model.compile(optimizer=tf.keras.optimizers.Adam(1e-3),
loss= tf.keras.losses.CategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit((mobilenet_features, resnet_features), batch_size=20, epochs=10)
I'm trying this as a method to improve accuracy over what I got using transfer learning. Any help would be appreciated.
z = Flatten()(combined)
z = Dense(fc, activation='relu')(z)
z = Dropout(dropout)(z)
z = Dense(fc, activation='relu')(z)
z = Dropout(dropout)(z)
predictions = Dense(3, activation='softmax')(z)
# use the prediction as output layer
model = tf.keras.Model(inputs=[x.input, y.input], outputs=predictions)
#add target tensor to the fit method
history = model.fit((mobilenet_features, resnet_features),youTarget, batch_size=20, epochs=10)

How to set up LSTM network for predict multi-sequence?

I am learning how to set up the RNN-LSTM network for prediction. I have created the dataset with one input variable.
x y
1 2.5
2 6
3 8.6
4 11.2
5 13.8
6 16.4
...
By the following python code, I have created the window data, like [x(t-2), x(t-1), x(t)] to predict [y(t)]:
df= pd.read_excel('dataset.xlsx')
# split a univariate dataset into train/test sets
def split_dataset(data):
train, test = data[:-328], data[-328:-6]
return train, test
train, test = split_dataset(df.values)
# scale train and test data
def scale(train, test):
# fit scaler
scaler = MinMaxScaler(feature_range=(0,1))
scaler = scaler.fit(train)
# transform train
#train = train.reshape(train.shape[0], train.shape[1])
train_scaled = scaler.transform(train)
# transform test
#test = test.reshape(test.shape[0], test.shape[1])
test_scaled = scaler.transform(test)
return scaler, train_scaled, test_scaled
scaler, train_scaled, test_scaled = scale(train, test)
def to_supervised(train, n_input, n_out=7):
# flatten data
data = train
X, y = list(), list()
in_start = 0
# step over the entire history one time step at a time
for _ in range(len(data)):
# define the end of the input sequence
in_end = in_start + n_input
out_end = in_end + n_out
# ensure we have enough data for this instance
if out_end <= len(data):
x_input = data[in_start:in_end, 0]
x_input = x_input.reshape((len(x_input), 1))
X.append(x_input)
y.append(data[in_end:out_end, 0])
# move along one time step
in_start += 1
return np.array(X), np.array(y)
train_x, train_y = to_supervised(train_scaled, n_input = 3, n_out = 1)
test_x, test_y = to_supervised(test_scaled, n_input = 3, n_out = 1)
verbose, epochs, batch_size = 0, 20, 16
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
model = Sequential()
model.add(LSTM(200, return_sequences= False, input_shape = (train_x.shape[1],train_x.shape[2])))
model.add(Dense(1))
model.compile(loss = 'mse', optimizer = 'adam')
history = model.fit(train_x, train_y, epochs=epochs, verbose=verbose, validation_data = (test_x, test_y))
However, I have other questions about this:
Q1: What is the meaning of units in LSTM? [model.add(LSTM(units, ...))]
(I have tried different units for the model, it would be more accurate as units increased.)
Q2: How many layers should I set?
Q3: How can I predict multi-steps ? e.g base on (x(t),x(t-1)) to predict y(t), y(t+1) I have tried to set the n_out = 2 in the to_supervised function, but when I applied the same method, it returned the error
train_x, train_y = to_supervised(train_scaled, n_input = 3, n_out = 2)
test_x, test_y = to_supervised(test_scaled, n_input = 3, n_out = 2)
verbose, epochs, batch_size = 0, 20, 16
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
model = Sequential()
model.add(LSTM(200, return_sequences= False, input_shape = (train_x.shape[1],train_x.shape[2])))
model.add(Dense(1))
model.compile(loss = 'mse', optimizer = 'adam')
history = model.fit(train_x, train_y, epochs=epochs, verbose=verbose, validation_data = (test_x, test_y))
ValueError: Error when checking target: expected dense_27 to have shape (1,) but got array with shape (2,)
Q3(cont): What should I add or change in the model setting?
Q3(cont): What is the return_sequences ? When should I set True?
Q1. Units in LSTM is the number of neurons in your LSTM layer.
Q2. That depends on your model / data. Try changing them around to see the effect.
Q3. That depends which apporach you take.
Q4. Ideally you'll want to predict a single time step every time.
It is possible to predict several at a time, but in my experience you will get better results like as i have described below
e.g
use y(t-1), y(t) to predict y_hat(t+1)
THEN
use y(t), y_hat(t+1) to predict y_hat(t+2)
Are you sure you're actually using X to predict Y in this case?
how does train x/y and test x/y look like?
Re Q1: It is the number of LSTM cells (=LSTM units), which consist of several neurons themselves but have (in the standard case as given) only one output each. Thus, the number of units corresponds directly to the dimensionality of your output.

Neural network has <0.001 validation and testing loss but 0% accuracy when doing a prediction

I've been training an MLP to predict the time remaining on an assembly sequence. The Training loss, Validation loss and MSE are all less 0.001, however, when I try to do a prediction with one of the datasets I trained the network with the it can't correctly identify any of the outputs from the set of inputs. What am I doing wrong that is producing this error?
I am also struggling to understand how, when the model is deployed, how do I perform the scaling of the result for one prediction? scaler.inverse_transform won't work because the data for that scaler used during training has been lost as the prediction would be done in a separate script to the training using the model the training produced. Is this information saved in the model builder?
I have tried to change the batch size during training, rounding the time column of the dataset to the nearest second (previously was 0.1 seconds), trained over 50, 100 and 200 epochs and I always end up with no correct predictions. I am also training an LSTM to see which is more accurate but that is also having the same issue. The dataset is split 70-30 training-testing and then training is then split 75-25 into training and validation.
Data scaling and model training code:
def scale_data(training_data, training_data_labels, testing_data, testing_data_labels):
# Create X and Y scalers between 0 and 1
x_scaler = MinMaxScaler(feature_range=(0, 1))
y_scaler = MinMaxScaler(feature_range=(0, 1))
# Scale training data
x_scaled_training = x_scaler.fit_transform(training_data)
y_scaled_training = y_scaler.fit_transform(training_data_labels)
# Scale testing data
x_scaled_testing = x_scaler.transform(testing_data)
y_scaled_testing = y_scaler.transform(testing_data_labels)
return x_scaled_training, y_scaled_training, x_scaled_testing, y_scaled_testing
def train_model(training_data, training_labels, testing_data, testing_labels, number_of_epochs, number_of_columns):
model_hidden_neuron_number_list = []
model_repeat_list = []
model_error_rate_list = []
for hidden_layer_1_units in range(int(np.floor(number_of_columns / 2)), int(np.ceil(number_of_columns * 2))):
print("Training starting, number of hidden units = %d" % hidden_layer_1_units)
for repeat in range(1, 6):
print("Repeat %d" % repeat)
model = k.Sequential()
model.add(Dense(hidden_layer_1_units, input_dim=number_of_columns,
activation='relu', name='hidden_layer_1'))
model.add(Dense(1, activation='linear', name='output_layer'))
model.compile(loss='mean_squared_error', optimizer='adam')
# Train Model
model.fit(
training_data,
training_labels,
epochs=number_of_epochs,
shuffle=True,
verbose=2,
callbacks=[logger],
batch_size=1024,
validation_split=0.25
)
# Test Model
test_error_rate = model.evaluate(testing_data, testing_labels, verbose=0)
print("Error on testing data is %.3f" % test_error_rate)
model_hidden_neuron_number_list.append(hidden_layer_1_units)
model_repeat_list.append(repeat)
model_error_rate_list.append(test_error_rate)
# Save Model
model_builder = tf.saved_model.builder.SavedModelBuilder("MLP/models/{hidden_layer_1_units}/{repeat}".format(hidden_layer_1_units=hidden_layer_1_units, repeat=repeat))
inputs = {
'input': tf.saved_model.build_tensor_info(model.input)
}
outputs = { 'time_remaining':tf.saved_model.utils.build_tensor_info(model.output)
}
signature_def = tf.saved_model.signature_def_utils.build_signature_def(
inputs=inputs,
outputs=outputs, method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME
)
model_builder.add_meta_graph_and_variables(
K.get_session(),
tags=[tf.saved_model.tag_constants.SERVING],
signature_def_map={tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature_def
}
)
model_builder.save()
And then to do a prediction:
file_name = top_level_file_path + "./MLP/models/19/1/"
testing_dataset = pd.read_csv(file_path + os.listdir(file_path)[0])
number_of_rows = len(testing_dataset.index)
number_of_columns = len(testing_dataset.columns)
newcol = [number_of_rows]
max_time = testing_dataset['Time'].max()
for j in range(0, number_of_rows - 1):
newcol.append(max_time - testing_dataset.iloc[j].iloc[number_of_columns - 1])
x_scaler = MinMaxScaler(feature_range=(0, 1))
y_scaler = MinMaxScaler(feature_range=(0, 1))
# Scale training data
data_scaled = x_scaler.fit_transform(testing_dataset)
labels = pd.read_csv("Labels.csv")
labels_scaled = y_scaler.fit_transform(labels)
signature_key = tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY
input_key = 'input'
output_key = 'time_remaining'
with tf.Session(graph=tf.Graph()) as sess:
saved_model = tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], file_name)
signature = saved_model.signature_def
x_tensor_name = signature[signature_key].inputs[input_key].name
y_tensor_name = signature[signature_key].outputs[output_key].name
x = sess.graph.get_tensor_by_name(x_tensor_name)
y = sess.graph.get_tensor_by_name(y_tensor_name)
#np.expand_dims(data_scaled[600], axis=0)
predictions = sess.run(y, {x: data_scaled})
predictions = y_scaler.inverse_transform(predictions)
#print(np.round(predictions, 2))
correct_result = 0
for i in range(0, number_of_rows):
correct_result = 0
print(np.round(predictions[i]), " ", np.round(newcol[i]))
if np.round(predictions[i]) == np.round(newcol[i]):
correct_result += 1
print((correct_result/number_of_rows)*100)
The output of the first row should 96.0 but it produces 110.0, the last should be 0.1 but is -40.0 when no negatives appear in the dataset.
You can't compute accuracy when you do regression. Compute the mean squared error on the test set as well.
Second, when it comes to the scalers, you always do scaler.fit_transform on the training date so the scaler will compute the parameters (in this case min and max if you use min-max scaler) on the training data. Then, when performing inference on the test set, you should only do scaler.transform prior to feeding the data to the model.

Categories

Resources