During evaluation, I want to store unique ids that are wrongly predicted to do some more processing.
It is a multiclass prediction problem
Here is the code during the evaluation:
outputs = model(imgs)
loss = criterion(outputs, targets) # Prediction error
val_loss += loss.item()
predicted = torch.argmax(outputs, dim=1)
t_predicted +=predicted.cpu().tolist()
total += targets.size(0)
good_answers = (predicted == targets)
correct += good_answers.sum().item()
Knowing that ids is a list of the ids of the images, When I try to get the ids that are wrong:
wrong_ids += ids[~(good_answers.to('cpu'))]
I get this error:
add(): argument 'other' (position 1) must be Tensor
The question contained a tensorflow tag, so I was preparing an answer. After completing my write up, I've found that this tag is removed. However, I believe my answer can give insight into this general question of whether they're using tf or pytorch.
Data
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
# train set / data
x_train = x_train.astype('float32') / 255
# validation set / data
x_test = x_test.astype('float32') / 255
# train set / target
y_train = tf.keras.utils.to_categorical(y_train, num_classes=10)
# validation set / target
y_test = tf.keras.utils.to_categorical(y_test, num_classes=10)
print(x_train.shape, y_train.shape)
print(x_test.shape, y_test.shape)
Train
import tensorflow as tf
# declare input shape
input = tf.keras.Input(shape=(32,32,3))
# Block 1
x = tf.keras.layers.Conv2D(32, 3, strides=2, activation="relu")(input)
x = tf.keras.layers.MaxPooling2D(3)(x)
# Now that we apply global max pooling.
gap = tf.keras.layers.GlobalMaxPooling2D()(x)
# Finally, we add a classification layer.
output = tf.keras.layers.Dense(10, activation='softmax')(gap)
# bind all
func_model = tf.keras.Model(input, output)
print('\nFunctional API')
func_model.compile(
metrics=['accuracy'],
loss = 'categorical_crossentropy',
optimizer = tf.keras.optimizers.Adam()
)
func_model.fit(x_train, y_train)
Error Prediction
# Predict the values from the validation dataset
y_pred = func_model.predict(x_test)
# Convert predictions classes to one hot vectors
y_pred_classes = np.argmax(y_pred, axis = 1)
y_test = np.argmax(y_test, axis=1)
# Errors are difference between predicted labels and true labels
errors = (y_pred_classes - y_test != 0)
y_pred_classes_errors = y_pred_classes[errors]
y_pred_errors = y_pred[errors]
y_true_errors = y_test[errors]
x_test_errors = x_test[errors]
# Probabilities of the wrong predicted numbers
y_pred_errors_prob = np.max(y_pred_errors, axis = 1)
# Predicted probabilities of the true values in the error set
true_prob_errors = np.diagonal(np.take(y_pred_errors, y_true_errors, axis=1))
# Difference between the probability of the predicted label and the true label
delta_pred_true_errors = y_pred_errors_prob - true_prob_errors
# Sorted list of the delta prob errors
sorted_dela_errors = np.argsort(delta_pred_true_errors)
# Top 6 errors
most_important_errors = sorted_dela_errors[-6:]
Display
import matplotlib.pyplot as plt
def display_errors(errors_index,img_errors,pred_errors, obs_errors):
""" This function shows 6 images with their predicted and real labels"""
n = 0
nrows = 2
ncols = 3
fig, ax = plt.subplots(nrows,ncols,sharex=True,sharey=True)
for row in range(nrows):
for col in range(ncols):
error = errors_index[n]
ax[row,col].imshow((img_errors[error]).reshape((32,32,3)))
ax[row,col].set_title("Predicted label :{}\nTrue label :{}".format(pred_errors[error],obs_errors[error]))
n += 1
# Show the top 6 errors
display_errors(most_important_errors, x_test_errors, y_pred_classes_errors, y_true_errors)
What worked for me is to declare empty list and then to fill them with the predictions
# TO GET ONLY THE WRONGLY LABELED ITEMS
wrong_ids.append(ids[~(predicted == targets ).to('cpu')])
# TO ALSO STORE ALL TESTED LABELS PREDICTED AND TARGETS IN SEPARATE LIST
tested_labels.append(ids.to('cpu'))
pred_labels.append(predicted.to('cpu'))
true_labels.append(targets.to('cpu'))
then after that to convert the resulting list of tensors to one list:
wrongs =[]
for i,j in enumerate(wrong_ids):
for k,l in enumerate(j):
wrongs.append(l.item())
# and so on for the other lists of tensors
To show all tested labels with the predictions and the true labels:
df = pd.DataFrame({'id': ids,
'predicted': pred_label,
'true_label': true_label})
print(df)
Related
I've been trying to predict the google stock prices between certain dates but when I use the trained network to predict future values I get an output similar to target but in a different scale (screenshots are below).
I coded a lstm neural network using pytorch. The google stock prices were obtained from yfinance library (python).
The neural network is:
lstm:
input size = 1
hidden size = 200
number of layers = 1
The output of lstm is passed to a fully connected layer:
input size = 200
output size = 1
Before trainning the network I use MinMaxScaler.fit_transform() to scale the trainning and testing data. Then I use the network to predict future values, and the output obtained is conversed to original scale using MinMaxScaler.inverse_transform()
Nonetheless the predicted output has a different scale that the target output but they are similar.
Screenshot: plot of y_target (blue) and y_pred (orange)
If I zoom to y_pred I can see the next plot
Screenshot: plot of y_pred
What is happening? Why the predicted values are similar to target values but in a reduced scale? What am I doing wrong?
Code input data
# get close prices from dataset
df_close =pd.DataFrame( df['Close'])
df_close_values = df_close.values
# normalize data using MinMaxScaler
mmscaler = MinMaxScaler(feature_range=(0,1))
df_close_scaled = mmscaler.fit_transform(df_close_values)
# Sequence Lenght
sequence_length = 25
# divide data into train, validation and test data
len_data = df_close_values.shape[0]
len_train_data = int(len_data * 0.8)
len_val_data = int((len_data - len_train_data)/2)
len_test_data = len_data - len_train_data - len_val_data
train_data = df_close_scaled[0:len_train_data]
val_data = df_close_scaled[len_train_data-sequence_length:len_train_data+len_val_data]
test_data = df_close_scaled[len_train_data+len_val_data-sequence_length:]
# Function to divide data into x and y
def partition_dataset(sequence_length, train_df):
x, y = [], []
data_len = train_df.shape[0]
for i in range(sequence_length, data_len):
x.append(train_df[i-sequence_length:i,:])
y.append(train_df[i,0])
# Convert the x and y to numpy arrays
x = np.array(x, dtype=np.float32)
y = np.array(y, dtype=np.float32)
return x, y
x_train, y_train = partition_dataset(sequence_length, train_data)
x_val, y_val = partition_dataset(sequence_length, val_data)
x_test, y_test = partition_dataset(sequence_length, test_data)
shapes result:
x_train.shape , y_train.shape = (2554, 25, 1) (2554,)
x_val.shape , y_val.shape = (322, 25, 1) (322,)
x_test.shape , y_test.shape = (323, 25, 1) (323,)
LSTM code:
class LSTMPredictor(nn.Module):
def __init__(self, input_size=50, hidden_size=1, num_layers=1, output_size=1, bidirectional=1, dropout=1.0, device='cuda'):
super().__init__()
# Atributes
self.device = 'cuda'
self.num_layers = num_layers
self.hidden_size = hidden_size
self.D = True if bidirectional==2 else False
self.output_size = output_size
self.dropout = dropout
# define LSTM layer
self.lstm = nn.LSTM(input_size = input_size,
hidden_size = self.hidden_size,
num_layers = self.num_layers,
bidirectional = self.D,
batch_first = True,
dropout = dropout)
# define fully connected (MLP)
self.fully_connected = nn.Linear(self.hidden_size,
self.output_size)
self.dropout = nn.Dropout(p=0.2)
def forward(self, x, hidden=None):
# Propagate input through LSTM
output, (h, _) = self.lstm(x)
out = self.fully_connected(output[:,-1])
return out
UPDATE
I used minmaxscaler separately for each set (train, validation and test) and effectively the scale changed but the result is similar:
Plot y_traget (blue) and y_pred (orange)
Plot y_pred only
The problem with your code is this;
df_close_scaled = mmscaler.fit_transform(df_close_values)
is first applied on the entire dataset. Then you separate the data to train and test data. This is wrong as it transfer information about test data to train data.
First separate the data and apply MinMaxScalar on the train data. Save this object in a variable and then when you test, use the scalar to convert the values to the model and then the same to covert it back.
I do not see how this can affect such a drastic change in the data. But this is definitely a problem as it is.
I'll inspect the code more and update if I notice anything else.
As stated in the title, trying to get matplotlib to just use the ticker I designate earlier in the parameters as the title instead of manually changing the plt.title("TSLA") command every time too. I've tried a few different things like plt.title("ticker()") but it says str object can't be called.
Any ideas would be greatly appreciated! The plot commands are near the bottom.
Here's my code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import time
import os
import random
from collections import deque
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout, Bidirectional
from tensorflow.keras.callbacks import ModelCheckpoint, TensorBoard
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import yfinance as yf
### QQQ 10 day price predictor
# Reproducability
np.random.seed(42)
tf.random.set_seed(42)
random.seed(42)
#units = neurons
def load_data(ticker, period, interval, n_steps=200, scale=True, shuffle=True, lookup_step=10, test_size=.2,
feature_columns=['Close', 'Volume', 'Open', 'High', 'Low']):
'''
:param ticker: Ticker you want to load, dtype: str
:param period: Time period you want data from, dtype: str(options in program)
:param interval: Interval for data, dtype:str
:param n_steps: Past sequence length used to predict, default = 50, dtype: int
:param scale: Whether to scale data b/w 0 and 1, default = True, dtype: Bool
:param shuffle: Whether to shuffle data, default = True, dtyper: Bool
:param lookup_step: Future lookup step to predict, default = 1(next day), dtype:int
:param test_size: ratio for test data, default is .2 (20% test data), dtype: float
:param feature_columns: list of features fed into the model, default is OHLCV, dtype: list
:return:
'''
df = yf.download(tickers=ticker, period=period, interval=interval,
group_by='ticker',
# adjust all OHLC automatically
auto_adjust=True, prepost=True, threads=True, proxy=None)
result = {}
result['df'] = df.copy()
for col in feature_columns:
assert col in df.columns, f"'{col}' does not exist in the dataframe."
if scale:
column_scaler = {}
# scale the data (prices) from 0 to 1
for column in feature_columns:
scaler = preprocessing.MinMaxScaler()
df[column] = scaler.fit_transform(np.expand_dims(df[column].values, axis=1))
column_scaler[column] = scaler
# add the MinMaxScaler instances to the result returned
result["column_scaler"] = column_scaler
# add the target column (label) by shifting by `lookup_step`
df['future'] = df['Close'].shift(-lookup_step)
# last `lookup_step` columns contains NaN in future column
# get them before droping NaNs
last_sequence = np.array(df[feature_columns].tail(lookup_step))
# drop NaNs
df.dropna(inplace=True)
sequence_data = []
sequences = deque(maxlen=n_steps)
for entry, target in zip(df[feature_columns].values, df['future'].values):
sequences.append(entry)
if len(sequences) == n_steps:
sequence_data.append([np.array(sequences), target])
# get the last sequence by appending the last `n_step` sequence with `lookup_step` sequence
# for instance, if n_steps=50 and lookup_step=10, last_sequence should be of 60 (that is 50+10) length
# this last_sequence will be used to predict future stock prices not available in the dataset
last_sequence = list(sequences) + list(last_sequence)
last_sequence = np.array(last_sequence)
# add to result
result['last_sequence'] = last_sequence
# construct the X's and y's
X, y = [], []
for seq, target in sequence_data:
X.append(seq)
y.append(target)
# convert to numpy arrays
X = np.array(X)
y = np.array(y)
# reshape X to fit the neural network
X = X.reshape((X.shape[0], X.shape[2], X.shape[1]))
# split the dataset
result["X_train"], result["X_test"], result["y_train"], result["y_test"] = train_test_split(X, y,
test_size=test_size, shuffle=shuffle)
# return the result
return result
def create_model(sequence_length, units=256, cell=LSTM, n_layers=3, dropout=0.3,
loss="mean_absolute_error", optimizer="adam", bidirectional=False):
model = Sequential()
for i in range(n_layers):
if i == 0:
# first layer
if bidirectional:
model.add(Bidirectional(cell(units, return_sequences=True), input_shape=(None, sequence_length)))
else:
model.add(cell(units, return_sequences=True, input_shape=(None, sequence_length)))
elif i == n_layers - 1:
# last layer
if bidirectional:
model.add(Bidirectional(cell(units, return_sequences=False)))
else:
model.add(cell(units, return_sequences=False))
else:
# hidden layers
if bidirectional:
model.add(Bidirectional(cell(units, return_sequences=True)))
else:
model.add(cell(units, return_sequences=True))
# add dropout after each layer
model.add(Dropout(dropout))
model.add(Dense(1, activation="linear"))
model.compile(loss=loss, metrics=["mean_absolute_error"], optimizer=optimizer)
return model
N_STEPS = 40
# valid periods: 1d,5d,1mo,3mo,6mo,1y,2y,5y,10y,ytd,max
PERIOD = '6mo'
# valid intervals: 1m,2m,5m,15m,30m,60m,90m,1h,1d,5d,1wk,1mo,3mo
INTERVAL = '1h'
# Lookup step, 1 is the next day
LOOKUP_STEP = 10
# test ratio size, 0.2 is 20%
TEST_SIZE = 0.3
# features to use
FEATURE_COLUMNS = ["Close", "Volume", "Open", "High", "Low"]
# date now
date_now = time.strftime("%Y-%m-%d")
# > model parameters <
N_LAYERS = 3
#Type of model
CELL = LSTM
# Number of neurons
UNITS = 256
# Dropout rate
DROPOUT = 0.3
# whether to use bidirectional RNNs
BIDIRECTIONAL = False
# > training parameters <
# LOSS = "mae"
# huber loss
LOSS = "huber_loss"
OPTIMIZER = "adam"
BATCH_SIZE = 64
EPOCHS = 100
ticker = "QQQ"
#save model
model_name = f"{date_now}_{ticker}-{LOSS}-{OPTIMIZER}-{CELL.__name__}-seq-{N_STEPS}-step-{LOOKUP_STEP}-layers-{N_LAYERS}-units-{UNITS}"
if BIDIRECTIONAL:
model_name += "-b"
# folders that store results
if not os.path.isdir("results"):
os.mkdir("results")
if not os.path.isdir("logs"):
os.mkdir("logs")
if not os.path.isdir("data"):
os.mkdir("data")
data = load_data(ticker, PERIOD, INTERVAL, N_STEPS, lookup_step=LOOKUP_STEP, test_size=TEST_SIZE, feature_columns=FEATURE_COLUMNS)
# save the dataframe
data["df"].to_csv()
# construct the model
model = create_model(N_STEPS, loss=LOSS, units=UNITS, cell=CELL, n_layers=N_LAYERS,
dropout=DROPOUT, optimizer=OPTIMIZER, bidirectional=BIDIRECTIONAL)
# some tensorflow callbacks
checkpointer = ModelCheckpoint(os.path.join("results", model_name + ".h5"), save_weights_only=True, save_best_only=True, verbose=1)
tensorboard = TensorBoard(log_dir=os.path.join("logs", model_name))
history = model.fit(data["X_train"], data["y_train"],
batch_size=BATCH_SIZE,
epochs=EPOCHS,
validation_data=(data["X_test"], data["y_test"]),
callbacks=[checkpointer, tensorboard],
verbose=1)
model.save(os.path.join("results", model_name) + ".h5")
# >> Testing the Model <<
data = load_data(ticker, PERIOD, INTERVAL, N_STEPS, lookup_step=LOOKUP_STEP, test_size=TEST_SIZE,
feature_columns=FEATURE_COLUMNS, shuffle=False)
# construct the model
model = create_model(N_STEPS, loss=LOSS, units=UNITS, cell=CELL, n_layers=N_LAYERS,
dropout=DROPOUT, optimizer=OPTIMIZER, bidirectional=BIDIRECTIONAL)
model_path = os.path.join("results", model_name) + ".h5"
model.load_weights(model_path)
# evaluate the model
mse, mae = model.evaluate(data["X_test"], data["y_test"], verbose=0)
# calculate the mean absolute error (inverse scaling)
mean_absolute_error = data["column_scaler"]["Close"].inverse_transform([[mae]])[0][0]
print("Mean Absolute Error:", mean_absolute_error)
def predict(model, data):
last_sequence = data["last_sequence"][-N_STEPS:]
column_scaler = data["column_scaler"]
last_sequence = last_sequence.reshape((last_sequence.shape[1], last_sequence.shape[0]))
last_sequence = np.expand_dims(last_sequence, axis=0)
# get the prediction (scaled from 0 to 1)
prediction = model.predict(last_sequence)
# get the price (by inverting the scaling)
predicted_price = column_scaler["Close"].inverse_transform(prediction)[0][0]
return predicted_price
# predict the future price
future_price = predict(model, data)
print(f"Future price after {LOOKUP_STEP} days is {future_price:.2f}$")
def plot_graph(model, data):
y_test = data["y_test"]
X_test = data["X_test"]
y_pred = model.predict(X_test)
y_test = np.squeeze(data["column_scaler"]["Close"].inverse_transform(np.expand_dims(y_test, axis=0)))
y_pred = np.squeeze(data["column_scaler"]["Close"].inverse_transform(y_pred))
currently last 200 days
plt.plot(y_test[-200:], c='b')
plt.plot(y_pred[-200:], c='r')
plt.xlabel("Days")
plt.ylabel("Price")
### added plot title -KH
plt.title("QQQ Price Chart")
plt.legend(["Actual Price", "Predicted Price"])
plt.show()
plot_graph(model, data)
def accuracy(model, data):
y_test = data["y_test"]
X_test = data["X_test"]
y_pred = model.predict(X_test)
y_test = np.squeeze(data["column_scaler"]["Close"].inverse_transform(np.expand_dims(y_test, axis=0)))
y_pred = np.squeeze(data["column_scaler"]["Close"].inverse_transform(y_pred))
y_pred = list(map(lambda current, future: int(float(future) > float(current)), y_test[:-LOOKUP_STEP], y_pred[LOOKUP_STEP:]))
y_test = list(map(lambda current, future: int(float(future) > float(current)), y_test[:-LOOKUP_STEP], y_test[LOOKUP_STEP:]))
return accuracy_score(y_test, y_pred)
print(str(LOOKUP_STEP) + ":", "Accuracy Score:", accuracy(model, data))
use f to format string literal and put variable name in curly brackets as shown below:
plt.title(f'{ticker}')
Basically this is the VGG-16 Model, I have performed Transfer Learning and Fine Tuned the model, I have trained this model 2 weeks ago and found both the test and train accuracy but now I need Class wise accuracy of the model too, I am trying to find out the Confusion matrix and wanna plot the matrix too. Training Code:
# Training the model again from the last CNN Block to The End of the Network
dataset = 'C:\\Users\\Sara Latif Khan\\OneDrive\\Desktop\\FYP_\\Scene15\\15-Scene'
model = model.to(device)
optimizer = Adam(filter(lambda p: p.requires_grad, model.parameters()))
#Training Fixed Feature Extractor for 15 epochs
num_epochs = 5
batch_loss = 0
cum_epoch_loss = 0 #cumulative loss for each batch
for e in range(num_epochs):
cum_epoch_loss = 0
for batch, (images, labels) in enumerate(trainloader,1):
images = images.to(device)
labels = labels.to(device)
optimizer.zero_grad()
logps = model(images)
loss = criterion(logps, labels)
loss.backward()
optimizer.step()
batch_loss += loss.item()
print(f'Epoch({e}/{num_epochs} : Batch number({batch}/{len(trainloader)}) : Batch loss : {loss.item()}')
torch.save(model, dataset+'_model_'+str(e)+'.pt')
print(f'Training loss : {batch_loss/len(trainloader)}')
This is the code I am using to check the accuracy of my model based on data from the test loader.
model. to('cpu')
model.eval()
with torch.no_grad():
num_correct = 0
total = 0
#set_trace ()
for batch, (images,labels) in enumerate(testloader,1):
logps = model(images)
output = torch.exp(logps)
pred = torch.argmax(output,1)
total += labels.size(0)
num_correct += (pred==labels).sum().item()
print(f'Batch ({batch} / {len(testloader)})')
# to check the accuracy of model on 5 batches
# if batch == 5:
# break
print(f'Accuracy of the model on {total} test images: {num_correct * 100 / total }% ')
Next, I need to find the class-wise accuracy of the model. I am working on the Jupyter Notebook. Should I reload a saved model and find the cm or what will the appropriate way of doing it.
You have to save all the predictions and targets of the test set.
predictions, targets = [], []
for images, labels in testloader:
logps = model(images)
output = torch.exp(logps)
pred = torch.argmax(output, 1)
# convert to numpy arrays
pred = pred.detach().cpu().numpy()
labels = labels.detach().cpu().numpy()
for i in range(len(pred)):
predictions.append(pred[i])
targets.append(labels[i])
Now you have all the predictions and actual targets of the test-set stored.
Next step is to create the confusion matrix. I think I can just give you my function I always use:
def create_confusion_matrix(y_true, y_pred, classes):
""" creates and plots a confusion matrix given two list (targets and predictions)
:param list y_true: list of all targets (in this case integers bc. they are indices)
:param list y_pred: list of all predictions (in this case one-hot encoded)
:param dict classes: a dictionary of the countries with they index representation
"""
amount_classes = len(classes)
confusion_matrix = np.zeros((amount_classes, amount_classes))
for idx in range(len(y_true)):
target = y_true[idx][0]
output = y_pred[idx]
output = list(output).index(max(output))
confusion_matrix[target][output] += 1
fig, ax = plt.subplots(1)
ax.matshow(confusion_matrix)
ax.set_xticks(np.arange(len(list(classes.keys()))))
ax.set_yticks(np.arange(len(list(classes.keys()))))
ax.set_xticklabels(list(classes.keys()))
ax.set_yticklabels(list(classes.keys()))
plt.setp(ax.get_xticklabels(), rotation=45, ha="left", rotation_mode="anchor")
plt.setp(ax.get_yticklabels(), rotation=45, ha="right", rotation_mode="anchor")
plt.show()
So y_true are all the targets, y_pred all the predictions and classes is a dictionary which maps the labels to the actual class-names, for example:
classes = {"dog": [1, 0], "cat": [0, 1]}
Then simply call:
create_confusion_matrix(targets, predictions, classes)
Probably you will have to adapt it to your code a little but I hope this works for you. :)
I've been training an MLP to predict the time remaining on an assembly sequence. The Training loss, Validation loss and MSE are all less 0.001, however, when I try to do a prediction with one of the datasets I trained the network with the it can't correctly identify any of the outputs from the set of inputs. What am I doing wrong that is producing this error?
I am also struggling to understand how, when the model is deployed, how do I perform the scaling of the result for one prediction? scaler.inverse_transform won't work because the data for that scaler used during training has been lost as the prediction would be done in a separate script to the training using the model the training produced. Is this information saved in the model builder?
I have tried to change the batch size during training, rounding the time column of the dataset to the nearest second (previously was 0.1 seconds), trained over 50, 100 and 200 epochs and I always end up with no correct predictions. I am also training an LSTM to see which is more accurate but that is also having the same issue. The dataset is split 70-30 training-testing and then training is then split 75-25 into training and validation.
Data scaling and model training code:
def scale_data(training_data, training_data_labels, testing_data, testing_data_labels):
# Create X and Y scalers between 0 and 1
x_scaler = MinMaxScaler(feature_range=(0, 1))
y_scaler = MinMaxScaler(feature_range=(0, 1))
# Scale training data
x_scaled_training = x_scaler.fit_transform(training_data)
y_scaled_training = y_scaler.fit_transform(training_data_labels)
# Scale testing data
x_scaled_testing = x_scaler.transform(testing_data)
y_scaled_testing = y_scaler.transform(testing_data_labels)
return x_scaled_training, y_scaled_training, x_scaled_testing, y_scaled_testing
def train_model(training_data, training_labels, testing_data, testing_labels, number_of_epochs, number_of_columns):
model_hidden_neuron_number_list = []
model_repeat_list = []
model_error_rate_list = []
for hidden_layer_1_units in range(int(np.floor(number_of_columns / 2)), int(np.ceil(number_of_columns * 2))):
print("Training starting, number of hidden units = %d" % hidden_layer_1_units)
for repeat in range(1, 6):
print("Repeat %d" % repeat)
model = k.Sequential()
model.add(Dense(hidden_layer_1_units, input_dim=number_of_columns,
activation='relu', name='hidden_layer_1'))
model.add(Dense(1, activation='linear', name='output_layer'))
model.compile(loss='mean_squared_error', optimizer='adam')
# Train Model
model.fit(
training_data,
training_labels,
epochs=number_of_epochs,
shuffle=True,
verbose=2,
callbacks=[logger],
batch_size=1024,
validation_split=0.25
)
# Test Model
test_error_rate = model.evaluate(testing_data, testing_labels, verbose=0)
print("Error on testing data is %.3f" % test_error_rate)
model_hidden_neuron_number_list.append(hidden_layer_1_units)
model_repeat_list.append(repeat)
model_error_rate_list.append(test_error_rate)
# Save Model
model_builder = tf.saved_model.builder.SavedModelBuilder("MLP/models/{hidden_layer_1_units}/{repeat}".format(hidden_layer_1_units=hidden_layer_1_units, repeat=repeat))
inputs = {
'input': tf.saved_model.build_tensor_info(model.input)
}
outputs = { 'time_remaining':tf.saved_model.utils.build_tensor_info(model.output)
}
signature_def = tf.saved_model.signature_def_utils.build_signature_def(
inputs=inputs,
outputs=outputs, method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME
)
model_builder.add_meta_graph_and_variables(
K.get_session(),
tags=[tf.saved_model.tag_constants.SERVING],
signature_def_map={tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature_def
}
)
model_builder.save()
And then to do a prediction:
file_name = top_level_file_path + "./MLP/models/19/1/"
testing_dataset = pd.read_csv(file_path + os.listdir(file_path)[0])
number_of_rows = len(testing_dataset.index)
number_of_columns = len(testing_dataset.columns)
newcol = [number_of_rows]
max_time = testing_dataset['Time'].max()
for j in range(0, number_of_rows - 1):
newcol.append(max_time - testing_dataset.iloc[j].iloc[number_of_columns - 1])
x_scaler = MinMaxScaler(feature_range=(0, 1))
y_scaler = MinMaxScaler(feature_range=(0, 1))
# Scale training data
data_scaled = x_scaler.fit_transform(testing_dataset)
labels = pd.read_csv("Labels.csv")
labels_scaled = y_scaler.fit_transform(labels)
signature_key = tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY
input_key = 'input'
output_key = 'time_remaining'
with tf.Session(graph=tf.Graph()) as sess:
saved_model = tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], file_name)
signature = saved_model.signature_def
x_tensor_name = signature[signature_key].inputs[input_key].name
y_tensor_name = signature[signature_key].outputs[output_key].name
x = sess.graph.get_tensor_by_name(x_tensor_name)
y = sess.graph.get_tensor_by_name(y_tensor_name)
#np.expand_dims(data_scaled[600], axis=0)
predictions = sess.run(y, {x: data_scaled})
predictions = y_scaler.inverse_transform(predictions)
#print(np.round(predictions, 2))
correct_result = 0
for i in range(0, number_of_rows):
correct_result = 0
print(np.round(predictions[i]), " ", np.round(newcol[i]))
if np.round(predictions[i]) == np.round(newcol[i]):
correct_result += 1
print((correct_result/number_of_rows)*100)
The output of the first row should 96.0 but it produces 110.0, the last should be 0.1 but is -40.0 when no negatives appear in the dataset.
You can't compute accuracy when you do regression. Compute the mean squared error on the test set as well.
Second, when it comes to the scalers, you always do scaler.fit_transform on the training date so the scaler will compute the parameters (in this case min and max if you use min-max scaler) on the training data. Then, when performing inference on the test set, you should only do scaler.transform prior to feeding the data to the model.
I am using this Kaggle guide to do time series forecasting (sample data attached).
Here's the code:
def create_dataset(dataset, window_size = 1):
data_X, data_Y = [], []
for i in range(len(dataset) - window_size - 1):
a = dataset[i:(i + window_size), 0]
data_X.append(a)
data_Y.append(dataset[i + window_size, 0])
return(np.array(data_X), np.array(data_Y))
def fit_model(train_X, train_Y, window_size = 1):
model = Sequential()
model.add(LSTM(4,
input_shape = (1, window_size)))
model.add(Dense(1))
model.compile(loss = "mean_squared_error",
optimizer = "adam")
model.fit(train_X,
train_Y,
epochs = 100,
batch_size = 1,
verbose = 0)
return(model)
def predict_and_score(model, X, Y):
# Make predictions on the original scale of the data.
pred = MinMaxScaler(feature_range = (0,1)).inverse_transform(model.predict(X))
# Prepare Y data to also be on the original scale for interpretability.
orig_data = MinMaxScaler(feature_range = (0,1)).inverse_transform([Y])
# Calculate RMSE.
score = math.sqrt(mean_squared_error(orig_data[0], pred[:, 0]))
return(score, pred)
This entire thing is being used in the following function:
def nnet(time_series, window_size=1, ):
cmi_total_raw = vstack((time_series.values.astype('float32')))
scaler = MinMaxScaler(feature_range = (0,1))
cmi_total_scaled = scaler.fit_transform(cmi_total_raw)
cmi_train_sc = (cmi_total_scaled[0:int(cmi_split*len(cmi_total_scaled))])
cmi_test_sc = cmi_total_scaled[int(cmi_split*len(cmi_total_scaled)) : len(cmi_total_scaled)]
# Create test and training sets for one-step-ahead regression.
window_size = 1
train_X, train_Y = create_dataset(cmi_train_sc, window_size)
test_X, test_Y = create_dataset(cmi_test_sc, window_size)
# Reshape the input data into appropriate form for Keras.
train_X = np.reshape(train_X, (train_X.shape[0], 1, train_X.shape[1]))
test_X = np.reshape(test_X, (test_X.shape[0], 1, test_X.shape[1]))
model = fit_model(train_X, train_Y, window_size)
rmse_train, train_predict = predict_and_score(nn_model, train_X, train_Y)
mape_test, test_predict = predict_and_score(model, test_X, test_Y)
return (mape_test, test_predict)
As far as I understand, it is creating a model based on training data and predicting on in-sample test set and finally calculates the error.
The input data has 209 rows and I want to predict the next row(s).
Here's what I tried:
Since the same thing is done in Auto-Arima using forecast(steps= n_steps) method, I looked for something similar in Keras.
From Keras documentation:
predict(x, batch_size=None, verbose=0, steps=None)
Arguments:
x: The input data, as a Numpy array (or list of Numpy arrays if the model has multiple inputs).
steps: Total number of steps (batches of samples) before declaring the prediction round finished. Ignored with the default value of None.
I tried changing step and it predicted very absurd values of the order of 100,000. Moreover, length of the test_predict was no way near the steps I gave. So I am assuming step means something else here.
Question
- Can Keras even be used to forecast time series data (out of sample)
- If yes, is there a forecast method just as there the aforementioned predict method?
- If no, can the existing predict method be used in any way to get out of sample forecast?
Sample data (cmi_total):
2014-05-25 272.459887
2014-06-01 272.446022
2014-06-08 330.301260
2014-06-15 656.838394
2014-06-22 670.575110