I have created a Keras neural network. The neural network was trained during eight epochs, and it outputs this loss value and accuracy:
Epoch 1/8
2009/2009 [==============================] - 0s 177us/step - loss: 0.0824 - acc: 4.9776e-04
Epoch 2/8
2009/2009 [==============================] - 0s 34us/step - loss: 0.0080 - acc: 4.9776e-04
Epoch 3/8
2009/2009 [==============================] - 0s 37us/step - loss: 0.0071 - acc: 4.9776e-04
Epoch 4/8
2009/2009 [==============================] - 0s 38us/step - loss: 0.0071 - acc: 4.9776e-04
Epoch 5/8
2009/2009 [==============================] - 0s 35us/step - loss: 0.0070 - acc: 4.9776e-04
Epoch 6/8
2009/2009 [==============================] - 0s 38us/step - loss: 0.0071 - acc: 4.9776e-04
Epoch 7/8
2009/2009 [==============================] - 0s 36us/step - loss: 0.0068 - acc: 4.9776e-04
Epoch 8/8
2009/2009 [==============================] - 0s 40us/step - loss: 0.0070 - acc: 4.9776e-04
How do I interpret the loss function provided within the output?
Is there any way to find the variation percentage between the actual price and prediction for every single day in the data set?
Here is the neural network:
import tensorflow as tf
import keras
import numpy as np
#import quandle
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
import pandas as pd
import sklearn
import math
import pandas_datareader as web
def func_stock_prediction(stockdata, start, end):
start = start
end = end
df = web.DataReader(stockdata, "yahoo", start, end)
df = df[['Close']]
previous = 5
def create_dataset(df, previous):
dataX, dataY = [], []
for i in range(len(df)-previous-1):
a = df[i:(i+previous), 0]
dataX.append(a)
dataY.append(df[i + previous, 0])
return np.array(dataX), np.array(dataY)
scaler = sklearn.preprocessing.MinMaxScaler(feature_range = (0, 1))
df = scaler.fit_transform(df)
train_size = math.ceil(len(df) * 0.5)
train, val = df[0:train_size,:], df[train_size:len(df),:]
X_train, Y_train = create_dataset(train, previous)
print(X_train)
print(Y_train)
print(X_train.shape)
print(Y_train.shape)
X_val, Y_val = create_dataset(val, previous)
X_train = np.reshape(X_train, (X_train.shape[0], 1, X_train.shape[1]))
X_val = np.reshape(X_val, (X_val.shape[0], 1, X_val.shape[1]))
model = keras.models.Sequential()
model.add(keras.layers.Dense(units = 64, activation = 'relu', input_shape = (1, 5)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(units = 1, activation = 'linear'))
model.compile(loss='mean_absolute_error',
optimizer='adam',
metrics=['accuracy'])
history = model.fit(X_train, Y_train, epochs=8)
train = model.predict(X_train)
val = model.predict(X_val)
train = scaler.inverse_transform(train)
Y_train = scaler.inverse_transform([Y_train])
val = scaler.inverse_transform(val)
Y_val = scaler.inverse_transform([Y_val])
predictions = val
trainPlot = np.empty_like(df)
trainPlot[:, :] = np.nan
trainPlot[previous:len(train)+previous, :] = train
valPlot = np.empty_like(df)
valPlot[:, :] = np.nan
valPlot[len(train)+(previous*2)+1:len(df)-1, :] = val
inversetransform, =plt.plot(scaler.inverse_transform(df))
train, =plt.plot(trainPlot)
val, =plt.plot(valPlot)
plt.xlabel('Number of Days')
plt.ylabel('Stock Price')
plt.title("Predicted vs. Actual Stock Price Per Day")
plt.show()
func_stock_prediction("PLAY", 2010-1-1, 2020-1-1)
You are using accuracy as a metric. Accuracy measures the proportion of predicted labels that match the true labels. Accuracy is used mostly (to my knowledge) for classification tasks. As far as I know, the accuracy is not really interpretable when you're predicting a continuous outcome variable.
Based on your code, it looks like you're using the neural network for a regression problem (you're predicting a continuous variable). For regression problem meterics, people often use "mean squared error", "root mean squared error", "mean absolute error", "R^2", etc.
If you're interested in percentage differences, then maybe you could try the keras loss, "mean_absolute_percentage_error".
Related
I have tried over and over with different approaches to building this model however I continue to run into this issue where my training accuracy steadily increases just fine but my validation and evaluation accuracy remains very low (55% - 65%).
Epoch 95/100
119/119 [==============================] - 0s 2ms/step - loss: 0.6326 - accuracy: 0.8057 - val_loss: 2.0461 - val_accuracy: 0.5985
Epoch 96/100
119/119 [==============================] - 0s 2ms/step - loss: 0.6485 - accuracy: 0.7990 - val_loss: 1.9512 - val_accuracy: 0.5909
Epoch 97/100
119/119 [==============================] - 0s 2ms/step - loss: 0.6263 - accuracy: 0.8032 - val_loss: 2.0344 - val_accuracy: 0.5682
Epoch 98/100
119/119 [==============================] - 0s 2ms/step - loss: 0.6249 - accuracy: 0.7990 - val_loss: 2.0183 - val_accuracy: 0.5682
Epoch 99/100
119/119 [==============================] - 0s 2ms/step - loss: 0.6189 - accuracy: 0.8007 - val_loss: 2.0818 - val_accuracy: 0.5758
Epoch 100/100
119/119 [==============================] - 0s 2ms/step - loss: 0.6261 - accuracy: 0.8024 - val_loss: 2.0591 - val_accuracy: 0.5833
18/18 [==============================] - 0s 1ms/step - loss: 2.2385 - accuracy: 0.5628
EVAL:
[2.238506317138672, 0.5628318786621094]
The entire script is as follows:
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from keras import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import Adam
import numpy as np
from keras.utils import to_categorical
def count_classes(y: list):
counts = {}
for i in y:
if i in counts.keys():
counts[i] += 1
else:
counts[i] = 1
return counts
dataset = pd.read_csv('Dementia-data.csv')
X= dataset.iloc[:,1:]
y= dataset.iloc[:,0] # roughly 2562 input variables
X.head(2)
#standardizing the input feature
X_train, X_test, y_train, y_test = train_test_split(X, y, shuffle = True, random_state = 0, test_size=0.2)
print("TRAIN:")
print(count_classes(list(y_train)))
print("TEST:")
print(count_classes(list(y_test)))
sc = StandardScaler()
scaler = sc.fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)
optimizer = Adam(lr=0.00005)
classifier = Sequential()
#First Hidden Layer
classifier.add(Dense(32, activation='relu', kernel_initializer='random_normal', kernel_regularizer=regularizers.l2(0.005), input_dim=2562))
#Second Hidden Layer
classifier.add(Dropout(0.2))
classifier.add(Dense(64, activation='relu', kernel_initializer='random_normal', kernel_regularizer=regularizers.l2(0.005)))
#Output Layer
classifier.add(Dropout(0.2))
classifier.add(Dense(64, activation='relu', kernel_initializer='random_normal'))
classifier.add(Dense(7, activation='softmax', kernel_initializer='random_normal'))
#Compiling the neural network
classifier.compile(optimizer =optimizer,loss='sparse_categorical_crossentropy', metrics =['accuracy'])
#Fitting the data to the training dataset
classifier.fit(X_train,y_train, batch_size=20, epochs=200, validation_split=0.1, shuffle=True)
eval_model=classifier.evaluate(X_test, y_test)
print("EVAL:")
print(eval_model)
I have tried all sorts of things to try to combat overfittings such as regulators, dropouts, and data splitting because that seems to be the lead cause for problems like this one. I do not know if I am missing something.
The class distribution for the test and train data is as follows, it doesn't look like there is any inconsistencies between the 2 datasets in terms of ratios of classses:
TRAIN:
{2: 822, 4: 136, 6: 229, 5: 184, 3: 76, 1: 57}
TEST:
{2: 199, 6: 59, 5: 45, 1: 26, 3: 15, 4: 33}
The plotted accuracy for training and testing:
The plotted loss for training and testing, the loss clearly increases for the testing data:
I have been struggling with this for weeks now and I'd truly appreciate any help to get this working. I have also tried using learning rates between 0.05 and 0.00005 with little to no improvement.
Some context about my project: I intend to study various parameters about bullets and how they affect the ballistics coefficient (i.e. bullet performance) of the projectile. I have different parameters, such as weight, caliber, sectional density, etc. I feel that I did this all wrong though; I am just reading through tutorials and applying what I feel could be useful and relevant in my project.
The output of my regression model looks a bit off to me; the trained model continuously outputs 0.0201 as MSE throughout the model.fit() part of my program.
Also, the model.predict(X) seems to have an accuracy of 100%, however, this does not seem right; I borrowed some code from a tutorial describing Keras models to display the model output while displaying the expected output.
This is the program constructing the model and training it
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.utils import shuffle
import tensorflow as tf
from tensorflow.keras.callbacks import TensorBoard
from pandas.plotting import scatter_matrix
import time
name = 'Bullet Database Analysis v2-{}'.format(int(time.time()))
tensorboard = TensorBoard(log_dir='logs/{}'.format(name))
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)
df = pd.read_csv('Bullet Optimization\ShootForum Bullet DB_2.csv')
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
dataset = df.values
X = dataset[:,0:12]
X = np.asarray(X).astype(np.float32)
y = dataset[:,13]
y = np.asarray(y).astype(np.float32)
X_train, X_val_and_test, y_train, y_val_and_test = train_test_split(X, y, test_size=0.3, shuffle=True)
X_val, X_test, y_val, y_test = train_test_split(X_val_and_test, y_val_and_test, test_size=0.5)
from keras.models import Sequential
from keras.layers import Dense, BatchNormalization
model = Sequential(
[
#2430 is the shape of X_train
#BatchNormalization(axis=-1, momentum = 0.1),
Dense(2430, activation='relu'),
Dense(32, activation='relu'),
Dense(1),
]
)
model.compile(loss='mse', metrics=['mse'])
history = model.fit(X_train, y_train,
batch_size=64,
epochs=20,
validation_data=(X_val, y_val),
#callbacks = [tensorboard]
)
# plt.plot(history.history['loss'],'r')
# plt.plot(history.history['val_loss'],'m')
plt.plot(history.history['mse'],'b')
plt.show()
model.summary()
model.save("Bullet Optimization\Bullet Database Analysis.h5")
Here is my code, loading my previously trained model via h5
import numpy as np
import tensorflow as tf
from tensorflow import keras
from keras.models import load_model
import pandas as pd
df = pd.read_csv('Bullet Optimization\ShootForum Bullet DB_2.csv')
model = load_model('Bullet Optimization\Bullet Database Analysis.h5')
dataset = df.values
X = dataset[:,0:12]
y = dataset[:,13]
model.fit(X,y, epochs=10)
#predictions = np.argmax(model.predict(X), axis=-1)
predictions = model.predict(X)
# summarize the first 5 cases
for i in range(5):
print('%s => %d (expected %d)' % (X[i].tolist(), predictions[i], y[i]))
This is the output
Epoch 1/10
2021-03-09 10:38:06.372303: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-03-09 10:38:07.747241: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
109/109 [==============================] - 2s 4ms/step - loss: 0.0201 - mse: 0.0201
Epoch 2/10
109/109 [==============================] - 1s 5ms/step - loss: 0.0201 - mse: 0.0201
Epoch 3/10
109/109 [==============================] - 0s 4ms/step - loss: 0.0201 - mse: 0.0201
Epoch 4/10
109/109 [==============================] - 0s 5ms/step - loss: 0.0201 - mse: 0.0201
Epoch 5/10
109/109 [==============================] - 1s 5ms/step - loss: 0.0201 - mse: 0.0201
Epoch 6/10
109/109 [==============================] - 1s 5ms/step - loss: 0.0201 - mse: 0.0201
Epoch 7/10
109/109 [==============================] - 1s 5ms/step - loss: 0.0201 - mse: 0.0201
Epoch 8/10
109/109 [==============================] - 0s 4ms/step - loss: 0.0201 - mse: 0.0201
Epoch 9/10
109/109 [==============================] - 1s 5ms/step - loss: 0.0201 - mse: 0.0201
Epoch 10/10
109/109 [==============================] - 0s 4ms/step - loss: 0.0201 - mse: 0.0201
[0.314, 7.9756, 100.0, 100.0, 31.4, 0.00314, 318.4713376, 6.480041472000001, 0.51, 12.95400001, 4.067556004, 0.145] => 0 (expected 0)
[0.358, 9.0932, 148.0, 148.0, 52.983999999999995, 0.002418919, 413.4078212, 9.590461379, 0.635, 16.12900002, 5.774182006, 0.165] => 0 (expected 0)
[0.313, 7.9502, 83.0, 83.0, 25.979, 0.003771084, 265.1757188, 5.378434422000001, 0.504, 12.80160001, 4.006900804, 0.121] => 0 (expected 0)
[0.251, 6.3754, 50.0, 50.0, 12.55, 0.00502, 199.20318730000002, 3.2400207360000004, 0.4, 10.16000001, 2.5501600030000002, 0.113] => 0 (expected 0)
[0.251, 6.3754, 50.0, 50.0, 12.55, 0.00502, 199.20318730000002, 3.2400207360000004, 0.41, 10.41400001, 2.613914003, 0.113] => 0 (expected 0)
Here is a link to my training dataset. Within my code, I used train_test_split to create both the test and train dataset.
Lastly, is there a way within Tensorboard to visualize the model fitting with the dataset? I really feel that although my model is training, it is not making any significant fitting even though the MSE error is reduced.
Because you have nan values in your dataset. Before splitting up you can check it with df.isna().sum(). These can have a negative impact on your network. Here I just simply dropped them (df.dropna(inplace = True, axis = 0)) but you can use some imputation techniques to replace them.
Also 2430 neurons can be overkill for this data, start with less neurons.
model = tf.keras.models.Sequential(
[
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dense(1),
]
)
Here is the last epoch:
Epoch 20/20
27/27 [==============================] - 0s 8ms/step - loss: 8.2077e-04 - mse: 8.2077e-04 -
val_loss: 8.5023e-04 - val_mse: 8.5023e-04
While doing regression, calculating accuracy straight forward is not a valid option. You can use model.evaluate(X_test, y_test) or when you get predictions by model.predict, you can use other regression metrics to compute how close your predictions are.
My problem is to predict the output as which has 3 class label,
Lets say I have 20000 samples in my dataset with each sample is associated with label (0,1,2).
As this is multiclass classification problem.
Can I only give input as Labels which are ( 0, 1,2) to the network and get prediction based on the labels.
Will the data feeded to the network is sufficient to learn and predict the output
Please help me with your inputs
# Below is the code
X_train, X_test, y_train, y_test = train_test_split(values_train[:, 0],
values_train[:, 1],
test_size=0.25,
random_state=42)
print(" X Training Set size is",X_train.shape )
print(" y Training Set size is",y_train.shape )
print(" X Test Set size is",X_test.shape)
print(" y Test Set size is",y_test.shape )
'X Training Set size is (165081,)'
'y Training Set size is (165081,)'
'X Test Set size is (55028,)'
'y Test Set size is (55028,)'
# convert to LSTM friendly format
X_train = X_train.reshape(len(X_train),1, 1)
X_test = X_test.reshape(len(X_test),1,1)
print(X_train.shape, X_test.shape)
(165081, 1, 1) (55028, 1, 1)
# configure network
n_batch = 1
n_epoch = 100
n_neurons = 10
from keras.optimizers import SGD
opt = SGD(lr=0.01)
# design network
model = Sequential()
model.add(LSTM(n_neurons, batch_input_shape=(n_batch, X_train.shape[1],
X_train.shape[2]),
stateful=True))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])
# fit network
for i in range(n_epoch):
model.fit(X_train, y_train ,validation_data=(X_test, y_test),
epochs=1, batch_size=n_batch, verbose=1, shuffle= False)
model.reset_states()
df_actual = []
dp_predict = []
for i in range(len(X_test)):
testX,testy = X_test[i],y_test[i]
testX = testX.reshape(1, 1, 1)
yhat = model.predict(testX, batch_size=1)
df_actual.append(testy)
dp_predict.append(yhat)
print('>Actual =%.1f, Predicted=%.1f' % (testy, yhat))
I am not able to get correct prediction in this model.
Update:
Please find the below Validation accuracy and Training accuracy with the loss
Train on 154076 samples, validate on 66033 samples
Epoch 1/5
154076/154076 [==============================] - 289s 2ms/step - loss: 1.0033 - accuracy: 0.3816 - val_loss: 1.0018 - val_accuracy: 0.4286
Epoch 2/5
154076/154076 [==============================] - 291s 2ms/step - loss: 1.0021 - accuracy: 0.3817 - val_loss: 1.0020 - val_accuracy: 0.4286
Epoch 3/5
154076/154076 [==============================] - 293s 2ms/step - loss: 1.0018 - accuracy: 0.3804 - val_loss: 1.0014 - val_accuracy: 0.4286
Epoch 4/5
154076/154076 [==============================] - 290s 2ms/step - loss: 1.0016 - accuracy: 0.3812 - val_loss: 1.0012 - val_accuracy: 0.4286
Epoch 5/5
154076/154076 [==============================] - 290s 2ms/step - loss: 1.0015 - accuracy: 0.3814 - val_loss: 1.0012 - val_accuracy: 0.4286
Can anyone suggest me what can be improvement
Note: - I have normalized the input data with MinMaxScalar and used the scaled data, but there is no change in the output
Class labels are of categorical type. Neural networks can't learn on categorical data. You have to one-hot encode it with e.g. keras.utils.to_categorical:
x = values_train[:, 0]
y = values_train[:, 1]
y = keras.utils.to_categorical(y)
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.25, random_state=42)
I've built a simplistic multi-layer NN using Keras with precipitation data in Australia. The code takes 4 input columns: ['MinTemp', 'MaxTemp', 'Rainfall', 'WindGustSpeed'] and trains against the RainTomorrow output.
I've partitioned the data into training/test buckets, transformed all values into 0 <= n <= 1. When I trying to run model.fit, my loss values steady at ~13.2, but my accuracy is always 0.0. An example of logged fitting intervals are:
...
Epoch 37/200
113754/113754 [==============================] - 0s 2us/step - loss: -13.1274 - acc: 0.0000e+00 - val_loss: -16.1168 - val_acc: 0.0000e+00
Epoch 38/200
113754/113754 [==============================] - 0s 2us/step - loss: -13.1457 - acc: 0.0000e+00 - val_loss: -16.1168 - val_acc: 0.0000e+00
Epoch 39/200
113754/113754 [==============================] - 0s 2us/step - loss: -13.1315 - acc: 0.0000e+00 - val_loss: -16.1168 - val_acc: 0.0000e+00
Epoch 40/200
113754/113754 [==============================] - 0s 2us/step - loss: -13.1797 - acc: 0.0000e+00 - val_loss: -16.1168 - val_acc: 0.0000e+00
Epoch 41/200
113754/113754 [==============================] - 0s 2us/step - loss: -13.1844 - acc: 0.0000e+00 - val_loss: -16.1169 - val_acc: 0.0000e+00
Epoch 42/200
113754/113754 [==============================] - 0s 2us/step - loss: -13.2205 - acc: 0.0000e+00 - val_loss: -16.1169 - val_acc: 0.0000e+00
Epoch 43/200
...
How can I amend the following script, so my accuracy grows, and my predication output returns a value between 0 and 1 (0: no rain, 1: rain)?
import keras
import sklearn.model_selection
import numpy as np
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import MinMaxScaler
labelencoder = LabelEncoder()
# read data, replace NaN with 0.0
csv_data = pd.read_csv('weatherAUS.csv', header=0)
csv_data = csv_data.replace(np.nan, 0.0, regex=True)
# Input/output columns scaled to 0<=n<=1
x = csv_data.loc[:, ['MinTemp', 'MaxTemp', 'Rainfall', 'WindGustSpeed']]
y = labelencoder.fit_transform(csv_data['RainTomorrow'])
scaler_x = MinMaxScaler(feature_range =(-1, 1))
x = scaler_x.fit_transform(x)
scaler_y = MinMaxScaler(feature_range =(-1, 1))
y = scaler_y.fit_transform([y])[0]
# Partitioned data for training/testing
x_train, x_test, y_train, y_test = sklearn.model_selection.train_test_split(x, y, test_size=0.2)
# model
model = keras.models.Sequential()
model.add( keras.layers.normalization.BatchNormalization(input_shape=tuple([x_train.shape[1]])))
model.add(keras.layers.core.Dense(4, activation='relu'))
model.add(keras.layers.core.Dropout(rate=0.5))
model.add(keras.layers.normalization.BatchNormalization())
model.add(keras.layers.core.Dense(4, activation='relu'))
model.add(keras.layers.core.Dropout(rate=0.5))
model.add(keras.layers.normalization.BatchNormalization())
model.add(keras.layers.core.Dense(4, activation='relu'))
model.add(keras.layers.core.Dropout(rate=0.5))
model.add(keras.layers.core.Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=["accuracy"])
callback_early_stopping = keras.callbacks.EarlyStopping(monitor='val_loss', patience=10, verbose=0, mode='auto')
model.fit(x_train, y_train, batch_size=1024, epochs=200, validation_data=(x_test, y_test), verbose=1, callbacks=[callback_early_stopping])
y_test = model.predict(x_test.values)
As you can see, the sigmoid activation function that you are using in your neural network output (the last layer) range from 0 to 1.
Note that your label (y) is rescaled to -1 to 1.
I suggest you change the y range to 0 to 1 and keep the sigmoid output.
So the sigmoid Ranges from 0 to 1.
Your MinMaxscaler scales data from -1 to 1.
You can fix it by replacing 'sigmoid' in the output layer with 'tanh', as tanh has output ranging from -1 to 1
Both the other answers can be used to address the fact that your network ouput is not in the same range as your y vector values. Either adjust your final layer to a tanh activation, or change the y-vector range to [0,1].
However, your network loss function and metric is defined for classification purposes, where as you are attempting regression (continuous values between [-1, 1]). The most common loss function and accuracy metric to use is the mean sqaured error, or mean absolute errtr. So I suggest you change the following:
model.compile(loss='mse', optimizer='rmsprop', metrics=['mse, 'mae'])
first of all, I'm running with the following setup:
Running on windows 10
Python 3.6.2
TensorFlow 1.8.0
Keras 2.1.6
I'm trying to predict, or at least guesstimate the following number sequence:
https://codepen.io/anon/pen/RJRPPx (limited to 20,000 for testing), the full sequence contains about one million records.
And here is the code (run.py)
import lstm
import time
import matplotlib.pyplot as plt
def plot_results(predicted_data, true_data):
fig = plt.figure(facecolor='white')
ax = fig.add_subplot(111)
ax.plot(true_data, label='True Data')
plt.plot(predicted_data, label='Prediction')
plt.legend()
plt.show()
def plot_results_multiple(predicted_data, true_data, prediction_len):
fig = plt.figure(facecolor='white')
ax = fig.add_subplot(111)
ax.plot(true_data, label='True Data')
#Pad the list of predictions to shift it in the graph to it's correct start
for i, data in enumerate(predicted_data):
padding = [None for p in range(i * prediction_len)]
plt.plot(padding + data, label='Prediction')
plt.legend()
plt.show()
#Main Run Thread
if __name__=='__main__':
global_start_time = time.time()
epochs = 10
seq_len = 50
print('> Loading data... ')
X_train, y_train, X_test, y_test = lstm.load_data('dice_amplified/primeros_20_mil.csv', seq_len, True)
print('> Data Loaded. Compiling...')
model = lstm.build_model([1, 50, 100, 1])
model.fit(
X_train,
y_train,
batch_size = 512,
nb_epoch=epochs,
validation_split=0.05)
predictions = lstm.predict_sequences_multiple(model, X_test, seq_len, 50)
#predicted = lstm.predict_sequence_full(model, X_test, seq_len)
#predicted = lstm.predict_point_by_point(model, X_test)
print('Training duration (s) : ', time.time() - global_start_time)
plot_results_multiple(predictions, y_test, 50)
I have tried to:
increase and decrease epochs.
increase and decrease batch size.
amplify the data.
The following plot represents:
epochs = 10
batch_size = 512
validation_split = 0.05
As well, as far as I understand, the loss should be decreasing with more epochs? Which doesn't seem to be happening!
Using TensorFlow backend.
> Loading data...
> Data Loaded. Compiling...
> Compilation Time : 0.03000473976135254
Train on 17056 samples, validate on 898 samples
Epoch 1/10
17056/17056 [==============================] - 31s 2ms/step - loss: 29927.0164 - val_loss: 289.8873
Epoch 2/10
17056/17056 [==============================] - 29s 2ms/step - loss: 29920.3513 - val_loss: 290.1069
Epoch 3/10
17056/17056 [==============================] - 29s 2ms/step - loss: 29920.4602 - val_loss: 292.7868
Epoch 4/10
17056/17056 [==============================] - 27s 2ms/step - loss: 29915.0955 - val_loss: 286.7317
Epoch 5/10
17056/17056 [==============================] - 26s 2ms/step - loss: 29913.6961 - val_loss: 298.7889
Epoch 6/10
17056/17056 [==============================] - 26s 2ms/step - loss: 29920.2068 - val_loss: 287.5138
Epoch 7/10
17056/17056 [==============================] - 28s 2ms/step - loss: 29914.0650 - val_loss: 295.2230
Epoch 8/10
17056/17056 [==============================] - 25s 1ms/step - loss: 29912.8860 - val_loss: 295.0592
Epoch 9/10
17056/17056 [==============================] - 28s 2ms/step - loss: 29907.4067 - val_loss: 286.9338
Epoch 10/10
17056/17056 [==============================] - 46s 3ms/step - loss: 29914.6869 - val_loss: 289.3236
Any recommendations? How could I improve it? Thanks!
Lstm.py contents:
import os
import time
import warnings
import numpy as np
from numpy import newaxis
from keras.layers.core import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM
from keras.models import Sequential
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' #Hide messy TensorFlow warnings
warnings.filterwarnings("ignore") #Hide messy Numpy warnings
def load_data(filename, seq_len, normalise_window):
f = open(filename, 'rb').read()
data = f.decode().split('\n')
sequence_length = seq_len + 1
result = []
for index in range(len(data) - sequence_length):
result.append(data[index: index + sequence_length])
if normalise_window:
result = normalise_windows(result)
result = np.array(result)
row = round(0.9 * result.shape[0])
train = result[:int(row), :]
np.random.shuffle(train)
x_train = train[:, :-1]
y_train = train[:, -1]
x_test = result[int(row):, :-1]
y_test = result[int(row):, -1]
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1))
return [x_train, y_train, x_test, y_test]
def normalise_windows(window_data):
normalised_data = []
for window in window_data:
normalised_window = [((float(p) / float(window[0])) - 1) for p in window]
normalised_data.append(normalised_window)
return normalised_data
def build_model(layers):
model = Sequential()
model.add(LSTM(
input_shape=(layers[1], layers[0]),
output_dim=layers[1],
return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(
layers[2],
return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(
output_dim=layers[3]))
model.add(Activation("linear"))
start = time.time()
model.compile(loss="mse", optimizer="rmsprop")
print("> Compilation Time : ", time.time() - start)
return model
def predict_point_by_point(model, data):
#Predict each timestep given the last sequence of true data, in effect only predicting 1 step ahead each time
predicted = model.predict(data)
predicted = np.reshape(predicted, (predicted.size,))
return predicted
def predict_sequence_full(model, data, window_size):
#Shift the window by 1 new prediction each time, re-run predictions on new window
curr_frame = data[0]
predicted = []
for i in range(len(data)):
predicted.append(model.predict(curr_frame[newaxis,:,:])[0,0])
curr_frame = curr_frame[1:]
curr_frame = np.insert(curr_frame, [window_size-1], predicted[-1], axis=0)
return predicted
def predict_sequences_multiple(model, data, window_size, prediction_len):
#Predict sequence of 50 steps before shifting prediction run forward by 50 steps
prediction_seqs = []
for i in range(int(len(data)/prediction_len)):
curr_frame = data[i*prediction_len]
predicted = []
for j in range(prediction_len):
predicted.append(model.predict(curr_frame[newaxis,:,:])[0,0])
curr_frame = curr_frame[1:]
curr_frame = np.insert(curr_frame, [window_size-1], predicted[-1], axis=0)
prediction_seqs.append(predicted)
return prediction_seqs
Addendum:
At nuric suggestion I modified the model as following:
def build_model(layers):
model = Sequential()
model.add(LSTM(input_shape=(layers[1], layers[0]), output_dim=layers[1], return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(layers[2], return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(output_dim=layers[3]))
model.add(Activation("linear"))
model.add(Dense(64, input_dim=50, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(1))
start = time.time()
model.compile(loss="mse", optimizer="rmsprop")
print("> Compilation Time : ", time.time() - start)
return model
Still a bit lost on this one...
Even though you normalise the input, you don't normalise the output. The LSTM by default has a tanh output which means you will have a limited feature space, ie the dense layer won't be able to regress to large numbers.
You have a fixed length numerical input (50,), directly pass that to Dense layers with relu activation and will perform better on regression tasks, something simple like:
model = Sequential()
model.add(Dense(64, input_dim=50, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(1))
For regression it is also preferable to use l2 regularizers instead of Dropout because you are not really feature extracting for classification etc.