I'm learning to work with neural networks applied to time-series so I tuned and LSTM example that I found to make predictions of daily temperature data. However, I found that the results are extremely poor as is shown in the image. (I only predict the last 92 days in order to save time for now).
This is the code I implemented. The data are 3 column dataframe (minimum, maximum and mean daily temperatures), but I only employ one of the columns at one time.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tools.eval_measures import rmse
from sklearn.preprocessing import MinMaxScaler
from keras.preprocessing.sequence import TimeseriesGenerator
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
import warnings
warnings.filterwarnings("ignore")
input_file2 = "TemperaturasCampillos.txt"
seriesT = pd.read_csv(input_file2,sep = "\t", decimal = ".", names = ["Minimas","Maximas","Medias"])
seriesT[seriesT==-999]=np.nan
date1 = '2010-01-01'
date2 = '2010-09-01'
date3 = '2020-05-17'
date4 = '2020-12-31'
mydates = pd.date_range(date2, date3).tolist()
seriesT['Fecha'] = mydates
seriesT.set_index('Fecha',inplace=True) # Para que los índices sean fechas y así se ponen en el eje x de forma predeterminada
seriesT.index = seriesT.index.to_pydatetime()
df = seriesT.drop(seriesT.columns[[1, 2]], axis=1) # df.columns is zero-based pd.Index
n_input = 92
train, test = df[:-n_input], df[-n_input:]
scaler = MinMaxScaler()
scaler.fit(train)
train = scaler.transform(train)
test = scaler.transform(test)
#n_input = 365
n_features = 1
generator = TimeseriesGenerator(train, train, length=n_input, batch_size=1)
model = Sequential()
model.add(LSTM(200, activation='relu', input_shape=(n_input, n_features)))
model.add(Dropout(0.15))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
model.fit_generator(generator,epochs=150)
#create an empty list for each of our 12 predictions
#create the batch that our model will predict off of
#save the prediction to our list
#add the prediction to the end of the batch to be used in the next prediction
pred_list = []
batch = train[-n_input:].reshape((1, n_input, n_features))
for i in range(n_input):
pred_list.append(model.predict(batch)[0])
batch = np.append(batch[:,1:,:],[[pred_list[i]]],axis=1)
df_predict = pd.DataFrame(scaler.inverse_transform(pred_list),
index=df[-n_input:].index, columns=['Prediction'])
df_test = pd.concat([df,df_predict], axis=1)
plt.figure(figsize=(20, 5))
plt.plot(df_test.index, df_test['Minimas'])
plt.plot(df_test.index, df_test['Prediction'], color='r')
plt.legend(loc='best', fontsize='xx-large')
plt.xticks(fontsize=18)
plt.yticks(fontsize=16)
plt.show()
As you can see if you click in the image link, I get a predict too smoothed, good to see the seasonality but is not what I am looking forward.
In addition, I tried to add more layers to the neural network shown, so the network looks something like:
#n_input = 365
n_features = 1
generator = TimeseriesGenerator(train, train, length=n_input, batch_size=1)
model = Sequential()
model.add(LSTM(200, activation='relu', input_shape=(n_input, n_features)))
model.add(LSTM(128, activation='relu'))
model.add(LSTM(256, activation='relu'))
model.add(LSTM(128, activation='relu'))
model.add(LSTM(64, activation='relu'))
model.add(LSTM(n_features, activation='relu'))
model.add(Dropout(0.15))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
model.fit_generator(generator,epochs=100)
but I get this error:
ValueError: Input 0 is incompatible with layer lstm_86: expected ndim=3, found ndim=2
Of course, as the model has a bad performance I cannot assure that out-of-sample predictions would be accurate.
Why I cannot implement more layers to the network? How could I improve the performance?
You are missing one argument: return_sequences.
When you have more than one LSTM layer, you should set it to TRUE. Because, otherwise, that layer will only output the last hidden state. Add it to each LSTM layer.
model.add(LSTM(128, activation='relu', return_sequences=True))
About poor performance: my guess it is because of you have low amount of data for this application (the data seems pretty noise), add layers won't help too much.
Related
I am trying to craft a neural network based on two research papers, but after cobbling something together using some examples online it runs but seems to be stuck at 57% accuracy no matter what I try to modify. I am quite sure there is something wrong with how I shaped the input to the model, but I don't understand enough to see why it's wrong and how to fix it.
The first paper (Quantitative Trading on Stock Market Based on Deep Reinforcement Learning by Jia WU, Chen WANG, Lidong XIONG, Hongyong SUN) states:
The LSTMbased agent is composed of an input layer, 5-layer LSTM
with 31 hidden units on each layer, a dense layer and a soft-max layer. The update rate of the baseline function b is 0.8. Weights of the agent are initialized uniformly between -0.2 and 0.2. The neural network of the agent is trained with Adam optimizer [30] with a learning rate of
0.001.
The second paper (Effects of Activation Functions and Optimizers on Stock Price Prediction using LSTM Recurrent Networks by Masud Rana, Md. Mohsin Uddin, Md. Mohaimnul Hoque) states:
The result is shown that linear activation function and adamax optimizer and, tanh activation function and adam optimizer make almost the same prediction that is better than other activation functions and optimizers’ prediction.
I want to try and implement both models, or one that takes inspiration from both (I couldn't really understand what the model in the second paper looks like so I focused on the activation functions).
My data set is a csv with a date, some daily Forex data including technical indicators and world bank data, and a Target Column that is 1 if the closing price 30 days from that day is higher than that day's closing price.
My code is:
`
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense, Dropout, LSTM, InputLayer
from keras.optimizers import Adam, Adamax
from tensorflow.keras import layers
from tensorflow.keras import initializers
from keras.callbacks import EarlyStopping
# Load the data into a Pandas dataframe
df = pd.read_csv(r'C:...categorical.csv', index_col=False)
df.index = df.pop('Date')
RowNum = len(df.axes[0])
ColNum = len(df.axes[1])-1 #Number of columns - 1 for target
X = df.iloc[:, :ColNum].values
y = df.iloc[:, ColNum].values
X = np.reshape(X, (RowNum, ColNum, 1))
es = EarlyStopping(monitor='val_accuracy', mode='max', min_delta=1)
model = Sequential()
model.add(InputLayer(input_shape=(ColNum, 1)))
model.add(LSTM(31, return_sequences=True, input_shape=(ColNum, 1), activation='tanh'))
model.add(LSTM(31, return_sequences=True, activation='tanh'))
model.add(LSTM(31, return_sequences=True, activation='tanh'))
model.add(LSTM(31, return_sequences=True, activation='tanh'))
model.add(LSTM(31, activation='tanh'))
model.add(Dropout(0.2))
model.add(Dense(units=1, activation='tanh', kernel_initializer=initializers.RandomUniform(minval=-0.2, maxval=0.2)))
# Compile the model
model.compile(optimizer=Adam(lr=0.001), loss='categorical_crossentropy', metrics=['accuracy'])
# Fit the model to the data
model.fit(X, y, batch_size=32, epochs=90, validation_split=0.2, callbacks=[es])
`
What can I try to fix this?
It is my first day with tf and keras. I had a quick tutorial which worked fine, but left me with a lot of questions.
Can someone show me how to get two data inputs instead of one?
import keras
import numpy as np
model = keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])
model.compile(optimizer='sgd',loss='mean_squared_error')
xs = np.array([1,2,3,4,5,6,7], dtype=int) # input data 1
ys = np.array([8,11,14,17,20,23,26], dtype=int)
# formel is : 3*x+5
model.fit(xs, ys, epochs=500)
print(model.predict([10.0]))
add a few hidden layers for feature detection. If you want multiple features then you will need to change the shape of X and the input shape
X = np.array([1,2,3,4,5,6,7], dtype=int).tolist()
#ys = np.array([8,11,14,17,20,23,26], dtype=int)
ys=list(map(lambda x: 3*x+5,xs.tolist()))
plt.plot(xs,ys)
X_train, X_test, y_train, y_test= train_test_split(X,y,test_size=0.3)
model=Sequential()
model.add(layers.Input(shape=(1,), name='main_input'))
model.add(Dense(200, activation='tanh'))
model.add(Dense(100, activation='tanh'))
model.add(Dense(32, activation='tanh'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse', metrics=['mse'])
history=model.fit(X_train, y_train, epochs=1000, verbose=0)
predictionResults=model.predict(X_test)
index=0
results=predictionResults.flatten()
for value in X_test:
plt.scatter(value,results[index])
index+=1
plt.plot(X,y)
plt.show()
Going off of Golden's answer, here is an example of adding another feature "x". You should just mess around with the layers and sizes.
import keras
import numpy as np
xs = np.array([1,2,3,4,5,6,7], dtype=int) # input data 1
x = np.array([3,5,7,9,11,13,15], dtype=int) # input data 2
ys = np.array([3,10,21,36,55,78,105], dtype=int)
# formel is : xs * x
input_data = np.array([[xs],[x]]).T
model = keras.models.Sequential()
model.add(keras.Input(shape=input_data.shape[1:]))
model.add(keras.layers.Dense(500, activation='tanh'))
model.add(keras.layers.Dense(200, activation='tanh'))
model.add(keras.layers.Dense(1))
model.compile(optimizer='adam', loss='mse', metrics=['mse'])
model.fit(input_data, ys, epochs=500)
print(model(np.array([[10, 21]])).numpy())
see https://www.pyimagesearch.com/2019/02/04/keras-multiple-inputs-and-mixed-data/ for how to create a simple feedforward neural network with 10 inputs
model = Sequential()
model.add(Dense(8, input_shape=(10,), activation="relu"))
model.add(Dense(4, activation="relu"))
model.add(Dense(1, activation="linear"))
This network is a simple feedforward neural without with 10 inputs, a first hidden layer with 8 nodes, a second hidden layer with 4 nodes, and a final output layer used for regression.
Keras allows you to create multiple sequential networks with inputs and concatenate them into a dense layer with one or more outputs
I'm trying to learn how to use RNN for time-series predictions and in all the examples I'm seeing out there they use a sequence of prices to predict the following price. In the examples each target (Y_train[n]) is associated to a sequence or matrix composed of the last 30 prices/steps ([X_train[[n-1],[n-2]....,[n-30]).
However in the real world to accurately predict you need more than the sequence of the last 30 prices, you would also need other... should I say features? Like the last 30 values of volume or the last 30 values of a sentiment index.
So my question is:
How do you shape the input of an RNN with two sequences for each target (last 30 prices and last 30 volume values)? This is the example code I'm using with only 1 sequence to use as reference:
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
# Dividing Dataset (Test and Train)
train_lim = int(len(df) * 2 / 3)
training_set = df[:train_lim][['Close']]
test_set = df[train_lim:][['Close']]
# Normalizing
sc = MinMaxScaler(feature_range=(0, 1))
training_set_scaled = sc.fit_transform(training_set)
# Shaping Input
X_train = []
y_train = []
X_test = []
for i in range(30, training_set_scaled.size):
X_train.append(training_set_scaled[i - 30:i, 0])
y_train.append(training_set_scaled[i, 0])
X_train, y_train = np.array(X_train), np.array(y_train)
for i in range(30, len(test_set)):
X_test.append(test_set.iloc[i - 30:i, 0])
X_test = np.array(X_test)
# Adding extra dimension ???
X_train = np.reshape(X_train, [X_train.shape[0], X_train.shape[1], 1])
X_test = np.reshape(X_test, [X_test.shape[0], X_test.shape[1], 1])
regressor = Sequential()
# LSTM layer 1
regressor.add(LSTM(units=50, return_sequences=True, input_shape=(X_train.shape[1], 1)))
regressor.add(Dropout(0.2))
# LSTM layer 2,3,4
regressor.add(LSTM(units=50, return_sequences=True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units=50, return_sequences=True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units=50, return_sequences=True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units=50, return_sequences=True))
regressor.add(Dropout(0.2))
# LSTM layer 5
regressor.add(LSTM(units=50))
regressor.add(Dropout(0.2))
# Fully connected layer
regressor.add(Dense(units=1))
# Compiling the RNN
regressor.compile(optimizer='adam', loss='mean_squared_error')
# Fitting the RNN model
regressor.fit(X_train, y_train, epochs=120, batch_size=32)
The dataframe that I'm using is a standard OHLCV with a datetime index so it will look like this:
Datetime Open High Low Close Volume
01/01/2021 102.42 103.33 100.57 101.23 1990
02/01/2021 101.23 105.22 99.45 100.11 1970
... ... ... ... ... ...
01/12/2021 203.22 210.34 199.22 201.11 2600
You can follow exactly the same process, the only difference is that the length of the last dimension of the arrays with the input sequences (X_train and X_test) will be greater than one (as it will be equal to the number of external regressors plus one, where the plus one comes from the fact that the past values of the target are also used as an input).
import pandas as pd
import numpy as np
import yfinance as yf
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense, LSTM, Dropout
pd.options.mode.chained_assignment = None
# define the target and features
target = ['Close']
features = ['Volume', 'High', 'Low']
# download the data
df = yf.download(tickers=['AAPL'], period='1y')
df = df[features + target]
# split the data
split = int(df.shape[0] * 2 / 3)
df_train = df.iloc[:split, :].copy()
df_test = df.iloc[split:, :].copy()
# scale the data
target_scaler = MinMaxScaler().fit(df_train[target])
df_train[target] = target_scaler.transform(df_train[target])
df_test[target] = target_scaler.transform(df_test[target])
features_scaler = MinMaxScaler().fit(df_train[features])
df_train[features] = features_scaler.transform(df_train[features])
df_test[features] = features_scaler.transform(df_test[features])
# extract the input sequences and output values
sequence_length = 30
X_train, y_train = [], []
for i in range(sequence_length, df_train.shape[0]):
X_train.append(df_train[features + target].iloc[i - sequence_length: i])
y_train.append(df_train[target].iloc[i])
X_train, y_train = np.array(X_train), np.array(y_train)
X_test, y_test = [], []
for i in range(sequence_length, df_test.shape[0]):
X_test.append(df_test[features + target].iloc[i - sequence_length: i])
y_test.append(df_test[target].iloc[i])
X_test, y_test = np.array(X_test), np.array(y_test)
print(X_train.shape)
# (138, 30, 4)
print(X_test.shape)
# (55, 30, 4)
# build and train the model
model = Sequential()
model.add(LSTM(units=50, return_sequences=True, input_shape=X_train.shape[1:]))
model.add(Dropout(0.2))
model.add(LSTM(units=50, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=50, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=50, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=50, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=50))
model.add(Dropout(0.2))
model.add(Dense(units=1))
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(X_train, y_train, epochs=120, batch_size=32)
model.evaluate(X_test, y_test)
# generate the test set predictions
y_pred = model.predict(X_test)
y_pred = target_scaler.inverse_transform(y_pred)
# plot the test set predictions
df['Predicted Close'] = np.nan
df['Predicted Close'].iloc[- y_pred.shape[0]:] = y_pred.flatten()
df[['Close', 'Predicted Close']].plot()
I understand the Vanishing and exploding gradients problem in Vanilla RNNs and why this happens. However, I would like to create this problem purposefully to understand in a better way. I have taken a below code from https://www.datatechnotes.com/2018/12/rnn-example-with-keras-simplernn-in.html.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense, SimpleRNN
# convert into dataset matrix
def convertToMatrix(data, step):
X, Y =[], []
for i in range(len(data)-step):
d=i+step
X.append(data[i:d,])
Y.append(data[d,])
return np.array(X), np.array(Y)
step = 4
N = 1000
Tp = 800
t=np.arange(0,N)
x=np.sin(0.02*t)+2*np.random.rand(N)
df = pd.DataFrame(x)
df.head()
plt.plot(df)
plt.show()
values=df.values
train,test = values[0:Tp,:], values[Tp:N,:]
# add step elements into train and test
test = np.append(test,np.repeat(test[-1,],step))
train = np.append(train,np.repeat(train[-1,],step))
trainX,trainY =convertToMatrix(train,step)
testX,testY =convertToMatrix(test,step)
trainX = np.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = np.reshape(testX, (testX.shape[0], 1, testX.shape[1]))
model = Sequential()
model.add(SimpleRNN(units=32, input_shape=(1,step), activation="relu"))
model.add(Dense(8, activation="relu"))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='rmsprop')
model.summary()
model.fit(trainX,trainY, epochs=100, batch_size=16, verbose=2)
trainPredict = model.predict(trainX)
testPredict= model.predict(testX)
predicted=np.concatenate((trainPredict,testPredict),axis=0)
trainScore = model.evaluate(trainX, trainY, verbose=0)
print(trainScore)
How should I modify this code to create this problem? Thank you.
Vanishing gradient is the problem when we use the sigmoid activation function. If you change relu to sigmoid, you may encounter vanishing gradient problem.
model = Sequential()
model.add(SimpleRNN(units=32, input_shape=(1,step), activation="sigmoid"))
model.add(Dense(8, activation="sigmoid"))
This is a regression problem. Below is my code
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.cross_validation import cross_val_score, KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
os.chdir(r'C:\Users\Swapnil\Desktop\RP TD\first\Changes')
## Load the dataset
dataset1 = pd.read_csv("Main Lane Plaza 1.csv")
X_train = dataset1.iloc[:,0:11].values
Y_train = dataset1.iloc[:,11].values
dataset2 = pd.read_csv("Main Lane Plaza 1_070416010117.csv")
X_test = dataset2.iloc[:,0:11].values
Y_test = dataset2.iloc[:,11].values
##Define base model
def base_model():
model = Sequential()
model.add(Dense(11, input_dim=11, kernel_initializer='normal',
activation='sigmoid'))
model.add(Dense(7, kernel_initializer='normal', activation='sigmoid'))
model.add(Dense(1, kernel_initializer='normal'))
model.compile(loss='mean_squared_error', optimizer = 'adam')
return model
seed = 7
np.random.seed(seed)
clf = KerasRegressor(build_fn=base_model, nb_epoch=100,
batch_size=5,verbose=0)
clf.fit(X_train, Y_train)
res = clf.predict(X_train)
##Result
clf.score(X_test, Y_test)
Not sure if the score should be negative??
Kindly advise if i am doing something wrong.
Thanks in advance.
I am not able to figure it out can this be problem due to feature scaling as I did feature scaling using R and saved the csv files to use in python.
When you get a negative score for regression problem, it usually means that your the model you choose can't fit your data well.
You have layer 1 activation as sigmoid, layer 2 also as sigmoid and then final layer as 1 output.
change the activations to relu, as sigmoid would be squashing the values between 0 to 1. Making the numbers really small, causing the vanishing gradient problem over the 2 hidden layer.
def base_model():
model = Sequential()
model.add(Dense(11, input_dim=11, kernel_initializer='normal', activation='relu'))
model.add(Dense(7, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal'))
model.compile(loss='mean_squared_error', optimizer='adam')
return model