I am learning how to set up the RNN-LSTM network for prediction. I have created the dataset with one-variables.
x y
1 2.5
2 6
3 8.6
4 11.2
5 13.8
6 16.4
...
And the relationship of the y(t) = 2.5x(t) + x(t-1) -0.9*x(t-2). And I am trying to set up the RNN-LSTM to learn the pattern, but it occurred the error of my program. My program is like below:
df= pd.read_excel('dataset.xlsx')
def split_dataset(data):
# split into standard weeks
train, test = data[:-328], data[-328:-6]
# restructure into windows of weekly data
train = np.array(np.split(train, len(train)/1))
test = np.array(np.split(test, len(test)/1))
return train, test
verbose, epochs, batch_size = 0, 20, 16
train, test = split_dataset(df.values)
train_x, train_y = train[:,:,0], train[:,:,1]
model = Sequential()
model.add(LSTM(200, return_sequences=True, input_shape = train_x.shape))
model.compile(loss='mse', optimizer='adam')
It occurred the ValueError:
ValueError: Error when checking input: expected lstm_35_input to have 3 dimensions, but got array with shape (8766, 1)
Any experienced DS or pythoner can teach me how to set up the network?
Thanks
For LSTM based RNN, the input should be of 3 dimensions (batch, time, data_point). I assume index of your x variable is its time. In this case, you have to transform your input into batches of some window, say a window of 3, then your input is:
batch # input target
0 x[0:3] y[3]
1 x[1:4] y[4]
2 x[2:5] y[5]
Note: your y's start from t=3 since you are using last 3 time steps to predict the next 4th value. If your y's are already calculated from the last three time steps as you have said, then y's should start from 0 index, i.e. at batch 0 you have y[0] as the target
UPDATE as per below comment
If you want to have multiple sequences, then you can model it as a sequence to sequence problem and will be an N to M mapping, you need five x values to predict three y's:
batch # input target
0 x[0:5] y[3:6]
1 x[1:6] y[4:7]
2 x[2:7] y[5:8]
In current, I have created the data window and it look like work for I mentioned case.
Below is my code:
df= pd.read_excel('dataset.xlsx')
# split a univariate dataset into train/test sets
def split_dataset(data):
train, test = data[:-328], data[-328:-6]
return train, test
train, test = split_dataset(df.values)
# scale train and test data to [-1, 1]
def scale(train, test):
# fit scaler
scaler = MinMaxScaler(feature_range=(0,1))
scaler = scaler.fit(train)
# transform train
#train = train.reshape(train.shape[0], train.shape[1])
train_scaled = scaler.transform(train)
# transform test
#test = test.reshape(test.shape[0], test.shape[1])
test_scaled = scaler.transform(test)
return scaler, train_scaled, test_scaled
scaler, train_scaled, test_scaled = scale(train, test)
def to_supervised(train, n_input, n_out=7):
# flatten data
data = train
X, y = list(), list()
in_start = 0
# step over the entire history one time step at a time
for _ in range(len(data)):
# define the end of the input sequence
in_end = in_start + n_input
out_end = in_end + n_out
# ensure we have enough data for this instance
if out_end <= len(data):
x_input = data[in_start:in_end, 0]
x_input = x_input.reshape((len(x_input), 1))
X.append(x_input)
y.append(data[in_end:out_end, 0])
# move along one time step
in_start += 1
return np.array(X), np.array(y)
train_x, train_y = to_supervised(train_scaled, n_input = 3, n_out = 1)
test_x, test_y = to_supervised(test_scaled, n_input = 3, n_out = 1)
verbose, epochs, batch_size = 0, 20, 16
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
model = Sequential()
model.add(LSTM(200, return_sequences= False, input_shape = (train_x.shape[1],train_x.shape[2])))
model.add(Dense(1))
model.compile(loss = 'mse', optimizer = 'adam')
history = model.fit(train_x, train_y, epochs=epochs, verbose=verbose, validation_data = (test_x, test_y))
However, I have other questions about this:
Q1: What is the meaning of units in LSTM? [model.add(LSTM(units, ...))]
(I have tried different units for the model, it would be more accurate as units increased.)
Q2: How many layers should I set?
Q3: How can I predict multi-steps ? e.g base on (x(t),x(t-1)) to predict y(t), y(t+1)
I have tried to set the n_out = 2 in the to_supervised function, but when I applied the same method, it returned the error
train_x, train_y = to_supervised(train_scaled, n_input = 3, n_out = 2)
test_x, test_y = to_supervised(test_scaled, n_input = 3, n_out = 2)
verbose, epochs, batch_size = 0, 20, 16
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
model = Sequential()
model.add(LSTM(200, return_sequences= False, input_shape = (train_x.shape[1],train_x.shape[2])))
model.add(Dense(1))
model.compile(loss = 'mse', optimizer = 'adam')
history = model.fit(train_x, train_y, epochs=epochs, verbose=verbose, validation_data = (test_x, test_y))
ValueError: Error when checking target: expected dense_27 to have shape (1,) but got array with shape (2,)
Q3(cont): What should I add or change in the model setting?
Q3(cont): What is the return_sequences ? When should I set True?
Related
I am wondering why this error is occuring. My hunch tells me that the tensorDataset reads the last column as being the labels, but I don't know why it would behave that way if I input a separate dataset for labels as the second argument. Also, can someone explain exactly how one-hot encoding works and how I can fix this problem because I only want one label per item?
Error: return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: 1D target tensor expected, multi-target not supported
Code:
if __name__ == '__main__':
inputs_file = pd.read_csv('dataset.csv')
targets_file = pd.read_csv('labels.csv')
inputs = inputs_file.iloc[1:1001].values
targets = targets_file.iloc[1:1001].values
inputs = torch.tensor(inputs, dtype=torch.float32)
targets = torch.tensor(targets)
dataset = TensorDataset(inputs, targets)
val_size = 200
test_size = 100
train_size = len(dataset) - (val_size + test_size)
# Divide dataset into 3 unique random subsets
training_data, validation_data, test_data = random_split(dataset, [train_size, val_size, test_size])
batch_size = 50
train_loader = DataLoader(training_data, batch_size, shuffle=True, num_workers=4, pin_memory=True)
valid_loader = DataLoader(validation_data, batch_size*2, num_workers=4, pin_memory=True)
From what I gather from the comments discussion, the error is reproduced by the following.
import torch
from torch import nn
from torch.utils.data import DataLoader, TensorDataset, random_split
inputs = torch.randn(999, 11, dtype=torch.float32)
targets = torch.randint(5, (999, 1), dtype=torch.long)
# you need this to adapt from pandas, but not for this example code
# inputs = torch.tensor(inputs, dtype=torch.float32)
# targets = torch.tensor(targets)
dataset = TensorDataset(inputs, targets)
val_size = 200
test_size = 100
train_size = len(dataset) - (val_size + test_size)
# Divide dataset into 3 unique random subsets
training_data, validation_data, test_data = random_split(dataset, [train_size, val_size, test_size])
batch_size = 50
train_loader = DataLoader(training_data, batch_size, shuffle=True, num_workers=4, pin_memory=True)
valid_loader = DataLoader(validation_data, batch_size*2, num_workers=4, pin_memory=True)
# guess model. More on this in a moment
model = nn.Sequential(
nn.Linear(11, 8),
nn.Linear(8, 5),
)
loss_func = nn.CrossEntropyLoss()
for features, labels in train_loader:
out = model(features)
loss = loss_func(out, labels)
print(f"{loss = }")
break
Solution 1
Add labels.squeeze(-1) to the loop body a la
for features, labels in train_loader:
out = model(features)
labels = labels.squeeze()
loss = loss_func(out, labels)
print(f"{loss = }")
break
Solution 2
Flatten your targets initially with
targets = torch.tensor(targets[:, 0])
In response to
Now I am getting this error: RuntimeError: mat1 and mat2 shapes cannot be multiplied (11x1 and 11x8) I should also add that I am using a hidden layer of size 8 and i have 5 classes
My architecture is a guess at what you're using, but as the code above is resolved by the target reshape, I'll need more to be more helpful.
Perhaps some documentation to assist? CrossEntropyLoss The example code shows the expected shape of the targets being N, rather than N, 1 or N, classes.
I have a classification problem (0 or 1) with 78 features.
Let's say I have that data in a dataframe with 78 columns and 202 rows, each cell value holding an integer in the range [0, infinity).
Trying to achieve prediction with TensorFlow, when I fit my model I get the following warning:
WARNING:tensorflow:Model was constructed with shape (None, 101, 78) for input Tensor("input_2:0", shape=(None, 101, 78), dtype=float32), but it was called on an input with incompatible shape (101, 78).
I would have thought my shape definition was correct, so why am I getting this warning? Perhaps I'm misunderstanding how to use the framework.
X_train, X_test, y_train, y_test = train_test_split(df_x, series_y, random_state=1, test_size=0.5)
numpy_x_train = X_train.to_numpy()
numpy_y_train = y_train.to_numpy()
numpy_x_test = X_test.to_numpy()
shape_x = len(X_train)
shape_y = len(X_train.columns)
inputs = keras.Input(shape=(shape_x, shape_y))
x = Rescaling(scale=1.0 / 255)(inputs)
num_classes = 1
outputs = layers.Dense(num_classes, activation="softmax")(x)
model = keras.Model(inputs=inputs, outputs=outputs)
processed_data = model(numpy_x_train)
optimiser = keras.optimizers.RMSprop(learning_rate=1e-3)
loss = keras.losses.CategoricalCrossentropy()
model.compile(optimizer=optimiser, loss=loss)
history = model.fit(numpy_x_train, numpy_y_train, batch_size=32, epochs=10)
here a full example based on your problem. you have 2D data so in the input layer you have to specify only the feature dimension and not also the sample dimension. you also are carrying out a binary classification task so the best choice is to use a final dense layer with 1 dimension, a sigmoid activation function, and binary crossentropy as a loss. the predicted class will be 1 if the prob are > 0.5 otherwise it is 0
from tensorflow import keras
import numpy as np
# create dummy data
train_size, test_size = 101, 101
n_features = 78
num_classes = 2 # 0 or 1
numpy_x_train = np.random.uniform(0,256, (train_size,n_features))
numpy_y_train = np.random.randint(0,num_classes,train_size)
numpy_x_test = np.random.uniform(0,256, (test_size,n_features))
numpy_y_test = np.random.randint(0,num_classes,test_size)
# rescaling data
numpy_x_train = numpy_x_train / 255
numpy_x_test = numpy_x_test / 255
# define model
inputs = keras.layers.Input(shape=(numpy_x_train.shape[1],))
outputs = keras.layers.Dense(1, activation="sigmoid")(inputs)
optimiser = keras.optimizers.RMSprop(learning_rate=1e-3)
loss = keras.losses.BinaryCrossentropy()
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(optimizer=optimiser, loss=loss)
history = model.fit(numpy_x_train, numpy_y_train, batch_size=32, epochs=10)
# get test predictions
test_prob = model.predict(numpy_x_test).ravel()
test_class = (test_prob>0.5)+0 # if prob > 0.5 is 1 else is 0
I am constructing an LSTM predictor with Keras. My input array is historical price data. I segment the data into window_size blocks, in order to predict prediction length blocks ahead. My data is a list of 4246 floating point numbers. I seperate my data into 4055 arrays each of length 168 in order to predict 24 units ahead.
This gives me an x_train set with dimension (4055,168). I then scale my data and try to fit the data but run into a dimension error.
df = pd.DataFrame(data)
print(f"Len of df: {len(df)}")
min_max_scaler = MinMaxScaler()
H = 24
window_size = 7*H
num_pred_blocks = len(df)-window_size-H+1
x_train = []
y_train = []
for i in range(num_pred_blocks):
x_train_block = df['C'][i:(i + window_size)]
x_train.append(x_train_block)
y_train_block = df['C'][(i + window_size):(i + window_size + H)]
y_train.append(y_train_block)
LEN = int(len(x_train)*window_size)
x_train = min_max_scaler.fit_transform(x_train)
batch_size = 1
def build_model():
model = Sequential()
model.add(LSTM(input_shape=(window_size,batch_size),
return_sequences=True,
units=num_pred_blocks))
model.add(TimeDistributed(Dense(H)))
model.add(Activation("linear"))
model.compile(loss="mse", optimizer="rmsprop")
return model
num_epochs = epochs
model= build_model()
model.fit(x_train, y_train, batch_size = batch_size, epochs = 50)
The error being returned is as such.
ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), but instead got the following list of 4055 arrays: [array([[0.00630006],
Am I not segmenting correctly? Loading correctly? Should the number of units be different than the number of prediction blocks? I appreciate any help. Thanks.
Edit
The suggestions to convert them to Numpy arrays is correct but MinMixScalar() returns a numpy array. I reshaped the arrays into the proper dimension but now my computer is having CUDA memory error. I consider the problem solved. Thank you.
df = pd.DataFrame(data)
min_max_scaler = MinMaxScaler()
H = prediction_length
window_size = 7*H
num_pred_blocks = len(df)-window_size-H+1
x_train = []
y_train = []
for i in range(num_pred_blocks):
x_train_block = df['C'][i:(i + window_size)].values
x_train.append(x_train_block)
y_train_block = df['C'][(i + window_size):(i + window_size + H)].values
y_train.append(y_train_block)
x_train = min_max_scaler.fit_transform(x_train)
y_train = min_max_scaler.fit_transform(y_train)
x_train = np.reshape(x_train, (len(x_train), 1, window_size))
y_train = np.reshape(y_train, (len(y_train), 1, H))
batch_size = 1
def build_model():
model = Sequential()
model.add(LSTM(batch_input_shape=(batch_size, 1, window_size),
return_sequences=True,
units=100))
model.add(TimeDistributed(Dense(H)))
model.add(Activation("linear"))
model.compile(loss="mse", optimizer="rmsprop")
return model
num_epochs = epochs
model = build_model()
model.fit(x_train, y_train, batch_size = batch_size, epochs = 50)
I don't think you passed the batch size in the model.
input_shape=(window_size,batch_size) is the data dimension. which is correct, but you should use input_shape=(window_size, 1)
If you want to use batch, you have to add another dimension, like this LSTM(n_neurons, batch_input_shape=(n_batch, X.shape[1], X.shape[2])) (Cited from the Keras)
in your case:
def build_model():
model = Sequential()
model.add(LSTM(input_shape=(batch_size, 1, window_size),
return_sequences=True,
units=num_pred_blocks))
model.add(TimeDistributed(Dense(H)))
model.add(Activation("linear"))
model.compile(loss="mse", optimizer="rmsprop")
return model
You also need to use np.shape to change the dimension of the of your data, it should be (batch_dim, data_dim_1, data_dim_2). I use numpy, so numpy.reshape() will work.
First your data should be row-wise, so for each row, you should have a shape of (1, 168), then add the batch dimension, it will be (batch_n, 1, 168).
Hope this help.
That's probably because x_train and y_train were not updated to numpy arrays. Take a closer look at this issue on github.
model = build_model()
x_train, y_train = np.array(x_train), np.array(y_train)
model.fit(x_train, y_train, batch_size = batch_size, epochs = 50)
I am learning how to set up the RNN-LSTM network for prediction. I have created the dataset with one input variable.
x y
1 2.5
2 6
3 8.6
4 11.2
5 13.8
6 16.4
...
By the following python code, I have created the window data, like [x(t-2), x(t-1), x(t)] to predict [y(t)]:
df= pd.read_excel('dataset.xlsx')
# split a univariate dataset into train/test sets
def split_dataset(data):
train, test = data[:-328], data[-328:-6]
return train, test
train, test = split_dataset(df.values)
# scale train and test data
def scale(train, test):
# fit scaler
scaler = MinMaxScaler(feature_range=(0,1))
scaler = scaler.fit(train)
# transform train
#train = train.reshape(train.shape[0], train.shape[1])
train_scaled = scaler.transform(train)
# transform test
#test = test.reshape(test.shape[0], test.shape[1])
test_scaled = scaler.transform(test)
return scaler, train_scaled, test_scaled
scaler, train_scaled, test_scaled = scale(train, test)
def to_supervised(train, n_input, n_out=7):
# flatten data
data = train
X, y = list(), list()
in_start = 0
# step over the entire history one time step at a time
for _ in range(len(data)):
# define the end of the input sequence
in_end = in_start + n_input
out_end = in_end + n_out
# ensure we have enough data for this instance
if out_end <= len(data):
x_input = data[in_start:in_end, 0]
x_input = x_input.reshape((len(x_input), 1))
X.append(x_input)
y.append(data[in_end:out_end, 0])
# move along one time step
in_start += 1
return np.array(X), np.array(y)
train_x, train_y = to_supervised(train_scaled, n_input = 3, n_out = 1)
test_x, test_y = to_supervised(test_scaled, n_input = 3, n_out = 1)
verbose, epochs, batch_size = 0, 20, 16
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
model = Sequential()
model.add(LSTM(200, return_sequences= False, input_shape = (train_x.shape[1],train_x.shape[2])))
model.add(Dense(1))
model.compile(loss = 'mse', optimizer = 'adam')
history = model.fit(train_x, train_y, epochs=epochs, verbose=verbose, validation_data = (test_x, test_y))
However, I have other questions about this:
Q1: What is the meaning of units in LSTM? [model.add(LSTM(units, ...))]
(I have tried different units for the model, it would be more accurate as units increased.)
Q2: How many layers should I set?
Q3: How can I predict multi-steps ? e.g base on (x(t),x(t-1)) to predict y(t), y(t+1) I have tried to set the n_out = 2 in the to_supervised function, but when I applied the same method, it returned the error
train_x, train_y = to_supervised(train_scaled, n_input = 3, n_out = 2)
test_x, test_y = to_supervised(test_scaled, n_input = 3, n_out = 2)
verbose, epochs, batch_size = 0, 20, 16
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
model = Sequential()
model.add(LSTM(200, return_sequences= False, input_shape = (train_x.shape[1],train_x.shape[2])))
model.add(Dense(1))
model.compile(loss = 'mse', optimizer = 'adam')
history = model.fit(train_x, train_y, epochs=epochs, verbose=verbose, validation_data = (test_x, test_y))
ValueError: Error when checking target: expected dense_27 to have shape (1,) but got array with shape (2,)
Q3(cont): What should I add or change in the model setting?
Q3(cont): What is the return_sequences ? When should I set True?
Q1. Units in LSTM is the number of neurons in your LSTM layer.
Q2. That depends on your model / data. Try changing them around to see the effect.
Q3. That depends which apporach you take.
Q4. Ideally you'll want to predict a single time step every time.
It is possible to predict several at a time, but in my experience you will get better results like as i have described below
e.g
use y(t-1), y(t) to predict y_hat(t+1)
THEN
use y(t), y_hat(t+1) to predict y_hat(t+2)
Are you sure you're actually using X to predict Y in this case?
how does train x/y and test x/y look like?
Re Q1: It is the number of LSTM cells (=LSTM units), which consist of several neurons themselves but have (in the standard case as given) only one output each. Thus, the number of units corresponds directly to the dimensionality of your output.
I am using this Kaggle guide to do time series forecasting (sample data attached).
Here's the code:
def create_dataset(dataset, window_size = 1):
data_X, data_Y = [], []
for i in range(len(dataset) - window_size - 1):
a = dataset[i:(i + window_size), 0]
data_X.append(a)
data_Y.append(dataset[i + window_size, 0])
return(np.array(data_X), np.array(data_Y))
def fit_model(train_X, train_Y, window_size = 1):
model = Sequential()
model.add(LSTM(4,
input_shape = (1, window_size)))
model.add(Dense(1))
model.compile(loss = "mean_squared_error",
optimizer = "adam")
model.fit(train_X,
train_Y,
epochs = 100,
batch_size = 1,
verbose = 0)
return(model)
def predict_and_score(model, X, Y):
# Make predictions on the original scale of the data.
pred = MinMaxScaler(feature_range = (0,1)).inverse_transform(model.predict(X))
# Prepare Y data to also be on the original scale for interpretability.
orig_data = MinMaxScaler(feature_range = (0,1)).inverse_transform([Y])
# Calculate RMSE.
score = math.sqrt(mean_squared_error(orig_data[0], pred[:, 0]))
return(score, pred)
This entire thing is being used in the following function:
def nnet(time_series, window_size=1, ):
cmi_total_raw = vstack((time_series.values.astype('float32')))
scaler = MinMaxScaler(feature_range = (0,1))
cmi_total_scaled = scaler.fit_transform(cmi_total_raw)
cmi_train_sc = (cmi_total_scaled[0:int(cmi_split*len(cmi_total_scaled))])
cmi_test_sc = cmi_total_scaled[int(cmi_split*len(cmi_total_scaled)) : len(cmi_total_scaled)]
# Create test and training sets for one-step-ahead regression.
window_size = 1
train_X, train_Y = create_dataset(cmi_train_sc, window_size)
test_X, test_Y = create_dataset(cmi_test_sc, window_size)
# Reshape the input data into appropriate form for Keras.
train_X = np.reshape(train_X, (train_X.shape[0], 1, train_X.shape[1]))
test_X = np.reshape(test_X, (test_X.shape[0], 1, test_X.shape[1]))
model = fit_model(train_X, train_Y, window_size)
rmse_train, train_predict = predict_and_score(nn_model, train_X, train_Y)
mape_test, test_predict = predict_and_score(model, test_X, test_Y)
return (mape_test, test_predict)
As far as I understand, it is creating a model based on training data and predicting on in-sample test set and finally calculates the error.
The input data has 209 rows and I want to predict the next row(s).
Here's what I tried:
Since the same thing is done in Auto-Arima using forecast(steps= n_steps) method, I looked for something similar in Keras.
From Keras documentation:
predict(x, batch_size=None, verbose=0, steps=None)
Arguments:
x: The input data, as a Numpy array (or list of Numpy arrays if the model has multiple inputs).
steps: Total number of steps (batches of samples) before declaring the prediction round finished. Ignored with the default value of None.
I tried changing step and it predicted very absurd values of the order of 100,000. Moreover, length of the test_predict was no way near the steps I gave. So I am assuming step means something else here.
Question
- Can Keras even be used to forecast time series data (out of sample)
- If yes, is there a forecast method just as there the aforementioned predict method?
- If no, can the existing predict method be used in any way to get out of sample forecast?
Sample data (cmi_total):
2014-05-25 272.459887
2014-06-01 272.446022
2014-06-08 330.301260
2014-06-15 656.838394
2014-06-22 670.575110