I am working with an LSTM project for learning purposes where I am using time-series data that has 3 columns [current, sma, target] where sma is the simple moving average; I extracted these values from the dataframe like so
data = df[['current', 'sma', 'target']].values
# normlize data
scaler = MinMaxScaler(feature_range=(0,1))
dataset = scaler.fit_transform(data)
# then split inputs from targets
X = dataset[:, :2]
y = dataset[:, 2]
# split into xtrain ytrain xtest ytest
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
Everything works fine so far, and I understand, but the uncharted territory for me would be to convert the x_*, y_* arrays into a 3-d arrays to feed the model; I am using a simple model just to make this work, I am not looking for impressive results, this is purely educational.
The model that I will use:
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(128, input_shape=(timesteps, features), return_sequences=True))
model.add(tf.keras.layers.LSTM(64, return_sequences=False))
model.add(tf.keras.layers.Dense(features))
model.compile(loss='mean_squared_error', optimizer='adam')
How to reshape the data to feed it to the model?
Related
Hello everyone i have a question is it smart and can you implement a time series forecasting model using TensorFlow and Long Short-Term Memory (LSTM) neural network to predict the monthly sales of a retail company?
Here is an example of my code for how i am trying to train my model is this the correct way of doing this or should i be doing something else? Any hints tops or help would be greatly appreciated.
""" my code uses the tensorflow library to create an LSTM neural network that predicts the monthly sales of a retail company. The data is loaded from a CSV file, normalized using the MinMaxScaler, split into training and testing sets, and prepared for the model. The LSTM model is then built, trained, and evaluated on the test data. """
import tensorflow as tf
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
# Load the retail sales data
sales_data = pd.read_csv("sales.csv")
# Normalize the data using MinMaxScaler
scaler = MinMaxScaler()
sales_data = scaler.fit_transform(sales_data)
# Split the data into training and testing sets
train_data = sales_data[:int(sales_data.shape[0]*0.8),:]
test_data = sales_data[int(sales_data.shape[0]*0.8):,:]
# Create the input data for the model
def create_input_data(data, window_size=12):
X = []
y = []
for i in range(data.shape[0]-window_size):
X.append(data[i:i+window_size,0])
y.append(data[i+window_size,0])
return np.array(X), np.array(y)
X_train, y_train = create_input_data(train_data)
X_test, y_test = create_input_data(test_data)
# Build the LSTM model
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(units=50, input_shape=(12, 1)))
model.add(tf.keras.layers.Dense(units=1))
model.compile(optimizer="adam", loss="mean_squared_error")
# Train the model
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
model.fit(X_train, y_train, epochs=100, batch_size=32)
# Evaluate the model on the test data
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
test_loss = model.evaluate(X_test, y_test)
print("Test Loss:", test_loss)
I have not tested my model yet because i am uncertain if i am going about do this the right way.
I have a table with 1799 users and 31 features which are arranged in rows and columns respectively. The last column is a 2-type condition feature that tells the model which condition the users belong to. I understood that by using LSTM I need to make my input to be 3-d. So, I used reshape(31,1) as I don't have time series data. I also understood that input_shape took in the number of features. My issue is that I want to predict a new set of users who also have the same 30 features and give me a classification result about which user belongs to which condition. It would be better if the result can tell me what is the probability of each of the conditions predicted. So, I tried to use model.predict to do the mentioned tasks. It gave me a result of a numpy array predict_prob with a shape=(200, 31, 1). I am confused at the part that the data structure should be [(31x1)x200] and the output should be the conditions of the users which should be (200,). How come the result is in 3-d and how should I convert it to dataframe format so that I can read it in .csv format? Thank you in advance.
X = raw_data[feature_names]
P = predict_data_raw[feature_names]
P1 = predict_data_raw[feature_names1]
#Training
y = raw_data['Conditions']
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=22, test_size=0.1)
X_test = np.expand_dims(X_test, axis=2)
# fit and evaluate a model
model = Sequential()
model.add(Reshape((31,1)))
model.add(Bidirectional(LSTM(10, return_sequences=True),input_shape=(31,)))
model.add(Dropout(0.5))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# compile the keras model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit the keras model on the dataset
LSTM = model.fit(X_train, y_train, epochs=5, batch_size=10)
# evaluate the keras model
_, accuracy = model.evaluate(X_test)
print('Accuracy: %.2f' % (accuracy*100))
predict_prob=model.predict([X_test])
df = pd.DataFrame(predict_prob, columns=["Prediction"])
I want to make predictions on the entire test set, here the test set is only 20% of datasetA, I understand that this is because its only for training purposes, when I save the weights and then make predictions on another datasetB, will it also split the test-set datasetB.
How can I make predictions on the entire test-set datasetB using the weights of datasetA that it was trained on.
Thanks.
x = dataset.iloc[:, :-1].values
# Dependent Variable:
y = dataset.iloc[:, -1].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
# Initialising the ANN
classifier = Sequential()
# Adding the input layer and the first hidden layer
classifier.add(Dense(units = 27, kernel_initializer = 'uniform', activation = 'relu', input_dim = 6))
# Adding the second hidden layer
classifier.add(Dense(units = 27, kernel_initializer = 'uniform', activation = 'relu'))
# Adding the output layer
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
# Compiling the ANN
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
# Fitting the ANN to the Training set
classifier.fit(X_train, y_train, batch_size = 10, epochs = 20)
#making predictions on test data
classifier.predict(X_test)
If I am understanding correctly, you want to use your trained model on a completely new dataset?
Keras provides several ways to do this, but I think the most common one would be to export your trained model into a .hd5 file using the command
model.save("filepath/model.hd5")
Now you can load in and use your model to wherever you want using the commands
model = model.load("filepath/model.hd5")
score = model.evaluate(X, Y)
where X is the feature columns of Dataset B and Y is the response to get your scoring. If dataset B is in the same instance, you can always just use
model.predict(X)
Where X is now the feature columns of dataset B
From what I understand you are asking 2 questions here:
First, the splitting of "dataset B" into a train and test set is done manually by you in the line
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0).
If, when you use your "dataset B", you want to test your classifier on ALL the data points of "dataset B", you do not have to do this train test split, and can simply pass the X values of "dataset B" to your classifier.
As for how to do this, as per your second question, it is the same as what you have already done with "dataset A"'s test set:
classifier.predict(X) will make predictions using the fit it already learned on "dataset A", assuming you do not recompile or call .fit() again.
I've a problem involving airfoil velocity and pressure prediction, given the AOA,x,y. I'm using keras with MLP. I have 3 inputs (AOA,x,y) and I have to predict 3 outputs (u,v,p). I initially have a code which outputs the MSE loss as a single value. However, I modified the code so that I have MSE for each output. However, I don't get the avg MSE of the 3 outputs (u_mean_squared_error: 73.63%,v_mean_squared_error: 1.13%,p_mean_squared_error: 2.16%) equal to the earlier single MSE loss (mean_squared_error: 5.81%). Hence, I'm wondering if my new code is wrong. Or whether I'm doing it the right way. Can someone help?
Old code:
# load pima indians dataset
dataset = numpy.loadtxt("S1020_data.csv", delimiter=",")
# split into input and output variables
X = dataset[:,0:3]
Y = dataset[:,3:6]
# split into 67% for train and 33% for test
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=seed)
# create model
input_data = layers.Input(shape=(3,))
#create the layers and pass them the input tensor to get the output tensor:
hidden1Out = Dense(units=12, activation='relu')(input_data)
hidden2Out = Dense(units=8, activation='relu')(hidden1Out)
finalOut = Dense(units=3, activation='relu')(hidden2Out)
#define the model's start and end points
model = Model(input_data, finalOut)
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mean_squared_error'])
# Fit the model
model.fit(X_train, y_train, validation_data=(X_test,y_test), epochs=10, batch_size=1000)
# evaluate the model
scores = model.evaluate(X, Y)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
New code:
# load pima indians dataset
dataset = numpy.loadtxt("S1020_data.csv", delimiter=",")
# split into input and output variables
X = dataset[:,0:3]
Y = dataset[:,3:6]
# split into 67% for train and 33% for test
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=seed)
# create model
input_data = layers.Input(shape=(3,))
#create the layers and pass them the input tensor to get the output tensor:
hidden1Out = Dense(units=12, activation='relu')(input_data)
hidden2Out = Dense(units=8, activation='relu')(hidden1Out)
u_out = Dense(1, activation='relu', name='u')(hidden2Out)
v_out = Dense(1, activation='relu', name='v')(hidden2Out)
p_out = Dense(1, activation='relu', name='p')(hidden2Out)
#define the model's start and end points
model = Model(input_data,outputs = [u_out, v_out, p_out])
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mean_squared_error'])
# Fit the model
model.fit(X_train, [y_train[:,0], y_train[:,1], y_train[:,2]], validation_data=(X_test,[y_test[:,0], y_test[:,1], y_test[:,2]]), epochs=10, batch_size=1000)
# evaluate the model
scores = model.evaluate(X, [Y[:,0], Y[:,1], Y[:,2]])
for i in range(7):
print("\n%s: %.2f%%" % (model.metrics_names[i], scores[i]*100))
I think the difference comes from the optimization objective.
In your old code, the objective was:
sqrt( (u_true - u_pred)^2 + (v_true - v_pred)^2 + (p_true - p_pred)^2 )
which minimizes the 2-norm of the [u_pred,v_pred,p_pred] vector with respect to its target.
But in the new one, the objective became:
sqrt((u_true - u_pred)^2) + sqrt((v_true - v_pred)^2) + sqrt((p_true - p_pred)^2)
which is quite different from the previous one.
I want to simultaneously augment X (500,28,28,1), Y (500,28,28,1) imageset in keras and store them in an array for visualizing results (before i can train a network). The output y is not a label but an image.
I copy X_train in y_train (Mnist dataset) and i want to apply same effects in both x, y for training a network. However, i am unable to do transofmration for both X and y. I am getting ZCA on X only.My code is :
'''
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape((X_train.shape[0], 28, 28, 1))
X_train = X_train.astype('float32')
y_train=X_train
datagen = ImageDataGenerator(zca_whitening=True)
datagen.fit(X_train)
datagen.fit(y_train)
training_set=datagen.flow(X_train,y_train,batch_size=100):
temp=np.asarray(training_set[0])
'''
temp[0...] has ZCA applied whereas temp[1..] doesnt have any effect
You need to pass pairs of X_train, y_train and X_test, y_test as arguments to datagen's flow method. Here's an example:
datagen = ImageDataGenerator(zca_whitening=True)
datagen.fit(X_train) # to compute quantities required for featurewise normalization
training_set = datagen.flow(X_train, y_train, batch_size=100)
test_set = datagen.flow(X_test, y_test, batch_size=100)
classifier.fit_generator(training_set, validation_data=test_set, epochs=100)
This allows for simultaneous augmentation of input X and corresponding ground-truth labels Y for training the neural network.
Hope this helps!
Here are a few references for the same: 1, 2 & 3