LSTM model on the 3 class label as classification problem - python

My problem is to predict the output as which has 3 class label,
Lets say I have 20000 samples in my dataset with each sample is associated with label (0,1,2).
As this is multiclass classification problem.
Can I only give input as Labels which are ( 0, 1,2) to the network and get prediction based on the labels.
Will the data feeded to the network is sufficient to learn and predict the output
Please help me with your inputs
# Below is the code
X_train, X_test, y_train, y_test = train_test_split(values_train[:, 0],
values_train[:, 1],
test_size=0.25,
random_state=42)
print(" X Training Set size is",X_train.shape )
print(" y Training Set size is",y_train.shape )
print(" X Test Set size is",X_test.shape)
print(" y Test Set size is",y_test.shape )
'X Training Set size is (165081,)'
'y Training Set size is (165081,)'
'X Test Set size is (55028,)'
'y Test Set size is (55028,)'
# convert to LSTM friendly format
X_train = X_train.reshape(len(X_train),1, 1)
X_test = X_test.reshape(len(X_test),1,1)
print(X_train.shape, X_test.shape)
(165081, 1, 1) (55028, 1, 1)
# configure network
n_batch = 1
n_epoch = 100
n_neurons = 10
from keras.optimizers import SGD
opt = SGD(lr=0.01)
# design network
model = Sequential()
model.add(LSTM(n_neurons, batch_input_shape=(n_batch, X_train.shape[1],
X_train.shape[2]),
stateful=True))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])
# fit network
for i in range(n_epoch):
model.fit(X_train, y_train ,validation_data=(X_test, y_test),
epochs=1, batch_size=n_batch, verbose=1, shuffle= False)
model.reset_states()
df_actual = []
dp_predict = []
for i in range(len(X_test)):
testX,testy = X_test[i],y_test[i]
testX = testX.reshape(1, 1, 1)
yhat = model.predict(testX, batch_size=1)
df_actual.append(testy)
dp_predict.append(yhat)
print('>Actual =%.1f, Predicted=%.1f' % (testy, yhat))
I am not able to get correct prediction in this model.
Update:
Please find the below Validation accuracy and Training accuracy with the loss
Train on 154076 samples, validate on 66033 samples
Epoch 1/5
154076/154076 [==============================] - 289s 2ms/step - loss: 1.0033 - accuracy: 0.3816 - val_loss: 1.0018 - val_accuracy: 0.4286
Epoch 2/5
154076/154076 [==============================] - 291s 2ms/step - loss: 1.0021 - accuracy: 0.3817 - val_loss: 1.0020 - val_accuracy: 0.4286
Epoch 3/5
154076/154076 [==============================] - 293s 2ms/step - loss: 1.0018 - accuracy: 0.3804 - val_loss: 1.0014 - val_accuracy: 0.4286
Epoch 4/5
154076/154076 [==============================] - 290s 2ms/step - loss: 1.0016 - accuracy: 0.3812 - val_loss: 1.0012 - val_accuracy: 0.4286
Epoch 5/5
154076/154076 [==============================] - 290s 2ms/step - loss: 1.0015 - accuracy: 0.3814 - val_loss: 1.0012 - val_accuracy: 0.4286
Can anyone suggest me what can be improvement
Note: - I have normalized the input data with MinMaxScalar and used the scaled data, but there is no change in the output

Class labels are of categorical type. Neural networks can't learn on categorical data. You have to one-hot encode it with e.g. keras.utils.to_categorical:
x = values_train[:, 0]
y = values_train[:, 1]
y = keras.utils.to_categorical(y)
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.25, random_state=42)

Related

ValueError: Shapes (None, 1) and (None, 64) are incompatible

I'm currently trying to build a classification model in keras but I keep getting a shape error. This is my model right now. Is there anything that I am doing wrong?
predictors=["Length", "Diameter", "Height", "Shucked weight", "Viscera weight", "Shell weight", "Rings"]
x_train, x_test, y_train, y_test =train_test_split(db[predictors], db["Sex"], test_size=.2)
x_train= x_train.to_numpy()
x_test = x_test.to_numpy()
y_train = y_train.to_numpy()
y_test = y_test.to_numpy()
model = models.Sequential()
model.add(layers.Dense(64, activation='relu', input_shape=(7,)))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(64, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'],
)
x_val = x_train[:1000]
partial_x_train = x_train[1000:]
y_val = y_train[:1000]
partial_y_train = y_train[1000:]
partial_x_train.shape
history = model.fit(partial_x_train,
partial_y_train,
epochs=20,
batch_size=512,
validation_data=(x_val, y_val))
ValueError: Shapes (None, 1) and (None, 64) are incompatible
Data Source https://www.kaggle.com/rodolfomendes/abalone-dataset
The output of the last layer consists of 64 different values, while your labels are of 1 value only.
This error is because you have 3 classes(labels) in your dataset and you are not defining those in your model's last layer. (As mentioned by #subspring)
model = Sequential()
model.add(layers.Dense(64, activation='relu', input_shape=(7,)))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(3)) # You need to mention this in the last dense layer
As the label data in this dataset is not numeric.
y_train.unique() #array(['I', 'M', 'F'], dtype=object)
For that, you can use LabelEncoder as below:
from sklearn.preprocessing import LabelEncoder
def Labels(y_train, y_test):
LabEnc = LabelEncoder()
LabEnc.fit(y_train)
Enc_y_train = LabEnc.transform(y_train)
Enc_y_test = LabEnc.transform(y_test)
return Enc_y_train, Enc_y_test
y_train, y_test = Labels(y_train, y_test)
y_train # array([1, 1, 2, ..., 2, 2, 0])
Now train the model by converting the input data (x_train,x_test) into an array.
x_train= np.array(x_train)
x_test = np.array(x_test)
#compile the model
model.compile(optimizer='rmsprop',
loss=tf.keras.losses.MeanSquaredError(),
metrics=['accuracy'])
x_val = x_train[:1000]
partial_x_train = x_train[1000:]
y_val = y_train[:1000]
partial_y_train = y_train[1000:]
partial_x_train.shape
#train the model
history = model.fit(partial_x_train,
partial_y_train,
epochs=5,
batch_size=512,
validation_data=(x_val, y_val))
Output:
Epoch 1/5
5/5 [==============================] - 2s 80ms/step - loss: 0.8610 - accuracy: 0.3302 - val_loss: 0.7966 - val_accuracy: 0.2350
Epoch 2/5
5/5 [==============================] - 0s 13ms/step - loss: 0.7997 - accuracy: 0.2563 - val_loss: 0.7491 - val_accuracy: 0.4620
Epoch 3/5
5/5 [==============================] - 0s 16ms/step - loss: 0.7917 - accuracy: 0.3315 - val_loss: 0.7883 - val_accuracy: 0.2680
Epoch 4/5
5/5 [==============================] - 0s 15ms/step - loss: 0.7949 - accuracy: 0.3405 - val_loss: 0.7499 - val_accuracy: 0.3390
Epoch 5/5
5/5 [==============================] - 0s 13ms/step - loss: 0.7884 - accuracy: 0.3306 - val_loss: 0.7605 - val_accuracy: 0.3670

Keras model cannot fit with given data

I'm trying to predict next number in sequence.
You can see the data sample in google colab here:
https://colab.research.google.com/drive/1QnkNtIo56V9wdQ4CMTm3LRSQaa6A9VmP?usp=sharing
(51 columns c0-c49 and the last 'y' is the first value from the next row)
data is scaled with StandardScaler:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaled_features = df.copy()
features = scaled_features[columns_a]
scaler = StandardScaler().fit(features.values)
features = scaler.transform(features.values)
scaled_features[columns] = features
after that is splitting in train and test data:
from sklearn.model_selection import train_test_split
train, test = train_test_split(scaled_features, test_size=0.2, shuffle=False)
and reshaped for LSTM input
Y_train=train["y"]
X_train=train.drop("y", axis=1)
Y_test=test["y"]
X_test=test.drop("y", axis=1)
X_train = X_train.to_numpy()
X_test = X_test.to_numpy()
X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], 1)
X_test = X_test.reshape(X_test.shape[0], X_test.shape[1], 1)
X_train.shape
creating the model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, GRU, Dense, Dropout
from matplotlib import pyplot
model = Sequential()
model.add(LSTM(64, input_shape=(X_train.shape[1], X_train.shape[2]), activation='relu', return_sequences=True))
model.add(LSTM(32, activation='relu' ))
model.add(Dense(1))
#model.add(Dropout(0.2))
model.compile(optimizer='adam', loss='mse', metrics=['accuracy'])
#print(model.summary())
model.fit(X_train, Y_train, epochs=5, batch_size=32, verbose=2)
scores = model.evaluate(X_test, Y_test, batch_size=32, verbose=0)
print("Model Accuracy: %.2f%%" % (scores[1]*100))
the output
Epoch 1/5
749/749 - 34s - loss: 1.2380 - accuracy: 0.0000e+00
Epoch 2/5
749/749 - 31s - loss: 1.2382 - accuracy: 0.0000e+00
Epoch 3/5
749/749 - 31s - loss: 1.2381 - accuracy: 0.0000e+00
Epoch 4/5
749/749 - 31s - loss: 1.2385 - accuracy: 0.0000e+00
Epoch 5/5
749/749 - 31s - loss: 1.2384 - accuracy: 0.0000e+00
Model Accuracy: 0.00%
I'm pretty new at machine learning/AI and i don't know what's wrong in code
Any idea? Thank you

Deep learning model is training on very less data

I'm training a deep learning model on 100000 rows with 80% of the training data and 20% of test data. The data is splitting however my model is showing the output of training with 2242. Below is the training code with model and output given. Any help will be highly appreciated.
Training Code:
import time
start_time = time.time()
from sklearn.feature_extraction.text import TfidfVectorizer
tweet_table = cleaning_table(tweet_table)
def tokenization_tweets(dataset, features):
tokenization = TfidfVectorizer(max_features=features)
tokenization.fit(dataset)
dataset_transformed = tokenization.transform(dataset).toarray()
return dataset_transformed
def splitting(table):
X_train, X_test, y_train, y_test = train_test_split(table.tweet, table.test, test_size=0.2, shuffle=True)
return X_train, X_test, y_train, y_test
if __name__ == "__main__":
tweet_table['test'] = tweet_table['Overall_Sentiment'].apply(lambda x: 1 if x == 'Positive' else (0 if x == 'Negative' else 2))
if __name__ == "__main__":
X_train, X_test, y_train, y_test = splitting(tweet_table)
#print(tweet_table["test"].value_counts())
#print(tweet_table["Overall_Sentiment"].value_counts())
#print(list(set(y_train)))
#print(list(set(y_test)))
#Create a Neural Network
#Create the model
def train(X_train_mod, y_train, features, shuffle, drop, layer1, layer2, epoch, lr, epsilon, validation):
model_nn = Sequential()
model_nn.add(Dense(layer1, input_shape=(features,), activation='relu'))
model_nn.add(Dropout(drop))
model_nn.add(Dense(layer2, activation='sigmoid'))
model_nn.add(Dropout(drop))
model_nn.add(Dense(3, activation='softmax'))
optimizer = keras.optimizers.Adam(lr=lr, beta_1=0.9, beta_2=0.999, epsilon=epsilon, decay=0.0, amsgrad=False)
model_nn.compile(loss='sparse_categorical_crossentropy',
optimizer=optimizer,
metrics=['accuracy'])
model_nn.fit(np.array(X_train_mod), y_train,
batch_size=32,
epochs=epoch,
verbose=1,
validation_split=validation,
shuffle=shuffle)
return model_nn
def test(X_test, model_nn):
prediction = model_nn.predict(X_test)
return prediction
def model1(X_train, y_train):
features = 3500
shuffle = True
drop = 0.5
layer1 = 512
layer2 = 256
epoch = 5
lr = 0.001
epsilon = None
validation = 0.1
X_train_mod = tokenization_tweets(X_train, features)
model = train(X_train_mod, y_train, features, shuffle, drop, layer1, layer2, epoch, lr, epsilon, validation)
return model;
#model1(X_train, y_train)
#model11(X_train, y_train)
def save_model(model):
# lets assume `model` is main model
model_json = model.to_json()
with open("model.json", "w") as json_file:
json.dump(model_json, json_file)
model.save_weights("model_weights.h5")
#print(len(X_train))
#print(len(y_train))
model_final = model1(X_train, y_train)
Output:
Epoch 1/5
2242/2242 [==============================] - 6s 3ms/step - loss: 0.3426 - accuracy: 0.8476 - val_loss: 0.2690 - val_accuracy: 0.8857
Epoch 2/5
2242/2242 [==============================] - 6s 3ms/step - loss: 0.2399 - accuracy: 0.9015 - val_loss: 0.2471 - val_accuracy: 0.8991
Epoch 3/5
2242/2242 [==============================] - 6s 3ms/step - loss: 0.1912 - accuracy: 0.9205 - val_loss: 0.2447 - val_accuracy: 0.9028
Epoch 4/5
2242/2242 [==============================] - 6s 3ms/step - loss: 0.1454 - accuracy: 0.9399 - val_loss: 0.2547 - val_accuracy: 0.9083
Epoch 5/5
2242/2242 [==============================] - 6s 3ms/step - loss: 0.1046 - accuracy: 0.9552 - val_loss: 0.2874 - val_accuracy: 0.9084
--- 192.1562056541443 seconds ---
Many Thanks

Calculating the Accuracy of A Keras Neural Network in Python

I have created a Keras neural network. The neural network was trained during eight epochs, and it outputs this loss value and accuracy:
Epoch 1/8
2009/2009 [==============================] - 0s 177us/step - loss: 0.0824 - acc: 4.9776e-04
Epoch 2/8
2009/2009 [==============================] - 0s 34us/step - loss: 0.0080 - acc: 4.9776e-04
Epoch 3/8
2009/2009 [==============================] - 0s 37us/step - loss: 0.0071 - acc: 4.9776e-04
Epoch 4/8
2009/2009 [==============================] - 0s 38us/step - loss: 0.0071 - acc: 4.9776e-04
Epoch 5/8
2009/2009 [==============================] - 0s 35us/step - loss: 0.0070 - acc: 4.9776e-04
Epoch 6/8
2009/2009 [==============================] - 0s 38us/step - loss: 0.0071 - acc: 4.9776e-04
Epoch 7/8
2009/2009 [==============================] - 0s 36us/step - loss: 0.0068 - acc: 4.9776e-04
Epoch 8/8
2009/2009 [==============================] - 0s 40us/step - loss: 0.0070 - acc: 4.9776e-04
How do I interpret the loss function provided within the output?
Is there any way to find the variation percentage between the actual price and prediction for every single day in the data set?
Here is the neural network:
import tensorflow as tf
import keras
import numpy as np
#import quandle
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
import pandas as pd
import sklearn
import math
import pandas_datareader as web
def func_stock_prediction(stockdata, start, end):
start = start
end = end
df = web.DataReader(stockdata, "yahoo", start, end)
df = df[['Close']]
previous = 5
def create_dataset(df, previous):
dataX, dataY = [], []
for i in range(len(df)-previous-1):
a = df[i:(i+previous), 0]
dataX.append(a)
dataY.append(df[i + previous, 0])
return np.array(dataX), np.array(dataY)
scaler = sklearn.preprocessing.MinMaxScaler(feature_range = (0, 1))
df = scaler.fit_transform(df)
train_size = math.ceil(len(df) * 0.5)
train, val = df[0:train_size,:], df[train_size:len(df),:]
X_train, Y_train = create_dataset(train, previous)
print(X_train)
print(Y_train)
print(X_train.shape)
print(Y_train.shape)
X_val, Y_val = create_dataset(val, previous)
X_train = np.reshape(X_train, (X_train.shape[0], 1, X_train.shape[1]))
X_val = np.reshape(X_val, (X_val.shape[0], 1, X_val.shape[1]))
model = keras.models.Sequential()
model.add(keras.layers.Dense(units = 64, activation = 'relu', input_shape = (1, 5)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(units = 1, activation = 'linear'))
model.compile(loss='mean_absolute_error',
optimizer='adam',
metrics=['accuracy'])
history = model.fit(X_train, Y_train, epochs=8)
train = model.predict(X_train)
val = model.predict(X_val)
train = scaler.inverse_transform(train)
Y_train = scaler.inverse_transform([Y_train])
val = scaler.inverse_transform(val)
Y_val = scaler.inverse_transform([Y_val])
predictions = val
trainPlot = np.empty_like(df)
trainPlot[:, :] = np.nan
trainPlot[previous:len(train)+previous, :] = train
valPlot = np.empty_like(df)
valPlot[:, :] = np.nan
valPlot[len(train)+(previous*2)+1:len(df)-1, :] = val
inversetransform, =plt.plot(scaler.inverse_transform(df))
train, =plt.plot(trainPlot)
val, =plt.plot(valPlot)
plt.xlabel('Number of Days')
plt.ylabel('Stock Price')
plt.title("Predicted vs. Actual Stock Price Per Day")
plt.show()
func_stock_prediction("PLAY", 2010-1-1, 2020-1-1)
You are using accuracy as a metric. Accuracy measures the proportion of predicted labels that match the true labels. Accuracy is used mostly (to my knowledge) for classification tasks. As far as I know, the accuracy is not really interpretable when you're predicting a continuous outcome variable.
Based on your code, it looks like you're using the neural network for a regression problem (you're predicting a continuous variable). For regression problem meterics, people often use "mean squared error", "root mean squared error", "mean absolute error", "R^2", etc.
If you're interested in percentage differences, then maybe you could try the keras loss, "mean_absolute_percentage_error".

Neural network in keras not converging

I'm building a simple Neural network in Keras, like the following:
# create model
model = Sequential()
model.add(Dense(1000, input_dim=x_train.shape[1], activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile model
model.compile(loss='mean_squared_error', metrics=['accuracy'], optimizer='RMSprop')
# Fit the model
model.fit(x_train, y_train, epochs=20, batch_size=700, verbose=2)
# evaluate the model
scores = model.evaluate(x_test, y_test, verbose=0)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
The shape of the used data is:
x_train = (49972, 601)
y_train = (49972, 1)
My problem is that the network is not converging, the accuracy is fixed on 0.0168, like below:
Epoch 1/20
- 1s - loss: 3.2222 - acc: 0.0174
Epoch 2/20
- 1s - loss: 3.1757 - acc: 0.0187
Epoch 3/20
- 1s - loss: 3.1731 - acc: 0.0212
Epoch 4/20
- 1s - loss: 3.1721 - acc: 0.0220
Epoch 5/20
- 1s - loss: 3.1716 - acc: 0.0225
Epoch 6/20
- 1s - loss: 3.1711 - acc: 0.0235
Epoch 7/20
- 1s - loss: 3.1698 - acc: 0.0245
Epoch 8/20
- 1s - loss: 3.1690 - acc: 0.0251
Epoch 9/20
- 1s - loss: 3.1686 - acc: 0.0257
Epoch 10/20
- 1s - loss: 3.1679 - acc: 0.0261
Epoch 11/20
- 1s - loss: 3.1674 - acc: 0.0267
Epoch 12/20
- 1s - loss: 3.1667 - acc: 0.0277
Epoch 13/20
- 1s - loss: 3.1656 - acc: 0.0285
Epoch 14/20
- 1s - loss: 3.1653 - acc: 0.0288
Epoch 15/20
- 1s - loss: 3.1653 - acc: 0.0291
I used Sklearn library to build the same structure with the same data, and it works perfectly, shown me an accuracy higher than 0.5:
model = Pipeline([
('classifier', MLPClassifier(hidden_layer_sizes=(1000), activation='relu',
max_iter=20, verbose=2, batch_size=700, random_state=0))
])
I'm totally sure that I used the same data for both models, and this is how I prepare it:
def load_data():
le = preprocessing.LabelEncoder()
with open('_DATA_train.txt', 'rb') as fp:
train = pickle.load(fp)
with open('_DATA_test.txt', 'rb') as fp:
test = pickle.load(fp)
x_train = train[:,0:(train.shape[1]-1)]
y_train = train[:,(train.shape[1]-1)]
y_train = le.fit_transform(y_train).reshape([-1,1])
x_test = test[:,0:(test.shape[1]-1)]
y_test = test[:,(test.shape[1]-1)]
y_test = le.fit_transform(y_test).reshape([-1,1])
print(x_train.shape, ' ' , y_train.shape)
print(x_test.shape, ' ' , y_test.shape)
return x_train, y_train, x_test, y_test
What is the problem with the Keras structure?
Edited:
it's a multi-class classification problem: y_training [0 ,1, 2, 3]
For a multiclass problem your labels should be one hot encoded. For example if the options are [0 ,1, 2, 3] and the label is 1 then it should be [0, 1, 0, 0].
Your final layer should be a dense layer with 4 units and an activation of softmax.
model.add(Dense(4, activation='softmax'))
And your loss should be categorical_crossentropy
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='RMSprop')

Categories

Resources