Deep learning model is training on very less data - python

I'm training a deep learning model on 100000 rows with 80% of the training data and 20% of test data. The data is splitting however my model is showing the output of training with 2242. Below is the training code with model and output given. Any help will be highly appreciated.
Training Code:
import time
start_time = time.time()
from sklearn.feature_extraction.text import TfidfVectorizer
tweet_table = cleaning_table(tweet_table)
def tokenization_tweets(dataset, features):
tokenization = TfidfVectorizer(max_features=features)
dataset_transformed = tokenization.transform(dataset).toarray()
return dataset_transformed
def splitting(table):
X_train, X_test, y_train, y_test = train_test_split(table.tweet, table.test, test_size=0.2, shuffle=True)
return X_train, X_test, y_train, y_test
if __name__ == "__main__":
tweet_table['test'] = tweet_table['Overall_Sentiment'].apply(lambda x: 1 if x == 'Positive' else (0 if x == 'Negative' else 2))
if __name__ == "__main__":
X_train, X_test, y_train, y_test = splitting(tweet_table)
#Create a Neural Network
#Create the model
def train(X_train_mod, y_train, features, shuffle, drop, layer1, layer2, epoch, lr, epsilon, validation):
model_nn = Sequential()
model_nn.add(Dense(layer1, input_shape=(features,), activation='relu'))
model_nn.add(Dense(layer2, activation='sigmoid'))
model_nn.add(Dense(3, activation='softmax'))
optimizer = keras.optimizers.Adam(lr=lr, beta_1=0.9, beta_2=0.999, epsilon=epsilon, decay=0.0, amsgrad=False)
metrics=['accuracy']), y_train,
return model_nn
def test(X_test, model_nn):
prediction = model_nn.predict(X_test)
return prediction
def model1(X_train, y_train):
features = 3500
shuffle = True
drop = 0.5
layer1 = 512
layer2 = 256
epoch = 5
lr = 0.001
epsilon = None
validation = 0.1
X_train_mod = tokenization_tweets(X_train, features)
model = train(X_train_mod, y_train, features, shuffle, drop, layer1, layer2, epoch, lr, epsilon, validation)
return model;
#model1(X_train, y_train)
#model11(X_train, y_train)
def save_model(model):
# lets assume `model` is main model
model_json = model.to_json()
with open("model.json", "w") as json_file:
json.dump(model_json, json_file)
model_final = model1(X_train, y_train)
Epoch 1/5
2242/2242 [==============================] - 6s 3ms/step - loss: 0.3426 - accuracy: 0.8476 - val_loss: 0.2690 - val_accuracy: 0.8857
Epoch 2/5
2242/2242 [==============================] - 6s 3ms/step - loss: 0.2399 - accuracy: 0.9015 - val_loss: 0.2471 - val_accuracy: 0.8991
Epoch 3/5
2242/2242 [==============================] - 6s 3ms/step - loss: 0.1912 - accuracy: 0.9205 - val_loss: 0.2447 - val_accuracy: 0.9028
Epoch 4/5
2242/2242 [==============================] - 6s 3ms/step - loss: 0.1454 - accuracy: 0.9399 - val_loss: 0.2547 - val_accuracy: 0.9083
Epoch 5/5
2242/2242 [==============================] - 6s 3ms/step - loss: 0.1046 - accuracy: 0.9552 - val_loss: 0.2874 - val_accuracy: 0.9084
--- 192.1562056541443 seconds ---
Many Thanks


What is the calculation process of loss functions in multi-class multi-label classification problems using deep learning?

Dataset description:
(1) X_train: (6000,4) shape
(2) y_train: (6000,4) shape
(3) X_validation: (2000,4) shape
(4) y_validation: (2000,4) shape
(5) X_test: (2000,4) shape
(6) y_test: (2000,4) shape
Relationship between X and Y is shown here
For single label classification, the activation function of the last layer is Softmax and the loss function is categorical_crossentrop.
And I know the mathematical calculation method for the loss function.
And for multi-class multi-label classification problems, the activation function of the last layer is sigmoid, and the loss function is binary_crossentrop.
I want to know how the mathematical calculation method of the loss function works
It would be a great help to me if you let me know.
def MinMaxScaler(data):
numerator = data - np.min(data)
denominator = np.max(data) - np.min(data)
return numerator / (denominator + 1e-5)
kki = pd.read_csv(filename,names=['UE0','UE1','UE2','UE3','selected_UE0','selected_UE1','selected_UE2','selected_UE3'])
def LoadData(file):
xy = np.loadtxt(file, delimiter=',', dtype=np.float32)
print("Data set length:", len(xy))
tr_set_size = int(len(xy) * 0.6)
xy[:, 0:-number_of_UEs] = MinMaxScaler(xy[:, 0:-number_of_UEs]) #number_of_UES : 4
X_train = xy[:tr_set_size, 0: -number_of_UEs] #6000 row
y_train = xy[:tr_set_size, number_of_UEs:number_of_UEs*2]
X_valid = xy[tr_set_size:int((tr_set_size/3) + tr_set_size), 0:-number_of_UEs]
y_valid = xy[tr_set_size:int((tr_set_size/3) + tr_set_size), number_of_UEs:number_of_UEs *2]
X_test = xy[int((tr_set_size/3) + tr_set_size):, 0:-number_of_UEs]
y_test = xy[int((tr_set_size/3) + tr_set_size):, number_of_UEs:number_of_UEs*2]
print("Training X shape:", X_train.shape)
print("Training Y shape:", y_train.shape)
print("validation x shape:", X_valid.shape)
print("validation y shape:", y_valid.shape)
print("Test X shape:", X_test.shape)
print("Test Y shape:", y_test.shape)
return X_train, y_train, X_valid, y_valid, X_test, y_test, tr_set_size
X_train, y_train, X_valid, y_valid, X_test, y_test, tr_set_size = LoadData(filename)
model = Sequential()
model.add(Dense(64,activation='relu', input_shape=(X_train.shape[1],)))
model.add(Dense(46, activation='relu'))
model.add(Dense(24, activation='relu'))
model.add(Dense(12, activation='relu'))
model.add(Dense(4, activation= 'sigmoid'))
model.compile( loss ='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
hist =, y_train, epochs=5, batch_size=1, verbose= 1, validation_data=(X_valid, y_valid), callbacks= es)
This is a learning process, and even if epochs are repeated,
Accuracy does not improve.
Epoch 1/10
6000/6000 [==============================] - 14s 2ms/step - loss: 0.2999 - accuracy: 0.5345 - val_loss: 0.1691 - val_accuracy: 0.5465
Epoch 2/10
6000/6000 [==============================] - 14s 2ms/step - loss: 0.1554 - accuracy: 0.4883 - val_loss: 0.1228 - val_accuracy: 0.4710
Epoch 3/10
6000/6000 [==============================] - 14s 2ms/step - loss: 0.1259 - accuracy: 0.4710 - val_loss: 0.0893 - val_accuracy: 0.4910
Epoch 4/10
6000/6000 [==============================] - 13s 2ms/step - loss: 0.1094 - accuracy: 0.4990 - val_loss: 0.0918 - val_accuracy: 0.5540
Epoch 5/10
6000/6000 [==============================] - 13s 2ms/step - loss: 0.0967 - accuracy: 0.5223 - val_loss: 0.0671 - val_accuracy: 0.5405
Epoch 6/10
6000/6000 [==============================] - 13s 2ms/step - loss: 0.0910 - accuracy: 0.5198 - val_loss: 0.0836 - val_accuracy: 0.5380
Epoch 7/10
6000/6000 [==============================] - 13s 2ms/step - loss: 0.0870 - accuracy: 0.5348 - val_loss: 0.0853 - val_accuracy: 0.5775
Epoch 8/10
6000/6000 [==============================] - 13s 2ms/step - loss: 0.0859 - accuracy: 0.5518 - val_loss: 0.0515 - val_accuracy: 0.6520
Epoch 9/10
6000/6000 [==============================] - 13s 2ms/step - loss: 0.0792 - accuracy: 0.5508 - val_loss: 0.0629 - val_accuracy: 0.4350
Epoch 10/10
6000/6000 [==============================] - 13s 2ms/step - loss: 0.0793 - accuracy: 0.5638 - val_loss: 0.0632 - val_accuracy: 0.6270
Mistake 1 -
The shape of y_train, y_validation and y_test should be (6000,), (2000,) and (2000,) respectively.
Mistake 2 -
For multi-class classification, the loss should be categorical_crossentropy and activation should be a softmax. So, change these two lines, like this:
model.add(Dense(4, activation= 'softmax'))
model.compile(loss ='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
Suggestion -
Why are you splitting data by yourself? Use scikit-learn train_test_split. This code will give you proper splits:
from sklearn.model_selection import train_test_split
x, x_test, y, y_test = train_test_split(xtrain, labels, test_size=0.2, train_size=0.8)
x_train, x_validation, y_train, y_validation = train_test_split(x, y, test_size = 0.25, train_size =0.75)

LSTM model on the 3 class label as classification problem

My problem is to predict the output as which has 3 class label,
Lets say I have 20000 samples in my dataset with each sample is associated with label (0,1,2).
As this is multiclass classification problem.
Can I only give input as Labels which are ( 0, 1,2) to the network and get prediction based on the labels.
Will the data feeded to the network is sufficient to learn and predict the output
Please help me with your inputs
# Below is the code
X_train, X_test, y_train, y_test = train_test_split(values_train[:, 0],
values_train[:, 1],
print(" X Training Set size is",X_train.shape )
print(" y Training Set size is",y_train.shape )
print(" X Test Set size is",X_test.shape)
print(" y Test Set size is",y_test.shape )
'X Training Set size is (165081,)'
'y Training Set size is (165081,)'
'X Test Set size is (55028,)'
'y Test Set size is (55028,)'
# convert to LSTM friendly format
X_train = X_train.reshape(len(X_train),1, 1)
X_test = X_test.reshape(len(X_test),1,1)
print(X_train.shape, X_test.shape)
(165081, 1, 1) (55028, 1, 1)
# configure network
n_batch = 1
n_epoch = 100
n_neurons = 10
from keras.optimizers import SGD
opt = SGD(lr=0.01)
# design network
model = Sequential()
model.add(LSTM(n_neurons, batch_input_shape=(n_batch, X_train.shape[1],
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])
# fit network
for i in range(n_epoch):, y_train ,validation_data=(X_test, y_test),
epochs=1, batch_size=n_batch, verbose=1, shuffle= False)
df_actual = []
dp_predict = []
for i in range(len(X_test)):
testX,testy = X_test[i],y_test[i]
testX = testX.reshape(1, 1, 1)
yhat = model.predict(testX, batch_size=1)
print('>Actual =%.1f, Predicted=%.1f' % (testy, yhat))
I am not able to get correct prediction in this model.
Please find the below Validation accuracy and Training accuracy with the loss
Train on 154076 samples, validate on 66033 samples
Epoch 1/5
154076/154076 [==============================] - 289s 2ms/step - loss: 1.0033 - accuracy: 0.3816 - val_loss: 1.0018 - val_accuracy: 0.4286
Epoch 2/5
154076/154076 [==============================] - 291s 2ms/step - loss: 1.0021 - accuracy: 0.3817 - val_loss: 1.0020 - val_accuracy: 0.4286
Epoch 3/5
154076/154076 [==============================] - 293s 2ms/step - loss: 1.0018 - accuracy: 0.3804 - val_loss: 1.0014 - val_accuracy: 0.4286
Epoch 4/5
154076/154076 [==============================] - 290s 2ms/step - loss: 1.0016 - accuracy: 0.3812 - val_loss: 1.0012 - val_accuracy: 0.4286
Epoch 5/5
154076/154076 [==============================] - 290s 2ms/step - loss: 1.0015 - accuracy: 0.3814 - val_loss: 1.0012 - val_accuracy: 0.4286
Can anyone suggest me what can be improvement
Note: - I have normalized the input data with MinMaxScalar and used the scaled data, but there is no change in the output
Class labels are of categorical type. Neural networks can't learn on categorical data. You have to one-hot encode it with e.g. keras.utils.to_categorical:
x = values_train[:, 0]
y = values_train[:, 1]
y = keras.utils.to_categorical(y)
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.25, random_state=42)

Constant Validation Accuracy with a high loss in machine learning

I'm currently trying to do create an image classification model using Inception V3 with 2 classes. I have 1428 images which are balanced about 70/30. When I run my model I get a pretty high loss of as well as a constant validation accuracy. What might be causing this constant value?
data = np.array(data, dtype="float")/255.0
labels = np.array(labels,dtype ="uint8")
(trainX, testX, trainY, testY) = train_test_split(
img_width, img_height = 320, 320 #InceptionV3 size
train_samples = 1145
validation_samples = 287
epochs = 20
batch_size = 32
base_model = keras.applications.InceptionV3(
weights ='imagenet',
input_shape = (img_width,img_height,3))
model_top = keras.models.Sequential()
model_top.add(keras.layers.GlobalAveragePooling2D(input_shape=base_model.output_shape[1:], data_format=None)),
model_top.add(keras.layers.Dense(1,activation = 'sigmoid'))
model = keras.models.Model(inputs = base_model.input, outputs = model_top(base_model.output))
for layer in model.layers[:30]:
layer.trainable = False
model.compile(optimizer = keras.optimizers.Adam(
#Image Processing and Augmentation
train_datagen = keras.preprocessing.image.ImageDataGenerator(
zoom_range = 0.05,
#width_shift_range = 0.05,
height_shift_range = 0.05,
horizontal_flip = True,
vertical_flip = True,
fill_mode ='nearest')
val_datagen = keras.preprocessing.image.ImageDataGenerator()
train_generator = train_datagen.flow(
validation_generator = val_datagen.flow(
history = model.fit_generator(
steps_per_epoch = train_samples//batch_size,
epochs = epochs,
validation_data = validation_generator,
validation_steps = validation_samples//batch_size,
callbacks = [ModelCheckpoint])
This is my log when I run my model:
Epoch 1/20
35/35 [==============================]35/35[==============================] - 52s 1s/step - loss: 0.6347 - acc: 0.6830 - val_loss: 0.6237 - val_acc: 0.6875
Epoch 2/20
35/35 [==============================]35/35 [==============================] - 14s 411ms/step - loss: 0.6364 - acc: 0.6756 - val_loss: 0.6265 - val_acc: 0.6875
Epoch 3/20
35/35 [==============================]35/35 [==============================] - 14s 411ms/step - loss: 0.6420 - acc: 0.6743 - val_loss: 0.6254 - val_acc: 0.6875
Epoch 4/20
35/35 [==============================]35/35 [==============================] - 14s 414ms/step - loss: 0.6365 - acc: 0.6851 - val_loss: 0.6289 - val_acc: 0.6875
Epoch 5/20
35/35 [==============================]35/35 [==============================] - 14s 411ms/step - loss: 0.6359 - acc: 0.6727 - val_loss: 0.6244 - val_acc: 0.6875
Epoch 6/20
35/35 [==============================]35/35 [==============================] - 15s 415ms/step - loss: 0.6342 - acc: 0.6862 - val_loss: 0.6243 - val_acc: 0.6875
I think you have too low learning rate and too few epochs. try with lr = 0.001 and epochs = 100.
Your accuracy is 68.25%. Given that your classes are split roughly 70/30 it is likely that your model is just predicting the same thing every time, ignoring the input. That would give the accuracy you are seeing. Your model has not yet learned from your data.
As Novak said, your learning rate seems very low, so maybe try increasing that first to see if that helps.

Neural network in keras not converging

I'm building a simple Neural network in Keras, like the following:
# create model
model = Sequential()
model.add(Dense(1000, input_dim=x_train.shape[1], activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile model
model.compile(loss='mean_squared_error', metrics=['accuracy'], optimizer='RMSprop')
# Fit the model, y_train, epochs=20, batch_size=700, verbose=2)
# evaluate the model
scores = model.evaluate(x_test, y_test, verbose=0)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
The shape of the used data is:
x_train = (49972, 601)
y_train = (49972, 1)
My problem is that the network is not converging, the accuracy is fixed on 0.0168, like below:
Epoch 1/20
- 1s - loss: 3.2222 - acc: 0.0174
Epoch 2/20
- 1s - loss: 3.1757 - acc: 0.0187
Epoch 3/20
- 1s - loss: 3.1731 - acc: 0.0212
Epoch 4/20
- 1s - loss: 3.1721 - acc: 0.0220
Epoch 5/20
- 1s - loss: 3.1716 - acc: 0.0225
Epoch 6/20
- 1s - loss: 3.1711 - acc: 0.0235
Epoch 7/20
- 1s - loss: 3.1698 - acc: 0.0245
Epoch 8/20
- 1s - loss: 3.1690 - acc: 0.0251
Epoch 9/20
- 1s - loss: 3.1686 - acc: 0.0257
Epoch 10/20
- 1s - loss: 3.1679 - acc: 0.0261
Epoch 11/20
- 1s - loss: 3.1674 - acc: 0.0267
Epoch 12/20
- 1s - loss: 3.1667 - acc: 0.0277
Epoch 13/20
- 1s - loss: 3.1656 - acc: 0.0285
Epoch 14/20
- 1s - loss: 3.1653 - acc: 0.0288
Epoch 15/20
- 1s - loss: 3.1653 - acc: 0.0291
I used Sklearn library to build the same structure with the same data, and it works perfectly, shown me an accuracy higher than 0.5:
model = Pipeline([
('classifier', MLPClassifier(hidden_layer_sizes=(1000), activation='relu',
max_iter=20, verbose=2, batch_size=700, random_state=0))
I'm totally sure that I used the same data for both models, and this is how I prepare it:
def load_data():
le = preprocessing.LabelEncoder()
with open('_DATA_train.txt', 'rb') as fp:
train = pickle.load(fp)
with open('_DATA_test.txt', 'rb') as fp:
test = pickle.load(fp)
x_train = train[:,0:(train.shape[1]-1)]
y_train = train[:,(train.shape[1]-1)]
y_train = le.fit_transform(y_train).reshape([-1,1])
x_test = test[:,0:(test.shape[1]-1)]
y_test = test[:,(test.shape[1]-1)]
y_test = le.fit_transform(y_test).reshape([-1,1])
print(x_train.shape, ' ' , y_train.shape)
print(x_test.shape, ' ' , y_test.shape)
return x_train, y_train, x_test, y_test
What is the problem with the Keras structure?
it's a multi-class classification problem: y_training [0 ,1, 2, 3]
For a multiclass problem your labels should be one hot encoded. For example if the options are [0 ,1, 2, 3] and the label is 1 then it should be [0, 1, 0, 0].
Your final layer should be a dense layer with 4 units and an activation of softmax.
model.add(Dense(4, activation='softmax'))
And your loss should be categorical_crossentropy
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='RMSprop')

How to get both score and accuracy after training, y_train, batch_size = batch_size,
nb_epoch = 4, validation_data = (X_test, y_test),
show_accuracy = True)
score = model.evaluate(X_test, y_test,
batch_size = batch_size, show_accuracy = True, verbose=0)
gives scalar output and hence the following code doesn't work.
print("Test score", score[0])
print("Test accuracy:", score[1])
The output that I get is:
Train on 20000 samples, validate on 5000 samples
Epoch 1/4
20000/20000 [==============================] - 352s - loss: 0.4515 - val_loss: 0.4232
Epoch 2/4
20000/20000 [==============================] - 381s - loss: 0.2592 - val_loss: 0.3723
Epoch 3/4
20000/20000 [==============================] - 374s - loss: 0.1513 - val_loss: 0.4329
Epoch 4/4
20000/20000 [==============================] - 380s - loss: 0.0838 - val_loss: 0.5044
Keras version 1.0
How can I get the accuracy as well? Please help
If you use Sequential model you can try (CODE UPDATED):
nb_epochs = 4
history =, y_train, batch_size = batch_size,
nb_epoch = nb_epochs, validation_data = (X_test, y_test),
show_accuracy = True)
print("Test score", history.history["val_loss"][nb_epochs - 1])
print("Test acc", history.history["val_acc"][nb_epochs - 1])
Thanks Marcin and you are correct.
The code needs to be like this
optimizer = 'adam',
show_accuracy serves no purpose in and needs to be removed from there.

