Python: Improve accuracy of an keras sequential model - python

While training the model the accuracy percentage of the model is less as compared to the loss, Tried to change the epochs, loss but still the accuracy is not improving.
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state = 42)
num_class = len(np.unique(data.IsAddedToReport.values))
model = Sequential()
model.add(Embedding(2000, 128,input_length = x.shape[1]))
model.add(SpatialDropout1D(0.5))
model.add(Dense(32, activation='relu'))
model.add(LSTM(196, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(num_class, activation='softmax'))
model.compile(loss = 'binary_crossentropy', optimizer='adam',metrics = ['acc'])
checkpointer = ModelCheckpoint(filepath, monitor='val_acc', verbose=2, save_best_only=True, mode='max')
history = model.fit(X_train,y_train,batch_size=128,epochs=10, validation_split=0.2, callbacks=[checkpointer])
Output :
Epoch 10/10
17013/17013 [==============================] - 1s 57us/step - loss: 0.6295 - acc: 0.6115 - val_loss: 0.7401 - val_acc: 0.5461
Epoch 00010: val_acc did not improve from 0.55125

Related

Keras model cannot fit with given data

I'm trying to predict next number in sequence.
You can see the data sample in google colab here:
https://colab.research.google.com/drive/1QnkNtIo56V9wdQ4CMTm3LRSQaa6A9VmP?usp=sharing
(51 columns c0-c49 and the last 'y' is the first value from the next row)
data is scaled with StandardScaler:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaled_features = df.copy()
features = scaled_features[columns_a]
scaler = StandardScaler().fit(features.values)
features = scaler.transform(features.values)
scaled_features[columns] = features
after that is splitting in train and test data:
from sklearn.model_selection import train_test_split
train, test = train_test_split(scaled_features, test_size=0.2, shuffle=False)
and reshaped for LSTM input
Y_train=train["y"]
X_train=train.drop("y", axis=1)
Y_test=test["y"]
X_test=test.drop("y", axis=1)
X_train = X_train.to_numpy()
X_test = X_test.to_numpy()
X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], 1)
X_test = X_test.reshape(X_test.shape[0], X_test.shape[1], 1)
X_train.shape
creating the model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, GRU, Dense, Dropout
from matplotlib import pyplot
model = Sequential()
model.add(LSTM(64, input_shape=(X_train.shape[1], X_train.shape[2]), activation='relu', return_sequences=True))
model.add(LSTM(32, activation='relu' ))
model.add(Dense(1))
#model.add(Dropout(0.2))
model.compile(optimizer='adam', loss='mse', metrics=['accuracy'])
#print(model.summary())
model.fit(X_train, Y_train, epochs=5, batch_size=32, verbose=2)
scores = model.evaluate(X_test, Y_test, batch_size=32, verbose=0)
print("Model Accuracy: %.2f%%" % (scores[1]*100))
the output
Epoch 1/5
749/749 - 34s - loss: 1.2380 - accuracy: 0.0000e+00
Epoch 2/5
749/749 - 31s - loss: 1.2382 - accuracy: 0.0000e+00
Epoch 3/5
749/749 - 31s - loss: 1.2381 - accuracy: 0.0000e+00
Epoch 4/5
749/749 - 31s - loss: 1.2385 - accuracy: 0.0000e+00
Epoch 5/5
749/749 - 31s - loss: 1.2384 - accuracy: 0.0000e+00
Model Accuracy: 0.00%
I'm pretty new at machine learning/AI and i don't know what's wrong in code
Any idea? Thank you

Keras neural network takes only few samples to train

data = np.random.random((10000, 150))
labels = np.random.randint(10, size=(10000, 1))
labels = to_categorical(labels, num_classes=10)
model = Sequential()
model.add(Dense(units=32, activation='relu', input_shape=(150,)))
model.add(Dense(units=10, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(data, labels, epochs=30, validation_split=0.2)
I created 10000 random samples to train my net, but it use only few of them(250/10000)
Exaple of the 1st epoch:
Epoch 1/30
250/250 [==============================] - 0s 2ms/step - loss: 2.1110 - accuracy: 0.2389 - val_loss: 2.2142 - val_accuracy: 0.1800
Your data is split into training and validation subsets (validation_split=0.2).
Training subset has size 8000 and validation 2000.
Training goes in batches, each batch has size 32 samples by default.
So one epoch should take 8000/32=250 batches, as it shows in the progress.
Try code like following example
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
# Generate dummy data
import numpy as np
data = np.random.random((1000, 100))
labels = np.random.randint(10, size=(1000, 1))
# Convert labels to categorical one-hot encoding
one_hot_labels = keras.utils.to_categorical(labels, num_classes=10)
# Train the model, iterating on the data in batches of 32 samples
model.fit(data, one_hot_labels, epochs=10, batch_size=32)

Keras - Save output of intermediate layer

What I have done?
I'm using from this code to implement my keras'model:
X, tx, Y, ty = train_test_split(X, Y, test_size=0.2, random_state=np.random.seed(7), shuffle=True)
X = np.reshape(X, (X.shape[0], 1, X.shape[1]))
tx = np.reshape(tx, (tx.shape[0], 1, tx.shape[1]))
model = Sequential()
model.add(LSTM(100, return_sequences=False, input_shape=(X.shape[1], X.shape[2])))
model.add(Dense(Y.shape[1], activation='softmax'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
filepath="weights-{epoch:02d}-{val_acc:.2f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]
model.fit(X, Y, validation_split=.20,
epochs=1000, batch_size=50, callbacks=callbacks_list, verbose=0)
OUTPUT:
Below is a part of the program's output:
Epoch 00993: val_acc did not improve
Epoch 00994: val_acc did not improve
Epoch 00995: val_acc did not improve
Epoch 00996: val_acc did not improve
Epoch 00997: val_acc did not improve
Epoch 00998: val_acc improved from 0.93900 to 0.94543, saving model to weights-998-0.94.hdf5
Epoch 00999: val_acc did not improve
PROBLEM:
I need to save the output of LSTM layer in any epoch but i do not know how?
Any idea?
You can use Keras functional API. You will have to rewrite your model creation but it's not much work. Then when you write something like this:
lstm_output = LSTM(128, ...)(x)
the output of the LSTM layer will be in the lstm_output variable and you can save it in every iteration of every epoch.
I hope this answers your question.

How to fix a constant validation accuracy in machine learning?

I'm trying to do image classification with dicom images that have balanced classes using the pre-trained InceptionV3 model.
def convertDCM(PathDCM) :
data = []
for dirName, subdir, files in os.walk(PathDCM):
for filename in sorted(files):
ds = pydicom.dcmread(PathDCM +'/' + filename)
im = fromarray(ds.pixel_array)
im = keras.preprocessing.image.img_to_array(im)
im = cv2.resize(im,(299,299))
data.append(im)
return data
PathDCM = '/home/Desktop/FULL_BALANCED_COLOURED/'
data = convertDCM(PathDCM)
#scale the raw pixel intensities to the range [0,1]
data = np.array(data, dtype="float")/255.0
labels = np.array(labels,dtype ="int")
#splitting data into training and testing
#test_size is percentage to split into test/train data
(trainX, testX, trainY, testY) = train_test_split(
data,labels,
test_size=0.2,
random_state=42)
img_width, img_height = 299, 299 #InceptionV3 size
train_samples = 300
validation_samples = 50
epochs = 25
batch_size = 15
base_model = keras.applications.InceptionV3(
weights ='imagenet',
include_top=False,
input_shape = (img_width,img_height,3))
model_top = keras.models.Sequential()
model_top.add(keras.layers.GlobalAveragePooling2D(input_shape=base_model.output_shape[1:], data_format=None)),
model_top.add(keras.layers.Dense(300,activation='relu'))
model_top.add(keras.layers.Dropout(0.5))
model_top.add(keras.layers.Dense(1, activation = 'sigmoid'))
model = keras.models.Model(inputs = base_model.input, outputs = model_top(base_model.output))
#Compiling model
model.compile(optimizer = keras.optimizers.Adam(
lr=0.0001),
loss='binary_crossentropy',
metrics=['accuracy'])
#Image Processing and Augmentation
train_datagen = keras.preprocessing.image.ImageDataGenerator(
rescale = 1./255,
zoom_range = 0.1,
width_shift_range = 0.2,
height_shift_range = 0.2,
horizontal_flip = True,
fill_mode ='nearest')
val_datagen = keras.preprocessing.image.ImageDataGenerator()
train_generator = train_datagen.flow(
trainX,
trainY,
batch_size=batch_size,
shuffle=True)
validation_generator = train_datagen.flow(
testX,
testY,
batch_size=batch_size,
shuffle=True)
When I train the model, I always get a constant validation accuracy of 0.3889 with the validation loss fluctuating.
#Training the model
history = model.fit_generator(
train_generator,
steps_per_epoch = train_samples//batch_size,
epochs = epochs,
validation_data = validation_generator,
validation_steps = validation_samples//batch_size)
Epoch 1/25
20/20 [==============================]20/20
[==============================] - 195s 49s/step - loss: 0.7677 - acc: 0.4020 - val_loss: 0.7784 - val_acc: 0.3889
Epoch 2/25
20/20 [==============================]20/20
[==============================] - 187s 47s/step - loss: 0.7016 - acc: 0.4848 - val_loss: 0.7531 - val_acc: 0.3889
Epoch 3/25
20/20 [==============================]20/20
[==============================] - 191s 48s/step - loss: 0.6566 - acc: 0.6304 - val_loss: 0.7492 - val_acc: 0.3889
Epoch 4/25
20/20 [==============================]20/20
[==============================] - 175s 44s/step - loss: 0.6533 - acc: 0.5529 - val_loss: 0.7575 - val_acc: 0.3889
predictions= model.predict(testX)
print(predictions)
Predicting the model also only returns an array of one prediction per image:
[[0.457804 ]
[0.45051473]
[0.48343503]
[0.49180537]...
Why is it that the model only predicts one of the two classes? Does this have to do with the constant val accuracy or possibly overfitting?
If you have two classes, every image is in one or the other so probabilities for one class are enough to find everything because the sum of the probabilities for each image is supposed to make 1. So if you have the probabilitie p for 1 class, the probability for the other one is 1-p.
If you want to have the possibility of classifing images not in one of those two class, then you should create a third one.
Also, this line:
model_top.add(keras.layers.Dense(1, activation = 'sigmoid'))
means that the Output is a vector of shape(nb_sample,1) and has the same shape as your training labels

How to get both score and accuracy after training

model.fit(X_train, y_train, batch_size = batch_size,
nb_epoch = 4, validation_data = (X_test, y_test),
show_accuracy = True)
score = model.evaluate(X_test, y_test,
batch_size = batch_size, show_accuracy = True, verbose=0)
gives scalar output and hence the following code doesn't work.
print("Test score", score[0])
print("Test accuracy:", score[1])
The output that I get is:
Train on 20000 samples, validate on 5000 samples
Epoch 1/4
20000/20000 [==============================] - 352s - loss: 0.4515 - val_loss: 0.4232
Epoch 2/4
20000/20000 [==============================] - 381s - loss: 0.2592 - val_loss: 0.3723
Epoch 3/4
20000/20000 [==============================] - 374s - loss: 0.1513 - val_loss: 0.4329
Epoch 4/4
20000/20000 [==============================] - 380s - loss: 0.0838 - val_loss: 0.5044
Keras version 1.0
How can I get the accuracy as well? Please help
If you use Sequential model you can try (CODE UPDATED):
nb_epochs = 4
history = model.fit(X_train, y_train, batch_size = batch_size,
nb_epoch = nb_epochs, validation_data = (X_test, y_test),
show_accuracy = True)
print("Test score", history.history["val_loss"][nb_epochs - 1])
print("Test acc", history.history["val_acc"][nb_epochs - 1])
Thanks Marcin and you are correct.
The code needs to be like this
model.compile(loss='binary_crossentropy',
optimizer = 'adam',
metrics=["accuracy"])
show_accuracy serves no purpose in model.fit and needs to be removed from there.

Categories

Resources