Convert BasicLSTMCell to bidirectional LSTM

Convert BasicLSTMCell to bidirectional LSTM - python

Recently I tried to use BasicLSTMCell api from Tensorflow to generate video caption. I am working with a code that builds BasicLSTMCell in the following way:
self.lstm1 = tf.compat.v1.nn.rnn_cell.BasicLSTMCell(dim_hidden, state_is_tuple=False)
self.lstm2 = tf.compat.v1.nn.rnn_cell.BasicLSTMCell(dim_hidden, state_is_tuple=False)
Then uses it later as follows:
with tf.compat.v1.variable_scope("Encoding") as scope:
for i in range(0, self.n_video_lstm_step):
if i > 0:
scope.reuse_variables()
with tf.compat.v1.variable_scope("LSTM1"):
output1, state1 = self.lstm1(image_emb[:,i,:], state1)
with tf.compat.v1.variable_scope("LSTM2"):
output2, state2 = self.lstm2(tf.concat([padding, output1], 1), state2)
out_list.append(tf.concat([output1, output2], 1))
I want these LSTM cells to be bidirectional for my requirement. I have tried using
keras.layers.Bidirectional(keras.layers.LSTM(dim_hidden, unit_forget_bias=True, unroll=True))
But it didn't work. Can anyone let me know how to make it work with bidirectional lstm.

Based on the question you ask, Convert BasicLSTMCell to bidirectional LSTM - You can use the Bi-Directional RNN wrapper directly as shown in the code below. Do clarify how you are modifying the LSTM layer class that is causing the error you are facing. Ill update my answer accordingly.
import numpy as np
from tensorflow.keras import layers, Model, utils
X = np.random.random((100,10,3))
y = np.random.random((100,))
inp = layers.Input((10,3))
x = layers.Bidirectional(layers.LSTM(8, return_sequences=True))(inp)
x = layers.Bidirectional(layers.LSTM(8))(x)
out = layers.Dense(1, activation='softmax')(x)
model = Model(inp, out)
utils.plot_model(model, show_layer_names=False, show_shapes=True)
model.compile(optimizer='adam', loss='binary_crossentropy')
model.fit(X, y, epochs=3)
Epoch 1/3
4/4 [==============================] - 5s 10ms/step - loss: 0.6963
Epoch 2/3
4/4 [==============================] - 0s 22ms/step - loss: 0.6965
Epoch 3/3
4/4 [==============================] - 0s 11ms/step - loss: 0.6976
<tensorflow.python.keras.callbacks.History at 0x7f91066bf4c0>

Related

Model accuracy and loss not improving in CNN

I am using the below LeNet architecture to train my image classification model , I have noticed that both train , val accuracy not improving for each iteration . Can any one expertise in this area explain what might have gone wrong ?
training samples - 110 images belonging to 2 classes.
validation - 50 images belonging to 2 classes.
#LeNet
import keras
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
#import dropout class if needed
from keras.layers import Dropout
from keras import regularizers
model = Sequential()
#Layer 1
#Conv Layer 1
model.add(Conv2D(filters = 6,
kernel_size = 5,
strides = 1,
activation = 'relu',
input_shape = (32,32,3)))
#Pooling layer 1
model.add(MaxPooling2D(pool_size = 2, strides = 2))
#Layer 2
#Conv Layer 2
model.add(Conv2D(filters = 16,
kernel_size = 5,
strides = 1,
activation = 'relu',
input_shape = (14,14,6)))
#Pooling Layer 2
model.add(MaxPooling2D(pool_size = 2, strides = 2))
#Flatten
model.add(Flatten())
#Layer 3
#Fully connected layer 1
model.add(Dense(units=128,activation='relu',kernel_initializer='uniform'
,kernel_regularizer=regularizers.l2(0.01)))
model.add(Dropout(rate=0.2))
#Layer 4
#Fully connected layer 2
model.add(Dense(units=64,activation='relu',kernel_initializer='uniform'
,kernel_regularizer=regularizers.l2(0.01)))
model.add(Dropout(rate=0.2))
#layer 5
#Fully connected layer 3
model.add(Dense(units=64,activation='relu',kernel_initializer='uniform'
,kernel_regularizer=regularizers.l2(0.01)))
model.add(Dropout(rate=0.2))
#layer 6
#Fully connected layer 4
model.add(Dense(units=64,activation='relu',kernel_initializer='uniform'
,kernel_regularizer=regularizers.l2(0.01)))
model.add(Dropout(rate=0.2))
#Layer 7
#Output Layer
model.add(Dense(units = 2, activation = 'softmax'))
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
from keras.preprocessing.image import ImageDataGenerator
#Image Augmentation
train_datagen = ImageDataGenerator(
rescale=1./255, #rescaling pixel value bw 0 and 1
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
#Just Feature scaling
test_datagen = ImageDataGenerator(rescale=1./255)
training_set = train_datagen.flow_from_directory(
'/Dataset/Skin_cancer/training',
target_size=(32, 32),
batch_size=32,
class_mode='categorical')
test_set = test_datagen.flow_from_directory(
'/Dataset/Skin_cancer/testing',
target_size=(32, 32),
batch_size=32,
class_mode='categorical')
model.fit_generator(
training_set,
steps_per_epoch=50, #number of input (image)
epochs=25,
validation_data=test_set,
validation_steps=10) # number of training sample
Epoch 1/25
50/50 [==============================] - 52s 1s/step - loss: 0.8568 - accuracy: 0.4963 - val_loss: 0.7004 - val_accuracy: 0.5000
Epoch 2/25
50/50 [==============================] - 50s 1s/step - loss: 0.6940 - accuracy: 0.5000 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 3/25
50/50 [==============================] - 48s 967ms/step - loss: 0.6932 - accuracy: 0.5065 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 4/25
50/50 [==============================] - 50s 1s/step - loss: 0.6932 - accuracy: 0.4824 - val_loss: 0.6933 - val_accuracy: 0.5000
Epoch 5/25
50/50 [==============================] - 49s 974ms/step - loss: 0.6932 - accuracy: 0.4949 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 6/25
50/50 [==============================] - 51s 1s/step - loss: 0.6932 - accuracy: 0.4854 - val_loss: 0.6931 - val_accuracy: 0.5000
Epoch 7/25
50/50 [==============================] - 49s 976ms/step - loss: 0.6931 - accuracy: 0.5015 - val_loss: 0.6918 - val_accuracy: 0.5000
Epoch 8/25
50/50 [==============================] - 51s 1s/step - loss: 0.6932 - accuracy: 0.4986 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 9/25
50/50 [==============================] - 49s 973ms/step - loss: 0.6932 - accuracy: 0.5000 - val_loss: 0.6929 - val_accuracy: 0.5000
Epoch 10/25
50/50 [==============================] - 50s 1s/step - loss: 0.6931 - accuracy: 0.5044 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 11/25
50/50 [==============================] - 49s 976ms/step - loss: 0.6931 - accuracy: 0.5022 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 12/25

Most importantly is that you are using loss = 'categorical_crossentropy', change it to loss = 'binary_crossentropy' as you have just 2 classes. And also change class_mode='categorical' to class_mode='binary' in flow_from_directory.
As #desertnaut rightly mentioned, categorical_crossentropy goes hand in hand with softmax activation in the last layer, and if you change the loss to binary_crossentropy the last activation should also be changed to sigmoid.
Other Improvements:
You have very limited data (160 images) and you have used almost 50% of data as validation data.
As you are building the model for image classification, you just have two Conv2D Layer and 4 dense Layer. The Dense layers are adding huge amount of weights to be learnt. Add few more conv2d layer and reduce the Dense layer.
Set batch_size = 1 and remove steps_per_epoch. As you have very less input let every epoch have same number of steps as input records.
Use the default glorot_uniform kernel initializer.
To further tune your model, build model using multiple Conv2D layer, followed by GlobalAveragePooling2D layer and FC Layer and final softmax layer.
Use Data Augmentation technique like horizontal_flip, vertical_flip, shear_range, zoom_range of ImageDataGenerator to increase the number of training and validation images.
Moving the comments to answer section as suggested by #desertnaut -
Question - Thanks ! Yes , less data is the problem I figured . One additional
question - why is that adding more dense layer than conv layer
negatively affecting the model, is there any rule to follow when we
decide how many conv and dense layer we gonna use ? –
Arun_Ramji_Shanmugam 2 days ago
Answer - To answer the first part of your question, Conv2D layer maintains the
spatial information of the image and weights to be learnt depend on
the kernel size and stride mentioned in the layer,where as the Dense
layer needs the output of Conv2D to be flattened and used further
hence losing the spatial information. Also dense layer adds more
number of weights, for example 2 dense layers of 512 adds
(512*512)=262144 params or weights to the model(has to be learnt by
the model).That means you have to train for more number of epochs and
with good hype parameters settings for learning of these weights. –
Tensorflow Warriors 2 days ago
Answer - To answer the second part of your question,use systematic experiments
to discover what works best for your specific dataset. Also it depends
on processing power you hold. Remember, deeper networks is always
better, at the cost of more data and increased complexity of learning.
A conventional approach is to look for similar problems and deep
learning architectures which have already been shown to work. Also we
have the flexibility to utilize the pretrained models like resnet, vgg
etc, use these models by freezing the part of the layers and training
on remaining layers. – Tensorflow Warriors 2 days ago
Question - Thank you for detailed answer !! If you don't bother one more question
- so when we are using already trained model (may be some layers) , isn't it required to be trained on same input data as the one we gonna
work ? – Arun_Ramji_Shanmugam yesterday
Answer - The intuition behind transfer learning for image classification is
that if a model is trained on a large and general enough dataset, this
model will effectively serve as a generic model of the visual world.
You can find transfer learning example with explanation here -
tensorflow.org/tutorials/images/transfer_learning . – Tensorflow
Warriors yesterday

Remove all kernel_initializer='uniform' arguments from your layers; don't specify anything here, the default initializer glorot_uniform is the highly recommended one (and the uniform is a particularly bad one).
As a general rule, keep in mind that the default values for such rather advanced settings are there for your convenience, they are implicitly recommended, and you should better not mess with them unless you have specific reasons to do so and you know exactly what you are doing.
For the kernel_initializer argument in particular, I have started believing that it has caused a lot of unnecessary pain to people (just see here for the most recent example).
Also, dropout should not be used by default, especially in cases like here where the model seems to struggle to learn anything; start without any dropout (comment out the respective layers), and only add it back if you see signs of overfitting.

What do I do to improve my Keras CNN VGG16 model

I'm working in a project that has 700 images for 2 classes (1400 total). I'm using VGG16 but i'm new with this model and I don't know what could I do to improve this model..
This is my model:
vgg16_model = VGG16(weights="imagenet", include_top=True)
# (1) visualize layers
print("VGG16 model layers")
for i, layer in enumerate(vgg16_model.layers):
print(i, layer.name, layer.output_shape)
# (2) remove the top layer
base_model = Model(input=vgg16_model.input,
output=vgg16_model.get_layer("block5_pool").output)
# (3) attach a new top layer
base_out = base_model.output
base_out = Reshape((25088,))(base_out)
top_fc1 = Dense(256, activation="relu")(base_out)
top_fc1 = Dropout(0.5)(top_fc1)
# output layer: (None, 5)
top_preds = Dense(1, activation="sigmoid")(top_fc1)
# (4) freeze weights until the last but one convolution layer (block4_pool)
for layer in base_model.layers[0:14]:
layer.trainable = False
# (5) create new hybrid model
model = Model(input=base_model.input, output=top_preds)
# (6) compile and train the model
sgd = SGD(lr=1e-4, momentum=0.9)
model.compile(optimizer=sgd, loss="binary_crossentropy", metrics=["accuracy"])
history = model.fit([data], [labels], nb_epoch=NUM_EPOCHS,
batch_size=BATCH_SIZE, validation_split=0.1)
# evaluate final model
vlabels = model.predict(np.array(valid))
model.save('model.h5')
... that gives me the follow return:
Train on 1260 samples, validate on 140 samples
Epoch 1/5
1260/1260 [==============================] - 437s 347ms/step - loss: 0.2200 - acc: 0.9746 - val_loss: 2.4432e-05 - val_acc: 1.0000
Epoch 2/5
1260/1260 [==============================] - 456s 362ms/step - loss: 0.0090 - acc: 0.9984 - val_loss: 1.5452e-04 - val_acc: 1.0000
Epoch 3/5
1260/1260 [==============================] - 438s 347ms/step - loss: 1.3702e-07 - acc: 1.0000 - val_loss: 8.4489e-05 - val_acc: 1.0000
Epoch 4/5
1260/1260 [==============================] - 446s 354ms/step - loss: 4.2592e-06 - acc: 1.0000 - val_loss: 7.6768e-05 - val_acc: 1.0000
Epoch 5/5
1260/1260 [==============================] - 457s 363ms/step - loss: 0.0017 - acc: 0.9992 - val_loss: 1.1921e-07 - val_acc: 1.0000
It seems to be a bit overfitting..
My predict.py:
def fix_layer0(filename, batch_input_shape, dtype):
with h5py.File(filename, 'r+') as f:
model_config = json.loads(f.attrs['model_config'].decode('utf-8'))
layer0 = model_config['config']['layers'][0]['config']
layer0['batch_input_shape'] = batch_input_shape
layer0['dtype'] = dtype
f.attrs['model_config'] = json.dumps(model_config).encode('utf-8')
fix_layer0('model.h5', [None, 224, 224, 3], 'float32')
model = load_model('model.h5')
for filename in os.listdir(r'v/'):
if filename.endswith(".jpg") or filename.endswith(".ppm") or filename.endswith(".jpeg") or filename.endswith(".png"):
ImageCV = cv2.resize(cv2.imread(os.path.join(TEST_DIR) + filename), (224,224))
ImageCV = cv2.addWeighted(ImageCV,4, cv2.GaussianBlur(ImageCV,(0,0), 224/25), -4, 120) #The same process made when I get data in train
ImageCV = ImageCV.reshape(-1,224,224,3)
print(model.predict(ImageCV))
And the results are strange because only the 2 first images are of 'class 0'.. the others are 'class 1':
[[0.99905235]]
[[0.]]
[[1.]]
[[0.012198]]
[[0.]]
[[1.]]
[[1.6363418e-07]]
[[0.99997246]]
[[0.00433112]]
[[0.9996668]]
[[1.]]
[[6.183685e-08]]
What can I do to improve it? I'm a little confused..

ImageCV = cv2.addWeighted(ImageCV,4, cv2.GaussianBlur(ImageCV,(0,0),
224/25), -4, 120)
Not sure why you do this for the test data. For the validation/test data, usually only normalization is done. During training as well you need to apply the same normalization as a final step before feeding the data to the network.
Refer to this example for fine-tuning VGG16 for a two class problem(dogs vs cats)
https://gist.github.com/fchollet/7eb39b44eb9e16e59632d25fb3119975
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
To reduce overfitting, you can do data augmentation for the training data i.e. feed the original data & augmented data(apply operations like flip, zoom etc.). Keras ImageDataGenerators make it easy to do the augmentation. Explored in the above tutorial as well.
https://keras.io/preprocessing/image/

First of all, Keras predict will return the scores of the regression (probabilities for each class) and predict_classes will return the most likely class of your prediction. For example, if you classify between cats and dogs, predict could output 0.2 for cat and 0.8 for dog.
So, if you use predict, there should be two values per picture, one for each class.
The reason why you only have one value is that your network only has one output neuron. It should have two, as there are two classes.
top_preds = Dense(2, activation="sigmoid")(top_fc1)
If you now want to see most likely class, not the probabilities, you should use predict_classes.

Keras - Single pixel classification without ImageDataGenerator

This is a newbie question, but I just can't get the simplest Keras experiment work. I went through a course and its samples work well, so my computer is set up correctly.
I've a few thousand 16x12 images, called "GGYRBGBBW.png", "BBYWBRBBB.png" and so on. The images have a single color with minimal shade differences and are reduced to a single pixel when loading. The first character of the filenames serves as training labels (e.g. green images' names start with 'G'). I need to build and train a simple model that indicates one of the 6 possible colors in the image. (This is a first learning step towards a much more complicated project).
I don't want to use ImageDataGenerator because this whole project will grow beyond that a simple directory-structure categorization can do, and I'll do the image randomization with my own external image generator.
I created a Keras model that looks like this:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten_1 (Flatten) (None, 3) 0
_________________________________________________________________
dense_1 (Dense) (None, 6) 24
=================================================================
Total params: 24
Trainable params: 24
Non-trainable params: 0
So, pretty simple - input shape is only three values (RGB), flattened, then 6 output neurons for each color category.
When running the experiments, beginning with random initial values, accuracy stays super low, sometimes 0.
210/210 [==============================] - 0s 2ms/step - loss: 7.2430 - acc: 0.2095 - val_loss: 6.7980 - val_acc: 0.2000
Epoch 2/50
210/210 [==============================] - 0s 10us/step - loss: 7.2411 - acc: 0.2095 - val_loss: 9.6617 - val_acc: 0.2000
Epoch 3/50
210/210 [==============================] - 0s 5us/step - loss: 9.9256 - acc: 0.2095 - val_loss: 9.6598 - val_acc: 0.2000
Epoch 4/50
210/210 [==============================] - 0s 5us/step - loss: 9.9236 - acc: 0.2095 - val_loss: 9.6579 - val_acc: 0.2000
Epoch 5/50
210/210 [==============================] - 0s 10us/step - loss: 9.9217 - acc: 0.2095 - val_loss: 9.6560 - val_acc: 0.2000
Epoch 6/50
210/210 [==============================] - 0s 10us/step - loss: 9.9197 - acc: 0.2095 - val_loss: 9.6541 - val_acc: 0.2000
I must be missing something trivial, but since I'm a noob in this, I can't figure out what. Here's the complete source code:
from __future__ import print_function
import random
import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
os.system('cls')
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras import backend as K
from keras.preprocessing.image import img_to_array
import numpy as np
num_classes = 6
batch_size = 300
epochs = 50
Image_width, Image_height = 1, 1
train_dir = './OneColorTrainingData'
colors="WYBGRO"
def load(numOfPics):
X = []
y = []
print("Reading training images")
allFiles = os.listdir(train_dir)
randomFiles = random.choices(allFiles, k=numOfPics)
for f in randomFiles:
path = os.path.join(train_dir, f)
img = keras.preprocessing.image.load_img(path, grayscale=False, target_size=(Image_width, Image_width))
img = img_to_array(img)
img /= 255
X.append(img)
y.append(colors.index(f[0]))
y = keras.utils.to_categorical(y, num_classes=num_classes)
return X, y
Data, labels = load(batch_size)
print(str(len(Data)) + " training files loaded")
print(labels)
model = Sequential()
model.add(Flatten(input_shape=(Image_height, Image_width, 3)))
model.add(Dense(num_classes, activation=K.tanh))
model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adadelta(), metrics=['accuracy'])
print(model.input_shape)
print(model.summary())
hist = model.fit(np.array(Data), np.array(labels), batch_size=batch_size, epochs=epochs, verbose=1, validation_split=0.3)
score = model.evaluate(np.array(Data), np.array(labels), verbose=0)
print('Test loss: ', score[0])
print('Test accuracy: ', score[1])
Any help would be appreciated.

You have K.tanh as your finaly Dense layer activation. When doing 1-of-many classification we often use softmax instead which will produce a probability distribution over the colour classes:
model.add(Dense(num_classes, activation='softmax'))
Now your target labels will be one-hot vectors [0,0,1,0,0,0] indication which class it is. You can also use sparse_categorical_crossentropy loss and give labels as class integers 2 as your target. It means the same thing in this context.

I'm not positive, but I think you need a hidden layer. You currently have a single affine transformation on your input: h_0 = W_0*x + b_0. This is passed through a K.tanh, so your logits are simply y = tanh(h_0). I have a hunch that for a small enough problem like this you may be able to prove that this cannot converge, but I am not certain.
I would just add a second dense layer, and use softmax for your final output as #nuric suggests:
model = Sequential()
model.add(Flatten(input_shape=(Image_height, Image_width, 3)))
model.add(Dense(10, activation=K.tanh))
model.add(Dense(num_classes, activation='softmax'))

OK, I figured it out. I needed to switch to softmax (thanks all who suggested), and use much more epochs because convergence was random and slow. I didn't think I needed to use more epochs because nothing changed after the first few - but it turned out that once I had 500 instead of 50, the network managed to learn all the colors one by one and hit 100% accuracy (with a single output layer) almost every time. Thank you all for the help!

MobileNet transfer learning in Keras for object localization extraction - loss computed as NaN

I'm trying to use Keras and its MobileNet implementation to do object localization (output the x/y coordinates of a few features, instead of classes) and I'm running into some likely very basic issue that I can't figure out.
My code looks like this:
# =============================
# Load MobileNet and change the top layers.
model = applications.MobileNet(weights="imagenet",
include_top=False,
input_shape=(224, 224, 3))
# Freeze all the layers except the very last 5.
for layer in model.layers[:-5]:
layer.trainable = False
# Adding custom Layers at the end, after the last Conv2D layer.
x = model.output
x = GlobalAveragePooling2D()(x)
x = Reshape((1, 1, 1024))(x)
x = Dropout(0.5)(x)
x = Conv2D(1024, (1, 1), activation='relu', padding='same', name='conv_preds')(x)
x = Dense(1024, activation="relu")(x)
# I'd like this to output 4 variables, two pairs of x/y coordinates
x = Dense(PREDICT_SIZE, activation="sigmoid")(x)
predictions = Reshape((PREDICT_SIZE,))(x)
# =============================
# Create the new final model.
model_final = Model(input = model.input, output = predictions)
def custom_loss(y_true, y_pred):
'''Trying to compute the Euclidian distance as a Loss Function'''
return K.sqrt(K.sum(K.square(y_true - y_pred), axis=-1))
model_final.compile(loss = custom_loss,
optimizer = optimizers.adam(lr=0.0001),
metrics=["accuracy"])
With this model, then I load the data and try to train it.
x_train, y_train, x_val, y_val = load_data(DATASET_DIR)
# This load_data is my own implementation. It returns the images
# as tensors.
# ==> x_train[0].shape= (224, 224, 3)
#
# y_train and y_val look like this:
# ==> y_train[0]= [ 0.182 -0.0933 0.072 -0.0453]
#
# holding values in the [0, 1] interval for where the pixel
# is relative to the width/height of the image.
#
model_final.fit(x_train, y_train,
batch_size=batch_size, epochs=5, shuffle=False,
validation_data=(x_val, y_val))
Unfortunately, what I get when I run this model to train, I get something like this:
Train on 45 samples, validate on 5 samples
Epoch 1/5
16/45 [=========>....................] - ETA: 2s - loss: nan - acc: 0.0625
32/45 [====================>.........] - ETA: 1s - loss: nan - acc: 0.0312
45/45 [==============================] - 4s - loss: nan - acc: 0.0222 - val_loss: nan - val_acc: 0.0000e+00
Epoch 2/5
16/45 [=========>....................] - ETA: 2s - loss: nan - acc: 0.0625
32/45 [====================>.........] - ETA: 1s - loss: nan - acc: 0.0312
45/45 [==============================] - 4s - loss: nan - acc: 0.0222 - val_loss: nan - val_acc: 0.0000e+00
Epoch 3/5
I'm at a loss about why my loss value is "nan". I must be doing something wrong, and I've tried to change everything - the loss function, the shape of the output... but I can't figure out what I'm doing wrong.
Any help would be appreciated!
UPDATE: it seems like the issue is in the way I load_data.
If I create the image data like this it fails and results in loss:nan
i = pil_image.open(img_filename)
img = image.load_img(img_filename, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = keras.applications.mobilenet.preprocess_input(x)
x_train = np.append(x_train, x, axis=0)
but if I do something trivial like this, 'fit' works just fine and computes real values for loss:
x_train = np.random.random((100, 224, 224, 3))
sigh I wonder what's happening...
UPDATE #2: I figured out what the issue was
Documenting this here in case it helps anybody.
The way to properly generate the input tensors for MobileNet is this one:
test_img=[]
for i in range(len(test)):
temp_img=image.load_img(test_path+test['filename'][i],target_size=(224,224))
temp_img=image.img_to_array(temp_img)
test_img.append(temp_img)
test_img=np.array(test_img)
test_img=preprocess_input(test_img)
Notice how making it into a numpy.array and running preprocess_input happens on the whole batch of images. Doing it image by image seems to not have worked (what I was doing before).
Hope this helps somebody someday.

Keras accuracy does not change

I have a few thousand audio files and I want to classify them using Keras and Theano. So far, I generated a 28x28 spectrograms (bigger is probably better, but I am just trying to get the algorithm work at this point) of each audio file and read the image into a matrix. So in the end I get this big image matrix to feed into the network for image classification.
In a tutorial I found this mnist classification code:
import numpy as np
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense
from keras.utils import np_utils
batch_size = 128
nb_classes = 10
nb_epochs = 2
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype("float32")
X_test = X_test.astype("float32")
X_train /= 255
X_test /= 255
print(X_train.shape[0], "train samples")
print(X_test.shape[0], "test samples")
y_train = np_utils.to_categorical(y_train, nb_classes)
y_test = np_utils.to_categorical(y_test, nb_classes)
model = Sequential()
model.add(Dense(output_dim = 100, input_dim = 784, activation= "relu"))
model.add(Dense(output_dim = 200, activation = "relu"))
model.add(Dense(output_dim = 200, activation = "relu"))
model.add(Dense(output_dim = nb_classes, activation = "softmax"))
model.compile(optimizer = "adam", loss = "categorical_crossentropy")
model.fit(X_train, y_train, batch_size = batch_size, nb_epoch = nb_epochs, show_accuracy = True, verbose = 2, validation_data = (X_test, y_test))
score = model.evaluate(X_test, y_test, show_accuracy = True, verbose = 0)
print("Test score: ", score[0])
print("Test accuracy: ", score[1])
This code runs, and I get the result as expected:
(60000L, 'train samples')
(10000L, 'test samples')
Train on 60000 samples, validate on 10000 samples
Epoch 1/2
2s - loss: 0.2988 - acc: 0.9131 - val_loss: 0.1314 - val_acc: 0.9607
Epoch 2/2
2s - loss: 0.1144 - acc: 0.9651 - val_loss: 0.0995 - val_acc: 0.9673
('Test score: ', 0.099454972004890438)
('Test accuracy: ', 0.96730000000000005)
Up to this point everything runs perfectly, however when I apply the above algorithm to my dataset, accuracy gets stuck.
My code is as follows:
import os
import pandas as pd
from sklearn.cross_validation import train_test_split
from keras.models import Sequential
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.layers.core import Dense, Activation, Dropout, Flatten
from keras.utils import np_utils
import AudioProcessing as ap
import ImageTools as it
batch_size = 128
nb_classes = 2
nb_epoch = 10
for i in range(20):
print "\n"
# Generate spectrograms if necessary
if(len(os.listdir("./AudioNormalPathalogicClassification/Image")) > 0):
print "Audio files are already processed. Skipping..."
else:
print "Generating spectrograms for the audio files..."
ap.audio_2_image("./AudioNormalPathalogicClassification/Audio/","./AudioNormalPathalogicClassification/Image/",".wav",".png",(28,28))
# Read the result csv
df = pd.read_csv('./AudioNormalPathalogicClassification/Result/result.csv', header = None)
df.columns = ["RegionName","IsNormal"]
bool_mapping = {True : 1, False : 0}
nb_classes = 2
for col in df:
if(col == "RegionName"):
a = 3
else:
df[col] = df[col].map(bool_mapping)
y = df.iloc[:,1:].values
y = np_utils.to_categorical(y, nb_classes)
# Load images into memory
print "Loading images into memory..."
X = it.load_images("./AudioNormalPathalogicClassification/Image/",".png")
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 0)
X_train = X_train.reshape(X_train.shape[0], 784)
X_test = X_test.reshape(X_test.shape[0], 784)
X_train = X_train.astype("float32")
X_test = X_test.astype("float32")
X_train /= 255
X_test /= 255
print("X_train shape: " + str(X_train.shape))
print(str(X_train.shape[0]) + " train samples")
print(str(X_test.shape[0]) + " test samples")
model = Sequential()
model.add(Dense(output_dim = 100, input_dim = 784, activation= "relu"))
model.add(Dense(output_dim = 200, activation = "relu"))
model.add(Dense(output_dim = 200, activation = "relu"))
model.add(Dense(output_dim = nb_classes, activation = "softmax"))
model.compile(loss = "categorical_crossentropy", optimizer = "adam")
print model.summary()
model.fit(X_train, y_train, batch_size = batch_size, nb_epoch = nb_epoch, show_accuracy = True, verbose = 1, validation_data = (X_test, y_test))
score = model.evaluate(X_test, y_test, show_accuracy = True, verbose = 1)
print("Test score: ", score[0])
print("Test accuracy: ", score[1])
AudioProcessing.py
import os
import scipy as sp
import scipy.io.wavfile as wav
import matplotlib.pylab as pylab
import Image
def save_spectrogram_scipy(source_filename, destination_filename, size):
dt = 0.0005
NFFT = 1024
Fs = int(1.0/dt)
fs, audio = wav.read(source_filename)
if(len(audio.shape) >= 2):
audio = sp.mean(audio, axis = 1)
fig = pylab.figure()
ax = pylab.Axes(fig, [0,0,1,1])
ax.set_axis_off()
fig.add_axes(ax)
pylab.specgram(audio, NFFT = NFFT, Fs = Fs, noverlap = 900, cmap="gray")
pylab.savefig(destination_filename)
img = Image.open(destination_filename).convert("L")
img = img.resize(size)
img.save(destination_filename)
pylab.clf()
del img
def audio_2_image(source_directory, destination_directory, audio_extension, image_extension, size):
nb_files = len(os.listdir(source_directory));
count = 0
for file in os.listdir(source_directory):
if file.endswith(audio_extension):
destinationName = file[:-4]
save_spectrogram_scipy(source_directory + file, destination_directory + destinationName + image_extension, size)
count += 1
print ("Generating spectrogram for files " + str(count) + " / " + str(nb_files) + ".")
ImageTools.py
import os
import numpy as np
import matplotlib.image as mpimg
def load_images(source_directory, image_extension):
image_matrix = []
nb_files = len(os.listdir(source_directory));
count = 0
for file in os.listdir(source_directory):
if file.endswith(image_extension):
with open(source_directory + file,"r+b") as f:
img = mpimg.imread(f)
img = img.flatten()
image_matrix.append(img)
del img
count += 1
#print ("File " + str(count) + " / " + str(nb_files) + " loaded.")
return np.asarray(image_matrix)
So I run the above code and recieve:
Audio files are already processed. Skipping...
Loading images into memory...
X_train shape: (2394L, 784L)
2394 train samples
1027 test samples
--------------------------------------------------------------------------------
Initial input shape: (None, 784)
--------------------------------------------------------------------------------
Layer (name) Output Shape Param #
--------------------------------------------------------------------------------
Dense (dense) (None, 100) 78500
Dense (dense) (None, 200) 20200
Dense (dense) (None, 200) 40200
Dense (dense) (None, 2) 402
--------------------------------------------------------------------------------
Total params: 139302
--------------------------------------------------------------------------------
None
Train on 2394 samples, validate on 1027 samples
Epoch 1/10
2394/2394 [==============================] - 0s - loss: 0.6898 - acc: 0.5455 - val_loss: 0.6835 - val_acc: 0.5716
Epoch 2/10
2394/2394 [==============================] - 0s - loss: 0.6879 - acc: 0.5522 - val_loss: 0.6901 - val_acc: 0.5716
Epoch 3/10
2394/2394 [==============================] - 0s - loss: 0.6880 - acc: 0.5522 - val_loss: 0.6842 - val_acc: 0.5716
Epoch 4/10
2394/2394 [==============================] - 0s - loss: 0.6883 - acc: 0.5522 - val_loss: 0.6829 - val_acc: 0.5716
Epoch 5/10
2394/2394 [==============================] - 0s - loss: 0.6885 - acc: 0.5522 - val_loss: 0.6836 - val_acc: 0.5716
Epoch 6/10
2394/2394 [==============================] - 0s - loss: 0.6887 - acc: 0.5522 - val_loss: 0.6832 - val_acc: 0.5716
Epoch 7/10
2394/2394 [==============================] - 0s - loss: 0.6882 - acc: 0.5522 - val_loss: 0.6859 - val_acc: 0.5716
Epoch 8/10
2394/2394 [==============================] - 0s - loss: 0.6882 - acc: 0.5522 - val_loss: 0.6849 - val_acc: 0.5716
Epoch 9/10
2394/2394 [==============================] - 0s - loss: 0.6885 - acc: 0.5522 - val_loss: 0.6836 - val_acc: 0.5716
Epoch 10/10
2394/2394 [==============================] - 0s - loss: 0.6877 - acc: 0.5522 - val_loss: 0.6849 - val_acc: 0.5716
1027/1027 [==============================] - 0s
('Test score: ', 0.68490593621422047)
('Test accuracy: ', 0.57156767283349563)
I tried changing the network, adding more epochs, but I always get the same result no matter what. I don't understand why I am getting the same result.
Any help would be appreciated. Thank you.
Edit:
I found a mistake where pixel values were not read correctly. I fixed the ImageTools.py below as:
import os
import numpy as np
from scipy.misc import imread
def load_images(source_directory, image_extension):
image_matrix = []
nb_files = len(os.listdir(source_directory));
count = 0
for file in os.listdir(source_directory):
if file.endswith(image_extension):
with open(source_directory + file,"r+b") as f:
img = imread(f)
img = img.flatten()
image_matrix.append(img)
del img
count += 1
#print ("File " + str(count) + " / " + str(nb_files) + " loaded.")
return np.asarray(image_matrix)
Now I actually get grayscale pixel values from 0 to 255, so now my dividing it by 255 makes sense. However, I still get the same result.

The most likely reason is that the optimizer is not suited to your dataset. Here is a list of Keras optimizers from the documentation.
I recommend you first try SGD with default parameter values. If it still doesn't work, divide the learning rate by 10. Do that a few times if necessary. If your learning rate reaches 1e-6 and it still doesn't work, then you have another problem.
In summary, replace this line:
model.compile(loss = "categorical_crossentropy", optimizer = "adam")
with this:
from keras.optimizers import SGD
opt = SGD(lr=0.01)
model.compile(loss = "categorical_crossentropy", optimizer = opt)
and change the learning rate a few times if it doesn't work.
If it was the problem, you should see the loss getting lower after just a few epochs.

Another solution that I do not see mentioned here, but caused a similar problem for me was the activiation function of the last neuron, especialy if it is relu and not something non linear like sigmoid.
In other words, it might help you to use a non-linear activation function in the last layer
Last layer:
model.add(keras.layers.Dense(1, activation='relu'))
Output:
7996/7996 [==============================] - 1s 76us/sample - loss: 6.3474 - accuracy: 0.5860
Epoch 2/30
7996/7996 [==============================] - 0s 58us/sample - loss: 6.3473 - accuracy: 0.5860
Epoch 3/30
7996/7996 [==============================] - 0s 58us/sample - loss: 6.3473 - accuracy: 0.5860
Epoch 4/30
7996/7996 [==============================] - 0s 57us/sample - loss: 6.3473 - accuracy: 0.5860
Epoch 5/30
7996/7996 [==============================] - 0s 58us/sample - loss: 6.3473 - accuracy: 0.5860
Epoch 6/30
7996/7996 [==============================] - 0s 60us/sample - loss: 6.3473 - accuracy: 0.5860
Epoch 7/30
7996/7996 [==============================] - 0s 57us/sample - loss: 6.3473 - accuracy: 0.5860
Epoch 8/30
7996/7996 [==============================] - 0s 57us/sample - loss: 6.3473 - accuracy: 0.5860
Now I used a non linear activation function:
model.add(keras.layers.Dense(1, activation='sigmoid'))
Output:
7996/7996 [==============================] - 1s 74us/sample - loss: 0.7663 - accuracy: 0.5899
Epoch 2/30
7996/7996 [==============================] - 0s 59us/sample - loss: 0.6243 - accuracy: 0.5860
Epoch 3/30
7996/7996 [==============================] - 0s 56us/sample - loss: 0.5399 - accuracy: 0.7580
Epoch 4/30
7996/7996 [==============================] - 0s 56us/sample - loss: 0.4694 - accuracy: 0.7905
Epoch 5/30
7996/7996 [==============================] - 0s 57us/sample - loss: 0.4363 - accuracy: 0.8040
Epoch 6/30
7996/7996 [==============================] - 0s 60us/sample - loss: 0.4139 - accuracy: 0.8099
Epoch 7/30
7996/7996 [==============================] - 0s 58us/sample - loss: 0.3967 - accuracy: 0.8228
Epoch 8/30
7996/7996 [==============================] - 0s 61us/sample - loss: 0.3826 - accuracy: 0.8260
This is not directly a solution to the original answer, but as the answer is #1 on Google when searching for this problem, it might benefit someone.

If the accuracy is not changing, it means the optimizer has found a local minimum for the loss. This may be an undesirable minimum. One common local minimum is to always predict the class with the most number of data points. You should use weighting on the classes to avoid this minimum.
from sklearn.utils import compute_class_weight
classWeight = compute_class_weight('balanced', outputLabels, outputs)
classWeight = dict(enumerate(classWeight))
model.fit(X_train, y_train, batch_size = batch_size, nb_epoch = nb_epochs, show_accuracy = True, verbose = 2, validation_data = (X_test, y_test), class_weight=classWeight)

After some examination, I found that the issue was the data itself. It was very dirty as in same input had 2 different outputs, hence creating confusion. After clearing up the data now my accuracy goes up to %69. Still not enough to be good, but at least I can now work my way up from here now that the data is clear.
I used the below code to test:
import os
import sys
import pandas as pd
import numpy as np
from keras.models import Sequential
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.layers.core import Dense, Activation, Dropout, Flatten
from keras.utils import np_utils
sys.path.append("./")
import AudioProcessing as ap
import ImageTools as it
# input image dimensions
img_rows, img_cols = 28, 28
dim = 1
# number of convolutional filters to use
nb_filters = 32
# size of pooling area for max pooling
nb_pool = 2
# convolution kernel size
nb_conv = 3
batch_size = 128
nb_classes = 2
nb_epoch = 200
for i in range(20):
print "\n"
## Generate spectrograms if necessary
if(len(os.listdir("./AudioNormalPathalogicClassification/Image")) > 0):
print "Audio files are already processed. Skipping..."
else:
# Read the result csv
df = pd.read_csv('./AudioNormalPathalogicClassification/Result/AudioNormalPathalogicClassification_result.csv', header = None, encoding = "utf-8")
df.columns = ["RegionName","Filepath","IsNormal"]
bool_mapping = {True : 1, False : 0}
for col in df:
if(col == "RegionName" or col == "Filepath"):
a = 3
else:
df[col] = df[col].map(bool_mapping)
region_names = df.iloc[:,0].values
filepaths = df.iloc[:,1].values
y = df.iloc[:,2].values
#Generate spectrograms and make a new CSV file
print "Generating spectrograms for the audio files..."
result = ap.audio_2_image(filepaths, region_names, y, "./AudioNormalPathalogicClassification/Image/", ".png",(img_rows,img_cols))
df = pd.DataFrame(data = result)
df.to_csv("NormalVsPathalogic.csv",header= False, index = False, encoding = "utf-8")
# Load images into memory
print "Loading images into memory..."
df = pd.read_csv('NormalVsPathalogic.csv', header = None, encoding = "utf-8")
y = df.iloc[:,0].values
y = np_utils.to_categorical(y, nb_classes)
y = np.asarray(y)
X = df.iloc[:,1:].values
X = np.asarray(X)
X = X.reshape(X.shape[0], dim, img_rows, img_cols)
X = X.astype("float32")
X /= 255
print X.shape
model = Sequential()
model.add(Convolution2D(64, nb_conv, nb_conv,
border_mode='valid',
input_shape=(1, img_rows, img_cols)))
model.add(Activation('relu'))
model.add(Convolution2D(32, nb_conv, nb_conv))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adadelta')
print model.summary()
model.fit(X, y, batch_size = batch_size, nb_epoch = nb_epoch, show_accuracy = True, verbose = 1)

Check out this one
sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile( loss = "categorical_crossentropy",
optimizer = sgd,
metrics=['accuracy']
)
Check out the documentation
I had better results with MNIST

By mistake I had added a softmax at the end instead of sigmoid. Try doing the latter. It worked as expected when I did this. For one output layer, softmax always gives values of 1 and this is what had happened.

I faced a similar issue. One-hot encoding the target variable using nputils in Keras, solved the issue of accuracy and validation loss being stuck. Using weights for balancing the target classes further improved performance.
Solution :
from keras.utils.np.utils import to_categorical
y_train = to_categorical(y_train)
y_val = to_categorical(y_val)

I've the same problem as you
my solution was a loop instead of epochs
for i in range(10):
history = model.fit_generator(generator=training_generator,
validation_data=validation_generator,
use_multiprocessing=True,
workers=6,
epochs=1)
and you can as well save the model each epoch so you can pause the training after any epoch you want
for i in range(10):
history = model.fit_generator(generator=training_generator,
validation_data=validation_generator,
use_multiprocessing=True,
workers=6,
epochs=1)
#save model
model.save('drive/My Drive/vggnet10epochs.h5')
model = load_model('drive/My Drive/vggnet10epochs.h5')

I got 13% Accuracy increment using this 'sigmoid' activation
model = Sequential()
model.add(Dense(3072, input_shape=(3072,), activation="sigmoid"))
model.add(Dense(512, activation="sigmoid"))
model.add(Dense(1, activation="sigmoid"))
Or you can also test the following, where 'relu' in first and hidden layer.
model = Sequential()
model.add(Dense(3072, input_shape=(3072,), activation="relu"))
model.add(Dense(512, activation="sigmoid"))
model.add(Dense(1, activation="sigmoid"))

As mentioned above, the problem mainly arises from the type of optimizers chosen. However, it can also be driven from the fact of topping 2 Dense layers with the same activation functions(softmax, for example).
In this case, NN finds a local minimum and is not able to descent more from that point, rolling around the same acc (val_acc) values.
Hope it helps out.

I had similar problem. I had binary class which was labeled by 1 and 2. After testing different kinds of optimizer and activation functions I found that the root of the problem was my labeling to classes. In the other words I changed the labels to 0 and 1 instead of 1 and 2, then this problem solved!

I faced same problem for multi-class, Try to changing optimizer by default it is Adam change it to sgd.
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
you can also try different Activation functions eg. (relu, sigmoid, softmax, softplus, etc.)
Some imp links
Optimizers
Activations

As pointed out by others, the optimizer probably doesn't suit your data/model which stuck in local minima. A neural network should at least be able to overfit the data (training_acc close to 1).
I once had a similar problem. I solved by trying different optimizers (in my case from SGD to RMSprop)

In my case, my problem was binary and I was using the 'softmax' activation function and it doesn't work. I changed to 'sigmoid' it works properly for me.

I had the exactly same problem: validation loss and accuracy remaining the same through the epochs. I increased the batch size 10x times, reduced learning rate by 100x times, etc. It did not work.
My last try, inspired by monolingual's and Ranjab's answers, worked.
my solution was to add Batchnormalization AND arrange the order as below:
Conv - DropOut - BatchNorm - Activation - Pool.
as recommended in Ordering of batch normalization and dropout?.

I know this is an old question but as of today (14/06/2021), the comment from #theTechGuy works well on tf 2.3. The code is:
from tensorflow.keras.optimizers import SGD
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile( loss = "categorical_crossentropy",
optimizer = sgd,
metrics=['accuracy']
)

I tried playing a lot with the optimizers and activation functions, but the only thing that worked was Batchnormalization1. And I guess it is a good practice too.
You can import it as:
from tensorflow.keras.layers import BatchNormalization
and simply add it before each hidden layer:
model.add(BatchNormalization())

I had the same problem, but in my case, it was caused by a non-regularized column on my data. This column had huge value. Fixing that solved it for me.
So, I just converted it to values around 0 and 1.

I had the same problem. My solution was to change the last layer activation function from "softmax" to "sigmoid" since i was dealing with a binary classification problem.
model.add(layers.Dense(1, activation="sigmoid"))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.