Tensorflow CNN transfer learning has constant low validation accuracy

Tensorflow CNN transfer learning has constant low validation accuracy - python

Hello I have been trying to create two tensorflow models to experiment with transfer learning. I have a trainned a cnn model for lung xray images for pneumonia(2 classes) by using the kaggle chest x-ray dataset .
Here is my code
import tensorflow as tf
import numpy as np
from tensorflow import keras
import os
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.preprocessing import image
import matplotlib.pyplot as plt
gen = ImageDataGenerator(rescale=1./255)
train_data = gen.flow_from_directory("/Users/saibalaji/Downloads/chest_xray/train",target_size=(500,500),batch_size=32,class_mode='binary')
test_data = gen.flow_from_directory("/Users/saibalaji/Downloads/chest_xray/test",target_size=(500,500),batch_size=32,class_mode='binary')
model = keras.Sequential()
# Convolutional layer and maxpool layer 1
model.add(keras.layers.Conv2D(32,(3,3),activation='relu',input_shape=(500,500,3)))
model.add(keras.layers.MaxPool2D(2,2))
# Convolutional layer and maxpool layer 2
model.add(keras.layers.Conv2D(64,(3,3),activation='relu'))
model.add(keras.layers.MaxPool2D(2,2))
# Convolutional layer and maxpool layer 3
model.add(keras.layers.Conv2D(128,(3,3),activation='relu'))
model.add(keras.layers.MaxPool2D(2,2))
# Convolutional layer and maxpool layer 4
model.add(keras.layers.Conv2D(128,(3,3),activation='relu'))
model.add(keras.layers.MaxPool2D(2,2))
# This layer flattens the resulting image array to 1D array
model.add(keras.layers.Flatten())
# Hidden layer with 512 neurons and Rectified Linear Unit activation function
model.add(keras.layers.Dense(512,activation='relu'))
# Output layer with single neuron which gives 0 for Cat or 1 for Dog
#Here we use sigmoid activation function which makes our model output to lie between 0 and 1
model.add(keras.layers.Dense(1,activation='sigmoid'))
hist = model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])
model.fit_generator(train_data,
steps_per_epoch = 163,
epochs = 4,
validation_data = test_data
)
I have saved the model in .h5 format.
Then I created a new notebook loaded data from kaggle for alzheimer's disease and loaded my saved pneumonia model. Copied its layer to a new model except last layer then Freezed all the layers in the new model as non trainable. Then added a output dense layer with 4 neurons for 4 classes. Then trainned only the last layer for 5 epochs. But the problem is val accuaracy remains at 35% constant. How can I improve it.
Here is my code for alzeihmers model
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.preprocessing import image
import matplotlib.pyplot as plt
import numpy as np
gen = ImageDataGenerator(rescale=1./255)
traindata = datagen.flow_from_directory('/Users/saibalaji/Documents/TensorFlowProjects/ad/train',target_size=(500,500),batch_size=32)
testdata = datagen.flow_from_directory('/Users/saibalaji/Documents/TensorFlowProjects/ad/test',target_size=(500,500),batch_size=32)
model = keras.models.load_model('pn.h5')
nmodel = keras.models.Sequential()
#add all layers except last one
for layer in model.layers[0:-1]:
nmodel.add(layer)
for layer in nmodel.layers:
layer.trainable = False
nmodel.add(keras.layers.Dense(units=4,name='dense_last'))
hist = nmodel.compile(optimizer = tf.keras.optimizers.Adam(learning_rate=0.002),loss = 'categorical_crossentropy', metrics = ['accuracy'])
nmodel.fit(x=traindata,validation_data=testdata,epochs=5,steps_per_epoch=160)
Here is my prediction code.
class_labels = []
for class_label,class_mode in traindata.class_indices.items():
print(class_label)
class_labels.append(class_label)
def predictimage(filepath):
test_image = image.load_img(path=filepath,target_size=(500,500))
image_array = image.img_to_array(test_image)
image_array = image_array / 255
print(image_array.shape)
image_array_exp = np.expand_dims(image_array,axis=0)
result = nmodel.predict(image_array_exp)
print(result)
plt.imshow(test_image)
plt.xlabel(class_labels[np.argmax(result)])
I also noticed that it is predicting only two classes even though I have changed the last layer to 4 neurons, changed the loss function.

It looks like you didn't add an activation function in your last layer.
Maybe it would be helpful to use the softmax activation function.
nmodel.add(keras.layers.Dense(units=4, activation= "softmax", name='dense_last'))

Related

Why the accuracy is high but the result for confusion matrix is bad?

I have trained a vgg16 model with a total of 1000 images for 5 classes (200 images for each class). I have used data augmentation, stratified K-fold, and dropout to train the model. The train accuracy and val accuracy is good. However, when i do prediction on the trained model with test dataset, the result of confusion matrix is not compatible with the train accuracy.
[Train & Val accuracy[Classification reportConfusion Matrix](https://i.stack.imgur.com/OIX3O.png)](https://i.stack.imgur.com/MAPXC.png)
VGG model:
def create_model():
# (CNN) is a multilayered neural network with a special architecture to detect complex features in data.
# VGG16 = Visual Geometry Group
# 16 = 16 refers to it has 16 layers that have weights
# VGG16 have 128 million parameters
# 3x3 filter with stride 1
# maxpool layer of 2x2 filter of stride 2
# Conv-1 Layer has 64 number of filters
# Conv-2 has 128 filters
# Conv-3 has 256 filters
# Conv 4 and Conv 5 has 512 filters
# import library
from tensorflow.keras.models import Model
from tensorflow.keras.applications import VGG16
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D, Conv2D
# number of species
NO_CLASSES = 5
# load the VGG16 model as the base model for training
# exclude the fully connected layer
base_model = VGG16(include_top=False, input_shape=(224, 224, 3))
# add layers
x = base_model.output
x = Conv2D(64, (3,3), activation = 'relu')(x) # output layer will be 64x64, (3x3) kernel
x = GlobalAveragePooling2D()(x) # use average pooling as we dont have min pooling which selects the darkest pixels from image (our dataset here is white background)
# add dense layers so that the model can learn more complex functions and classify for netter results
# Dense layer = a layer that is deeply connected with its preceding layer
x = Flatten()(x) # for feeding into fully connected layer as fully connected layer only accept 1D
x = Dense(1024,activation='relu')(x)
x = Dense(1024,activation='relu')(x) # dense layer 2
x = Dense(512,activation='relu')(x) # dense layer 3
x = Dropout(0.2)(x) # reduce dependency between neurons
# final layer with softmax activation for multiclass classification
preds = Dense(NO_CLASSES, activation='softmax')(x)
# layers of the VGG16 model are frozen, bcuz we dont want their weights to changes during model training
# create a new model with the base model's original input and the new model's output
model = Model(inputs = base_model.input, outputs = preds)
# don't train the first 19 layers - 0..18
for layer in model.layers[:19]:
layer.trainable=False
# train the rest of the layers - 19 onwards
for layer in model.layers[19:]:
layer.trainable=True
# compile the model
model.compile(optimizer='Adam', # Adam optimizer -- training cost (low) and performance (high)
loss='categorical_crossentropy', # for multi-class(classes are mutually exclusive) problem
metrics=['accuracy']) # calculate accuracy
return model
Stratified K fold & model fit
from sklearn.model_selection import StratifiedKFold
from statistics import mean, stdev
EPOCHS = 6
histories = []
kfold = StratifiedKFold(n_splits = 5, shuffle=True, random_state=123)
for f, (trn_ind, val_ind) in enumerate(kfold.split(train_dataset.Image_path, train_dataset.labels)):
print(); print("#"*50)
print("Fold: ",f+1)
print("#"*50)
train_ds = datagen.flow_from_dataframe(train_dataset.loc[trn_ind,:],
x_col='Image_path', y_col='labels',
target_size=(width,height),
class_mode = 'categorical', color_mode = 'rgb',
batch_size = 16, shuffle = True)
val_ds = datagen.flow_from_dataframe(train_dataset.loc[val_ind,:],
x_col='Image_path', y_col='labels',
target_size=(width,height),
class_mode = 'categorical', color_mode = 'rgb',
batch_size = 16, shuffle = True)
# Define start and end epoch for each folds
fold_start_epoch = f * EPOCHS
fold_end_epoch = EPOCHS * (f+1)
step_size_train = train_ds.n // train_ds.batch_size
# fit
history=model.fit(train_ds,
initial_epoch=fold_start_epoch ,
epochs=fold_end_epoch,
validation_data=val_ds,
shuffle=True,
steps_per_epoch=step_size_train,
verbose=1)
# store history for each folds
histories.append(history)
Does this happened is because of the dataset itself or the coding problem? I hope to find the mistake.

Cannot convert a symbolic input/output to a numpy array

I am trying to run a deep learning code that I found in a tutorial in order to familiarise myself with resnet50, keras and tensorflow with python 3.7. When I run my code, I get the following error:
TypeError: Cannot convert a symbolic Keras input/output to a numpy array. This error may indicate that you're trying to pass a symbolic value to a NumPy call, which is not supported. Or, you may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically converting the API call to a lambda layer in the Functional Model.
I tried to use the following fix as mentioned on stack overflow:
from tensorflow.python.framework.ops import disable_eager_execution
disable_eager_execution()
Without any success. My full code can be seen below:
from keras.applications.resnet50 import ResNet50
from keras.layers import Dense, GlobalAveragePooling2D
from keras.models import Model
from keras.optimizers import SGD
from keras.preprocessing.image import ImageDataGenerator
import numpy as np
from keras.preprocessing import image
from sklearn.linear_model import LogisticRegression
from tensorflow.python.framework.ops import disable_eager_execution
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
# Download the architecture of ResNet50 with ImageNet weights
base_model = ResNet50(include_top=False, weights='imagenet')
# Taking the output of the last convolution block in ResNet50
x = base_model.output
# Adding a Global Average Pooling layer
x = GlobalAveragePooling2D()(x)
# Adding a fully connected layer having 1024 neurons
x = Dense(1024, activation='relu')(x)
# Adding a fully connected layer having 2 neurons which will
# give probability of image having either dog or cat
predictions = Dense(2, activation='softmax')(x)
# Model to be trained
model = Model(inputs=base_model.input, outputs=predictions)
# Training only top layers i.e. the layers which we have added in the end
for layer in base_model.layers:
layer.trainable = False
# Compiling the model
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy', metrics = ['accuracy'],
experimental_run_tf_function=False)
# Creating objects for image augmentations
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
# Proving the path of training and test dataset
# Setting the image input size as (224, 224)
# We are using class mode as binary because there are only two classes in our data
training_set = train_datagen.flow_from_directory('training_set',
target_size = (224, 224),
batch_size = 32,
class_mode = 'categorical')
test_set = test_datagen.flow_from_directory('test_set',
target_size = (224, 224),
batch_size = 32,
class_mode = 'categorical')
# Training the model for 5 epochs
model.fit_generator(training_set,
steps_per_epoch = 8000,
epochs = 5,
validation_data = test_set,
validation_steps = 2000)
# We will try to train the last stage of ResNet50
for layer in base_model.layers[0:143]:
layer.trainable = False
for layer in base_model.layers[143:]:
layer.trainable = True
# Training the model for 10 epochs
model.fit_generator(training_set,
steps_per_epoch = 8000,
epochs = 10,
validation_data = test_set,
validation_steps = 2000)
# Saving the weights in the current directory
model.save_weights("resnet50_weights.h5")
# Predicting the final result of image
test_image = image.load_img('cat_or_dog_test.jpg', target_size = (224, 224))
test_image = image.img_to_array(test_image)\
# Expanding the 3-d image to 4-d image.
# The dimensions will be Batch, Height, Width, Channel
test_image = np.expand_dims(test_image, axis = 0)
# Predicting the final class
classifier = LogisticRegression()
result = classifier.predict(test_image)
# Fetching the class labels
labels = training_set.class_indices
labels = list(labels.items())
# Printing the final label
for label, i in labels:
if i == result:
print("The test image has: ", label)
break

I had the same problem when using: from keras import Input;
But, when I change to: from tensorflow.keras import Input, it works!

I assume that the following line is where the error occurs:
test_image = np.expand_dims(test_image, axis = 0)
The reason is probably that you try to apply a numpy function to a tensor. Don't do that. Either convert your tensor to numpy or use a function that work on tensors. Normally, I'd say prefer the second option over the first one (it will prevent unnecessary conversions and make your code more efficient). In your case you will need to convert your tensor to numpy because you are using sklearn afterward:
test_image = np.expand_dims(test_image.numpy(), axis=0)

I am new to DL and I received a similar error a nd the following has helped me.
Try:
del base_model
Before:
base_model = ResNet50(include_top=False, weights='imagenet')
and also simultaneously:
Try:
del model
Before:
model = Model(inputs=base_model.input, outputs=predictions)
Please let me know if this has helped you or hasn't :) .

Try using tensorflow.keras.something instead of keras.something.
It worked for me.
Ofcourse you have to also import tensorlfow

'Concatenate' object has no attribute 'Flatten'

I'm trying to extract features from a pretrained model and use on my own model. I can successfully instantiate the Inveption V3 Model and save the outputs tu use as inputs for my model, but as i try to use it i get error. I tried to delete the Flatten layer but looks like the problem isnt this. I think the problem is about the last_output but have no clue on how to solve it.
The code:
#%% Imports.
import tensorflow as tf
from tensorflow.keras.layers import Dense, Conv2D, Flatten, Dropout, MaxPooling2D
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras import layers, Model
from tensorflow.keras.applications.inception_v3 import InceptionV3
import os, signal
import numpy as np
#%% Instatiate an Inception V3 model
url = "https://storage.googleapis.com/mledu-datasets/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5" # Get the weights from the pretrained model
local_weights_file = tf.keras.utils.get_file("inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5", origin = url, extract = True)
pre_trained_model = InceptionV3(input_shape=(150, 150, 3), include_top=False, weights=None) # include_top=False argument, we load a network that doesn't include
pre_trained_model.load_weights(local_weights_file) # the classification layers at the top—ideal for feature extraction.
# Make the model non-trainable, since we will only use it for feature extraction; we won't update the weights of the pretrained model during training.
for layers in pre_trained_model.layers:
layers.trainable = False
# The layer we will use for feature extraction in Inception v3 is called mixed7. It is not the bottleneck of the network, but we are using it to keep a
# sufficiently large feature map (7x7 in this case). (Using the bottleneck layer would have resulting in a 3x3 feature map, which is a bit small.)
last_layer = pre_trained_model.get_layer('mixed7')
print('last layer output shape:', last_layer.output_shape)
last_output = last_layer.output
print(last_output)
# %% Stick a fully connected classifier on top of last_output
# Flatten the output layer to 1 dimension
x = layers.Flatten()(last_output)
# Add a fully connected layer with 1,024 hidden units and ReLU activation
x = layers.Dense(1024, activation='relu')(x)
# Add a dropout rate of 0.2
x = layers.Dropout(0.2)(x)
# Add a final sigmoid layer for classification
x = layers.Dense(1, activation='sigmoid')(x)
# Configure and compile the model
model = Model(pre_trained_model.input, x)
model.compile(loss='binary_crossentropy',
optimizer=RMSprop(lr=0.0001),
metrics=['acc'])
the error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
c:\Users\jpaul\Code\Google_ML_Crash_Course\02_Practica\02_Image_Classification\image_classification_part3.py in
39 # Flatten the output layer to 1 dimension
----> 40 x = layers.Flatten()(last_output)
41
42 # Add a fully connected layer with 1,024 hidden units and ReLU activation
43 x = layers.Dense(1024, activation='relu')(x)
AttributeError: 'Concatenate' object has no attribute 'Flatten'

In your for loop, you overwrote the layers identifier from the import statement of
from tensorflow.keras import layers
So when you try to create a new Flatten() layer, the identifier layers contains a Concatenate object rather than the Keras layers module you were expecting.
Change the variable name in your for loop and you should be good.

interpreting get_weight in LSTM model in keras

This is my simple reproducible code:
from keras.callbacks import ModelCheckpoint
from keras.models import Model
from keras.models import load_model
import keras
import numpy as np
SEQUENCE_LEN = 45
LATENT_SIZE = 20
VOCAB_SIZE = 100
inputs = keras.layers.Input(shape=(SEQUENCE_LEN, VOCAB_SIZE), name="input")
encoded = keras.layers.Bidirectional(keras.layers.LSTM(LATENT_SIZE), merge_mode="sum", name="encoder_lstm")(inputs)
decoded = keras.layers.RepeatVector(SEQUENCE_LEN, name="repeater")(encoded)
decoded = keras.layers.Bidirectional(keras.layers.LSTM(VOCAB_SIZE, return_sequences=True), merge_mode="sum", name="decoder_lstm")(decoded)
autoencoder = keras.models.Model(inputs, decoded)
autoencoder.compile(optimizer="sgd", loss='mse')
autoencoder.summary()
x = np.random.randint(0, 90, size=(10, SEQUENCE_LEN,VOCAB_SIZE))
y = np.random.normal(size=(10, SEQUENCE_LEN, VOCAB_SIZE))
NUM_EPOCHS = 1
checkpoint = ModelCheckpoint(filepath='checkpoint/{epoch}.hdf5')
history = autoencoder.fit(x, y, epochs=NUM_EPOCHS,callbacks=[checkpoint])
and here is my code to have a look at the weights in the encoder layer:
for epoch in range(1, NUM_EPOCHS + 1):
file_name = "checkpoint/" + str(epoch) + ".hdf5"
lstm_autoencoder = load_model(file_name)
encoder = Model(lstm_autoencoder.input, lstm_autoencoder.get_layer('encoder_lstm').output)
print(encoder.output_shape[1])
weights = encoder.get_weights()[0]
print(weights.shape)
for idx in range(encoder.output_shape[1]):
token_idx = np.argsort(weights[:, idx])[::-1]
here print(encoder.output_shape) is (None,20) and print(weights.shape) is (100, 80).
I understand that get_weight will print the weight transition after the layer.
The part I did not get based on this architecture is 80. what is it?
And, are the weights here the weight that connect the encoder layer to the decoder? I meant the connection between encoder and the decoder.
I had a look at this question here. as it is only simple dense layers I could not connect the concept to the seq2seq model.
Update1
What is the difference between:
encoder.get_weights()[0] and encoder.get_weights()[1]?
the first one is (100,80) and the second one is (20,80) like conceptually?
any help is appreciated:)

The encoder as you have defined it is a model, and it consists of two layers: an input layer and the 'encoder_lstm' layer which is the bidirectional LSTM layer in the autoencoder. So its output shape would be the output shape of 'encoder_lstm' layer which is (None, 20) (because you have set LATENT_SIZE = 20 and merge_mode="sum"). So the output shape is correct and clear.
However, since encoder is a model, when you run encoder.get_weights() it would return the weights of all the layers in the model as a list. The bidirectional LSTM consists of two separate LSTM layers. Each of those LSTM layers has 3 weights: the kernel, the recurrent kernel and the biases. So encoder.get_weights() would return a list of 6 arrays, 3 for each of the LSTM layers. The first element of this list, as you have stored in weights and is subject of your question, is the kernel of one of the LSTM layers. The kernel of an LSTM layer has a shape of (input_dim, 4 * lstm_units). The input dimension of 'encoder_lstm' layer is VOCAB_SIZE and its number of units is LATENT_SIZE. Therefore, we have (VOCAB_SIZE, 4 * LATENT_SIZE) = (100, 80) as the shape of kernel.

How do I use my own data in this CNN finetune code

I am trying to finetune inception-v3, so that it is able to make a decision between an image with signal present and signal absent. How do I edit the code so that it can train on my data? Here is the code to finetune inception-v3:
from keras.applications.inception_v3 import InceptionV3
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K
# create the base pre-trained model
base_model = InceptionV3(weights='imagenet',
include_top=False)
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 200 classes
predictions = Dense(200, activation='softmax')(x)
# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
# first: train only the top layers (which were randomly
# initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
layer.trainable = False
# compile the model (should be done *after* setting layers
# to non-trainable)
model.compile(optimizer='rmsprop',loss='categorical_
crossentropy')
# train the model on the new data for a few epochs
model.fit_generator(...)
# at this point, the top layers are well trained and we can
# start fine-tuning
# convolutional layers from inception V3. We will freeze the
# bottom N layers
# and train the remaining top layers.
# let's visualize layer names and layer indices to see how
# many layers
# we should freeze:
for i, layer in enumerate(base_model.layers):
print(i, layer.name)
# we chose to train the top 2 inception blocks, i.e. we will
#freeze
# the first 249 layers and unfreeze the rest:
for layer in model.layers[:249]:
layer.trainable = False
for layer in model.layers[249:]:
layer.trainable = True
# we need to recompile the model for these modifications to
# take effect
# we use SGD with a low learning rate
from keras.optimizers import SGD
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9),
loss='categorical_crossentropy')
# we train our model again (this time fine-tuning the top 2
#inception blocks
# alongside the top Dense layers
model.fit_generator(...)
I would greatly appreciate any help you might give.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Tensorflow CNN transfer learning has constant low validation accuracy - python

It looks like you didn't add an activation function in your last layer. Maybe it would be helpful to use the softmax activation function. nmodel.add(keras.layers.Dense(units=4, activation= "softmax", name='dense_last'))

Related

Why the accuracy is high but the result for confusion matrix is bad?

Cannot convert a symbolic input/output to a numpy array

'Concatenate' object has no attribute 'Flatten'

interpreting get_weight in LSTM model in keras

How do I use my own data in this CNN finetune code

Categories

Resources