How do i access data after model.add(Flatten()) layer? - python

I am trying to use CNN for feature extraction and XGboost for classification of a image data. I researched and found that it could be done by extracting the data after the convolution layers. I found some source code for similar problem and tried on my own.
model = Sequential()
model.add(Conv2D(32, kernel_size=(3,3), strides=(1,1), padding='same', activation="relu", input_shape = data.shape[1:]))
model.add(MaxPool2D(pool_size=(2,2), strides=(2,2)))
model.add(Conv2D(64, kernel_size=(3,3), padding='same', strides=(1,1), activation="relu") )
model.add(MaxPool2D(pool_size=(2,2), strides=(2,2))) #max pool window 2x2
model.add(Conv2D(128, kernel_size=(3,3), padding='same', strides=(1,1), activation="relu"))
model.add(MaxPool2D(pool_size=(2,2), strides=(2,2))) #max pool window 2x2
model.add(Conv2D(256, kernel_size=(3,3), padding='same', strides=(1,1), activation="relu"))
model.add(MaxPool2D(pool_size=(2,2), strides=(2,2))) #max pool window 2x2
model.add(Flatten())
model.add(Dense(128, activation="relu", name='firstDenseLayer'))
model.add(Dense(1, activation="sigmoid"))
# model.summary()
# print(model)
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
model.fit(data, label, batch_size=16, epochs=10, validation_data=(val_data, val_label))
Below i accessed the dense layer named "firstDenseLayer".
import xgboost as xgb
from keras.models import Model
layerName = 'firstDenseLayer'
intermediate_layer_model = Model(inputs=model.input,
outputs=model.get_layer(layerName).output)
intermediate_output = intermediate_layer_model.predict(data)
from xgboost import XGBClassifier
xgbmodel = XGBClassifier(objective='multi:softmax', num_class= 2)
xgbmodel.fit(intermediate_output, label)
xgbmodel.score(intermediate_output, label)
As i am new in this, i have several confusions.
How the data is being flowed. After i extract the features of the pictures via convolution layers, how do i actually access the data from there?
What is this line of code doing? What data is it extracting?
intermediate_output = intermediate_layer_model.predict(data)
When i omit(keep commented out) the below line,
model.fit(data, label, batch_size=16, epochs=10, validation_data=(val_data, val_label))
from the first snippet and run the XGboost model directly the XGboost gives low accuracy and when i don't it gives higher accuracy. Why is it being like that?
Kindly help me out. I am stuck with this for quite a while. I am just trying to access the extracted features data from the last convolution layer and use that data to do classification using XGboost. As i tried to follow the method that i found from online, i am not sure if it is the the only way of doing it. If there is another way kindly let me know.

The model.fit(...) line does what you would expect, it trains the convnet defined by model on some data and labels. Your classifier yielding lower accuracy when you're using randomly initialized weights (i.e. without running fit) is not surprising.
intermediate_layer_model is constructed as a keras model whose output is the dense layer just before the output of model. Note the name parameter given to the dense layer in the construction of model.
You could just as easily give a name to one of the Conv2D layers and access it the same way. Alternatively, you could store the layer in a python variable, i.e. instead of
model.add(Conv2D(256, kernel_size=(3,3), padding='same', strides=(1,1), activation="relu"))
in the model construction it could say
last_conv_layer = Conv2D(256, kernel_size=(3,3), padding='same', strides=(1,1), activation="relu")
model.add(last_conv_layer)
Then for the intermediate_layer_model you put
intermediate_layer_model = Model(inputs=model.input, outputs=last_conv_layer.output)

Related

Keras fit_generator() doesn't train properly

I am trying to create an image classifier using Keras and TensorFlow 2.0.0 backend.
I'm training this model on my local machine on a custom dataset containing a total of 17~ thousand images. The images vary in size and are located in three different folders (training, validation, and test), each containing two subfolders (one for each class).
I tried an architecture similar to VGG16, which yielded more than decent results on this dataset in the past. Note, there is a minor class imbalance in the data (52:48)
When I call fit_generator(), the model doesn't train well; although the training loss lowers slightly throughout the first epoch, it does not change much afterward. Using this architecture with higher regulation, I achieved 85% accuracy after 55~ epochs in the past.
Imports and hyperparameters
import tensorflow as tf
from tensorflow import keras
from keras import backend as k
from keras.layers import Dense, Dropout, Conv2D, MaxPooling2D, Flatten, Input, UpSampling2D
from keras.models import Sequential, Model, load_model
from keras.utils import to_categorical
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import ModelCheckpoint
TRAIN_PATH = 'data/train/'
VALID_PATH = 'data/validation/'
TEST_PATH = 'data/test/'
TARGET_SIZE = (256, 256)
RESCALE = 1.0 / 255
COLOR_MODE = 'grayscale'
EPOCHS = 2
BATCH_SIZE = 16
CLASSES = ['Damselflies', 'Dragonflies']
CLASS_MODE = 'categorical'
CHECKPOINT = "checkpoints/weights.hdf5"
Model
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu',
input_shape=(256, 256, 1), padding='same'))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.1))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.1))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.1))
model.add(Flatten())
model.add(Dense(516, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='Adam', metrics=['accuracy'])
In the past, I created a custom pipeline to reshape, grayscale, flip, and normalize the images; then, I trained the model using my CPU on batches of processed images.
I tried repeating the process using ImageDataGenerator, flow_from_directory, and GPU support.
# randomly flip images, and scale pixel values
trainGenerator = ImageDataGenerator(rescale=RESCALE,
horizontal_flip=True,
vertical_flip=True)
# only scale the pixel values validation images
validatioinGenerator = ImageDataGenerator(rescale=RESCALE)
# only scale the pixel values test images
testGenerator = ImageDataGenerator(rescale=RESCALE)
# instanciate train flow
trainFlow = trainGenerator.flow_from_directory(
TRAIN_PATH,
target_size = TARGET_SIZE,
batch_size = BATCH_SIZE,
classes = CLASSES,
color_mode = COLOR_MODE,
class_mode = CLASS_MODE,
shuffle=True
)
# instanciate validation flow
validationFlow = validatioinGenerator.flow_from_directory(
VALID_PATH,
target_size = TARGET_SIZE,
batch_size = BATCH_SIZE,
classes = CLASSES,
color_mode = COLOR_MODE,
class_mode= CLASS_MODE,
shuffle=True
)
Then, fitting the model using fit_generator.
checkpoints = ModelCheckpoint(CHECKPOINT, monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')
with tf.device('/GPU:0'):
model.fit_generator(
trainFlow,
validation_data=validationFlow,
callbacks=[checkpoints],
epochs=EPOCHS
)
I tried training it for 40 epochs.
The classifier achieves 52% after the first epoch and does not improve as time goes by.
Testing the classifier
testFlow = testGenerator.flow_from_directory(
TEST_PATH,
target_size = TARGET_SIZE,
batch_size = BATCH_SIZE,
classes = CLASSES,
color_mode = COLOR_MODE,
class_mode= CLASS_MODE,
)
ans = model.predict_generator(testFlow)
When I look at the predictions, the model predicts all the test images as the majority class with the same confidence [0.48498476, 0.51501524].
Have I made sure the data is correct?
Yes. I tested whether the generators yield processed images and their corresponding labels correctly.
Have I tried changing the loss function, activation function, and optimizer?
Yes. I tried changing the class mode to binary, the loss to binary_crossentropy, and changing the last layer to produce a single output with sigmoid activation. No, I did not change the optimizer. However, I did try to increase the learning rate.
Have I tried changing the model's architecture?
Yes. I tried increasing and decreasing model complexity.
Both more layers with less regularization and fewer layers with more regularization produced similar results.
Are the layers trainable?
Yes.
Is the GPU support implemented correctly?
I hope so.
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
Num GPUs Available: 1
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
config = tf.compat.v1.ConfigProto(log_device_placement=True)
config.gpu_options.allow_growth = True
sess = tf.compat.v1.Session(config=config)
print(sess)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: NVIDIA GeForce GTX 1050 with Max-Q Design, pci bus id: 0000:03:00.0, compute capability: 6.1
<tensorflow.python.client.session.Session object at 0x000001F9443E2CC0>
Have I tried transfer learning?
Not yet.
I found a similar unanswered question from 2017 keras-doesnt-train-using-fit-generator.
Thoughts?
The problem is with your model. I copied your code and ran it on a data set I have used before (which gets high accuracy) and got results similar to yours. I then substituted the simple model below
model = tf.keras.Sequential([
Conv2D(16, 3, padding='same', activation='relu', input_shape=(256 , 256,1)),
MaxPooling2D(),
Conv2D(32, 3, padding='same', activation='relu' ),
MaxPooling2D(),
Conv2D(64, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(128, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(256, 3, padding='same', activation='relu'),
MaxPooling2D(),
Flatten(),
Dense(128, activation='relu'),
Dropout(.3),
Dense(64, activation='relu'),
Dropout(.3),
Dense(2, activation='softmax')
])
model.compile(loss='categorical_crossentropy',
optimizer='Adam', metrics=['accuracy'])
The model trained properly. By the way model.fit_generator is depreciated. You can now just use model.fit which can now handle generators. I then took your model and removed all the dropout layers except for the last one and your model trained properly. Code is:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu',
input_shape=(256, 256, 1), padding='same'))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
#model.add(Dropout(0.1))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
#model.add(Dropout(0.1))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
#model.add(Dropout(0.1))
model.add(Flatten())
model.add(Dense(516, activation='relu'))
#model.add(Dropout(0.1))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='Adam', metrics=['accuracy'])
#Gerry P,
By accident, I found what's causing the error.
Removing from Keras import backend as k resolved the model's inability to learn.
That's not all. I also identified that the model you defined, not calling ModelCheckpoint, and not customizing class names affected the fitting process.
model = Sequential([
Conv2D(16, 3, padding='same', activation='relu', input_shape=(256 , 256, 1)),
MaxPooling2D(),
Conv2D(32, 3, padding='same', activation='relu' ),
MaxPooling2D(),
Conv2D(64, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(128, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(256, 3, padding='same', activation='relu'),
MaxPooling2D(),
Flatten(),
Dense(128, activation='relu'),
Dropout(.3),
Dense(64, activation='relu'),
Dropout(.3),
Dense(2, activation='softmax')
])
I commented that import to try and resolve an error that occurred when I copy-pasted your sequential model. Then, I forgot to uncomment it when I tested it beautiful or average dataset. I achieved over 80% accuracy after the third epoch. Then, I reverted the changes and tried it on my dataset, and it failed again.
As a bonus, not importing Keras's backend decreased the time it takes to train the model!
Lately, I had to re-install Keras and TensorFlow because they couldn't detect my GPU anymore. I probably made a mistake and installed an incompatible version of Keras.
CUDA==10.0
tensorflow-gpu==2.0.0
keras==2.3.1
Note, it's still not a 100% solution, and the problems arise every so often.
EDIT:
Whenever it doesn't work, simplify the model.
Changed batch size and stopped learning? Simplify the model.
Augmented the images further and stopped learning? Simplify the model.

Configuration of CNN model for recognition of sequential data - Architecture of the top of the CNN - Parallel Layers

I am trying to configure a network for character recognition of sequential data like license plates.
Now I would like to use the architecture which is noted in Table 3 in Deep Automatic Licence Plate Recognition system (link: http://www.ee.iisc.ac.in/people/faculty/soma.biswas/Papers/jain_icgvip2016_alpr.pdf).
The architecture the authors presented is this one:
The first layers are very common, but where I was stumbling was the top (the part in the red frame) of the architecture. They mention 11 parallel layers and I am really unsure how to get this in Python. I coded this architecture but it does not seem to be right to me.
model = Sequential()
model.add(Conv2D(64, kernel_size=(5, 5), input_shape = (32, 96, 3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(128, kernel_size=(3, 3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(256, kernel_size=(3, 3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(1024, activation = "relu"))
model.add(Dense(11*37, activation="Softmax"))
model.add(keras.layers.Reshape((11, 37)))
Could someone help? How do I have to code the top to get an equal architecture like the authors?
The code below can build the architecture described in the image.
import tensorflow as tf
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Conv2D, Flatten, MaxPooling2D, Dense, Input, Reshape, Concatenate, Dropout
def create_model(input_shape = (32, 96, 1)):
input_img = Input(shape=input_shape)
'''
Add the ST Layer here.
'''
model = Conv2D(64, kernel_size=(5, 5), input_shape = input_shape, activation = "relu")(input_img)
model = MaxPooling2D(pool_size=(2, 2))(model)
model = Dropout(0.25)(model)
model = Conv2D(128, kernel_size=(3, 3), input_shape = input_shape, activation = "relu")(model)
model = MaxPooling2D(pool_size=(2, 2))(model)
model = Dropout(0.25)(model)
model = Conv2D(256, kernel_size=(3, 3), input_shape = input_shape, activation = "relu")(model)
model = MaxPooling2D(pool_size=(2, 2))(model)
model = Dropout(0.25)(model)
model = Flatten()(model)
backbone = Dense(1024, activation="relu")(model)
branches = []
for i in range(11):
branches.append(backbone)
branches[i] = Dense(37, activation = "softmax", name="branch_"+str(i))(branches[i])
output = Concatenate(axis=1)(branches)
output = Reshape((11, 37))(output)
model = Model(input_img, output)
return model
From my understanding, your implementation is almost correct. The authors train 11 individual classifiers taking as input the output from the Fully Connected Layer. Here, you can think of "parallel" as "independent".
However, you cannot apply the Softmax activation right after the Fully Connected Layer. Since all the classifiers are independent, we want each of them to output a probability for each possible character. Putting things differently, we want the sum of the outputs of each classifier to be 1. Hence, the correct implementation would be:
...
model.add(Dense(1024, activation = "relu"))
# Feeding every neuron with the previous layer's output
model.add(Dense(11*37))
model.add(keras.layers.Reshape((11, 37)))
model.add(keras.activations.softmax(x, axis=1))

How can I extract predictions results after multiclass training?

I am trying to do some multiclass image classification with VGG-style, but unfortunately, I cannot reach my goal, and I think I have a stupid mistake in my code.
I have almost 16K images with 4 categories, let's say 1,2,3 and 4. With the following code, I import all the images. I don't want to write for each category in order not to make a huge post. 4 times lines like this:
path_101 = ("/media/data/working_dir/categories/101/")
train_101 = []
for png in os.listdir(path_101):
imageread = img.imread(path_101+png)
imageread = cv2.resize(imageread, (320,240)) #resizing
train_101.append(imageread)
After I concatenate them into one x_data variable.
x_data = np.concatenate((train_101, train_102, train_104, train_105), axis=0)
Following to this, I create my categorical data and do One Hot encoding:
# We create our classify data.
one = np.ones(len(train_101))
four = np.ones(len(train_104)) +3
two = np.ones(len(train_102)) + 1
five = np.ones(len(train_105)) + 4
y = np.concatenate((one, two, four, five), axis= 0).reshape(-1,1)
from sklearn.preprocessing import OneHotEncoder
labels=OneHotEncoder(categories = "auto", handle_unknown = "ignore")
y=labels.fit_transform(y).toarray()
And after these all I do the training part as you can see below:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x_data, y, test_size = 0.2)
# import Keras and layers libraries
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from keras.models import Sequential
from keras.layers import Dense, Conv2D, MaxPool2D , Flatten
model = Sequential()
model.add(Conv2D(input_shape=(240,320,3),filters=64,kernel_size=(3,3),padding="same", activation="relu"))
model.add(Conv2D(filters=64,kernel_size=(3,3),padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
model.add(Conv2D(filters=128, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=128, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
model.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
model.add(Flatten())
model.add(Dense(units=512,activation="relu"))
model.add(Dense(units=64, activation="relu"))
model.add(Dense(units=4, activation="softmax"))
model.add(Dense(units=4, activation="softmax"))
from keras.optimizers import Adam
opt = Adam(lr=0.001)
model.compile(optimizer=opt , loss=keras.losses.categorical_crossentropy, metrics=['accuracy'])
model.fit(x_train, y_train, verbose=1, epochs=5, validation_split=0.2)
results = model.evaluate(x_test, y_test, batch_size=64)
After I was having terrible results like 25% of accuracy. And for each epoch it was not changing. I taught that maybe my training output is not going well. I try to predict every 100th image of my initial db with this code:
prs=[]
for k in np.arange(1,15000,100):
imgg=x_data[k]
imgg=imgg[np.newaxis,...]
pr=model.predict_classes(imgg)
prs.append(pr[0])
print(prs)
And I was getting only 0s, or 2s depending on the class that has more input images. So something was wrong with it.
I am new in Neural Networks and maybe I did some amateur stuff.
How can I have class 0, if my y data was one-hot encoded 1,2,4 and 5?
I was thinking about the y variable, if I trained with one-hot encoding, maybe I should decode for predictions or what?
Thanks in advance! Do not hesitate to ask for details

"The first layer in a Sequential model must get an `inputShape` or `batchInputShape` argument." when loading Keras model with TensorFlow.js

I trained the following model using Keras (Version 2.2.4):
# imports ...
model = Sequential()
model.add(Conv2D(filters=64, kernel_size=5, data_format="channels_last", activation="relu"))
model.add(BatchNormalization())
model.add(MaxPooling2D(data_format="channels_last"))
model.add(Conv2D(filters=32, kernel_size=3, data_format="channels_last", activation="relu"))
model.add(BatchNormalization())
model.add(MaxPooling2D(data_format="channels_last"))
model.add(Flatten(data_format="channels_last"))
model.add(Dense(units=256, activation="relu"))
model.add(Dense(units=128, activation="relu"))
model.add(Dense(units=32, activation="relu"))
model.add(Dense(units=8, activation="softmax"))
# training ...
model.save("model.h5")
The inputs are 28 x 28 grayscale images of shape(28, 28, 1).
I converted the model with tensorflowjs_converter and now I want to load it in my website using TensorFlow.js (Version 1.1.0):
tf.loadLayersModel('./model/model.json')
This produces the following error:
The first layer in a Sequential model must get an `inputShape` or `batchInputShape` argument.
at new e (errors.ts:48)
at e.add (models.ts:440)
at e.fromConfig (models.ts:1020)
at vp (generic_utils.ts:277)
at nd (serialization.ts:31)
at models.ts:299
at common.ts:14
at Object.next (common.ts:14)
at o (common.ts:14)
How can I fix this error without having to retrain the model?
Try to adjust your neural net to this format:
input_img = Input(batch_shape=(None, 28,28,1))
layer1=Conv2D(filters=64, kernel_size=5, data_format="channels_last", activation="relu")(input_img)
layer2=BatchNormalization()(layer1)
.
.
.
final_layer=Dense(units=8, activation="softmax")(previous_layer)
... and so forth. At the end:
model = Model(inputs = input_img, outputs = final_layer)
You have to specify the input shape in the Conv2D layer of your keras model.
# imports ...
model = Sequential()
model.add(Conv2D(input_shape=(28, 28, 1), filters=64, kernel_size=5, data_format="channels_last", activation="relu"))
model.add(BatchNormalization())
model.add(MaxPooling2D(data_format="channels_last"))
model.add(Conv2D(filters=32, kernel_size=3, data_format="channels_last", activation="relu"))
model.add(BatchNormalization())
model.add(MaxPooling2D(data_format="channels_last"))
model.add(Flatten(data_format="channels_last"))
model.add(Dense(units=256, activation="relu"))
model.add(Dense(units=128, activation="relu"))
model.add(Dense(units=32, activation="relu"))
model.add(Dense(units=8, activation="softmax"))
# training ...
model.save("model.h5")
Best approach is to change your keras model and retrain.
Anyway, if you cannot retrain your network, you can manually edit your model.json file.
You need to find the input layer in your model.json file and add
"config": {
...
"batch_input_shape": [
null,
28,
28,
1
]
...
}

Extracting last layers of keras model as a submodel

Say we have a convolutional neural network M. I can extract features from images by using
extractor = Model(M.inputs, M.get_layer('last_conv').output)
features = extractor.predict(X)
How can I get the model that will predict classes using features?
I can't use the following lines because it requires the input of the model to be a placeholder.
predictor = Model([M.get_layer('next_layer').input], M.outputs)
pred = predictor.predict(features)
I also can't use K.function because later I want to use it as part of another model, so I will be appliyng predictor to tf.tensor, not np.array.
This is not the nicest solution, but it works:
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Dense, Dropout, Flatten
def cnn():
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=(28, 28, 1), name='l_01'))
model.add(Conv2D(64, (3, 3), activation='relu', name='l_02'))
model.add(MaxPooling2D(pool_size=(2, 2), name='l_03'))
model.add(Dropout(0.25, name='l_04'))
model.add(Flatten(name='l_05'))
model.add(Dense(128, activation='relu', name='l_06'))
model.add(Dropout(0.5, name='l_07'))
model.add(Dense(10, activation='softmax', name='l_08'))
return model
def predictor(input_shape):
model = Sequential()
model.add(Flatten(name='l_05', input_shape=(12, 12, 64)))
model.add(Dense(128, activation='relu', name='l_06'))
model.add(Dropout(0.5, name='l_07'))
model.add(Dense(10, activation='softmax', name='l_08'))
return model
cnn_model = cnn()
cnn_model.save('/tmp/cnn_model.h5')
predictor_model = predictor(cnn_model.output.shape)
predictor_model.load_weights('/tmp/cnn_model.h5', by_name=True)
Every layer in the model is indexed. So if you know which layers you need, you could loop through them, copying them into a new model. This operation should copy the weights inside the layer as well.
Here's a model (from Oli Blum's answer):
model = Sequential()
# add some layers
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=(28, 28, 1), name='l_01'))
model.add(Conv2D(64, (3, 3), activation='relu', name='l_02'))
model.add(MaxPooling2D(pool_size=(2, 2), name='l_03'))
model.add(Dropout(0.25, name='l_04'))
model.add(Flatten(name='l_05'))
model.add(Dense(128, activation='relu', name='l_06'))
model.add(Dropout(0.5, name='l_07'))
model.add(Dense(10, activation='softmax', name='l_08'))
Say you wanted the last three layers:
def extract_layers(main_model, starting_layer_ix, ending_layer_ix):
# create an empty model
new_model = Sequential()
for ix in range(starting_layer_ix, ending_layer_ix + 1):
curr_layer = main_model.get_layer(index=ix)
# copy this layer over to the new model
new_model.add(curr_layer)
return new_model
It depends on what you want to do.
If you are going to throw away the feature extractor afterwards
If you plan on training the feature extractor later
If you are going to use the extracted features but you don't intend on training the model used to generate them, you could use the predict method to get the features as you did:
features = extractor.predict(X)
then save its output to a file (np.save or cPickle or whatever).
After that you could use that new dataset as the input to a new model.
If you plan on training the feature extractor later you'll need to stack the two networks as seen here with vgg as feature extractor https://github.com/fchollet/keras/issues/4576:
img_width, img_height = 150, 150
vgg16_model = VGG16(include_top=False, weights='imagenet')
input = Input(batch_shape=vgg16_model.output_shape)
x = GlobalAveragePooling2D()(input)
x = Dense(256, activation='relu')(x)
x = Dropout(0.5)(x)
predict = Dense(1, activation='sigmoid')(x)
top_model = Model(input, predict)
top_model.load_weights(os.path.join(data_path, 'VGG16Classifier.hdf5'))
input = Input(shape=(3, img_width, img_height))
x = vgg16_model(input)
predict = top_model(x)
model = Model(input, predict)
PS: This example uses channels first ordering. If you are using tensorflow you should change the shape to shape=(img_width, img_height,3 )

Categories

Resources