how to localize object with CNN? - python

model = keras.Sequential([
keras.layers.Conv2D(32,(3,3),padding = 'same' ,activation = 'relu', input_shape= (image_width,image_height,image_channels)),
keras.layers.MaxPooling2D(pool_size = (2,2)),
keras.layers.Dropout(0.25),
keras.layers.Conv2D(64,(3,3),padding = 'same',activation = 'relu'),
keras.layers.MaxPooling2D(pool_size = (2,2)),
keras.layers.Dropout(0.25),
keras.layers.Conv2D(64,(3,3),padding = "same",activation = 'relu'),
keras.layers.MaxPooling2D(pool_size = (2,2)),
keras.layers.Dropout(0.25),
keras.layers.Flatten(),
keras.layers.Dense(256, activation = 'relu'),
keras.layers.Dropout(0.25),
keras.layers.Dense(5,activation = 'softmax')
])
This is my model that have been trained and get above 85% of accuracy for classifying 5 classess of flower. For the classification it works well and successfully classify most of the images. But for the object localization what layers should I add up in order to localize and detect the flower object within the test images?

Related

Cascaded Neural Network architecture and input using TensorFlow

I am currently replicating a machine learning architecture found in this paper:
https://downloads.hindawi.com/journals/acisc/2018/1439312.pdf
Specifically on page 4 of the paper:
Any ideas on how to implement this on TensorFlow? My current model looks like this:
BATCH_SIZE = 32
INPUT_SIZE1 = (15, 30, 3)
INPUT_SIZE2 = (30, 60, 3)
INPUT_SIZE3 = (40 ,80, 3)
LEARNING_RATE = 0.001
EPOCHS = 10
CNN_CR1_INPUT = keras.Input(shape = INPUT_SIZE1)
CNN_CR1 = Conv2D(64, (5, 5), strides=2, padding='same', activation='relu')(CNN_CR1_INPUT)
CNN_CR1 = MaxPooling2D(3,3)(CNN_CR1)
CNN_CR1 = Conv2D(64, (3,3), strides=1, padding='same', activation='relu')(CNN_CR1)
CNN_CR1 = Conv2D(64, (3,3), strides=2, padding='same', activation='relu')(CNN_CR1)
CNN_CR1 = Flatten()(CNN_CR1)
CNN_CR1_OUTPUT = Dense(1)(CNN_CR1)
CNN_CR2_INPUT = keras.Input(shape = INPUT_SIZE2)
CNN_CR2 = Conv2D(64, (5,5), strides=2, padding='same', activation='relu')(CNN_CR2_INPUT)
CNN_CR2 = MaxPooling2D(3, 3)(CNN_CR2)
CNN_CR2 = Conv2D(64, (3,3), strides=1, padding='same', activation='relu')(CNN_CR2)
CNN_CR2 = Conv2D(64, (3,3), strides=2, padding='same', activation='relu')(CNN_CR2)
CNN_CR2 = Flatten()(CNN_CR2)
CNN_CR2_OUTPUT = Dense(1)(CNN_CR2)
CNN_CR3_INPUT = keras.Input(shape = INPUT_SIZE3)
CNN_CR3 = Conv2D(64, (5,5), strides=2, padding='same', activation='relu')(CNN_CR3_INPUT)
CNN_CR3 = MaxPooling2D(3, 3)(CNN_CR3)
CNN_CR3 = Conv2D(64, (3,3), strides=1, padding='same', activation='relu')(CNN_CR3)
CNN_CR3 = Conv2D(64, (3,3), strides=2, padding='same', activation='relu')(CNN_CR3)
CNN_CR3 = Flatten()(CNN_CR3)
CNN_CR3_OUTPUT = Dense(1)(CNN_CR3)
# SUGGESTION: This kinda weird. If this works, we only need 1 valitadation datagen? Not sure how it all connect together.
CNN_MAX = Maximum()([CNN_CR1_OUTPUT, CNN_CR2_OUTPUT, CNN_CR3_OUTPUT])
CNN_MODEL = keras.Model(inputs=[CNN_CR1_INPUT, CNN_CR2_INPUT, CNN_CR3_INPUT], outputs=[CNN_MAX])
I am not sure if the model I make is correct or not and I need some assistance. Also, how do you create the input pipeline for a cascading neural network like this one? I already tried this:
TRAIN_DATAGEN1 = ImageDataGenerator(
# SUGGESTION: Not sure if this is needed??
rescale = 1/255.0
)
TRAIN_GENERATOR1 = TRAIN_DATAGEN1.flow_from_directory(
os.path.join(WORKING_DATASETS['GI4E']['train']['images'], '0'),
target_size = (15, 30),
class_mode ='binary',
batch_size = BATCH_SIZE
)
TEST_DATAGEN1 = ImageDataGenerator(
rescale = 1/255.0,
)
TEST_GENERATOR1 = TEST_DATAGEN1.flow_from_directory(
os.path.join(WORKING_DATASETS['GI4E']['test']['images'], '0'),
target_size = (15, 30),
class_mode ='binary',
batch_size = BATCH_SIZE
)
# CNN 2
TRAIN_DATAGEN2 = ImageDataGenerator(
rescale = 1/255.0
)
TRAIN_GENERATOR2 = TRAIN_DATAGEN2.flow_from_directory(
os.path.join(WORKING_DATASETS['GI4E']['train']['images'], '1'),
target_size = (30, 60),
class_mode ='binary',
batch_size = BATCH_SIZE
)
TEST_DATAGEN2 = ImageDataGenerator(
rescale = 1/255.0,
)
TEST_GENERATOR2 = TEST_DATAGEN2.flow_from_directory(
os.path.join(WORKING_DATASETS['GI4E']['test']['images'], '1'),
target_size = (30, 60),
class_mode ='binary',
batch_size = BATCH_SIZE
)
# CNN 3
TRAIN_DATAGEN3 = ImageDataGenerator(
rescale = 1/255.0
)
TRAIN_GENERATOR3 = TRAIN_DATAGEN3.flow_from_directory(
os.path.join(WORKING_DATASETS['GI4E']['train']['images'], '2'),
target_size = (40 ,80),
class_mode = 'binary',
batch_size = BATCH_SIZE
)
TEST_DATAGEN3 = ImageDataGenerator(
rescale = 1/255.0,
)
TEST_GENERATOR3 = TEST_DATAGEN3.flow_from_directory(
os.path.join(WORKING_DATASETS['GI4E']['test']['images'], '2'),
target_size = (40 ,80),
class_mode = 'binary',
batch_size = BATCH_SIZE
)
But it spits out an error when I try to fit it.
ValueError: Failed to find data adapter that can handle input: (<class 'list'> containing values of types {"<class 'keras.preprocessing.image.DirectoryIterator'>"}), <class 'NoneType'>
After digging through the documentation, I am aware that TensorFlow cannot handle a bunch of ImageDataGenerator together in a list. But I'm not sure how to feed images to the model without ImageDataGenerator.
So, to summarize:
How to recreate said model in TensorFlow? Is the model that I create correct already?
How to create the input pipeline for the model? Any alternative beside ImageDataGenerator?
You could try building the model with tf.data - load the datas with keras.utils , for data augmentation you can add layers on to model architecture (tf.keras.layers.Rescaling,tf.keras.layers.RandomRotation etc). There are other augmentation methods that are given in the docs below which are more efficient in utilizing the GPU that ImageDataGenerator
https://www.tensorflow.org/tutorials/images/data_augmentation#data_augmentation_2
https://www.tensorflow.org/api_docs/python/tf/keras/utils
https://www.tensorflow.org/guide/keras/preprocessing_layers

Keras how to write parallel model, for multiclass prediction

I have the following model, where keep_features=900 or so,y is one-hot encoding of classes. I am looking for the architecture below though(is that possible with keras, and what would the notation idea look like,specially the parallel part and the concatination)
model = Sequential()
model.add(Dense(keep_features, activation='relu'))
model.add(BatchNormalization())
model.add(Dense(256, activation='relu'))
model.add(BatchNormalization())
model.add(Dense(64, activation='relu'))
model.add(BatchNormalization())
model.add(Dense(3, activation='softmax'))
model.compile(loss=losses.categorical_crossentropy,optimizer='adam',metrics=['mae', 'acc'])
With the chapter "Multi-input and multi-output models" here you can make something like this for your desired model:
K = tf.keras
input1 = K.layers.Input(keep_features_shape)
denseA1 = K.layers.Dense(256, activation='relu')(input1)
denseB1 = K.layers.Dense(256, activation='relu')(input1)
denseC1 = K.layers.Dense(256, activation='relu')(input1)
batchA1 = K.layers.BatchNormalization()(denseA1)
batchB1 = K.layers.BatchNormalization()(denseB1)
batchC1 = K.layers.BatchNormalization()(denseC1)
denseA2 = K.layers.Dense(64, activation='relu')(batchA1)
denseB2 = K.layers.Dense(64, activation='relu')(batchB1)
denseC2 = K.layers.Dense(64, activation='relu')(batchC1)
batchA2 = K.layers.BatchNormalization()(denseA2)
batchB2 = K.layers.BatchNormalization()(denseB2)
batchC2 = K.layers.BatchNormalization()(denseC2)
denseA3 = K.layers.Dense(32, activation='softmax')(batchA2) # individual layer
denseB3 = K.layers.Dense(16, activation='softmax')(batchB2) # individual layer
denseC3 = K.layers.Dense(8, activation='softmax')(batchC2) # individual layer
concat1 = K.layers.Concatenate(axis=-1)([denseA3, denseB3, denseC3])
model = K.Model(inputs=[input1], outputs=[concat1])
model.compile(loss = K.losses.categorical_crossentropy, optimizer='adam', metrics=['mae', 'acc'])
This results in:

How and why we use CNN layer wrapped with time distributed layer?

I need to know how this code works. It's taking Embedding then it sends it into this model. model1 is CNN and moel2 is Time distributed layer. Why wrapping is done in this code, i didn't find article on this.
model1 = Sequential()
model1.add(Embedding(nb_words + 1,
embedding_dim,
weights = [word_embedding_matrix],
input_length = max_sentence_len,
trainable = False))
model1.add(Convolution1D(filters = nb_filter,
kernel_size = filter_length,
padding = 'same'))
model1.add(BatchNormalization())
model1.add(Activation('relu'))
model1.add(Dropout(dropout))
model1.add(Convolution1D(filters = nb_filter,
kernel_size = filter_length,
padding = 'same'))
model1.add(BatchNormalization())
model1.add(Activation('relu'))
model1.add(Dropout(dropout))
model1.add(Flatten())
model2 = Sequential()
model2.add(Embedding(nb_words + 1,
embedding_dim,
weights = [word_embedding_matrix],
input_length = max_sentence_len,
trainable = False))
model2.add(Convolution1D(filters = nb_filter,
kernel_size = filter_length,
padding = 'same'))
model2.add(BatchNormalization())
model2.add(Activation('relu'))
model2.add(Dropout(dropout))
model2.add(Convolution1D(filters = nb_filter,
kernel_size = filter_length,
padding = 'same'))
model2.add(BatchNormalization())
model2.add(Activation('relu'))
model2.add(Dropout(dropout))
model2.add(Flatten())
then it merges and getting the output. I don't understand the computation behind this.

What is the meaning of "validation_data will override validation_split." in keras model.fit documentation

I am new to python and machine learning. I have a confusion in the sentence in keras model.fiit that is "validation_data will override validation_split." Does that mean if I give validation data like this
history = model.fit(X_train, [train_labels_hotEncode,train_labels_hotEncode,train_labels_hotEncode],validation_data= (y_train,[test_labels_hotEncode,test_labels_hotEncode,test_labels_hotEncode]),train_labels_hotEncode]), validation_split=0.3 ,epochs=epochs, batch_size= 64, callbacks=[lr_sc])
The validation split will not be accepted? And the function will only use Validation_data instead of split?
Also, I am trying to test my data on 30% of training data.
But if I try to use model.fit with only validation_split = 0.3 the validation accuracy gets really ugly. I am using inception googleNet architecture for this.
loss: 1.8204 - output_loss: 1.1435 - auxilliary_output_1_loss: 1.1292 - auxilliary_output_2_loss: 1.1272 - output_acc: 0.3845 - auxilliary_output_1_acc: 0.3797 - auxilliary_output_2_acc: 0.3824 - val_loss: 9.7972 - val_output_loss: 6.6655 - val_auxilliary_output_1_loss: 5.0973 - val_auxilliary_output_2_loss: 5.3417 - val_output_acc: 0.0000e+00 - val_auxilliary_output_1_acc: 0.0000e+00 - val_auxilliary_output_2_acc: 0.0000e+00
CODE GOOGLENET
input_layer = Input(shape=(224,224,3))
image = Conv2D(64,(7,7),padding='same', strides=(2,2), activation='relu', name='conv_1_7x7/2', kernel_initializer=kernel_init, bias_initializer=bias_init)(input_layer)
image = MaxPool2D((3,3), padding='same', strides=(2,2), name='max_pool_1_3x3/2')(image)
image = Conv2D(64, (1,1), padding='same', strides=(1,1), activation='relu', name='conv_2a_3x3/1' )(image)
image = Conv2D(192, (3,3), padding='same', strides=(1,1), activation='relu', name='conv_2b_3x3/1')(image)
image = MaxPool2D((3,3), padding='same', strides=(2,2), name='max_pool_2_3x3/2')(image)
image = inception_module(image,
filters_1x1= 64,
filters_3x3_reduce= 96,
filter_3x3 = 128,
filters_5x5_reduce=16,
filters_5x5= 32,
filters_pool_proj=32,
name='inception_3a')
image = inception_module(image,
filters_1x1=128,
filters_3x3_reduce=128,
filter_3x3=192,
filters_5x5_reduce=32,
filters_5x5=96,
filters_pool_proj=64,
name='inception_3b')
image = MaxPool2D((3,3), padding='same', strides=(2,2), name='max_pool_3_3x3/2')(image)
image = inception_module(image,
filters_1x1=192,
filters_3x3_reduce=96,
filter_3x3=208,
filters_5x5_reduce=16,
filters_5x5=48,
filters_pool_proj=64,
name='inception_4a')
image1 = AveragePooling2D((5,5), strides=3)(image)
image1 = Conv2D(128, (1,1), padding='same', activation='relu')(image1)
image1 = Flatten()(image1)
image1 = Dense(1024, activation='relu')(image1)
image1 = Dropout(0.4)(image1)
image1 = Dense(5, activation='softmax', name='auxilliary_output_1')(image1)
image = inception_module(image,
filters_1x1 = 160,
filters_3x3_reduce= 112,
filter_3x3= 224,
filters_5x5_reduce= 24,
filters_5x5= 64,
filters_pool_proj=64,
name='inception_4b')
image = inception_module(image,
filters_1x1= 128,
filters_3x3_reduce = 128,
filter_3x3= 256,
filters_5x5_reduce= 24,
filters_5x5=64,
filters_pool_proj=64,
name='inception_4c')
image = inception_module(image,
filters_1x1=112,
filters_3x3_reduce=144,
filter_3x3= 288,
filters_5x5_reduce= 32,
filters_5x5=64,
filters_pool_proj=64,
name='inception_4d')
image2 = AveragePooling2D((5,5), strides=3)(image)
image2 = Conv2D(128, (1,1), padding='same', activation='relu')(image2)
image2 = Flatten()(image2)
image2 = Dense(1024, activation='relu')(image2)
image2 = Dropout(0.4)(image2) #Changed from 0.7
image2 = Dense(5, activation='softmax', name='auxilliary_output_2')(image2)
image = inception_module(image,
filters_1x1=256,
filters_3x3_reduce=160,
filter_3x3=320,
filters_5x5_reduce=32,
filters_5x5=128,
filters_pool_proj=128,
name= 'inception_4e')
image = MaxPool2D((3,3), padding='same', strides=(2,2), name='max_pool_4_3x3/2')(image)
image = inception_module(image,
filters_1x1=256,
filters_3x3_reduce=160,
filter_3x3= 320,
filters_5x5_reduce=32,
filters_5x5= 128,
filters_pool_proj=128,
name='inception_5a')
image = inception_module(image,
filters_1x1=384,
filters_3x3_reduce=192,
filter_3x3=384,
filters_5x5_reduce=48,
filters_5x5=128,
filters_pool_proj=128,
name='inception_5b')
image = GlobalAveragePooling2D(name='avg_pool_5_3x3/1')(image)
image = Dropout(0.4)(image)
image = Dense(5, activation='softmax', name='output')(image)
model = Model(input_layer, [image,image1,image2], name='inception_v1')
model.summary()
epochs = 2
initial_lrate = 0.01 # Changed From 0.01
def decay(epoch, steps=100):
initial_lrate = 0.01
drop = 0.96
epochs_drop = 8
lrate = initial_lrate * math.pow(drop,math.floor((1+epoch)/epochs_drop))#
return lrate
sgd = keras.optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
# nadam = keras.optimizers.Nadam(lr= 0.002, beta_1=0.9, beta_2=0.999, epsilon=None)
# keras
lr_sc = LearningRateScheduler(decay)
# rms = keras.optimizers.RMSprop(lr = initial_lrate, rho=0.9, epsilon=1e-08, decay=0.0)
# ad = keras.optimizers.adam(lr=initial_lrate)
model.compile(loss=['categorical_crossentropy', 'categorical_crossentropy','categorical_crossentropy'],loss_weights=[1,0.3,0.3], optimizer='sgd', metrics=['accuracy'])
# loss = 'categorical_crossentropy', 'categorical_crossentropy','categorical_crossentropy'
history = model.fit(X_train, [train_labels_hotEncode,train_labels_hotEncode,train_labels_hotEncode], validation_split=0.3 ,epochs=epochs, batch_size= 32, callbacks=[lr_sc])
Thanks,
validation_split is a parameter that gets passed in. It's a number that determines how your data should be partitioned into training and validation sets. For example if validation_split = 0.1 then 10% of your data will be used in the validation set and 90% of your data will be used in the test set.
validation_data is a parameter where you explicitly pass in the validation set. If you pass in validation data, keras uses your explicitly passed in data instead of computing the validation set using validation_split. This is what it means by "ignore" - passing in an argument for validation_data overrides whatever you pass in for validation_split.
In your situation since you want to use 30% of your data as validation data, simply pass in validation_split=0.3 and don't pass in an argument for validation_data.

How to add svm on top of cnn as final classifier?

I work on sentiment analysis task and i want to add SVM layer on top CNN as a final classifier, how can i do that without using hing-loss?
tweet_input = Input(shape=(seq_len,), dtype='int32')
tweet_encoder = Embedding(vocabulary_size, EMBEDDING_DIM,
input_length=seq_len, trainable=True)(tweet_input)
bigram_branch = Conv1D(filters=64, kernel_size=2, padding='same',
activation='relu', strides=1)(tweet_encoder)
bigram_branch = GlobalMaxPooling1D()(bigram_branch)
trigram_branch = Conv1D(filters=32, kernel_size=3, padding='same',
activation='relu', strides=1)(tweet_encoder)
trigram_branch = GlobalMaxPooling1D()(trigram_branch)
fourgram_branch = Conv1D(filters=16, kernel_size=4, padding='same',
activation='relu', strides=1)(tweet_encoder)
fourgram_branch = GlobalMaxPooling1D()(fourgram_branch)
merged = concatenate([bigram_branch, trigram_branch, fourgram_branch], axis=1)
merged = Dense(512, activation='softmax')(merged)
merged = Dropout(0.8)(merged)
merged = Dense(2)(merged)
output = Activation('sigmoid')(merged)
model = Model(inputs=[tweet_input], outputs=[output])
adam=keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
model.compile(loss='hinge',
optimizer= adam,
metrics=['accuracy'])
model.summary()

Categories

Resources