how to know which node is dropped after using keras dropout layer - python

From nick blog it is clear that in dropout layer of CNN model we drop some nodes on the basis of bernoulli. But how to verify it, i.e. how to check which node is not selected. In DropConnect we leave some weights so I think with the help of model.get_weights() we can verify, but how in the case of dropout layer.
model = Sequential()
model.add(Conv2D(2, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(4, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(8, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.binary_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
Another question is that it is mention in keras that dropout rate should float b/w 0 to 1. But for above model when I take dropout rate = 1.25, then also my model is working, how this happens?

Concerning your second question, if you see Keras code, in the call method form Dropout class:
def call(self, inputs, training=None):
if 0. < self.rate < 1.:
noise_shape = self._get_noise_shape(inputs)
def dropped_inputs():
return K.dropout(inputs, self.rate, noise_shape,
seed=self.seed)
return K.in_train_phase(dropped_inputs, inputs,
training=training)
return inputs
This means that if the rate is not between 0 and 1, it will do nothing.

Related

How many layers should I stack in a sequential model?

I am trying to train a sequential model using the LSTM layer.
The size of sequence data for learning is as follows:
x = np.array(sequences)
y = to_categorical(labels).astype(int)
x.shape => (1800, 34, 48)
y.shape => (1800, 20)
After that, I make a sequential model and try to stack the LSTM layer and the dense layer, but I don't know how much to do that.
First, I did something like this:
model = Sequential()
model.add(LSTM(64, return_sequences=True, activation='relu', input_shape=x_train.shape[1:3]))
model.add(LSTM(128, return_sequences=True, activation='relu'))
model.add(LSTM(64, return_sequences=False, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(actions.shape[0], activation='softmax'))
model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics=['acc'])
model.summary()
However, this doesn't seem to fit my case as I followed someone else's code.
How many layers should I stack in a sequential model?

TypeError: __init__() missing 1 required positional argument: 'units' issue [duplicate]

I am working in python and tensor flow but I miss 'units' argument and I do not know how to solve it, It looks like your post is mostly code; please add some more details.It looks like your post is mostly code; please add some more details.
here the code
def createModel():
model = Sequential()
# first set of CONV => RELU => MAX POOL layers
model.add(Conv2D(32, (3, 3), padding='same', activation='relu', input_shape=inputShape))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(output_dim=NUM_CLASSES, activation='softmax'))
# returns our fully constructed deep learning + Keras image classifier
opt = Adam(lr=INIT_LR, decay=INIT_LR / EPOCHS)
# use binary_crossentropy if there are two classes
model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy"])
return model
print("Reshaping trainX at..."+ str(datetime.now()))
#print(trainX.sample())
print(type(trainX)) # <class 'pandas.core.series.Series'>
print(trainX.shape) # (750,)
from numpy import zeros
Xtrain = np.zeros([trainX.shape[0],HEIGHT, WIDTH, DEPTH])
for i in range(trainX.shape[0]): # 0 to traindf Size -1
Xtrain[i] = trainX[i]
print(Xtrain.shape) # (750,128,128,3)
print("Reshaped trainX at..."+ str(datetime.now()))
print("Reshaping valX at..."+ str(datetime.now()))
print(type(valX)) # <class 'pandas.core.series.Series'>
print(valX.shape) # (250,)
from numpy import zeros
Xval = np.zeros([valX.shape[0],HEIGHT, WIDTH, DEPTH])
for i in range(valX.shape[0]): # 0 to traindf Size -1
Xval[i] = valX[i]
print(Xval.shape) # (250,128,128,3)
print("Reshaped valX at..."+ str(datetime.now()))
# initialize the model
print("compiling model...")
sys.stdout.flush()
model = createModel()
# print the summary of model
from keras.utils import print_summary
print_summary(model, line_length=None, positions=None, print_fn=None)
# add some visualization
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot
SVG(model_to_dot(model).create(prog='dot', format='svg'))
Try changing this line:
model.add(Dense(output_dim=NUM_CLASSES, activation='softmax'))
to
model.add(Dense(NUM_CLASSES, activation='softmax'))
I'm not experience in keras but I could not find a parameter called output_dim on the documentation page for Dense. I think you meant to provide units but labelled it as output_dim
The Keras Dense layer documentation is as follows:
keras.layers.Dense(units, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)
Using the following :
classifier.add(Dense(6, activation='relu', kernel_initializer='glorot_uniform',input_dim=11))
Will work as here the units means the output_dim saying that we need 6 neurons in the hidden layer. The weights are initialized with the uniform function and the input layer has 11 independent variables of the dataset (input_dim) to feed the above-hidden layer.
I think it's a version issue. In updated version of keras for Dense there is no "output_dim" argument.
You can see this documentation link for Dense arguments.
https://keras.io/api/layers/core_layers/dense/
tf.keras.layers.Dense(
units,
activation=None,
use_bias=True,
kernel_initializer="glorot_uniform",
bias_initializer="zeros",
kernel_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
bias_constraint=None,
**kwargs
)
So the first argument is "units", Which is mandatory.
instead of this line:
model.add(Dense(output_dim=NUM_CLASSES, activation='softmax'))
use this:
model.add(Dense(units=NUM_CLASSES, activation='softmax'))
or
model.add(Dense(NUM_CLASSES, activation='softmax'))

How do i access data after model.add(Flatten()) layer?

I am trying to use CNN for feature extraction and XGboost for classification of a image data. I researched and found that it could be done by extracting the data after the convolution layers. I found some source code for similar problem and tried on my own.
model = Sequential()
model.add(Conv2D(32, kernel_size=(3,3), strides=(1,1), padding='same', activation="relu", input_shape = data.shape[1:]))
model.add(MaxPool2D(pool_size=(2,2), strides=(2,2)))
model.add(Conv2D(64, kernel_size=(3,3), padding='same', strides=(1,1), activation="relu") )
model.add(MaxPool2D(pool_size=(2,2), strides=(2,2))) #max pool window 2x2
model.add(Conv2D(128, kernel_size=(3,3), padding='same', strides=(1,1), activation="relu"))
model.add(MaxPool2D(pool_size=(2,2), strides=(2,2))) #max pool window 2x2
model.add(Conv2D(256, kernel_size=(3,3), padding='same', strides=(1,1), activation="relu"))
model.add(MaxPool2D(pool_size=(2,2), strides=(2,2))) #max pool window 2x2
model.add(Flatten())
model.add(Dense(128, activation="relu", name='firstDenseLayer'))
model.add(Dense(1, activation="sigmoid"))
# model.summary()
# print(model)
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
model.fit(data, label, batch_size=16, epochs=10, validation_data=(val_data, val_label))
Below i accessed the dense layer named "firstDenseLayer".
import xgboost as xgb
from keras.models import Model
layerName = 'firstDenseLayer'
intermediate_layer_model = Model(inputs=model.input,
outputs=model.get_layer(layerName).output)
intermediate_output = intermediate_layer_model.predict(data)
from xgboost import XGBClassifier
xgbmodel = XGBClassifier(objective='multi:softmax', num_class= 2)
xgbmodel.fit(intermediate_output, label)
xgbmodel.score(intermediate_output, label)
As i am new in this, i have several confusions.
How the data is being flowed. After i extract the features of the pictures via convolution layers, how do i actually access the data from there?
What is this line of code doing? What data is it extracting?
intermediate_output = intermediate_layer_model.predict(data)
When i omit(keep commented out) the below line,
model.fit(data, label, batch_size=16, epochs=10, validation_data=(val_data, val_label))
from the first snippet and run the XGboost model directly the XGboost gives low accuracy and when i don't it gives higher accuracy. Why is it being like that?
Kindly help me out. I am stuck with this for quite a while. I am just trying to access the extracted features data from the last convolution layer and use that data to do classification using XGboost. As i tried to follow the method that i found from online, i am not sure if it is the the only way of doing it. If there is another way kindly let me know.
The model.fit(...) line does what you would expect, it trains the convnet defined by model on some data and labels. Your classifier yielding lower accuracy when you're using randomly initialized weights (i.e. without running fit) is not surprising.
intermediate_layer_model is constructed as a keras model whose output is the dense layer just before the output of model. Note the name parameter given to the dense layer in the construction of model.
You could just as easily give a name to one of the Conv2D layers and access it the same way. Alternatively, you could store the layer in a python variable, i.e. instead of
model.add(Conv2D(256, kernel_size=(3,3), padding='same', strides=(1,1), activation="relu"))
in the model construction it could say
last_conv_layer = Conv2D(256, kernel_size=(3,3), padding='same', strides=(1,1), activation="relu")
model.add(last_conv_layer)
Then for the intermediate_layer_model you put
intermediate_layer_model = Model(inputs=model.input, outputs=last_conv_layer.output)

Validation data performing worse than training data in keras

I am training a CNN on some text data. The sentences are padded and embedded and fed to a CNN. The model architecture is:
model = Sequential()
model.add(Embedding(max_features, embedding_dims, input_length=maxlen))
model.add(Conv1D(128, 5, activation='relu'))
model.add(GlobalMaxPooling1D())
model.add(Dense(50, activation = 'relu'))
model.add(BatchNormalization())
model.add(Dense(50, activation = 'relu'))
model.add(BatchNormalization())
model.add(Dense(25, activation = 'relu'))
#model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
Any help would be appreciated.
You model is over-fitting so the best practice is:
add layers and preferably that goes in the power of 2
instead of
model.add(Dense(50, activation = 'relu'))
use
model.add(Dense(64, activation = 'relu'))
and go with 512 128 64 32 16
add some dropout layers preferably after two layers.
train on bigger data.
You can try removing BatchNormalization and adding more Convolutional and Pooling Layer that may increase your accuracy.
You can also check out this -:
https://forums.fast.ai/t/batch-normalization-with-a-large-batch-size-breaks-validation-accuracy/7940

Keras ConvLSTM2D: why use the averagepooling3d and how to to regression

i have been studying Keras ConvLSTM2D: ValueError on output layer
i want to use the same code but i want to do regression ( single value ).
I dont know how to do this. And i also dont understand the use of last layers of this post code. Why is averagepolling3d used?
the code from link is
model = Sequential()
model.add(ConvLSTM2D(
filters=40,
kernel_size=(3, 3),
input_shape=(None, 135, 240, 1),
padding='same',
return_sequences=True))
model.add(BatchNormalization())
model.add(ConvLSTM2D(
filters=40,
kernel_size=(3, 3),
padding='same',
return_sequences=True))
model.add(BatchNormalization())
model.add(ConvLSTM2D(
filters=40,
kernel_size=(3, 3),
padding='same',
return_sequences=True))
model.add(BatchNormalization())
model.add(AveragePooling3D((1, 135, 240)))
model.add(Reshape((-1, 40)))
model.add(Dense(
units=9,
activation='sigmoid'))
model.compile(
loss='categorical_crossentropy',
optimizer='adadelta'
)
AveragePooling3D is used to reduce each frame in a sequence to a single value + to reduce the #parameters in the Dense Layer. So, the dimension becomes (None, 40 , 1 , 1 ,1 ). Then, using Reshape allows it to use for fully-connected part.
Also, as in Keras ConvLSTM2D: ValueError on output layer, AveragePooling3D is used instead of GlobalMaxPooling2D since data is 5D and Global operations leaves only (batch_size, channels) which is not desirable in your case.

Categories

Resources