I have a model that I load this way:
def YOLOv3_pretrained(n_classes=12, n_bbox=3):
yolo3 = tf.keras.models.load_model("yolov3/yolo3.h5")
yolo3.trainable = False
l3 = yolo3.get_layer('leaky_re_lu_71').output
l3_flat = tf.keras.layers.Flatten()(l3)
out3 = tf.keras.layers.Dense(100*(4+1+n_classes))(l3_flat)
out3 = Reshape((100, (4+1+n_classes)), input_shape=(12,))(out3)
yolo3 = Model(inputs=yolo3.input, outputs=[out3])
return yolo3
I want to add a Dense at the end of it but since it takes an input with shape (None, 416,416,3) it doesn't let me do it and it returns an error:
ValueError: The last dimension of the inputs to a Dense layer should be defined. Found None. Full input shape received: (None, None)
I also tried this way with a Sequential (I want to use just the last output of yolo):
def YOLOv3_Dense(n_classes=12):
yolo3 = tf.keras.models.load_model("yolov3/yolo3.h5")
model = Sequential()
model.add(yolo3)
model.add(Flatten())
model.add(Dense(100*(4+1+n_classes)))
model.add(Reshape((100, (4+1+n_classes)), input_shape=(413,413,3)))
return model
But it returns another error:
ValueError: All layers in a Sequential model should have a single output tensor. For multi-output layers, use the functional API.
Is there a way to add the final Dense layer?
The problem is that you are trying to reduce (flatten) an output with multiple None dimensions, which will not work if you want to use the output as input to another layer. You can try using a GlobalAveragePooling2D or GlobalMaxPooling2D instead:
import tensorflow as tf
yolo3 = tf.keras.models.load_model("yolo3.h5")
yolo3.trainable = False
l3 = yolo3.get_layer('leaky_re_lu_71').output
l3_flat = tf.keras.layers.GlobalMaxPooling2D()(l3)
out3 = tf.keras.layers.Dense(100*(4+1+12))(l3_flat)
out3 = tf.keras.layers.Reshape((100, (4+1+12)), input_shape=(12,))(out3)
yolo3 = tf.keras.Model(inputs=yolo3.input, outputs=[out3])
Related
I am trying to concatenate two sequential models. I have a model which is a concatenation of two sub-models, each of which is a concatenation of two sequential models. I have the following code but it doesn't work with Keras 2.3.0
model = Sequential()
sub_model1 = Sequential()
sub_model_channel1 = Sequential()
sub_model_channel2 = Sequential()
sub_model_channel1.add(Dropout(dropout_prob[0], input_shape=(channels, sequence_length,sequence_length)))
sub_model_channel2.add(Dropout(dropout_prob[0], input_shape=(channels, sequence_length,sequence_length)))
in1 = Input(shape=(channels, sequence_length,sequence_length))
in2 = Input(shape=(channels, sequence_length,sequence_length))
convs1 = model_unichannel(in1)
convs2 = model_unichannel(in2)
out1 = Concatenate()(convs1)
out2 = Concatenate()(convs2)
m1 = Model(inputs=in1, outputs=out1)
m2 = Model(inputs=in2, outputs=out2)
sub_model_channel1.add(m1)
sub_model_channel2.add(m2)
m = Concatenate()([sub_model_channel1, sub_model_channel2])
sub_model1.add(m)
model.add(sub_model1)
I am getting the following error
ValueError: Layer concatenate_3 was called with an input that isn't a symbolic tensor. Received type: <class 'keras.engine.sequential.Sequential'>.
in the line m = Concatenate()([sub_model_channel1, sub_model_channel2]).
I have already looked at following solutions but nothing really solves my problem.
1) ValueError with Concatenate Layer (Keras functional API)
2) Merge 2 sequential models in Keras
I modified my code following the approach in the second link.
model = Sequential()
sub_model_channel1 = Sequential()
sub_model_channel2 = Sequential()
sub_model_channel1.add(Dropout(dropout_prob[0], input_shape=(channels, sequence_length,sequence_length)))
sub_model_channel2.add(Dropout(dropout_prob[0], input_shape=(channels, sequence_length,sequence_length)))
in1 = Input(shape=(channels, sequence_length,sequence_length))
in2 = Input(shape=(channels, sequence_length,sequence_length))
convs1 = model_unichannel(in1) #adds Conv, MaxPooling and Flatten layer
convs2 = model_unichannel(in2)
out1 = Concatenate()(convs1)
out2 = Concatenate()(convs2)
m1 = Model(inputs=in1, outputs=out1)
m2 = Model(inputs=in2, outputs=out2)
sub_model_channel1.add(m1)
sub_model_channel2.add(m2)
m = Concatenate()([sub_model_channel1.output, sub_model_channel2.output])
sub_model1 = Model([sub_model_channel1.input,sub_model_channel2.input], m)
model.add(sub_model1)
In this case I am getting an error ValueError: Layer model_3 expects 2 inputs, but it received 1 input tensors. Input received: [<tf.Tensor 'model_3_input:0' shape=(?, 7, 145, 145) dtype=float32>]. I understand this is because my model is also Sequential but how do I define the inputs? Also, is there any alternative way(apart from approach two) of doing this?
I would like to extract and store the dropout mask [array of 1/0s] from a dropout layer in a Sequential Keras model at each batch while training. I was wondering if there was a straight forward way way to do this within Keras or if I would need to switch over to tensorflow (How to get the dropout mask in Tensorflow).
Would appreciate any help! I'm quite new to TensorFlow and Keras.
There are a couple of functions (dropout_layer.get_output_mask(), dropout_layer.get_input_mask()) for the dropout layer that I tried using but got None after calling on the previous layer.
model = tf.keras.Sequential()
model.add(tf.keras.layers.Flatten(name="flat", input_shape=(28, 28, 1)))
model.add(tf.keras.layers.Dense(
512,
activation='relu',
name = 'dense_1',
kernel_initializer=tf.keras.initializers.GlorotUniform(seed=123),
bias_initializer='zeros'))
dropout = tf.keras.layers.Dropout(0.2, name = 'dropout') #want this layer's mask
model.add(dropout)
x = dropout.output_mask
y = dropout.input_mask
model.add(tf.keras.layers.Dense(
10,
activation='softmax',
name='dense_2',
kernel_initializer=tf.keras.initializers.GlorotUniform(seed=123),
bias_initializer='zeros'))
model.compile(...)
model.fit(...)
It's not easily exposed in Keras. It goes deep until it calls the Tensorflow dropout.
So, although you're using Keras, it's will also be a tensor in the graph that can be gotten by name (finding it's name: In Tensorflow, get the names of all the Tensors in a graph).
This option, of course will lack some keras information, you should probably have to do that inside a Lambda layer so Keras adds certain information to the tensor. And you must take extra care because the tensor will exist even when not training (where the mask is skipped)
Now, you can also use a less hacky way, that may consume a little processing:
def getMask(x):
boolMask = tf.not_equal(x, 0)
floatMask = tf.cast(boolMask, tf.float32) #or tf.float64
return floatMask
Use a Lambda(getMasc)(output_of_dropout_layer)
But instead of using a Sequential model, you will need a functional API Model.
inputs = tf.keras.layers.Input((28, 28, 1))
outputs = tf.keras.layers.Flatten(name="flat")(inputs)
outputs = tf.keras.layers.Dense(
512,
# activation='relu', #relu will be a problem here
name = 'dense_1',
kernel_initializer=tf.keras.initializers.GlorotUniform(seed=123),
bias_initializer='zeros')(outputs)
outputs = tf.keras.layers.Dropout(0.2, name = 'dropout')(outputs)
mask = Lambda(getMask)(outputs)
#there isn't "input_mask"
#add the missing relu:
outputs = tf.keras.layers.Activation('relu')(outputs)
outputs = tf.keras.layers.Dense(
10,
activation='softmax',
name='dense_2',
kernel_initializer=tf.keras.initializers.GlorotUniform(seed=123),
bias_initializer='zeros')(outputs)
model = Model(inputs, outputs)
model.compile(...)
model.fit(...)
Training and predicting
Since you can't train the masks (it doesn't make any sense), it should not be an output of the model for training.
Now, we could try this:
trainingModel = Model(inputs, outputs)
predictingModel = Model(inputs, [output, mask])
But masks don't exist in prediction, because dropout is only applied in training. So this doesn't bring us anything good in the end.
The only way for training is then using a dummy loss and dummy targets:
def dummyLoss(y_true, y_pred):
return y_true #but this might evoke a "None" gradient problem since it's not trainable, there is no connection to any weights, etc.
model.compile(loss=[loss_for_main_output, dummyLoss], ....)
model.fit(x_train, [y_train, np.zeros((len(y_Train),) + mask_shape), ...)
It's not guaranteed that these will work.
I found a very hacky way to do this by trivially extending the provided dropout layer. (Almost all code from TF.)
class MyDR(tf.keras.layers.Layer):
def __init__(self,rate,**kwargs):
super(MyDR, self).__init__(**kwargs)
self.noise_shape = None
self.rate = rate
def _get_noise_shape(self,x, noise_shape=None):
# If noise_shape is none return immediately.
if noise_shape is None:
return array_ops.shape(x)
try:
# Best effort to figure out the intended shape.
# If not possible, let the op to handle it.
# In eager mode exception will show up.
noise_shape_ = tensor_shape.as_shape(noise_shape)
except (TypeError, ValueError):
return noise_shape
if x.shape.dims is not None and len(x.shape.dims) == len(noise_shape_.dims):
new_dims = []
for i, dim in enumerate(x.shape.dims):
if noise_shape_.dims[i].value is None and dim.value is not None:
new_dims.append(dim.value)
else:
new_dims.append(noise_shape_.dims[i].value)
return tensor_shape.TensorShape(new_dims)
return noise_shape
def build(self, input_shape):
self.noise_shape = input_shape
print(self.noise_shape)
super(MyDR,self).build(input_shape)
#tf.function
def call(self,input):
self.noise_shape = self._get_noise_shape(input)
random_tensor = tf.random.uniform(self.noise_shape, seed=1235, dtype=input.dtype)
keep_prob = 1 - self.rate
scale = 1 / keep_prob
# NOTE: if (1.0 + rate) - 1 is equal to rate, then we want to consider that
# float to be selected, hence we use a >= comparison.
self.keep_mask = random_tensor >= self.rate
#NOTE: here is where I save the binary masks.
#the file grows quite big!
tf.print(self.keep_mask,output_stream="file://temp/droput_mask.txt")
ret = input * scale * math_ops.cast(self.keep_mask, input.dtype)
return ret
I am trying to use the implementetion of DeepTriage which is a deep learning approach for bug triaging. This website includes dataset, source code and paper. I know that is a very specific area, but I'll try to make it simple.
In the source code they define their approach "DBRNN-A: Deep Bidirectional Recurrent Neural Network with Attention mechanism and with Long Short-Term Memory units (LSTM)" with this code part:
input = Input(shape=(max_sentence_len,), dtype='int32')
sequence_embed = Embedding(vocab_size, embed_size_word2vec, input_length=max_sentence_len)(input)
forwards_1 = LSTM(1024, return_sequences=True, dropout_U=0.2)(sequence_embed)
attention_1 = SoftAttentionConcat()(forwards_1)
after_dp_forward_5 = BatchNormalization()(attention_1)
backwards_1 = LSTM(1024, return_sequences=True, dropout_U=0.2, go_backwards=True)(sequence_embed)
attention_2 = SoftAttentionConcat()(backwards_1)
after_dp_backward_5 = BatchNormalization()(attention_2)
merged = merge([after_dp_forward_5, after_dp_backward_5], mode='concat', concat_axis=-1)
after_merge = Dense(1000, activation='relu')(merged)
after_dp = Dropout(0.4)(after_merge)
output = Dense(len(train_label), activation='softmax')(after_dp)
model = Model(input=input, output=output)
model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=1e-4), metrics=['accuracy'])
SoftAttentionConcat implementation is from here. Rest of the functions are from keras. Also, in the paper they share the structure as:
In the first batch normalization line, it throws this error:
ValueError: Input 0 is incompatible with layer batch_normalization_1: expected ndim=3, found ndim=2
When I use max_sentence_len=50 and max_sentence_len=200 I look at the dimension until the error point, I see these shapes:
Input -> (None, 50)
Embedding -> (None, 50, 200)
LSTM -> (None, None, 1024)
SoftAttentionConcat -> (None, 2048)
So, is there anybody seeing the problem here?
I guess the problem is using TensorFlow code in a Keras structure or some version issues.
By using the question and the answers here, I implemented the attention mechanism in Keras as follows:
attention_1 = Dense(1, activation="tanh")(forwards_1)
attention_1 = Flatten()(attention_1) # squeeze (None,50,1)->(None,50)
attention_1 = Activation("softmax")(attention_1)
attention_1 = RepeatVector(num_rnn_unit)(attention_1)
attention_1 = Permute([2, 1])(attention_1)
attention_1 = multiply([forwards_1, attention_1])
attention_1 = Lambda(lambda xin: K.sum(xin, axis=1), output_shape=(num_rnn_unit,))(attention_1)
last_out_1 = Lambda(lambda xin: xin[:, -1, :])(forwards_1)
sent_representation_1 = concatenate([last_out_1, attention_1])
This works quite well. All the source code that I used for the implementation is available is in GitHub.
I'm trying to prepend a preprocessing layer to a pre-trained network. This is the code I'm working on:
orig_model = applications.vgg16.VGG16(include_top=True, weights=None, input_tensor=None, input_shape=None, pooling=None, classes=1000)
orig_model.load_weights(weights_path)
preproc_layer = Lambda(preprocess, input_shape=(3,224,224), output_shape=(3,224,224))
model = Sequential()
model.add(preproc_layer)
all_layers = orig_model.layers
for l in all_layers:
config = l.get_config()
copy = layers.deserialize({'class_name':l.__class__.__name__, 'config': config})
weights = l.get_weights()
copy.set_weights(weights)
model.add(copy)
Where preprocess is:
preprocess(x):
x = x[::-1, ...]
x = K.bias_add(x, vgg_mean, data_format='channels_first')
It works for the first InputLayer but throws me an error at copy.set_weights(weights) for the second (Conv2D) layer:
You called `set_weights(weights)` on layer "block1_conv1" with a weight list of length 2, but the layer was expecting 0 weights.
I found something similar on Google: https://github.com/keras-team/keras/issues/4812. Here they suggest setting trainable = True for the layer, but this doesn't work in my case.
Do you have any suggestions? Keras version is 2.1.5, Tensorflow 1.6.0
Let's say I want to train a GRU and because I need stateful=true the batch-size has to be known beforehand.
Using the functional API I would have an Input as follows:
input_1 = Input(batch_shape=(batch_size, None, features))
But when I evaluate the model I don't want to pass my test data in batches (batch_size = 1; predictions for one observation) with fixed timesteps. My
solution at the moment is to load the saved model and rebuild it with:
input_1 = Input(shape=(None, num_input_dim))
To do that though I need a method that goes through every layer of the model and then
set the weights afterwards.
input_1 = Input(shape=(None, num_input_dim))
x1 = input_1
weights = []
for l in range(0, len(layers)):
if isinstance(layers[l], keras.layers.GRU):
x1 = GRU(layers[l].output_shape[-1], return_sequences=True)(x1)
weights.append(layers[l].get_weights())
elif isinstance(layers[l], keras.layers.Dense):
x1 = Dense(layers[l].output_shape[-1], activation='tanh')(x1)
weights.append(layers[l].get_weights())
else:
continue
(This is just an example and I find this solution very unelegant.)
There must be a better way to redefine the input shape. Can somebody help me out here
please.
Since you're not using a stateful=True model for evaluating, then you do need to redefine the model.
You can make a function to create the model taking the options as input:
def createModel(stateful, weights=None):
#input
if (stateful==True):
batch = batch_size
else:
batch = None
#You don't need fixed timesteps, even if the model is stateful
input_1 = Input(batch_shape=(batch_size, None, num_input_dim))
#layer creation as you did with your first model
...
out = LSTM(...., stateful=stateful)(someInput)
...
model = Model(input_1,out)
if weights is not None:
model.set_weights(weights)
return model
Work sequence:
#create the training model
trainModel = createModel(True,None)
#train
...
#create the other model
newModel = createModel(False,trainModel.get_weights())