So, I got this multi-input model with 6 identical inputs of same shape. Right now If I have to use this model, I have to multiply my input data with total numbers of input layer, i.e. 6. I was wondering if I can add another layer on top of this and can pass single input that will connect with all these 6 inputs. I'm not sure how to accomplish this! Any thoughts?
Issue was something like this: I have a "BASE" multi-input model, where all inputs are identical, as this "BASE" model was just a combination of multiple models which happens to share identical type input! Now, when using this "BASE" model for classification, I had to provide [input_data x "total_inputs"] for each input layer, which is something I don't wanted to do, say when classifying millions of sentences!
So, the ideal solution was to just have a single input which is connected with all "BASE" model inputs!!
Alrighty, so here is how it's done:
Create a new top_model which will take single input and generate multiple identical outputs. This can be done with Lambda layer.
single_input = layers.Input(input_shape)
multi_output = layers.Lambda(lambda x: [x] * total_numbers_of_base_inputs)(single_input)
top_model = Model(inputs=single_input, outputs=multi_output)
Use the top_model input and your multi_input_base_model like below to create new single_input model.
new_model = Model(inputs=top_model.input, outputs=multi_input_base_model(top_model.output))
Related
In the Tensorflow ML Basics with Keras tutorial for making a basic text classification, when preparing the trained model for export, the tutorial suggests including the TextVectorization layer into the Model so it can "process raw strings". I understand why to do this.
But then the code snippet is:
export_model = tf.keras.Sequential([
vectorize_layer,
model,
layers.Activation('sigmoid')
])
Why when preparing the model for export, does the tutorial also include a new activation layer layers.Activation('sigmoid')? Why not incorporate this layer into the original model?
Before the TextVectorization layer was introduced, you had to manually edit your raw strings. This usually meant removing punctuation, lower case, tokenization and so forth:
#Raw String
"Furthermore, he asked himself why it happened to Billy?"
#Remove punctuation
"Furthermore he asked himself why it happened to Billy"
#Lower-case
"furthermore he asked himself why it happened to billy"
#Tokenize
['furthermore', 'he', 'asked', 'himself', 'why', 'it', 'happened', 'to', 'billy']
If you include the TextVectorization layer in your model when you export, you can essentially feed raw strings into your model for prediction without having to clean them up first.
Regarding your second question: I also find it rather odd that the sigmoid activation function was not used. I imagine that the last layer has a "linear activation function" due to the dataset and its samples. The samples can be split into two classes, solving a linearly separable problem.
The problem with a linear activation function during inference is that it can output negative values:
# With linear activation function
examples = [
"The movie was great!",
"The movie was okay.",
"The movie was terrible..."
]
export_model.predict(examples)
'''
array([[ 0.4543204 ],
[-0.26730654],
[-0.61234593]], dtype=float32)
'''
For example, the value -0.26730654 could indicate that the review "The movie was okay." is negative, but this is not necessarily the case. What one actually wants to predict is the probability that a particular sample belongs to a particular class. Therefore, a sigmoid function is used in the inference to squeeze the output values between 0 and 1. The output can then be interpreted as the probability that sample x belongs to class n:
# With sigmoid activation function
examples = [
"The movie was great!",
"The movie was okay.",
"The movie was terrible..."
]
export_model.predict(examples)
'''
array([[0.6116659 ],
[0.43356845],
[0.35152423]], dtype=float32)
'''
Sometimes you want to know model's answer before sigmoid as it may contain useful information, for example, about distribution shape and its evolution. In such scenario it's convenient to have final scaling as a separate entity. Otherwise one would have to remove/add sigmoid layer - more lines of code, more possible erros. So it may be a good practice to apply sigmoid in the very end - just before saving/exporting. Or just an agreement.
I have two models (model_a,model_b) of similar structure (VGG16 architecture with a replaced top block). I need to concatenate the outputs of the last layers of both models in order to send as input to an attention mechanism.
I run the following line of code for concatenation:
merged = Concatenate()([model_a.layers[-1].layers[-1], model_b.layers[-1].layers[-1]])
(model_a.layers[-1] is the top block which is a Sequential object, model_a.layers[-1].layers[-1] is a Dense layer.)
However, I receive the following error when I try to do so:
Layer concatenate_8 was called with an input that isn't a symbolic
tensor. Received type: < class 'keras.layers.core.Dense' >. Full input:
[< keras.layers.core.Dense object at 0x >,
< keras.layers.core.Dense object at 0x >]. All inputs to the
layer should be tensors.
I noticed that similar issues are fixed by redefining the last layer by specifying the input layer for it, but I'm not sure how that solution would help here since I'm using predefined and pre-trained models.
Use the .output attribute of the model to access the symbolic tensor.
merged = Concatenate()(
[model_a.layers[-1].layers[-1].output, model_b.layers[-1].layers[-1].output])
The tensorflow.keras.layers.Layer documentation states the following about Layer.output
output: Retrieves the output tensor(s) of a layer.
Only applicable if the layer has exactly one output, i.e. if it is connected to one incoming layer.
Regarding the comment
Actually, when I tried to finally combine inputs and outputs using the Model() method, I had an error - Graph disconnected: Cannot get value of tensor Tensor Flatten.. , however when I changed model_a.layers[-1].layers[-1].output to model_a.output and similarly for model_b, the issue was resolved. Any idea why so?
It's difficult to say without seeing the model code, but you can compare the values of model_a.layers[-1].layers[-1] to model_a.output.
I am trying to train a model for emotion recognition, which uses one of VGG's layer's output as an input.
I could manage what I want by running the prediction in a first step, saving the extracted features and then using them as input to my network, but I am looking for a way to do the whole process at once.
The second model uses a concatenated array of feature maps as input (I am working with video data), so I am not able to simply wire it to the output of VGG.
I tried to use a map operation as depicted in the tf.data.dataset API documentations this way :
def trimmed_vgg16():
vgg16 = tf.keras.applications.vgg16.VGG16(input_shape=(224,224,3))
trimmed = tf.keras.models.Model(inputs=vgg16.get_input_at(0),
outputs=vgg16.layers[-3].get_output_at(0))
return trimmed
vgg16 = trimmed_vgg16()
def _extract_vgg_features(images, labels):
pred = vgg16_model.predict(images, batch_size=batch_size, steps=1)
return pred, labels
dataset = #load the dataset (image, label) as usual
dataset = dataset.map(_extract_vgg_features)
But I'm getting this error : Tensor Tensor("fc1/Relu:0", shape=(?, 4096), dtype=float32) is not an element of this graph which is pretty explicit. I'm stuck here, as I don't see a good way of inserting the trained model in the same graph and getting predictions "on the fly".
Is there a clean way of doing this or something similar ?
Edit: missed a line.
Edit2: added details
You should be able to connect the layers by first creating the vgg16 and then retrieving the output of the model as such and afterward you can use that tensor as an input to your own network.
vgg16 = tf.keras.applications.vgg16.VGG16(input_shape=(224,224,3))
network_input = vgg16.get_input_at(0)
vgg16_out = vgg16.layers[-3].get_output_at(0) # use this tensor as input to your own network
I currently have a keras model which uses an Embedding layer. Something like this:
input = tf.keras.layers.Input(shape=(20,) dtype='int32')
x = tf.keras.layers.Embedding(input_dim=1000,
output_dim=50,
input_length=20,
trainable=True,
embeddings_initializer='glorot_uniform',
mask_zero=False)(input)
This is great and works as expected. However, I want to be able to send text to my model, have it preprocess the text into integers, and continue normally.
Two issues:
1) The Keras docs say that Embedding layers can only be used as the first layer in a model: https://keras.io/layers/embeddings/
2) Even if I could add a Lambda layer before the Embedding, I'd need it to keep track of certain state (like a dictionary mapping specific words to integers). How might I go about this stateful preprocessing?
In short, I need to modify the underlying Tensorflow DAG, so when I save my model and upload to ML Engine, it'll be able to handle my sending it raw text.
Thanks!
Here are the first few layers of a model which uses a string input:
input = keras.layers.Input(shape=(1,), dtype="string", name='input_1')
lookup_table_op = tf.contrib.lookup.index_table_from_tensor(
mapping=vocab_list,
num_oov_buckets=num_oov_buckets,
default_value=-1,
)
lambda_output = Lambda(lookup_table_op.lookup)(input)
emb_layer = Embedding(int(number_of_categories),int(number_of_categories**0.25))(lambda_output)
Then you can continue the model as you normally would after an embedding layer. This is working for me and the model trains fine from string inputs.
It is recommended that you do the string -> int conversion in some preprocessing step to speed up the training process. Then after the model is trained you create a second keras model that just converts string -> int and then combine the two models to get the full string -> target model.
I want to create a multi-layered bidirectional LSTM in Tensorflow. Currently my single-layered model looks like:
cell_fw = tf.contrib.rnn.LSTMCell(hidden_size)
cell_bw = tf.contrib.rnn.LSTMCell(hidden_size)
(self.out_fw, self.out_bw), _ = tf.nn.bidirectional_dynamic_rnn(cell_fw, cell_bw, input, ...)
In order to turn this into a multi-layered I suspect I can not simply wrap a few LSTMCells with MultiRNNCells like so:
multi_cell_fw = tf.contrib.rnn.MultiRNNCell([cell_fw] * num_layers, ...)
and feed them into the bidirectional_dynamic_rnn since both forward and backward LSTMs in each layer need the output of both the forward and backward directions of the preceding layer. Currently my solution is to create my bidirectional_dynamic_rnns in a loop, feeding in the concatenated output of LSTMs of the preceding layers.
However, it's not very clean and frankly I'm not sure if it's correct, though it does work on a toy dataset. Is there a better way that's comparably elegant to using something like MultiRNNCell?
I'm using Tensorflow API r1.0.
Just do:
multi_cell_fw = tf.contrib.rnn.MultiRNNCell([cell_fw for _ in range(num_layers)], ...)
That should work.