I want to create a multi-layered bidirectional LSTM in Tensorflow. Currently my single-layered model looks like:
cell_fw = tf.contrib.rnn.LSTMCell(hidden_size)
cell_bw = tf.contrib.rnn.LSTMCell(hidden_size)
(self.out_fw, self.out_bw), _ = tf.nn.bidirectional_dynamic_rnn(cell_fw, cell_bw, input, ...)
In order to turn this into a multi-layered I suspect I can not simply wrap a few LSTMCells with MultiRNNCells like so:
multi_cell_fw = tf.contrib.rnn.MultiRNNCell([cell_fw] * num_layers, ...)
and feed them into the bidirectional_dynamic_rnn since both forward and backward LSTMs in each layer need the output of both the forward and backward directions of the preceding layer. Currently my solution is to create my bidirectional_dynamic_rnns in a loop, feeding in the concatenated output of LSTMs of the preceding layers.
However, it's not very clean and frankly I'm not sure if it's correct, though it does work on a toy dataset. Is there a better way that's comparably elegant to using something like MultiRNNCell?
I'm using Tensorflow API r1.0.
Just do:
multi_cell_fw = tf.contrib.rnn.MultiRNNCell([cell_fw for _ in range(num_layers)], ...)
That should work.
Related
I am trying to train a model for emotion recognition, which uses one of VGG's layer's output as an input.
I could manage what I want by running the prediction in a first step, saving the extracted features and then using them as input to my network, but I am looking for a way to do the whole process at once.
The second model uses a concatenated array of feature maps as input (I am working with video data), so I am not able to simply wire it to the output of VGG.
I tried to use a map operation as depicted in the tf.data.dataset API documentations this way :
def trimmed_vgg16():
vgg16 = tf.keras.applications.vgg16.VGG16(input_shape=(224,224,3))
trimmed = tf.keras.models.Model(inputs=vgg16.get_input_at(0),
outputs=vgg16.layers[-3].get_output_at(0))
return trimmed
vgg16 = trimmed_vgg16()
def _extract_vgg_features(images, labels):
pred = vgg16_model.predict(images, batch_size=batch_size, steps=1)
return pred, labels
dataset = #load the dataset (image, label) as usual
dataset = dataset.map(_extract_vgg_features)
But I'm getting this error : Tensor Tensor("fc1/Relu:0", shape=(?, 4096), dtype=float32) is not an element of this graph which is pretty explicit. I'm stuck here, as I don't see a good way of inserting the trained model in the same graph and getting predictions "on the fly".
Is there a clean way of doing this or something similar ?
Edit: missed a line.
Edit2: added details
You should be able to connect the layers by first creating the vgg16 and then retrieving the output of the model as such and afterward you can use that tensor as an input to your own network.
vgg16 = tf.keras.applications.vgg16.VGG16(input_shape=(224,224,3))
network_input = vgg16.get_input_at(0)
vgg16_out = vgg16.layers[-3].get_output_at(0) # use this tensor as input to your own network
I currently have a keras model which uses an Embedding layer. Something like this:
input = tf.keras.layers.Input(shape=(20,) dtype='int32')
x = tf.keras.layers.Embedding(input_dim=1000,
output_dim=50,
input_length=20,
trainable=True,
embeddings_initializer='glorot_uniform',
mask_zero=False)(input)
This is great and works as expected. However, I want to be able to send text to my model, have it preprocess the text into integers, and continue normally.
Two issues:
1) The Keras docs say that Embedding layers can only be used as the first layer in a model: https://keras.io/layers/embeddings/
2) Even if I could add a Lambda layer before the Embedding, I'd need it to keep track of certain state (like a dictionary mapping specific words to integers). How might I go about this stateful preprocessing?
In short, I need to modify the underlying Tensorflow DAG, so when I save my model and upload to ML Engine, it'll be able to handle my sending it raw text.
Thanks!
Here are the first few layers of a model which uses a string input:
input = keras.layers.Input(shape=(1,), dtype="string", name='input_1')
lookup_table_op = tf.contrib.lookup.index_table_from_tensor(
mapping=vocab_list,
num_oov_buckets=num_oov_buckets,
default_value=-1,
)
lambda_output = Lambda(lookup_table_op.lookup)(input)
emb_layer = Embedding(int(number_of_categories),int(number_of_categories**0.25))(lambda_output)
Then you can continue the model as you normally would after an embedding layer. This is working for me and the model trains fine from string inputs.
It is recommended that you do the string -> int conversion in some preprocessing step to speed up the training process. Then after the model is trained you create a second keras model that just converts string -> int and then combine the two models to get the full string -> target model.
So, I got this multi-input model with 6 identical inputs of same shape. Right now If I have to use this model, I have to multiply my input data with total numbers of input layer, i.e. 6. I was wondering if I can add another layer on top of this and can pass single input that will connect with all these 6 inputs. I'm not sure how to accomplish this! Any thoughts?
Issue was something like this: I have a "BASE" multi-input model, where all inputs are identical, as this "BASE" model was just a combination of multiple models which happens to share identical type input! Now, when using this "BASE" model for classification, I had to provide [input_data x "total_inputs"] for each input layer, which is something I don't wanted to do, say when classifying millions of sentences!
So, the ideal solution was to just have a single input which is connected with all "BASE" model inputs!!
Alrighty, so here is how it's done:
Create a new top_model which will take single input and generate multiple identical outputs. This can be done with Lambda layer.
single_input = layers.Input(input_shape)
multi_output = layers.Lambda(lambda x: [x] * total_numbers_of_base_inputs)(single_input)
top_model = Model(inputs=single_input, outputs=multi_output)
Use the top_model input and your multi_input_base_model like below to create new single_input model.
new_model = Model(inputs=top_model.input, outputs=multi_input_base_model(top_model.output))
I am using the python API of CNTK to train some CNN that I save using the save_model function.
Now I want to run some analysis on my network afterwards. Specifically I want to take a look at the activations of each layer. Obviously I can run my network on some data called img like this:
model.eval(img)
But that will only give me the output of the last Layer in my Network. Is there some easy way to also get the output from the previous layers?
Actually, there is even an example provided for that task: https://github.com/Microsoft/CNTK/tree/master/Examples/Image/FeatureExtraction
Let me give you a short overview about the essential steps:
Important is the name of your node, of which you want to get the output.
# get the node in the graph of which you desire the output
node_in_graph = loaded_model.find_by_name(node_name)
output_nodes = combine([node_in_graph.owner])
# evaluate the node e.g. using a minibatch_source
mb = minibatch_source.next_minibatch(1)
output = output_nodes.eval(mb[features_si])
# access the values as a one dimensional vector
out_values = output[0].flatten()
desired_output = out_values[np.newaxis]
Basically you just do the same like you do anyways with the difference that you retrieve an intermediate node.
With CNTK I have created a network with 2 input neurons and 1 output neuron.
A line in the training file looks like
|features 1.567518 2.609619 |labels 1.000000
Then the network was trained with brain script. Now I want to use the network for predicting values. For example: Input data is [1.82, 3.57]. What ist the output from the net?
I have tried Python with the following code, but here I am new. Code does not work. So my question is: How to pass the input data [1.82, 3.57] to the eval function?
On stackoverflow there are some hints, here and here, but this is too abstract for me.
Thank you.
import cntk as ct
import numpy as np
z = ct.load_model("LR_reg.dnn", ct.device.cpu())
input_data= np.array([1.82, 3.57], dtype=np.float32)
pred = z.eval({ z.arguments[0] : input_data })
print(pred)
Here's the most defensive way of doing it. CNTK can be forgiving if you omit some of this when the network is specified with V2 constructs. Not sure about a network that was created with V1 code.
Basically you need a pair of braces for each axis. Which axes exist in Brainscript? There's a batch axis, a sequence axis and then the static axes of your network. You have one dimensional data so that means the following should work:
input_data= np.array([[[1.82, 3.57]]], dtype=np.float32)
This specifies a batch of one sequence, of length one, containing one 1d vector of two elements. You can also try omitting the outermost braces and see if you are getting the same result.
Update based on more information from the comment below, we should not forget that the V1 code also saved the part of the network that computes things like loss and accuracy. If we provide only the features, CNTK will complain that the labels have not been provided. There are two ways to deal with this issue. One possibility is to provide some fake labels, so that the network can evaluate these auxiliary operations. Another possibility is to identify the prediction and use that. If the prediction was called 'p' in V1, this python code
p = z.find_by_name('p')
should create a CNTK function that only needs the features in order to compute the prediction.