How to implement a stacked RNNs in Tensorflow?

How to implement a stacked RNNs in Tensorflow? - python

I want to implement an RNN using Tensorflow1.13 on GPU. Following the official recommendation, I write the following code to get a stack of RNN cells
lstm = [tk.layers.CuDNNLSTM(128) for _ in range(2)]
cells = tk.layers.StackedRNNCells(lstm)
However, I receive an error message:
ValueError: ('All cells must have a state_size attribute. received cells:', [< tensorflow.python.keras.layers.cudnn_recurrent.CuDNNLSTM object at 0x13aa1c940>])
How can I correct it?

This may be a Tensorflow bug and I would suggest creating an issue on Github. However, if you want to by pass the bug, you can use:
import tensorflow as tf
import tensorflow.keras as tk
lstm = [tk.layers.CuDNNLSTM(128) for _ in range(2)]
stacked_cells = tf.nn.rnn_cell.MultiRNNCell(lstm)
This will work but it will give a deprecation warning that you can suppress.

Thanks #qlzh727. Here, I quote the response:
Either StackedRNNCells or StackedRNNCells only works with Cell, not layer. The difference between the cell and layer in RNN is that cell will only process one time step within the whole sequence, whereas the layer will process the whole sequence. You can treat RNN layer as:
for t in whole_time_steps:
output_t, state_t = cell(input_t, state_t-1)
If you want to stack 2 LSTM layers to together with cudnn in 1.x, you can do:
l1 = tf.layers.CuDNNLSTM(128, return_sequence=True)
l2 = tf.layers.CuDNNLSTM(128)
l1_output = l1(input)
l2_oupput = l2(l1_output)
In tf 2.x, we unify the cudnn and normal implementation together, you can just change the example above with tf.layers.LSTM(128, return_sequence=True), which will use the cudnn impl if available.

Related

Keras Model Multi Input - TypeError: ('Keyword argument not understood:', 'input')

I am trying to build a CNN that receives multiple inputs and I am trying the following:
input = keras.Input()
classifier = keras.Model(inputs=input,output=classifier)
When run the code I am receiving the following error for line 6 though:
TypeError: ('Keyword argument not understood:', 'input').
A hint would be much appreciated, thank you!

Some parameters of your code are not specified. I have copied your example with some numbers that you can change back.
import keras
input_dim_1 = 10
input1 = keras.layers.Input(shape=(input_dim_1,1))
cnn_classifier_1 = keras.layers.Conv1D(64, 5, activation='sigmoid')(input1)
cnn_classifier_1 = keras.layers.Dropout(0.5)(cnn_classifier_1)
cnn_classifier_1 = keras.layers.Conv1D(48, 5, activation='sigmoid')(cnn_classifier_1)
cnn_classifier_1 = keras.models.Model(inputs=input1,outputs=cnn_classifier_1)
Some things to note
The imports of your layers were not right. You need to import the layers/models you want from the right places. You can check my code against yours to see this.
With the functional API of keras you do not need to specify the input shape as you have done in the first Conv1D layer. This is handled automatically.
You need to correctly specify the keywords in Model. Specifically inputs and outputs. Different versions of keras use input / output or inputs/outputs as keywords for the call of the class Model.

Hey, its simple, use following code:
classifier = keras.Model(input, classifier)
instead of calling
classifier = keras.Model(inputs = input, output = classifier)
Issue seems to come from latest versions of keras implementation.

tflite quantization how to change the input dtype

see possible solution at the end of the post
I am trying to fully quantize the keras-vggface model from rcmalli to run on an NPU. The model is a Keras model (not tf.keras).
When using TF 1.15 for quantization with:
print(tf.version.VERSION)
num_calibration_steps=5
converter = tf.lite.TFLiteConverter.from_keras_model_file('path_to_model.h5')
#converter.post_training_quantize = True # This only makes the weight in8 but does not initialize model quantization
def representative_dataset_gen():
for _ in range(num_calibration_steps):
pfad='path_to_image(s)'
img=cv2.imread(pfad)
# Get sample input data as a numpy array in a method of your choosing.
yield [img]
converter.representative_dataset = representative_dataset_gen
tflite_quant_model = converter.convert()
open("quantized_model", "wb").write(tflite_quant_model)
The model is converted but as I need full int8 quantization, I add:
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8 # or tf.uint8
converter.inference_output_type = tf.int8 # or tf.uint8
This error message appears:
ValueError: Cannot set tensor: Got value of type UINT8 but expected type FLOAT32 for input 0, name: input_1
clearly, the input of the model still requires float32.
Questions:
Do I have to adapt the quantization method that the input dtype is changed ? or
Do I have to change the input layer of the model to dtype int8 beforehand?
Or is that actually reporting that the model is not actually quantized?
If 1 or 2 is the answer, would you also have a best practice tip for me?
Addition:
Using :
h5_path = 'my_model.h5'
model = keras.models.load_model(h5_path)
model.save(os.getcwd() +'/modelTF2')
to save the h5 as pb with TF 2.2 and then using converter=tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
as TF 2.x tflite takes floats, and convert them to uint8s internally . I thought that could be a solution. Unfortunately, this error message appears:
tf.lite.TFLiteConverter.from_keras_model giving 'str' object has no attribute 'call'
Apparently TF2.x cannot handle pure keras models.
using tf.compat.v1.lite.TFLiteConverter.from_keras_model_file() to solve this error just repeats the error from above, as we are back again at "TF 1.15" level.
Addition 2
Another solution is to transfer the keras model to tf.keras manually. I will look into that if there is no other solution.
Regarding the comment of Meghna Natraj
To recreate the model (using TF 1.13.x) just:
pip install git+https://github.com/rcmalli/keras-vggface.git
and
from keras_vggface.vggface import VGGFace
pretrained_model = VGGFace(model='resnet50', include_top=False, input_shape=(224, 224, 3), pooling='avg') # pooling: None, avg or max
pretrained_model.summary()
pretrained_model.save("my_model.h5") #using h5 extension
The input layer is connected. Too bad, that looked like a good/easy fix.
Possible Solution
It seems to work using TF 1.15.3 I used 1.15.0 beforehand. I will check if I did something else different by accident.

A possible reason why this fails is that the model has input tensors that are not connected to the output tensor, i.,e they are probably unused.
Here is a colab notebook where I've reproduced this error. Modify the io_type at the beginning of the notebook to tf.uint8 to see an error similar to one you got.
SOLUTION
You need to manually inspect the model and to see if there are any inputs that are dangling/lost/not connected to the output and remove them.
Post a link to the model and I can try to debug it as well.

Running LSTM with multiple GPUs gets "Input and hidden tensors are not at the same device"

I am trying to train a LSTM layer in pytorch. I am using 4 GPUs. When initializing, I added the .cuda() function move the hidden layer to GPU. But when I run the code with multiple GPUs I am getting this runtime error :
RuntimeError: Input and hidden tensors are not at the same device
I have tried to solve the problem by using .cuda() function in the forward function like below :
self.hidden = (self.hidden[0].type(torch.FloatTensor).cuda(), self.hidden[1].type(torch.FloatTensor).cuda())
This line seems to solve the problem, but it raises my concern that if the updated hidden layer is seen in different GPUs. Should I move the vector back to cpu at the end of the forward function for a batch or is there any other way to solve the problem.

When you call .cuda() on the tensor, Pytorch moves it to the current GPU device by default (GPU-0). So, due to data parallelism, your data lives in a different GPU while your model goes to another, this results in the runtime error you are facing.
The correct way to implement data parallelism for recurrent neural networks is as follows:
from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence
class MyModule(nn.Module):
# ... __init__, other methods, etc.
# padded_input is of shape [B x T x *] (batch_first mode) and contains
# the sequences sorted by lengths
# B is the batch size
# T is max sequence length
def forward(self, padded_input, input_lengths):
total_length = padded_input.size(1) # get the max sequence length
packed_input = pack_padded_sequence(padded_input, input_lengths,
batch_first=True)
packed_output, _ = self.my_lstm(packed_input)
output, _ = pad_packed_sequence(packed_output, batch_first=True,
total_length=total_length)
return output
m = MyModule().cuda()
dp_m = nn.DataParallel(m)
You also need to set the CUDA_VISIBLE_DEVICES environment variable accordingly for a multi GPU setup.
References:
Data Parallelism
Fast.ai Forums
RNNs and Data Parallelism

Keras + Tensorflow: 'ConvLSTM2D' object has no attribute 'outbound_nodes'

I’m trying to have a ConvLSTM as part of my functioning tensorflow network, because I had some issues with using the tensorflow ConvLSTM implementation, I settled for using the ConvLSTM2D Keras Layer instead.
To make Keras available in my Tensorflow session I used the blogposts suggestion (I’m using the Tensorflow backend):
https://blog.keras.io/keras-as-a-simplified-interface-to-tensorflow-tutorial.html
import tensorflow as tf
sess = tf.Session()
from keras import backend as K
K.set_session(sess)
A snippet of my code (that what causes the issues):
# state has a shape of [1, 75, 32, 32] with batchsize=1
state = tf.concat([screen, screen2, non_spatial], axis=1)
# Reshaping state to get time=1 to have the right shape for the ConvLSTM
state_reshaped = tf.reshape(state, [1, 1, 75, 32, 32])
# Keras ConvLSTM2D Layer
# I tried leaving out the batch_size for the input_shape but it didn't make a difference for the error and it seems to be fine
lstm_layer = ConvLSTM2D(filters=5, kernel_size=(3, 3), input_shape=(1, 1, 75, 32, 32), data_format='channels_first', stateful=True)(state_reshaped)
fc1 = layers.fully_connected(inputs=layers.flatten(lstm_layer), num_outputs=256, activation_fn=tf.nn.relu)
This gives me the following error:
AttributeError: 'ConvLSTM2D' object has no attribute 'outbound_nodes’”
I have no idea what this means. I thought it might has to do with mixing Keras ConvLSTM and tensorflows flatten. So I tried using Keras Flatten() instead like this:
# lstm_layer shape is (5, 5, 30, 30)
lstm_layer = Flatten(data_format='channels_first')(lstm_layer)
fc1 = layers.fully_connected(inputs=lstm_layer, num_outputs=256, activation_fn=tf.nn.relu)
and got the following error: ValueError: The last dimension of the inputs to 'Dense' should be defined. Found 'None'.
This error is caused by Flatten(), for whatever reason, having an output shape of (?, ?) and the fullyconnected layer needing to have a defined shape for the last dimension but I don't understand why it would be undefined. It was defined before.
Using Reshape((4500,))(lstm_layer) instead gives me the same no attribute 'outbound_nodes' error.
I googled the issue and I seem to not be the only one but I couldn't find a solution.
How can I solve this issue?
Is the unknown output shape of Flatten() a bug or wanted behavior, if so why?

I encountered the same problem and had a bit of a dig into the tensorflow code. The problem is that there was some refactoring done for Keras 2.2.0 and tf.keras hasn't yet been updated to this new API.
The 'outbound_nodes' attribute was renamed to '_outbound_nodes' in Keras 2.2.0. It's pretty easy to fix, there's two references in base.py you need to update:
/site-packages/tensorflow/python/layers/base.py
After updating it works fine for me.

In my case I was getting the error on a custom subclass, but the following solution can be applied nonetheless, if you subclass ConvLSTM2D and add this to your new class:
#property
def outbound_nodes(self):
if hasattr(self, '_outbound_nodes'):
print("outbound_nodes called but _outbound_nodes found")
return getattr(self, '_outbound_nodes', [])

I found the solution, even though I don't know why it works.
Currently I'm using Tensorflow 1.8 and Keras 2.2. If you downgrade Keras to ~2.1.1 it works without any problems and you can easily use Keras layers together with tensorflow. This fixed AttributeError: 'ConvLSTM2D' object has no attribute 'outbound_nodes’” and then I just used layers.flatten(lstm_layer) and everything worked.

As others have pointed out, this is because of a mismatch between your installed tensorflow and keras libraries.
Their solutions work, but in my opinion, the cleanest and easiest way to solve this is by using the keras layers contained within the tensorflow package itself rather than by using the keras library directly.
i.e, replace
from keras.layers import ConvLSTM2D
by
from tensorflow.python.keras.layers import ConvLSTM2D
This will ensure that your tensorflow and keras function calls / objects are always compatible, and solved this issue for me.

Computing gradients in extracted Tensorflow subgraph which contains a 'while_loop'

In some deep learning workflows, it is useful to train a model, extract it out of its graph using tf.graph_util.convert_variables_to_constants or tf.graph_util.extract_sub_graph so training-related tensors are left out, and then connect the extracted subgraph to other model(s) via tf.import_graph_def. In this way, the trained model can serve as a building block in a larger setup.
Often, we'd like to backpropagate through the new, composite model, in order to fine-tune it, optimize the inputs and so on.
However, it appears that one cannot define a gradient through a while_loop tensorflow operation in an imported graph, since it relies on 'outer context', an object added into the metagraph's collections (see TF issue #7404). Slightly adapting the example in this Github issue, here's an example of what I am trying to do:
import tensorflow as tf
g1=tf.Graph()
sess1=tf.Session(graph=g1)
with g1.as_default():
with sess1.as_default():
i=tf.constant(0, name="input")
out=tf.while_loop(lambda i: tf.less(i,5), lambda i: [tf.add(i,1)], [i], name="output")
loss=tf.square(out,name='loss')
graph_def = tf.graph_util.convert_variables_to_constants(sess1,g1.as_graph_def(),['output/Exit'])
g2 = tf.Graph()
with g2.as_default():
tf.import_graph_def(graph_def,name='')
i_imported = g2.get_tensor_by_name("input:0")
out_imported = g2.get_tensor_by_name("output/Exit:0")
tf.gradients(out_imported, i_imported)
The last line raises an AttributeError: 'NoneType' object has no attribute 'outer_context' error.
Tensorflow's solution to this issue is to use tf.train.export_meta_graph and tf.train.import_meta_graph so the outer context is copied, but this copies the entire graph, without editting. In this minimal case, the 'loss' tensor won't be removed.
I tried copying the missing context to the new graph:
g2.add_to_collection('while_context',g1.get_collection('while_context'))
But it doesn't solve the issue.
Is there a way to overcome this limitation or is it an irreparable Tensorflow design flaw?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to implement a stacked RNNs in Tensorflow? - python

Related

Keras Model Multi Input - TypeError: ('Keyword argument not understood:', 'input')

tflite quantization how to change the input dtype

Running LSTM with multiple GPUs gets "Input and hidden tensors are not at the same device"

Keras + Tensorflow: 'ConvLSTM2D' object has no attribute 'outbound_nodes'

Computing gradients in extracted Tensorflow subgraph which contains a 'while_loop'

Categories

Resources