Strategies to debug an pinpoint the problem in Keras models - python

Recently, I had an error in one custom layer of a larger model. The error can be reproduced by this code. However, after fixing it in the smaller example, the larger model still presented the same error. It is hard to tell which custom layer, or part of the code in the larger model is the culprit. How can I debug and find exactly which line of code is creating this error? For instance in the smaller code below it seems to be the tf.reduce_mean(), but how can I determine this, since it occurs when loading the model? Ultimately, I need to employ this debugging technique for the larger model.
Code:
import tensorflow as tf
import numpy as np
from tensorflow.keras import Input, Model
tf.compat.v1.disable_eager_execution()
#tf.compat.v1.enable_eager_execution()
inputs = Input(shape=(2,))
output_loss = tf.keras.backend.mean(inputs)
outputs = [inputs, output_loss]
model = Model(inputs, outputs)
loss = tf.reduce_mean((output_loss)) #Error
#loss = tf.math.rsqrt((output_loss)) #No Error
model.add_loss(loss)
model.compile(optimizer="adam", loss=[None] * len(model.outputs))
model.fit(np.random.random((5, 2)), epochs=2)
model.save("my_model_.h5")
#Error when loading and loss tf.reduce_mean
model_ = keras.models.load_model("my_model_.h5", compile=False)# ValueError: Inconsistent values for attr 'Tidx' DT_FLOAT vs. DT_INT32 while building NodeDef 'tf_op_layer_Mean_1/Mean_1'
model_.summary()
error
ValueError: Inconsistent values for attr 'Tidx' DT_FLOAT vs. DT_INT32 while building NodeDef

Related

Keras_tuner nonsense ValueError: Unknown initializer: relu

I am running a hyperparameter search on a new version keras tuner for a NN, and I get an error that 1) didn't exist in the old one, 2) doesn't make sense.
ValueError: Unknown initializer: relu. Please ensure this object is passed to the custom_objects argument.
I don't get why 'relu' is passed as initializer, the code breaks at the definition of learning rate (as per keras tuner documentation):
hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=hp_learning_rate),
loss=self.loss_function,
metrics=['sparse_categorical_accuracy', 'accuracy'])
However, it works with
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
loss=self.loss_function,
metrics=['sparse_categorical_accuracy', 'accuracy'])
I am not specifying any weight initializers, so they all should be default ones.
Any idea why this could be happening?
thank you very much in advance!

How to get TensorFlow operations contained in Keras model

I have a TensorFlow Keras model (TensorFlow 2.6.0); here's a basic example:
import tensorflow as tf
x = inp = tf.keras.Input((5,))
x = tf.keras.layers.Dense(7, activation="relu")(x)
x = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inp, x)
I would like to get all the tf.Operation objects in the graph for the model, select specific operations, then create a new tf.function or tf.keras.Model to output the values of those tensors on arbitrary inputs.
For example, in my simple model above, I might want to get the outputs of all relu operators. I know in that case, I could redefine the model to include the output of that layer as another output of the model, but the point here is that I already have the model (it's much more complicated than above), and there are specific operators that I want to find to get the outputs of.
Have you tried this:
all_ops = tf.get_default_graph().get_operations()
If you got an empty list and you use Tensorflow 2.x , you may try this:
import tensorflow as tf
print(tf.__version__)
tf.compat.v1.disable_eager_execution() # disable eager execution
a = tf.constant([1],name='aa')
print(tf.compat.v1.get_default_graph().get_operations())
print(tf.compat.v1.get_default_graph().get_tensor_by_name('aa:0'))

How to update mirrored variable in tensorflow 2.0?

I am building a model in tensorflow version 2.0 (upgrading is not an option due to compatibility with my version of cuda, which I do not have permission to change). I am using tf.strategy.MirroredStrategy() to train my model on 2 GPUs. However, I am trying to instantiate a custom dense layer whose weights are the transpose of the weights of a different dense layer. My code involves this line to build the custom layer:
from tensorflow.keras import backend as K
class DenseTied(Layer):
# Really long class, full code can be found at link below
def build(self, input_shape):
self.kernel = K.transpose(self.tied_to.kernel)
I am then using this in a model as follows:
from tensorflow.keras.layers import Input, Dense
def build_model(input_shape):
model_input = Input(shape=input_shape)
dense1 = Dense(6144, activation='relu')
dense_tied1 = DenseTied(49152, tied_to=dense1)
x = dense1(model_input)
model_output = dense_tied1(x)
model = Model(model_input, model_output)
model.compile(optimizer='adam', loss='mse')
return model
When trying to build this model I get an error: AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute '_distribute_strategy'.
I have been tracking this down for a while now and I pinpointed that the issue is in the line
self.kernel = K.transpose(self.tied_to.kernel)
It seems that self.tied_to.kernel is of type <class 'tensorflow.python.distribute.values.MirroredVariable'> but after calling K.transpose() on it the resulting output is of type <class 'tensorflow.python.framework.ops.EagerTensor'>. I tried following the instructions here but it did not work. I get AttributeError: 'MirroredStrategy' object has no attribute 'run' when in the docs it does. So I think maybe my version of Tensorflow is too old for that method.
How can I update a mirrored variable in Tensorflow 2.0?
Also if you want to see the full custom layer code, I am trying to implement the dense tied layer described here.
As of now, the documentation is for tensorflow 2.3. If you are using 2.0 it should be
strategy.experimental_run_v2 instead of strategy.run.

What is the difference between these two ways of building a model in keras?

I am new to Keras and after going through a few tutorials i started building a model and found these two styles of implementations. However i am getting an error in the first one and second one works fine. Can someone explain the difference between the two?
First Method:
visible = Embedding(QsVocabSize, 1024, input_length=max_length_inp, mask_zero=True)
encoder = LSTM(100,activation='relu')(visible)
Second Method:
model = Sequential()
model.add(Embedding(QsVocabSize, 1024, input_length=max_length_inp, mask_zero=True))
model.add(LSTM(100,activation ='relu'))
This is the error I get:
ValueError: Layer lstm_59 was called with an input that isn't a symbolic tensor. Received type: <class 'keras.layers.embeddings.Embedding'>. Full input: [<keras.layers.embeddings.Embedding object at 0x00000207BC7DBCC0>]. All inputs to the layer should be tensors.
They're two ways of creating DL models in Keras. The first code snippet follows functional style. This style is used for creating complex models like multi-input/output, shared layers etc.
https://keras.io/getting-started/functional-api-guide/
The second code snippet is Sequential style. Simple models can be created which involves just stacking of layers.
https://keras.io/getting-started/sequential-model-guide/
If you read the functional API guide, you'll notice the following point:
'A layer instance is callable (on a tensor), and it returns a tensor'
Now the error you're seeing would make sense. This line only creates the layer and doesn't invoke it by passing a tensor.
visible = Embedding(QsVocabSize, 1024, input_length=max_length_inp, mask_zero=True)
Subsequently, passing this Embedding object to LSTM layer throws an error as it is expecting a Tensor.
This is an example from the functional API guide. Notice the output tensors getting passed from one layer to another.
main_input = Input(shape=(100,), dtype='int32', name='main_input')
x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input)
lstm_out = LSTM(32)(x)

Using the output of an internal layer to fit Keras model?

I have a model M that have two inputs: x_train1, x_train2. After passing through heavy transformations these inputs are concatenated into one single array x1_x2. Later it is plugged into an autoencoder where output should be x1_x2. But when I try to fit the model I get the following error:
ValueError: When feeding symbolic tensors to a model, we expect thetensors to have a static batch size. Got tensor with shape: (None, 2080)
I know that the problem lays down on how I am specifying my expected output. I was able to run the code using a dummy array such as np.zeros((96, 2080)), but not by setting the output of an internal layer.
I do the following to fit the model:
autoencoder.fit([x_train1, x_train2],
autoencoder.layers[-7].output,
epochs=50,
batch_size=8,
shuffle=True,
validation_split=0.2)
How can I make Keras understand that the expected output should be the output of an internal layer with shape (number_of_input_images, 2080)?
I'd do the following: Import the Model class from Keras and create an additional model.
from tensorflow.python.keras.models import Model
# model = your existing model
new_model = Model(
inputs = model.input,
outputs = model.get_layer(name_of_desired_output_layer).output
)
That's it, now you can use your new model and train it instead.

Categories

Resources