Is there a way to get the values of a Keras Tensor as a numpy array?
A normal K.eval() does not work and results in:
AttributeError: 'KerasTensor' object has no attribute 'numpy'
I use the eager execution.
I need this to access and check the outputs of some layers of my sequential model. Example: (out2 = K.eval(cnn.layers[2].output))
Minimal example:
import numpy
import tensorflow.keras.models as models
import tensorflow.keras.layers as layers
from keras import backend as K
cnn = models.Sequential([
layers.Conv2D(filters=64, kernel_size=3, activation='relu',
kernel_initializer='he_uniform', padding='same',
input_shape=(32,32,3))],
name='cnn')
res = cnn(numpy.random.random((1,32,32,3)))
print(K.eval(cnn.layers[0].output))
Do not mix keras with tensorflow.keras, i.e.
from keras import backend as K
#wrong, do not import anything from keras and mix them with tensorflow
To get intermediate outputs from a model:
#for example, to get the intermediate outputs from model.layers[2]
temp_model=tf.keras.Model(model.input,model.layers[2].output)
print(temp_model(np.random.rand(1,32,32,3)).numpy())
Related
Assuming there is a model given as an h5 file, i.e., I can not change the code building the model's architecture:
from tensorflow.keras.layers import Input, BatchNormalization
from tensorflow.keras.models import Model
inputs = Input(shape=(4,))
outputs = BatchNormalization()(inputs, training=True)
model = Model(inputs=inputs, outputs=outputs)
model.save('model.h5', include_optimizer=False)
Now I'd like to remove the training=True part, i.e., I want to the BatchNormalization as if it was attached to the model without this flag.
My current attempt looks as follows:
import numpy as np
from tensorflow.keras.models import load_model
model = load_model('model.h5')
for layer in model.layers:
for node in layer.inbound_nodes:
if "training" in node.call_kwargs:
del node.call_kwargs["training"]
model.predict(np.asarray([[1, 2, 3, 4]]))
But the model.predict calls fails with the following error (I'm using TensorFlow 2.5.0):
ValueError: Could not pack sequence. Structure had 1 elements, but flat_sequence had 2 elements. Structure: ((<KerasTensor: shape=(None, 4) dtype=float32 (created by layer 'input_1')>,), {}), flat_sequence: [<tf.Tensor 'model/Cast:0' shape=(None, 4) dtype=float32>, True].
How can this be fixed/worked around?
(When using node.call_kwargs["training"] = False instead of del node.call_kwargs["training"] then model.predict does not crash, but it simply behaves as if nothing was changed, i.e., the modified flag is ignored.)
I found simply saving and re-loading the model again after modifying the call_kwargs helps.
import numpy as np
from tensorflow.keras.models import load_model
model = load_model('model.h5')
# Removing training=True
for layer in model.layers:
for node in layer.inbound_nodes:
if "training" in node.call_kwargs:
del node.call_kwargs["training"]
# The two following lines are the solution.
model.save('model_modified.h5', include_optimizer=False)
model = load_model('model_modified.h5')
model.predict(np.asarray([[1, 2, 3, 4]]))
And all is fine. :)
have you tried
for layer in model.layers:
layer.trainable=False
I am implementing a simple multitask model in Keras. I used the code given in the documentation under the heading of shared layers.
I know that in multitask learning, we share some of the initial layers in our model and the final layers are made individual to the specific tasks as per the link.
I have following two cases in keras API where in the first, I am using keras.layers.concatenate while in the other, I am not using any keras.layers.concatenate.
I am posting the codes as well as the models for each case as follows.
Case-1 code
import keras
from keras.layers import Input, LSTM, Dense
from keras.models import Model
from keras.models import Sequential
from keras.layers import Dense
from keras.utils.vis_utils import plot_model
tweet_a = Input(shape=(280, 256))
tweet_b = Input(shape=(280, 256))
# This layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)
# When we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)
# We can then concatenate the two vectors:
merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)
# And add a logistic regression on top
predictions1 = Dense(1, activation='sigmoid')(merged_vector)
predictions2 = Dense(1, activation='sigmoid')(merged_vector)
# We define a trainable model linking the
# tweet inputs to the predictions
model = Model(inputs=[tweet_a, tweet_b], outputs=[predictions1, predictions2])
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
Case-1 Model
Case-2 code
import keras
from keras.layers import Input, LSTM, Dense
from keras.models import Model
from keras.models import Sequential
from keras.layers import Dense
from keras.utils.vis_utils import plot_model
tweet_a = Input(shape=(280, 256))
tweet_b = Input(shape=(280, 256))
# This layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)
# When we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)
# And add a logistic regression on top
predictions1 = Dense(1, activation='sigmoid')(encoded_a )
predictions2 = Dense(1, activation='sigmoid')(encoded_b)
# We define a trainable model linking the
# tweet inputs to the predictions
model = Model(inputs=[tweet_a, tweet_b], outputs=[predictions1, predictions2])
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
Case-2 Model
In both cases, the LSTMlayer is shared only. In case-1, we have keras.layers.concatenate but in case-2, we don't have any keras.layers.concatenate.
My question is, which one is multitasking, case-1 or case-2? Morover, what is the function of keras.layers.concatenate in case-1?
Both are multi-task models, as this only depends if there are multiple outputs with one task associated to each output.
The difference is that your first model explicitly concatenates features produced by the shared layer, so both output tasks can consider information from both inputs. The second model only has connections from one input directly to one of the outputs, without considering the other input. The only link between models here is that they share the LSTM weights.
I've been following Towards Data Science's tutorial about word2vec and skip-gram models, but I stumbled upon a problem that I cannot solve, despite searching about it a lot and trying multiple unsuccessful solutions.
https://towardsdatascience.com/understanding-feature-engineering-part-4-deep-learning-methods-for-text-data-96c44370bbfa
The step that it shows you how to build the skip-gram model architecture seems deprecated because of the use of the Merge layer from keras.layers.
What I tried to do was translate his piece of code - which is implemented in the Sequential API of Keras - to the Functional API to solve the deprecation of the Merge layer, by replacing it with the keras.layers.Dot layer. However, I'm still stuck in this step of merging the two models (word and context) into the final model, whose architecture must be like this:
Here's the code that the author used:
from keras.layers import Merge
from keras.layers.core import Dense, Reshape
from keras.layers.embeddings import Embedding
from keras.models import Sequential
# build skip-gram architecture
word_model = Sequential()
word_model.add(Embedding(vocab_size, embed_size,
embeddings_initializer="glorot_uniform",
input_length=1))
word_model.add(Reshape((embed_size, )))
context_model = Sequential()
context_model.add(Embedding(vocab_size, embed_size,
embeddings_initializer="glorot_uniform",
input_length=1))
context_model.add(Reshape((embed_size,)))
model = Sequential()
model.add(Merge([word_model, context_model], mode="dot"))
model.add(Dense(1, kernel_initializer="glorot_uniform", activation="sigmoid"))
model.compile(loss="mean_squared_error", optimizer="rmsprop")
And here is my attempt to translate the Sequential code implementation into the Functional one:
from keras import models
from keras import layers
from keras import Input, Model
word_input = Input(shape=(1,))
word_x = layers.Embedding(vocab_size, embed_size, embeddings_initializer='glorot_uniform')(word_input)
word_reshape = layers.Reshape((embed_size,))(word_x)
word_model = Model(word_input, word_reshape)
context_input = Input(shape=(1,))
context_x = layers.Embedding(vocab_size, embed_size, embeddings_initializer='glorot_uniform')(context_input)
context_reshape = layers.Reshape((embed_size,))(context_x)
context_model = Model(context_input, context_reshape)
model_input = layers.dot([word_model, context_model], axes=1, normalize=False)
model_output = layers.Dense(1, kernel_initializer='glorot_uniform', activation='sigmoid')
model = Model(model_input, model_output)
However, when executed, the following error is returned:
ValueError: Layer dot_5 was called with an input that isn't a symbolic
tensor. Received type: . Full
input: [,
]. All inputs to
the layer should be tensors.
I'm a total beginner to the Functional API of Keras, I will be grateful if you could give me some guidance in this situation on how could I input the context and word models into the dot layer to achieve the architecture in the image.
You are passing Model instances to the layer, however as the error suggests you need to pass Keras Tensors (i.e. outputs of layers or models) to layers in Keras. You have two option here. One is to use the .output attribute of the Model instance like this:
dot_output = layers.dot([word_model.output, context_model.output], axes=1, normalize=False)
or equivalently, you can use the output tensors directly:
dot_output = layers.dot([word_reshape, context_reshape], axes=1, normalize=False)
Further, you need to apply the Dense layer which is followed on the dot_output and pass instances of Input layer as inputs of Model. Therefore:
model_output = layers.Dense(1, kernel_initializer='glorot_uniform',
activation='sigmoid')(dot_output)
model = Model([word_input, context_input], model_output)
I've been following Towards Data Science's tutorial about word2vec and skip-gram models, but I stumbled upon a problem that I cannot solve, despite searching about it for hours and trying a lot of unsuccessful solutions.
https://towardsdatascience.com/understanding-feature-engineering-part-4-deep-learning-methods-for-text-data-96c44370bbfa
The step that it shows you how to build the skip-gram model architecture seems deprecated because of the use of the Merge layer from keras.layers.
I've seem many discussions about it, and the majority of answers was the you need to use the Functional API of Keras to merge layers now. But the problem is, I'm a total beginner in Keras and have no idea how to translate my code from Sequential to Functional, here's the code that the author used (and I copied):
from keras.layers import Merge
from keras.layers.core import Dense, Reshape
from keras.layers.embeddings import Embedding
from keras.models import Sequential
# build skip-gram architecture
word_model = Sequential()
word_model.add(Embedding(vocab_size, embed_size,
embeddings_initializer="glorot_uniform",
input_length=1))
word_model.add(Reshape((embed_size, )))
context_model = Sequential()
context_model.add(Embedding(vocab_size, embed_size,
embeddings_initializer="glorot_uniform",
input_length=1))
context_model.add(Reshape((embed_size,)))
model = Sequential()
model.add(Merge([word_model, context_model], mode="dot"))
model.add(Dense(1, kernel_initializer="glorot_uniform", activation="sigmoid"))
model.compile(loss="mean_squared_error", optimizer="rmsprop")
# view model summary
print(model.summary())
# visualize model structure
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot
SVG(model_to_dot(model, show_shapes=True, show_layer_names=False,
rankdir='TB').create(prog='dot', format='svg'))
And when I run the block, the following error is shown:
ImportError Traceback (most recent call last)
<ipython-input-79-80d604373468> in <module>()
----> 1 from keras.layers import Merge
2 from keras.layers.core import Dense, Reshape
3 from keras.layers.embeddings import Embedding
4 from keras.models import Sequential
5
ImportError: cannot import name 'Merge'
What I'm asking here is some guidance on how to transform this Sequential into a Functional API structure.
This did indeed change. For a dot product, you can now use the dot layer:
from keras.layers import dot
...
dot_product = dot([target, context], axes=1, normalize=False)
...
You have to set the axis parameter according to your data, of course. If you set normalize=True, this gives the cosine proximity. For more information, see the documentation.
To learn about the functional API to Keras, there is a good guide to the functional API in the documentation. It's not difficult to switch if you already understand the sequential API.
Merge seems deprecated so Instead of Merge use Dot directly on embedding (and not with models). Use below code.
from keras.layers import Input
from keras.models import Model
from keras.layers.embeddings import Embedding
from keras.layers.core import Dense, Reshape
from keras.layers import dot
input_target = Input((1,))
input_context = Input((1,))
embedding = Embedding(vocab_size, embed_size, input_length=1, name='embedding')
word_embedding = embedding(input_target)
word_embedding = Reshape((embed_size, 1))(word_embedding)
context_embedding = embedding(input_context)
context_embedding = Reshape((embed_size, 1))(context_embedding)
# now perform the dot product operation
dot_product = dot([word_embedding, context_embedding], axes=1)
dot_product = Reshape((1,))(dot_product)
# add the sigmoid output layer
output = Dense(1, activation='sigmoid')(dot_product)
model = Model(input=[input_target, input_context], output=output)
model.compile(loss='mean_squared_error', optimizer='rmsprop')
# view model summary
print(model.summary())
# visualize model structure
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot
SVG(model_to_dot(model, show_shapes=True, show_layer_names=False,
rankdir='TB').create(prog='dot', format='svg'))
I am struggling with converting my Keras model into a TensorFlow estimator. I got the following error:
AttributeError: type object 'Dense' has no attribute 'from_config'
And here is my code:
from tensorflow import keras
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense
classifier = keras.models.Sequential()
classifier.add(tf.layers.Dense(units = 6, kernel_initializer = keras.initializers.he_uniform(), activation = tf.nn.relu, input_shape =(11,)))
classifier.add(tf.layers.Dense(units = 6, kernel_initializer = keras.initializers.he_uniform(), activation = tf.nn.relu))
classifier.add(tf.layers.Dense(units = 1, kernel_initializer = tf.keras.initializers.he_uniform(), activation = tf.nn.softmax))
classifier.compile(optimizer=tf.keras.optimizers.Adam(lr=0.0001),
loss=tf.keras.losses.binary_crossentropy,
metric=tf.keras.metrics.categorical_accuracy)
my_estimator = tf.keras.estimator.model_to_estimator(keras_model=classifier)
The error comes from the last line of code
I guess this is because keras Dense has not the good attribute, but how can I find the equivalent that will have from_config?
Keras==2.1.6
tensorflow==1.7.0
Looks like you're using the Dense layer from the wrong package: it should be tf.keras.layers.Dense rather than tf.layers.Dense.
Note that though they have the same class name and lots of similar parameters, in fact they have nothing in common: tf.layers.Dense is a high-level tensorflow API, not related to keras. That's why you can't add them to classifier.