How do I define a multi input layer using Keras Functional API? Below is an example of the neural network I want to build. There are three input nodes. I want each node to be a 1 dimensional numpy array of different lengths.
Here's what I have so far. Basically I want to define an input layer with multiple input tensors.
from keras.layers import Input, Dense, Dropout, concatenate
from keras.models import Model
x1 = Input(shape =(10,))
x2 = Input(shape =(12,))
x3 = Input(shape =(15,))
input_layer = concatenate([x1,x2,x3])
hidden_layer = Dense(units=4, activation='relu')(input_layer)
prediction = Dense(1, activation='linear')(hidden_layer)
model = Model(inputs=input_layer,outputs=prediction)
model.summary()
The code gives the error.
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("x1_1:0", shape=(?, 10), dtype=float32) at layer "x1". The following previous layers were accessed without issue: []
Later when I fit the model I will pass in a list of 1D numpy arrays with the corresponding lengths.
The inputs must be your Input() layers:
model = Model(inputs=[x1, x2, x3],outputs=prediction)
Change
model = Model(inputs=input_layer,outputs=prediction)
to
model = Model(inputs=[x1, x2, x3],outputs=prediction)
Related
I'm trying to extract features from a pretrained model and use on my own model. I can successfully instantiate the Inveption V3 Model and save the outputs tu use as inputs for my model, but as i try to use it i get error. I tried to delete the Flatten layer but looks like the problem isnt this. I think the problem is about the last_output but have no clue on how to solve it.
The code:
#%% Imports.
import tensorflow as tf
from tensorflow.keras.layers import Dense, Conv2D, Flatten, Dropout, MaxPooling2D
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras import layers, Model
from tensorflow.keras.applications.inception_v3 import InceptionV3
import os, signal
import numpy as np
#%% Instatiate an Inception V3 model
url = "https://storage.googleapis.com/mledu-datasets/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5" # Get the weights from the pretrained model
local_weights_file = tf.keras.utils.get_file("inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5", origin = url, extract = True)
pre_trained_model = InceptionV3(input_shape=(150, 150, 3), include_top=False, weights=None) # include_top=False argument, we load a network that doesn't include
pre_trained_model.load_weights(local_weights_file) # the classification layers at the top—ideal for feature extraction.
# Make the model non-trainable, since we will only use it for feature extraction; we won't update the weights of the pretrained model during training.
for layers in pre_trained_model.layers:
layers.trainable = False
# The layer we will use for feature extraction in Inception v3 is called mixed7. It is not the bottleneck of the network, but we are using it to keep a
# sufficiently large feature map (7x7 in this case). (Using the bottleneck layer would have resulting in a 3x3 feature map, which is a bit small.)
last_layer = pre_trained_model.get_layer('mixed7')
print('last layer output shape:', last_layer.output_shape)
last_output = last_layer.output
print(last_output)
# %% Stick a fully connected classifier on top of last_output
# Flatten the output layer to 1 dimension
x = layers.Flatten()(last_output)
# Add a fully connected layer with 1,024 hidden units and ReLU activation
x = layers.Dense(1024, activation='relu')(x)
# Add a dropout rate of 0.2
x = layers.Dropout(0.2)(x)
# Add a final sigmoid layer for classification
x = layers.Dense(1, activation='sigmoid')(x)
# Configure and compile the model
model = Model(pre_trained_model.input, x)
model.compile(loss='binary_crossentropy',
optimizer=RMSprop(lr=0.0001),
metrics=['acc'])
the error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
c:\Users\jpaul\Code\Google_ML_Crash_Course\02_Practica\02_Image_Classification\image_classification_part3.py in
39 # Flatten the output layer to 1 dimension
----> 40 x = layers.Flatten()(last_output)
41
42 # Add a fully connected layer with 1,024 hidden units and ReLU activation
43 x = layers.Dense(1024, activation='relu')(x)
AttributeError: 'Concatenate' object has no attribute 'Flatten'
In your for loop, you overwrote the layers identifier from the import statement of
from tensorflow.keras import layers
So when you try to create a new Flatten() layer, the identifier layers contains a Concatenate object rather than the Keras layers module you were expecting.
Change the variable name in your for loop and you should be good.
I was trying to implement a sequence tagging model with the LSTM. Just for example I have 200 sentences in which, each token has a 1024-dim embedding. I also padded all the sentences to 80-dim vectors. So, I have an input matrix with shape (200,80,1024).
I also padded the targets. I have for each token of the sentences a tag. So the shape of my y is (200,80).
I tried with the LSTM in this way
from keras.models import Model, Input
from keras.layers.merge import add
from keras.layers import LSTM, Embedding, Dense, TimeDistributed, Dropout, Bidirectional, Lambda
max_len = 80
input_text = Input(shape=(max_len,1024), dtype=tf.float32)
x = Bidirectional(LSTM(units=512, return_sequences=True,
recurrent_dropout=0.2, dropout=0.2))(input_text)
x_rnn = Bidirectional(LSTM(units=512, return_sequences=True,
recurrent_dropout=0.2, dropout=0.2))(x)
x = add([x, x_rnn]) # residual connection to the first biLSTM
out = TimeDistributed(Dense(n_tags, activation="softmax"))(x)
model = Model(input_text, out)
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])
history = model.fit(np.array(full_embeddings), y,batch_size=32, epochs=10, verbose=1)
but I get this error:
ValueError: Error when checking target: expected time_distributed_6 to have 3 dimensions, but got array with shape (200, 80)
Could anyone explain me the problem? I'm quite new to Keras and Neural Nets and I'm not able to understand the reason.
Thanks
Since your final output layer is time distributed, it has 3 dimensions. Your targets, y should also have 3 dimensions. Reshape your y
y = np.expand_dims(y, -1)
to have a shape of (200, 80, 1)
I have a built a LSTM architecture using Keras. My goal is to map length 29 time series input sequences of floats to length 29 output sequences of floats. I am trying to implement a "many-to-many" approach. I followed this post for implementing such a model.
I start by reshaping each data point into an np.array of shape `(1, 29, 1). I have multiple data points and train the model on each one separately. The following code is how I build my model:
def build_model():
# define model
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(29, return_sequences=True, input_shape=(29, 1)))
model.add(tf.keras.layers.LeakyReLU(alpha=0.3))
model.compile(optimizer='sgd', loss='mse', metrics = ['mae'])
#cast data
for point in train_dict:
train_data = train_dict[point]
train_dataset = tf.data.Dataset.from_tensor_slices((
tf.cast(train_data[0], features_type),
tf.cast(train_data[1], target_type))
).repeat() #cast into X, Y
# fit model
model.fit(train_dataset, epochs=100,steps_per_epoch = 1,verbose=0)
print(model.summary())
return model
I am confused because when I call model.predict(test_point, steps = 1, verbose = 1) the model returns 29 length 29 sequences! I don't understand why this is happening, based on my understanding from the linked post. When I try return_state=True instead of return_sequences=True then my code raises this error: ValueError: All layers in a Sequential model should have a single output tensor. For multi-output layers, use the functional API.
How do I solve the problem?
Your model has few flaws.
The last layer of your model is an LSTM. Assuming you're doing either classification / regression. This should be followed by a Dense layer (SoftMax/sigmoid - classification, linear - regression). But since this is a time-series problem, dense layer should be wrapped in a TimeDistributed wrapper.
It's odd to apply a LeakyReLU on top of the LSTM.
I've fixed the code with fixes for above issues. See if that helps.
from tensorflow.keras.layers import Embedding, Input, Bidirectional, LSTM, Dense, Concatenate, LeakyReLU, TimeDistributed
from tensorflow.keras.initializers import Constant
from tensorflow.keras.models import Model
from tensorflow.keras.models import Sequential
def build_model():
# define model
model = Sequential()
model.add(LSTM(29, return_sequences=True, input_shape=(29, 1)))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='sgd', loss='mse', metrics = ['mae'])
print(model.summary())
return model
model = build_model()
This is my simple reproducible code:
from keras.callbacks import ModelCheckpoint
from keras.models import Model
from keras.models import load_model
import keras
import numpy as np
SEQUENCE_LEN = 45
LATENT_SIZE = 20
VOCAB_SIZE = 100
inputs = keras.layers.Input(shape=(SEQUENCE_LEN, VOCAB_SIZE), name="input")
encoded = keras.layers.Bidirectional(keras.layers.LSTM(LATENT_SIZE), merge_mode="sum", name="encoder_lstm")(inputs)
decoded = keras.layers.RepeatVector(SEQUENCE_LEN, name="repeater")(encoded)
decoded = keras.layers.Bidirectional(keras.layers.LSTM(VOCAB_SIZE, return_sequences=True), merge_mode="sum", name="decoder_lstm")(decoded)
autoencoder = keras.models.Model(inputs, decoded)
autoencoder.compile(optimizer="sgd", loss='mse')
autoencoder.summary()
x = np.random.randint(0, 90, size=(10, SEQUENCE_LEN,VOCAB_SIZE))
y = np.random.normal(size=(10, SEQUENCE_LEN, VOCAB_SIZE))
NUM_EPOCHS = 1
checkpoint = ModelCheckpoint(filepath='checkpoint/{epoch}.hdf5')
history = autoencoder.fit(x, y, epochs=NUM_EPOCHS,callbacks=[checkpoint])
and here is my code to have a look at the weights in the encoder layer:
for epoch in range(1, NUM_EPOCHS + 1):
file_name = "checkpoint/" + str(epoch) + ".hdf5"
lstm_autoencoder = load_model(file_name)
encoder = Model(lstm_autoencoder.input, lstm_autoencoder.get_layer('encoder_lstm').output)
print(encoder.output_shape[1])
weights = encoder.get_weights()[0]
print(weights.shape)
for idx in range(encoder.output_shape[1]):
token_idx = np.argsort(weights[:, idx])[::-1]
here print(encoder.output_shape) is (None,20) and print(weights.shape) is (100, 80).
I understand that get_weight will print the weight transition after the layer.
The part I did not get based on this architecture is 80. what is it?
And, are the weights here the weight that connect the encoder layer to the decoder? I meant the connection between encoder and the decoder.
I had a look at this question here. as it is only simple dense layers I could not connect the concept to the seq2seq model.
Update1
What is the difference between:
encoder.get_weights()[0] and encoder.get_weights()[1]?
the first one is (100,80) and the second one is (20,80) like conceptually?
any help is appreciated:)
The encoder as you have defined it is a model, and it consists of two layers: an input layer and the 'encoder_lstm' layer which is the bidirectional LSTM layer in the autoencoder. So its output shape would be the output shape of 'encoder_lstm' layer which is (None, 20) (because you have set LATENT_SIZE = 20 and merge_mode="sum"). So the output shape is correct and clear.
However, since encoder is a model, when you run encoder.get_weights() it would return the weights of all the layers in the model as a list. The bidirectional LSTM consists of two separate LSTM layers. Each of those LSTM layers has 3 weights: the kernel, the recurrent kernel and the biases. So encoder.get_weights() would return a list of 6 arrays, 3 for each of the LSTM layers. The first element of this list, as you have stored in weights and is subject of your question, is the kernel of one of the LSTM layers. The kernel of an LSTM layer has a shape of (input_dim, 4 * lstm_units). The input dimension of 'encoder_lstm' layer is VOCAB_SIZE and its number of units is LATENT_SIZE. Therefore, we have (VOCAB_SIZE, 4 * LATENT_SIZE) = (100, 80) as the shape of kernel.
I have a multi input Keras neural network where I want to calculate the gradient of the output with respect to one of the intermediate layers. The layer I chose is inside one of my sequential models that handles one of the two inputs to the model.
But when I try to calculate the gradient with respect to this layer I get [None] returned as the output. I am able to take the gradient for other layers such as the concatenate layer, but not any layers within my 2 branches. Is it possible for me to take the gradient of this layer given that it is within a sequential wrapper in my model?
Here is some code that shows what I'm trying to do.
from keras.models import Sequential, Model
from keras.layers import Dense, Input, concatenate
import keras.backend as K
# Init branch 1
branch1 = Sequential()
branch1.add(Dense(64, activation='relu', input_shape=(1000,)))
branch1.add(Dense(32, activation='relu'))
# Init branch 2
branch2 = Sequential()
branch2.add(Dense(16, activation='relu', input_shape=(500,)))
branch2.add(Dense(8, activation='relu'))
branch1_input = Input(shape=(1000,))
branch1_out = branch1(branch1_input)
branch2_input = Input(shape=(500,))
branch2_out = branch2(branch2_input)
# Combine
x = concatenate([branch1_out, branch2_out])
out = Dense(1)(x)
# Create model
model = Model(inputs=[branch1_input, branch2_input], outputs=out)
layer = model.get_layer('sequential_1').get_layer('dense_2')
grads = K.gradients(model.output, layer.output)
print(grads) # prints out [None]