I'm looking for a way to create a Keras model with optional inputs. In raw TensorFlow, you can create placeholders with optional inputs as follows:
import numpy as np
import tensorflow as tf
def main():
required_input = tf.placeholder(
tf.float32,
shape=(None, 2),
name='required_input')
default_optional_input = tf.random_uniform(
shape=(tf.shape(required_input)[0], 3))
optional_input = tf.placeholder_with_default(
default_optional_input,
shape=(None, 3),
name='optional_input')
output = tf.concat((required_input, optional_input), axis=-1)
with tf.Session() as session:
with_optional_input_output_np = session.run(output, feed_dict={
required_input: np.random.uniform(size=(4, 2)),
optional_input: np.random.uniform(size=(4, 3)),
})
print(f"with optional input: {with_optional_input_output_np}")
without_optional_input_output_np = session.run(output, feed_dict={
required_input: np.random.uniform(size=(4, 2)),
})
print(f"without optional input: {without_optional_input_output_np}")
if __name__ == '__main__':
main()
In a similar fashion, I would want to be able to have optional inputs for my Keras model. It seems like the tensor argument in the keras.layers.Input.__init__ might be what I'm looking for, but at least it doesn't work as I was expecting (i.e. the same way as tf.placeholder_with_default shown above). Here's an example that breaks:
import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp
def create_model(output_size):
required_input = tf.keras.layers.Input(
shape=(13, ), dtype='float32', name='required_input')
batch_size = tf.shape(required_input)[0]
def sample_optional_input(inputs, batch_size=None):
base_distribution = tfp.distributions.MultivariateNormalDiag(
loc=tf.zeros(output_size),
scale_diag=tf.ones(output_size),
name='sample_optional_input')
return base_distribution.sample(batch_size)
default_optional_input = tf.keras.layers.Lambda(
sample_optional_input,
arguments={'batch_size': batch_size}
)(None)
optional_input = tf.keras.layers.Input(
shape=(output_size, ),
dtype='float32',
name='optional_input',
tensor=default_optional_input)
concat = tf.keras.layers.Concatenate(axis=-1)(
[required_input, optional_input])
dense = tf.keras.layers.Dense(
output_size, activation='relu')(concat)
model = tf.keras.Model(
inputs=[required_input, optional_input],
outputs=[dense])
return model
def main():
model = create_model(output_size=3)
required_input_np = np.random.normal(size=(4, 13))
outputs_np = model.predict({'required_input': required_input_np})
print(f"outputs_np: {outputs_np}")
required_input = tf.random_normal(shape=(4, 13))
outputs = model({'required_input': required_input})
print(f"outputs: {outputs}")
if __name__ == '__main__':
main()
The first call to the model.predict seems to give correct output, but for some reason, the direct call to model fails with the following error:
ValueError: Layer model expects 2 inputs, but it received 1 input tensors. Inputs received: []
Can the tensor argument in Input.__init__ be used to implement optional inputs for Keras model as in my example above? If yes, what should I change in my example to make it run correctly? If not, what is the expected way of creating optional inputs in Keras?
I really don't think it's possible without workarounds. Keras was not meant for that.
But, noticing that you are using two different session.run commands for each case, it seems that it should be easy to do it with two models. One model uses the optional input, the other doesn't. You choose which one to use the same way you choose which session.run() to call.
That said, you can use Input(tensor=...) or simply create the optional input inside a Lambda layer. Both things are fine. But don't use Input(shape=..., tensor=...), these are redundant arguments and sometimes Keras does not deal well with redundancies like this.
Ideally, keep all operations inside Lambda layers, even the tf.shape operation.
That said:
required_input = tf.keras.layers.Input(
shape=(13, ), dtype='float32', name='required_input')
#needs the input for the case you want to pass it:
optional_input_when_used = tf.keras.layers.Input(shape=(output_size,))
#operations should be inside Lambda layers
batch_size = Lambda(lambda x: tf.shape(x)[0])(required_input)
#updated for using the batch size coming from lambda
#you didn't use "inputs" anywhere in this function
def sample_optional_input(batch_size):
base_distribution = tfp.distributions.MultivariateNormalDiag(
loc=tf.zeros(output_size),
scale_diag=tf.ones(output_size),
name='sample_optional_input')
return base_distribution.sample(batch_size)
#updated for using the batch size as input
default_optional_input = tf.keras.layers.Lambda(sample_optional_input)(batch_size)
#let's skip the concat for now - notice I'm not "using" this layer yet
dense_layer = tf.keras.layers.Dense(output_size, activation='relu')
#you could create the rest of the model here if it's big, so you don't create it twice
#(check the final section of this answer)
Model using passed input:
concat_when_used = tf.keras.layers.Concatenate(axis=-1)(
[required_input, optional_input_when_used]
)
dense_when_used = dense_layer(concat_when_used)
#or final_part_of_the_model(concat_when_used)
model_when_used = Model([required_input, optional_input_when_used], dense_when_used)
Model not using the optional input:
concat_not_used = tf.keras.layers.Concatenate(axis=-1)(
[required_input, default_optional_input]
)
dense_not_used = dense_layer(concat_not_used)
#or final_part_of_the_model(concat_not_used)
model_not_used = Model(required_input, dense_not_used)
It's ok to create two models like this and choose one to use (both models share the final layers, so they will always be trained together)
Now, at the point you choose which session.run, now you will choose which model to use:
model_when_used.predict([x1, x2])
model_when_used.fit([x1,x2], y)
model_not_used.predict(x)
model_not_used.fit(x, y)
How to create a shared final part?
If your final part is big, you will not want to call everything twice to create two models. In this case, create a final model first:
input_for_final = Input(shape_after_concat)
out = Dense(....)(input_for_final)
out = Dense(....)(out)
out = Dense(....)(out)
.......
final_part_of_the_model = Model(input_for_final, out)
Then use this final part in previous answer.
dense_when_used = final_part_of_the_model(concat_when_used)
dense_not_used = final_part_of_the_model(concat_not_used)
Related
I am trying to use this notebook where we define a 3-head model based on DenseNet201. The AlexNet based works correctly but DenseNet201 throws me an error. I am a Pytorch user and have not been able to figure out the error of ValueError: Missing data for input "input_5". You passed a data dictionary with keys ['img_input']. Expected the following keys: ['input_5'].
I know somewhere in the following code snippet I should have a name 'img_input' but I cannot figure it out.
class base_model():
def __init__(self, side_dim, n_bb, n_classes, name_model):
self.side_dim = side_dim
self.name_model = name_model
# base model DenseNet
if name_model == 'DenseNet201':
self.base_model = keras.applications.DenseNet201(
include_top=False,
input_shape=(self.side_dim, self.side_dim, 3),
)
self.image_input = self.base_model.input
self.flatten = keras.layers.Flatten()(self.base_model.layers[-2].output)
self.BatcNorm = keras.layers.BatchNormalization()(self.flatten)
print('Base model: DenseNet121 (7.2M params x 201 layers')
# ----------------------------------------------------------------------
# Add head with three different outputs to last layer of the basic model
# ----------------------------------------------------------------------
# class output
self.class_categorical = keras.layers.Dense((n_bb * n_classes),
activation='softmax')(self.BatcNorm)
self.class_output = keras.layers.Reshape((n_bb, n_classes),
name='class_output')(self.class_categorical)
# confidence output
self.score_confidence = keras.layers.Dense((n_bb),
name='score_confidence',
activation='tanh')(self.BatcNorm)
# bounding boxes coordinate output
self.score_coords = keras.layers.Dense((n_bb * 4),
name='score_coords')(self.BatcNorm)
The error is thrown when I run the following:
# let's start our training
train_history = myModel.fit({'img_input': X_train},
{'class_output': class_target,
'score_confidence': target_confidence,
'score_coords': target_coords},
epochs=N_ep,
validation_data=({'img_input': X_val},
{'class_output': Val_class,
'score_confidence': Val_confidence,
'score_coords': Val_coords}),
batch_size=Batchs,
initial_epoch = init_ep,
verbose=1,
callbacks=[callbacks,
tensorboard_callback])
In the AlexNet based network, the input name is changed directly but I do not know how to do it for the DenseNet201.
Can you please help me?
The issue is that your input node does not have the same name as the dictionary key holding your input.
You can create your input layer before hand wit the right name, and pass it to the DenseNet201 function as the input tensor.
self.image_input = keras.Input((self.side_dim, self.side_dim, 3), name="img_input")
self.base_model = keras.applications.DenseNet201(
include_top=False,
input_tensor=self.image_input,
)
Another option is to get the name of the input right in your dictionary by using the name of the input node:
myModel.fit({myModel.input.name: X_train},
{'class_output': class_target,
'score_confidence': target_confidence,
'score_coords': target_coords})
A final option is to skip using a dictionary all together, given that you have a single input:
myModel.fit(X_train,
{'class_output': class_target,
'score_confidence': target_confidence,
'score_coords': target_coords})
I have a data generator that produces batches of input data (X) and targets (Y), and also a mask (batch_mask) to be applied to the model output (the same mask applies to all the datapoint in the batch; there are different masks for different batches and the data generator takes care of doing this).
As a result, the first dimension of batch_mask could have shape 1 or batch_size (by repeating the same mask along the first dimension batch_size times). I was expecting Keras to let me use either, and I wanted to simply create masks having a shape of 1 on the first dimension.
However, when I tried this, I got the error:
ValueError: Data cardinality is ambiguous:
x sizes: 128, 1
y sizes: 128
Make sure all arrays contain the same number of samples.
Why won't Keras broadcast along the first dimension? It seems like this should not be complicated.
Here's some minimal example code to observe this behavior
import tensorflow.keras as tfk
import numpy as np
#######################
# 1. model definition #
#######################
# model parameters
nfeatures_in = 6
target_size = 8
# model inputs
input = tfk.layers.Input(nfeatures_in)
input_mask = tfk.layers.Input(target_size)
# model graph
out = tfk.layers.Dense(target_size)(input)
out_masked = tfk.layers.Multiply()((out,input_mask)) # multiply all model outputs in the batch by the same mask
model = tfk.Model(inputs=(input, input_mask), outputs=out_masked)
##########################
# 2. dummy data creation #
##########################
batch_size = 32
# create masks the batch
zeros_vector = np.zeros((1,target_size)) # "batch_size"==1
zeros_vector[0,:6] = 1
batch_mask = zeros_vector
# dummy data creation
X = np.random.randn(batch_size, 6)
Y = np.random.randn(batch_size, target_size)*batch_mask # the target is masked by design in each batch
############################
# 3. compile model and fit #
############################
model.compile(optimizer="Adam", loss="mse")
model.fit((X, batch_mask),Y, batch_size=batch_size)
I know I could make this work by either:
repeating the mask to make the first dimension of batch_mask be the size of the first dimension of X (instead of 1).
using pure tensorflow (but I feel like broadcasting along the batch dimension should not be a problem for Keras).
How can I make this work with Keras?
Thank you!
You can create an IdentityLayer which receives as an external input parameter the batch_mask and returns it as a tensor.
class IdentityLayer(tfk.layers.Layer):
def __init__(self, my_mask, **kwargs):
super(IdentityLayer, self).__init__()
self.my_mask = my_mask
def call(self, _):
my_mask = tf.convert_to_tensor(self.my_mask, dtype=tf.float32)
return my_mask
def get_config(self):
config = super().get_config()
config.update({
"my_mask": self.my_mask,
})
return config
The usage of IdentityLayer in a model is straightforward:
# model inputs
input = tfk.layers.Input(nfeatures_in)
input_mask = IdentityLayer(batch_mask)(input)
# model graph
out = tfk.layers.Dense(target_size)(input)
out_masked = tfk.layers.Multiply()((out,input_mask))
model = tfk.Model(inputs=input, outputs=out_masked)
Where batch_mask is a numpy array created as you reported:
zeros_vector = np.zeros((1,target_size)) # "batch_size"==1
zeros_vector[0,:6] = 1
batch_mask = zeros_vector
The solution is to (properly) use a DataGenerator.
See the gist with the working code: https://gist.github.com/iranroman/2aaecf5b5621051df6b1b6b5394e5ef3
Thank you #Marco Cerliani for the discussion that led to figuring out the solution.
I am new to TensorFlow. My task is predict some values (in this case, speed). If I use one value for the model input (l0), then everything is fine, I can train it and make predictions:
dataset, meta = arff.loadarff('data.arff')
# meta: 'XYZ'
# TIMESTAMP_ms's type is numeric
# SPEED_KMH's type is numeric
# POWER_W's type is numeric
# CURRENT_A's type is numeric
# VOLTAGE_V's type is numeric
# TORQUE_Nm's type is numeric
# CADENCE_RPM's type is numeric
speed = np.array(dataset[:]['SPEED_KMH'], dtype=float)
cadence = np.array(dataset[:]['CADENCE_RPM'], dtype=float)
power = np.array(dataset[:]['POWER_W'], dtype=float)
torque = np.array(dataset[:]['TORQUE_Nm'], dtype=float)
# Create model
l0 = tf.keras.layers.Dense(units=4, input_shape=[1]) #with one input all ok. BUT HOW TO USE n-Input?
l1 = tf.keras.layers.Dense(units=4)
l2 = tf.keras.layers.Dense(units=1)
model = tf.keras.Sequential([l0, l1, l2])
model.compile(loss='mean_squared_error', optimizer=tf.keras.optimizers.Adam(0.01))
model.fit(cadence, speed, epochs=500, verbose=True)
...
model.predict([<some_val>])
BUT, when I tried to add several values to the input layer to increase the accuracy of the model, I have a problem:
...
train_data = []
for i in range(len(dataset)):
train_data.append([cadence[i], power[i], torque[i]])
...
l0 = tf.keras.layers.Dense(units=4, input_shape=[3])
...
model.fit(train_data, speed, epochs=1, verbose=True)
ValueError: Failed to find data adapter that can handle input: ( containing values of types {'(
Please, help me transfer multiple values to the input layer l0 of the model?
One way of using multiple inputs for a model is to use Tensorflow's functional API. It allows you to set multiple inputs which you can concatenate together later on in your model.
input1 = tf.keras.layers.Input(shape=(1, ))
input2 = tf.keras.layers.Input(shape=(1,))
input3 = tf.keras.layers.Input(shape=(1,))
mergeLayer = tf.keras.layers.Concatenate(axis=1)([input1, input2, input3])
dense1 = tf.keras.layers.Dense(4)(mergeLayer)
dense2 = tf.keras.layers.Dense(4)(dense1)
output = tf.keras.layers.Dense(1)(dense2)
model = tf.keras.models.Model([input1, input2, input3], output)
Now you can try merging your data together into one list and calling the fit() method on the new model.
For some more information on the functional API, you can go to the docs.
The Keras Functional API
I'm using tf 1.15, i'm trying to make a regression task using a signal.
First of all i load my signals into the pipeline, i have several files, here i simulate the loading using a np.zeros to make the code usable by you.
Every file has this shape (?, 75000, 3), where ? is a random number of elements, 75000 is the number of samples in each element and 3 is the number of signals.
Using the tf.data i unpack them and i get a dataset who output signals with this shape (75000,), and i use them in my keras model.
Everything should be fine until i create the keras model, i copied my input pipeline because during my tests i got different errors using a generic tf.data.dataset or using the dataset built in this way.
import numpy as np
import tensorflow as tf
# called in the dataset pipeline
def my_func(x):
p = np.zeros([86, 75000, 3])
x = p[:,:,0]
y = p[:, :, 1]
z = p[:, :, 2]
return x, y, z
# called in the dataset pipeline
def load_sign(path):
func = tf.compat.v1.numpy_function(my_func, [path], [tf.float64, tf.float64, tf.float64])
return func
# Dataset pipeline
s = [1, 2] # here i have the file paths, i simulate it with numbers
AUTOTUNE = tf.data.experimental.AUTOTUNE
ds = tf.data.Dataset.from_tensor_slices(s)
# ds = ds.map(load_sign, num_parallel_calls=AUTOTUNE)
ds = ds.map(load_sign, num_parallel_calls=AUTOTUNE).unbatch()
itera = tf.data.make_one_shot_iterator(ds)
ABP, ECG, PLETH = itera.get_next()
# Until there everything should be fine
# Here i create my convolutional network
signal = tf.keras.layers.Input(shape=(None,75000), dtype='float32')
x = tf.compat.v1.keras.layers.Conv1D(64, (1), strides=1, padding='same')(signal)
x = tf.keras.layers.Dense(75000)(x)
model = tf.keras.Model(inputs=signal, outputs=x, name='resnet18')
# And finally i try to insert my signal into model
logits = model(PLETH)
I get this error:
ValueError: Input 0 of layer conv1d is incompatible with the layer: its rank is undefined, but the layer requires a defined rank.
Why? And how can i make it works?
Also the input size of my net should be this one according the documentation:
3D tensor with shape: (batch_size, steps, input_dim)
What is the steps? In my case i assume it should be (batch_size, 1, 75000), right?
In the implementation i am using, the lstm is initialized in the following way:
l_lstm = Bidirectional(LSTM(64, return_sequences=True))(embedded_sequences)
What i don't really understand and it might be because of the lack of experience in Python generally: the notation l_lstm= Bidirectional(LSTM(...))(embedded_sequences).
I don't get what i am passing the embedded_sequences to? Because it is not a parameter of LSTM() but also does not seem to be an argument for Bidirectional() as it stands separately.
Here is the documentation for Bidirectional:
def __init__(self, layer, merge_mode='concat', weights=None, **kwargs):
if merge_mode not in ['sum', 'mul', 'ave', 'concat', None]:
raise ValueError('Invalid merge mode. '
'Merge mode should be one of '
'{"sum", "mul", "ave", "concat", None}')
self.forward_layer = copy.copy(layer)
config = layer.get_config()
config['go_backwards'] = not config['go_backwards']
self.backward_layer = layer.__class__.from_config(config)
self.forward_layer.name = 'forward_' + self.forward_layer.name
self.backward_layer.name = 'backward_' + self.backward_layer.name
self.merge_mode = merge_mode
if weights:
nw = len(weights)
self.forward_layer.initial_weights = weights[:nw // 2]
self.backward_layer.initial_weights = weights[nw // 2:]
self.stateful = layer.stateful
self.return_sequences = layer.return_sequences
self.return_state = layer.return_state
self.supports_masking = True
self._trainable = True
super(Bidirectional, self).__init__(layer, **kwargs)
self.input_spec = layer.input_spec
self._num_constants = None
Let's try to break down what is going on:
You start with LSTM(...) which creates an LSTM Layer. Now layers in Keras are callable which means you can use them like functions. For example lstm = LSTM(...) and then lstm(some_input) will call the LSTM on the given input tensor.
The Bidirectional(...) wraps any RNN layer and returns you another layer that when called applies the wrapped layer in both directions. So l_lstm = Bidirectional(LSTM(...)) is a layer when called with some input will apply the LSTM in both direction. Note: Bidirectional creates a copy of passed LSTM layer, so backwards and forwards are different LSTMs.
Finally, when you call Bidirectional(LSTM(...))(embedded_seqences) bidirectional layer takes the input sequences, passes it to the wrapped LSTMs in both directions, collects their output and concatenates it.
To understand more about layers and their callable nature, you can look at the functional API guide of the documentation.