BERT embeddings in LSTM model error in fit function - python

I am novice in TensorFlow
I am traying to use BERT embeddings in LSTM model
this is my model function
def bert_tweets_model():
Bertmodel = TFAutoModel.from_pretrained(model_name,output_hidden_states=True)
input_word_ids = tf.keras.Input(shape=(max_length,), dtype=tf.int32, name="input_ids")
input_masks_in = tf.keras.Input(shape=(max_length,), name='masked_token', dtype='int32')
with torch.no_grad():
last_hidden_states = Bertmodel(input_word_ids, attention_mask=input_masks_in)[0]
x = tf.keras.layers.LSTM(100, dropout=0.1, activation='relu',recurrent_dropout=0.3,return_sequences = True)(last_hidden_states)
x = tf.keras.layers.LSTM(50, dropout=0.1,activation='relu', recurrent_dropout=0.3,return_sequences = True)(x)
output = tf.keras.layers.Dense(units = 2, activation='sigmoid')(x)
model = tf.keras.Model(inputs=[input_word_ids, input_masks_in], outputs = output)
return model
with strategy.scope():
model = bert_tweets_model()
adam_optimizer = tf.keras.optimizers.Adam(learning_rate=1e-5)
validation_data=[dev_encoded, y_val]
train2=[input_id, attention_mask]
history =
x=train2, y=y_train, batch_size=batch_size,
I recieved this error in fit function when I tried to input data
"ValueError: Layer "model_1" expects 2 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, 512) dtype=int32>]"
also,I received these warning massages I do not know what is means.
WARNING:tensorflow:Layer lstm_2 will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.
WARNING:tensorflow:Layer lstm_3 will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.
can someone help me, thanks in advance.

Regenerating your error
_input1 = tf.random.uniform((1,100), 0 , 10)
_input2 = tf.random.uniform((1,100), 0 , 10)
model(_input1, _input2)
After running this code I am getting the same error...
Layer "model" expects 2 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor: shape=(1, 100), ...
#Now, the problem is you have to enclose the inputs in the set or list then you have to pass the inputs to the model like this
model((_input1, _input2))
<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[0.5324366, 0.3743334]], dtype=float32)>
Remember: if you are using then encolse it then while making the dataset enclose the dataset within the set like this, words_mask))
Second Problem as you asked
The warning you are getting because, you should be aware that LSTM doesn't run in CUDA GPU it uses the CPU only therefore it is slow, so TensorFlow is just telling you that LSTM will not run under GPU or parallel computing.


how to save and load custom siamese bert model

I am following this tutorial on how to train a siamese bert network:
all good, but I am not sure what is the best way to save the model after train it and save it.
any suggestion?
I was trying with'models/bert_siamese_v1')
which creates a folder with save_model.bp keras_metadata.bp and two subfolders (variables and assets)
then I try to load it with:
and it gives me this error:
2022-03-08 14:11:52.567762: W tensorflow/core/util/] Could not open models/bert_siamese_v1/: Failed precondition: models/bert_siamese_v1; Is a directory: perhaps your file is in a different file format and you need to use a different restore operator?
what is the best way to proceed?
Try using to save your model:, 'models/bert_siamese_v1')
model = tf.saved_model.load('models/bert_siamese_v1')
The warning you get during saving can apparently be ignored. After loading your model, you can use it for inference f(test_data):
f = model.signatures["serving_default"]
x1 = tf.random.uniform((1, 128), maxval=100, dtype=tf.int32)
x2 = tf.random.uniform((1, 128), maxval=100, dtype=tf.int32)
x3 = tf.random.uniform((1, 128), maxval=100, dtype=tf.int32)
print(f(attention_masks = x1, input_ids = x2, token_type_ids = x3))
ConcreteFunction signature_wrapper(*, token_type_ids, attention_masks, input_ids)
attention_masks: int32 Tensor, shape=(None, 128)
input_ids: int32 Tensor, shape=(None, 128)
token_type_ids: int32 Tensor, shape=(None, 128)
{'dense': <1>}
<1>: float32 Tensor, shape=(None, 3)
{'dense': <tf.Tensor: shape=(1, 3), dtype=float32, numpy=array([[0.40711606, 0.13456087, 0.45832306]], dtype=float32)>}
It seems you have two options
manually save weights
model = create_model()
save the entire model
Call to save a model's architecture, weights, and training configuration in a single file/folder. This allows you to export a model so it can be used without access to the original Python code*. Since the optimizer-state is recovered, you can resume training from exactly where you left off.
Save model
# Create and train a new model instance.
model = create_model(), train_labels, epochs=5)
# Save the entire model as a SavedModel.
!mkdir -p saved_model'saved_model/my_model')
load model
new_model = tf.keras.models.load_model('saved_model/my_model')
It seems that you are mixing both approaches, saving model and loading weights.

Keras: Trying to model.predict() gives "ValueError: Tensor's shape is not compatible with supplied shape"

I'm following the TensorFlow Keras tutorial for text generation. The training part works perfectly, but when I try to predict the next token, I get an error.
Here's all the important code:
Making the vocabulary and dataset.
vocab = sorted(set(text))
char2index = { c:i for i, c in enumerate(vocab) }
index2char = np.array(vocab)
chars_to_int = np.array([char2index[c] for c in text])
char_dataset =
sequences = char_dataset.batch(seq_length + 1, drop_remainder=True)
def split_input_and_target(sequence):
input_ = sequence[:-1]
target_ = sequence[1:]
return input_, target_
dataset =
dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)
Building the model
(important part here is that BATCH_SIZE = 64):
model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(len(vocab), EMBEDDING_DIM,
batch_input_shape=[BATCH_SIZE, None]))
# here are a few more layers
model.compile(loss="sparse_categorical_crossentropy", optimizer="adam"), epochs=EPOCHS)
Actually trying to generate text (this one was copied almost directly from the tutorial after I started getting desperate):
num_tokens = 100
seed = "some text"
input_eval = [char2index[c] for c in seed]
input_eval = tf.expand_dims(input_eval, 0)
text_generated = []
for i in range(num_tokens):
predictions = model(input_eval)
predictions = tf.squeeze(predictions, 0)
# more stuff
Then, I first get a warning:
WARNING:tensorflow:Model was constructed with shape (64, None) for input Tensor("embedding_14_input:0", shape=(64, None), dtype=float32), but it was called on an input with incompatible shape (1, 9).
Then it gives me an error:
---->3 predictions = model(input_eval)
ValueError: Tensor's shape (9, 64, 256) is not compatible with supplied shape [9, 1, 256]
The second number, 64, is my batch size. If I change BATCH_SIZE to 1, everything works and all is fine, but this is obviously not the solution I am hoping for.
(I somehow managed to miss a step in the tutorial despite reading it several times over the past few hours.)
Here's the relevant passage:
To keep this prediction step simple, use a batch size of 1.
Because of the way the RNN state is passed from timestep to timestep, the model only accepts a fixed batch size once built.
To run the model with a different batch_size, we need to rebuild the model and restore the weights from the checkpoint.
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))[1, None]))
I hope my silly mistake will help somebody to remember to reload the model in the future!

Error on prediction running keras multi_gpu_model

I've an issue running a Keras model on a Google Cloud Platform instance.
The model is the following:
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1))
verbose, epochs, batch_size = 1, 1, 64 # low number of epochs just for testing purpose
with tf.device('/cpu:0'):
m = Sequential()
m.add(CuDNNLSTM(20, input_shape=(n_timesteps, n_features)))
m.add(CuDNNLSTM(20, return_sequences=True))
self.model = multi_gpu_model(m, gpus=8)
self.model.compile(loss='mse', optimizer='adam'), train_y, epochs=epochs, batch_size=batch_size, verbose=verbose)
As you can see from the code above, I run the model on machine with 8 GPUs (Nvidia Tesla K80).
Train works well, without any errors. However, the prediction fails and returns the following error:
W tensorflow/core/framework/] OP_REQUIRES failed at : Unknown: CUDNN_STATUS_BAD_PARAM
in tensorflow/stream_executor/cuda/ 'cudnnSetTensorNdDescriptor( tensor_desc.get(), data_type, sizeof(dims) / sizeof(dims[0]), dims, strides)'
Here the code to run the prediction:
What I've noticed is that if I remove the code for multi-GPU data parallelism, the code works well using a single GPU.
To be more precise, if I comment this line, the code works without error
self.model = multi_gpu_model(m, gpus=8)
What am I missing?
virtualenv information
cudatoolkit - 10.0.130
cudnn - 7.6.4
keras - 2.2.4
keras-applications - 1.0.8
keras-base - 2.2.4
keras-gpu - 2.2.4
python - 3.6
train_x.shape = (1441, 288, 1)
train_y.shape = (1441, 288, 1)
input_x.shape = (1, 288, 1)
After Olivier Dehaene's reply I tried his suggestion and it worked.
I tried to modify the input_x shape in order to obtain (8, 288, 1).
In order to do that I also modified train_x and train_y shapes.
Here a recap:
train_x.shape = (8065, 288, 1)
train_y.shape = (8065, 288, 1)
input_x.shape = (8, 288, 1)
But now I've the same error on the training phase, on this line:, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose)
From the tf.keras.utils.multi_gpu_model we can see that it works in the following way:
Divide the model's input(s) into multiple sub-batches.
Apply a model copy on each sub-batch. Every model copy is executed on a dedicated GPU.
Concatenate the results (on CPU) into one big batch.
You are triggering an error because the input of the CuDNNLSTM layer is empty for at least one of the model copy. This is because the divide operations requires that: input // n_gpus > 0
Try this code out:
input_x = np.random.randn(8, n_timesteps, n_features)

tf.keras & tf.estimator & tf.dataset

I am trying to update my code to work with TF 2.0. as a start, I have used a pre-made keras model:
def train_input_fn(batch_size=1):
"""An input function for training"""
print("train_input_fn: start function")
train_dataset =, batch_size=batch_size,label_name='label',
print('train_input_fn: finished make_csv_dataset')
train_dataset =
print("train_input_fn: finished the map with pars_features_vector")
train_dataset = train_dataset.repeat().batch(batch_size)
print("train_input_fn: finished batch size. train_dataset is %s ", train_dataset)
return train_dataset
IMG_SHAPE = (160,160,3)
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
include_top = False,
weights = 'imagenet')
base_model.trainable = False
estimator = tf.keras.estimator.model_to_estimator(keras_model = model, model_dir = './date')
# train_input_fn read a CSV of images, resize them and returns dataset batch
train_spec = tf.estimator.TrainSpec(input_fn=train_input_fn, max_steps=20)
# eval_input_fn read a CSV of images, resize them and returns dataset batch of one sample
eval_spec = tf.estimator.EvalSpec(eval_input_fn)
tf.estimator.train_and_evaluate(estimator, train_spec=train_spec, eval_spec=eval_spec)
LOGS are:
train_input_fn: finished batch size. train_dataset is %s <BatchDataset shapes: ({mobilenetv2_1.00_160_input: (None, 1, 160, 160, 3)}, (None, 1)), types: ({mobilenetv2_1.00_160_input: tf.float32}, tf.int32)>
ValueError: Input 0 of layer Conv1_pad is incompatible with the layer: expected ndim=4, found ndim=5. Full shape received: [None, 1, 160, 160, 3]
What will be the right way to combine tf.keras with dataset API. is this the issue or something else?
You don't need this line
train_dataset = train_dataset.repeat().batch(batch_size)
Function you're using to create dataset, alredy batched it. You can use repeat though

fine tune a model using Keras Functional API

I am using VGG16 to finetune it on my dataset.
Here's the model:
def finetune(self, aux_input):
model = applications.VGG16(weights='imagenet', include_top=False)
# return model
drop_5 = Input(shape=(7, 7, 512))
flatten = Flatten()(drop_5)
# aux_input = Input(shape=(1,))
concat = Concatenate(axis=1)([flatten, aux_input])
fc1 = Dense(512, kernel_regularizer=regularizers.l2(self.weight_decay))(concat)
fc1 = Activation('relu')(fc1)
fc1 = BatchNormalization()(fc1)
fc1_drop = Dropout(0.5)(fc1)
fc2 = Dense(self.num_classes)(fc1_drop)
top_model_out = Activation('softmax')(fc2)
top_model = Model(inputs=drop_5, outputs=top_model_out)
output = top_model(model.output)
complete_model = Model(inputs=[model.input, aux_input], outputs=output)
return complete_model
I have two inputs to the model. In the above function, I'm using Concatenate for the flattened array and my aux_input.
I'm not sure if this would work with imagenet weights.
When I run this, I get an error:
ValueError: Graph disconnected: cannot obtain value for tensor
Tensor("aux_input:0", shape=(?, 1), dtype=float32) at layer
"aux_input". The following previous layers were accessed without
issue: ['input_2', 'flatten_1']
Not sure where am I going wrong.
If it matters, this is fit function:{'input_1': x_train, 'aux_input': y_aux_train}, y=y_train, batch_size=batch_size,
epochs=maxepoches, validation_data=([x_test, y_aux_test], y_test),
callbacks=[reduce_lr, tensorboard], verbose=2)
But, I get an error before this fit function when I call model.summary().
The problem is that you are using aux_input in your top_model but you don't specify it as an input in your definition of top_model. Try replacing your definition of top_model and output with the following:
top_model = Model(inputs=[drop_5, aux_input], outputs=top_model_out)
output = top_model([model.output, aux_input])

