Fine tuning MoViNet-A2-Stream Tensorflow-Hub model - python

import tensorflow as tf
import tensorflow_hub as hub
hub_url = "https://tfhub.dev/tensorflow/movinet/a2/stream/kinetics-600/classification/3"
encoder = hub.KerasLayer(hub_url, trainable=True)
# Define the image (video) input
image_input = tf.keras.layers.Input(
shape=\[None, None, None, 3\],
dtype=tf.float32,
name='image')
# Define the state inputs, which is a dict that maps state names to tensors.
init_states_fn = encoder.resolved_object.signatures\['init_states'\]
state_shapes = {
name: (\[s if s \> 0 else None for s in state.shape\], state.dtype)
for name, state in init_states_fn(tf.constant(\[0, 0, 0, 0, 3\])).items()
}
states_input = {
name: tf.keras.Input(shape\[1:\], dtype=dtype, name=name)
for name, (shape, dtype) in state_shapes.items()
}
# The inputs to the model are the states and the video
inputs = {\*\*states_input, 'image': image_input}
outputs = encoder(inputs)
model = tf.keras.Model(inputs, outputs, name='movinet')
# Create your example input here.
# Refer to the description or paper for recommended input shapes.
example_input = tf.ones(\[1, 8, 172, 172, 3\])
# Split the video into individual frames.
# Note: we can also split into larger clips as well (e.g., 8-frame clips).
# Running on larger clips will slightly reduce latency overhead, but
# will consume more memory.
frames = tf.split(example_input, example_input.shape\[1\], axis=1)
# Initialize the dict of states. All state tensors are initially zeros.
init_states = init_states_fn(tf.shape(example_input))
# Run the model prediction by looping over each frame.
states = init_states
predictions = \[\]
for frame in frames:
output, states = model({\*\*states, 'image': frame})
predictions.append(output)
# The video classification will simply be the last output of the model.
final_prediction = tf.argmax(predictions\[-1\], -1)
Here is the official tf-hub code from the page.
I want to train this model with the custom dataset. But the problem is the last layer of this model gives output and states. Of which form output we get the predictions. The new states has to be fed to the model as input along with the next frame input.
My question is how wrap the encoder layer and the last prediction layer in a .keras.Model and train it ?

Related

TensorFlow Keras make ThresholdedReLU Theta optimizable

I've built a seq-to-seq model to take N periods of continuous input data and predict M periods of continuous response data (M<N). Because my response is sparse - usually 0, I want to squish predictions that are "small" to 0, so have added a Thresholded ReLU as my final layer. The model, with a single encoder and single decoder is:
# encoder
encoder_inputs_11 = tf.keras.layers.Input(shape=(n_past, n_features))
encoder_layr1_11 = tf.keras.layers.LSTM(encoderLayers[0], return_state=True)
encoder_outputs1_11 = encoder_layr1_11(encoder_inputs_11)
encoder_states1_11 = encoder_outputs1_11[1:]
# decoder
decoder_inputs_11 = tf.keras.layers.RepeatVector(n_future)(encoder_outputs1_11[0])
decoder_layr1_11 = tf.keras.layers.LSTM(decoderLayers[0], return_sequences=True)(decoder_inputs_11, initial_state = encoder_states1_11)
decoder_outputs1_11 = tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(n_response))(decoder_layr1_11)
# threshold layer
thresh_11 = tf.keras.layers.ThresholdedReLU(theta=reluTheta)(decoder_outputs1_11)
# entire model
result = tf.keras.models.Model(encoder_inputs_11, thresh_11)
How can I make theta in tf.keras.layers.ThresholdedReLU(theta=reluTheta) an optimizable parameter?

How do you specify multiple values for the inputTensorName key in INPUTMETADATA spec in Vertex Explainable AI for a Functional API model?

I want to add explanation to my model running in Vertex AI using the Vertex AI SDK.I get a silent error when running the batch prediction using ModelBatchPredictOp, where the ModelBatchPredictOp node runs infinitely.Here is my ModelBatchPredictOp definition;
ModelBatchPredictOp(
project=project_id,
job_display_name="tensorflow-ex-batch-prediction-job",
location=project_location,
model=champion_model.outputs["model"],
instances_format="csv",
predictions_format="jsonl",
gcs_source_uris=gcs_source_uris,
gcs_destination_output_uri_prefix=gcs_destination_output_uri_prefix,
machine_type=batch_prediction_machine_type,
starting_replica_count=batch_prediction_min_replicas,
max_replica_count=batch_prediction_max_replicas,
generate_explanation=True,
)
I have narrowed down the issue to inputTensorName key in the INPUTMETADATA spec.The 'inputTensorName' key in the INPUTMETADATA spec takes in a string for it's value(INPUTMETADATA spec). In my case I have a tensorflow model defined using the functional API meaning it has multiple inputs, as shown below;
# numeric/categorical features in Chicago trips dataset to be preprocessed
NUM_COLS = ["dayofweek", "hourofday", "trip_distance", "trip_miles", "trip_seconds"]
ORD_COLS = ["company"]
OHE_COLS = ["payment_type"]
def build_and_compile_model(dataset: Dataset, model_params: dict) -> Model:
"""Build and compile model.
Args:
dataset (Dataset): training dataset
model_params (dict): model parameters
Returns:
model (Model): built and compiled model
"""
# create inputs (scalars with shape `()`)
num_ins = {
name: Input(shape=(), name=name, dtype=tf.float32) for name in NUM_COLS
}
ord_ins = {
name: Input(shape=(), name=name, dtype=tf.string) for name in ORD_COLS
}
cat_ins = {
name: Input(shape=(), name=name, dtype=tf.string) for name in OHE_COLS
}
# join all inputs and expand by 1 dimension. NOTE: this is useful when passing
# in scalar inputs to a model in Vertex AI batch predictions or endpoints e.g.
# `{"instances": {"input1": 1.0, "input2": "str"}}` instead of
# `{"instances": {"input1": [1.0], "input2": ["str"]}`
all_ins = {**num_ins, **ord_ins, **cat_ins}
exp_ins = {n: tf.expand_dims(i, axis=-1) for n, i in all_ins.items()}
# preprocess expanded inputs
num_encoded = [normalization(n, dataset)(exp_ins[n]) for n in NUM_COLS]
ord_encoded = [str_lookup(n, dataset, "int")(exp_ins[n]) for n in ORD_COLS]
ohe_encoded = [str_lookup(n, dataset, "one_hot")(exp_ins[n]) for n in OHE_COLS]
# ensure ordinal encoded layers is of type float32 (like the other layers)
ord_encoded = [tf.cast(x, tf.float32) for x in ord_encoded]
# concat encoded inputs and add dense layers including output layer
x = num_encoded + ord_encoded + ohe_encoded
x = Concatenate()(x)
for units, activation in model_params["hidden_units"]:
x = Dense(units, activation=activation)(x)
x = Dense(1, name="output", activation="linear")(x)
model = Model(inputs=all_ins, outputs=x, name="nn_model")
model.summary()
logging.info(f"Use optimizer {model_params['optimizer']}")
optimizer = optimizers.get(model_params["optimizer"])
optimizer.learning_rate = model_params["learning_rate"]
model.compile(
loss=model_params["loss_fn"],
optimizer=optimizer,
metrics=model_params["metrics"],
)
return model
As a consequence when getting the input layer, using;
serving_input = list(
loaded_model.signatures["serving_default"].structured_input_signature[1].keys()
)[0]
INPUT_METADATA = {
"input_tensor_name": serving_input,
"encoding": "BAG_OF_FEATURES",
"modality": "numeric",
"index_feature_mapping": cols,
}
I only get one Input layer corresponding to one of the input tensors (cols) in either NUM_COLS, ORD_COLS or OHE_COLS. This causes an infinite run when running the ModelBatchPredictOp in the prediction pipeline as only the name to one input tensor is passed as the value to the inputTensorName. Running
list(model.signatures["serving_default"].structured_input_signature[1].keys())
returns the a list of all the input layers corresponding to the input tensor names (cols) defined in NUM_COLS, ORD_COLS and OHE_COLS.
How do I specify the value of inputTensorName in order to capture all the input layers? Or is there a work around to assign multiple input tensor names to inputTensorName?

Keras won't broadcast-multiply the model output with a mask designed for the entire mini batch

I have a data generator that produces batches of input data (X) and targets (Y), and also a mask (batch_mask) to be applied to the model output (the same mask applies to all the datapoint in the batch; there are different masks for different batches and the data generator takes care of doing this).
As a result, the first dimension of batch_mask could have shape 1 or batch_size (by repeating the same mask along the first dimension batch_size times). I was expecting Keras to let me use either, and I wanted to simply create masks having a shape of 1 on the first dimension.
However, when I tried this, I got the error:
ValueError: Data cardinality is ambiguous:
x sizes: 128, 1
y sizes: 128
Make sure all arrays contain the same number of samples.
Why won't Keras broadcast along the first dimension? It seems like this should not be complicated.
Here's some minimal example code to observe this behavior
import tensorflow.keras as tfk
import numpy as np
#######################
# 1. model definition #
#######################
# model parameters
nfeatures_in = 6
target_size = 8
# model inputs
input = tfk.layers.Input(nfeatures_in)
input_mask = tfk.layers.Input(target_size)
# model graph
out = tfk.layers.Dense(target_size)(input)
out_masked = tfk.layers.Multiply()((out,input_mask)) # multiply all model outputs in the batch by the same mask
model = tfk.Model(inputs=(input, input_mask), outputs=out_masked)
##########################
# 2. dummy data creation #
##########################
batch_size = 32
# create masks the batch
zeros_vector = np.zeros((1,target_size)) # "batch_size"==1
zeros_vector[0,:6] = 1
batch_mask = zeros_vector
# dummy data creation
X = np.random.randn(batch_size, 6)
Y = np.random.randn(batch_size, target_size)*batch_mask # the target is masked by design in each batch
############################
# 3. compile model and fit #
############################
model.compile(optimizer="Adam", loss="mse")
model.fit((X, batch_mask),Y, batch_size=batch_size)
I know I could make this work by either:
repeating the mask to make the first dimension of batch_mask be the size of the first dimension of X (instead of 1).
using pure tensorflow (but I feel like broadcasting along the batch dimension should not be a problem for Keras).
How can I make this work with Keras?
Thank you!
You can create an IdentityLayer which receives as an external input parameter the batch_mask and returns it as a tensor.
class IdentityLayer(tfk.layers.Layer):
def __init__(self, my_mask, **kwargs):
super(IdentityLayer, self).__init__()
self.my_mask = my_mask
def call(self, _):
my_mask = tf.convert_to_tensor(self.my_mask, dtype=tf.float32)
return my_mask
def get_config(self):
config = super().get_config()
config.update({
"my_mask": self.my_mask,
})
return config
The usage of IdentityLayer in a model is straightforward:
# model inputs
input = tfk.layers.Input(nfeatures_in)
input_mask = IdentityLayer(batch_mask)(input)
# model graph
out = tfk.layers.Dense(target_size)(input)
out_masked = tfk.layers.Multiply()((out,input_mask))
model = tfk.Model(inputs=input, outputs=out_masked)
Where batch_mask is a numpy array created as you reported:
zeros_vector = np.zeros((1,target_size)) # "batch_size"==1
zeros_vector[0,:6] = 1
batch_mask = zeros_vector
The solution is to (properly) use a DataGenerator.
See the gist with the working code: https://gist.github.com/iranroman/2aaecf5b5621051df6b1b6b5394e5ef3
Thank you #Marco Cerliani for the discussion that led to figuring out the solution.

save and load custom attention model lstm in keras

I want to run a seq2seq model using lstm for a customer journey analysis.I am able to run the model but unable to load the saved model on a different notebook.
Code for attention model is here:
# RNN "Cell" classes in Keras perform the actual data transformations at each timestep. Therefore, in order to add attention to LSTM, we need to make a custom subclass of LSTMCell.
class AttentionLSTMCell(LSTMCell):
def __init__(self, **kwargs):
self.attentionMode = False
super(AttentionLSTMCell, self).__init__(**kwargs)
# Build is called to initialize the variables that our cell will use. We will let other Keras
# classes (e.g. "Dense") actually initialize these variables.
#tf_utils.shape_type_conversion
def build(self, input_shape):
# Converts the input sequence into a sequence which can be matched up to the internal
# hidden state.
self.dense_constant = TimeDistributed(Dense(self.units, name="AttLstmInternal_DenseConstant"))
# Transforms the internal hidden state into something that can be used by the attention
# mechanism.
self.dense_state = Dense(self.units, name="AttLstmInternal_DenseState")
# Transforms the combined hidden state and converted input sequence into a vector of
# probabilities for attention.
self.dense_transform = Dense(1, name="AttLstmInternal_DenseTransform")
# We will augment the input into LSTMCell by concatenating the context vector. Modify
# input_shape to reflect this.
batch, input_dim = input_shape[0]
batch, timesteps, context_size = input_shape[-1]
lstm_input = (batch, input_dim + context_size)
# The LSTMCell superclass expects no constant input, so strip that out.
return super(AttentionLSTMCell, self).build(lstm_input)
# This must be called before call(). The "input sequence" is the output from the
# encoder. This function will do some pre-processing on that sequence which will
# then be used in subsequent calls.
def setInputSequence(self, input_seq):
self.input_seq = input_seq
self.input_seq_shaped = self.dense_constant(input_seq)
self.timesteps = tf.shape(self.input_seq)[-2]
# This is a utility method to adjust the output of this cell. When attention mode is
# turned on, the cell outputs attention probability vectors across the input sequence.
def setAttentionMode(self, mode_on=False):
self.attentionMode = mode_on
# This method sets up the computational graph for the cell. It implements the actual logic
# that the model follows.
def call(self, inputs, states, constants):
# Separate the state list into the two discrete state vectors.
# ytm is the "memory state", stm is the "carry state".
ytm, stm = states
# We will use the "carry state" to guide the attention mechanism. Repeat it across all
# input timesteps to perform some calculations on it.
stm_repeated = K.repeat(self.dense_state(stm), self.timesteps)
# Now apply our "dense_transform" operation on the sum of our transformed "carry state"
# and all encoder states. This will squash the resultant sum down to a vector of size
# [batch,timesteps,1]
# Note: Most sources I encounter use tanh for the activation here. I have found with this dataset
# and this model, relu seems to perform better. It makes the attention mechanism far more crisp
# and produces better translation performance, especially with respect to proper sentence termination.
combined_stm_input = self.dense_transform(
keras.activations.relu(stm_repeated + self.input_seq_shaped))
# Performing a softmax generates a log probability for each encoder output to receive attention.
score_vector = keras.activations.softmax(combined_stm_input, 1)
# In this implementation, we grant "partial attention" to each encoder output based on
# it's log probability accumulated above. Other options would be to only give attention
# to the highest probability encoder output or some similar set.
context_vector = K.sum(score_vector * self.input_seq, 1)
# Finally, mutate the input vector. It will now contain the traditional inputs (like the seq2seq
# we trained above) in addition to the attention context vector we calculated earlier in this method.
inputs = K.concatenate([inputs, context_vector])
# Call into the super-class to invoke the LSTM math.
res = super(AttentionLSTMCell, self).call(inputs=inputs, states=states)
# This if statement switches the return value of this method if "attentionMode" is turned on.
if(self.attentionMode):
return (K.reshape(score_vector, (-1, self.timesteps)), res[1])
else:
return res
# Custom implementation of the Keras LSTM that adds an attention mechanism.
# This is implemented by taking an additional input (using the "constants" of the RNN class into the LSTM: The encoder output vectors across the entire input sequence.
class LSTMWithAttention(RNN):
def __init__(self, units, **kwargs):
cell = AttentionLSTMCell(units=units)
self.units = units
super(LSTMWithAttention, self).__init__(cell, **kwargs)
#tf_utils.shape_type_conversion
def build(self, input_shape):
self.input_dim = input_shape[0][-1]
self.timesteps = input_shape[0][-2]
return super(LSTMWithAttention, self).build(input_shape)
# This call is invoked with the entire time sequence. The RNN sub-class is responsible
# for breaking this up into calls into the cell for each step.
# The "constants" variable is the key to our implementation. It was specifically added
# to Keras to accomodate the "attention" mechanism we are implementing.
def call(self, x, constants, **kwargs):
if isinstance(x, list):
self.x_initial = x[0]
else:
self.x_initial = x
# The only difference in the LSTM computational graph really comes from the custom
# LSTM Cell that we utilize.
self.cell._dropout_mask = None
self.cell._recurrent_dropout_mask = None
self.cell.setInputSequence(constants[0])
return super(LSTMWithAttention, self).call(inputs=x, constants=constants, **kwargs)
Code defining encoder and decoder model:
# Encoder Layers
encoder_inputs = Input(shape=(None,len_input), name="attenc_inputs")
encoder = LSTM(units=units, return_sequences=True, return_state=True)
encoder_outputs, state_h, state_c = encoder((encoder_inputs))
encoder_states = [state_h, state_c]
#define inference decoder
encoder_model = Model(encoder_inputs, encoder_states)
#encoder_model.save('atten_enc_model.h5')
# define training decoder
decoder_inputs = Input(shape=(None, n_output))
Attention_dec_lstm = LSTMWithAttention(units=units, return_sequences=True, return_state=True)
# Note that the only real difference here is that we are feeding attenc_outputs to the decoder now.
attdec_lstm_out, _, _ = Attention_dec_lstm(inputs=decoder_inputs,
constants=encoder_outputs,
initial_state=encoder_states)
decoder_dense1 = Dense(units, activation="relu")
decoder_dense2 = Dense(n_output, activation='softmax')
decoder_outputs = decoder_dense2(Dropout(rate=.10)(decoder_dense1(Dropout(rate=.10)(attdec_lstm_out))))
atten_model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
atten_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
atten_model.summary()
#Defining inference decoder
state_input_h = Input(shape=(units,), name="state_input_h")
state_input_c = Input(shape=(units,), name="state_input_c")
decoder_states_inputs = [state_input_h, state_input_c]
attenc_seq_out = Input(shape=encoder_outputs.get_shape()[1:], name="attenc_seq_out")
attenc_seq_out
inf_attdec_inputs = Input(shape=(None,n_output), name="inf_attdec_inputs")
attdec_res, attdec_h, attdec_c = Attention_dec_lstm(inputs=inf_attdec_inputs,
constants=attenc_seq_out,
initial_state=decoder_states_inputs)
decoder_states = [attdec_h, attdec_c]
decoder_model = Model(inputs=[inf_attdec_inputs, state_input_h, state_input_c, attenc_seq_out],
outputs=[attdec_res, attdec_h, attdec_c])
Code for model fit and save:
history = atten_model.fit([encoder_input_data, decoder_input_data], decoder_target_data,
batch_size=batch_size_num,
epochs=epochs,
validation_split=0.2,verbose=1)
atten_model.save('atten_model_lstm.h5')
Code to load the encoder decoder model with custom Attention layer:
with open('atten_model_lstm.json') as mdl:
json_string = mdl.read()
model = model_from_json(json_string, custom_objects={'AttentionLSTMCell': AttentionLSTMCell, 'LSTMWithAttention': LSTMWithAttention})
This code to load is giving error :
TypeError: int() argument must be a string, a bytes-like object or a number, not 'AttentionLSTMCell'
Here's a solution inspired by the link in my comment:
# serialize model to JSON
atten_model_json = atten_model.to_json()
with open("atten_model.json", "w") as json_file:
json_file.write(atten_model_json)
# serialize weights to HDF5
atten_model.save_weights("atten_model.h5")
print("Saved model to disk")
# Different part of your code or different file
# load json and create model
json_file = open('atten_model.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)
# load weights into new model
loaded_model.load_weights("atten_model.h5")
print("Loaded model from disk")

TensorFlow: Remember LSTM state for next batch (stateful LSTM)

Given a trained LSTM model I want to perform inference for single timesteps, i.e. seq_length = 1 in the example below. After each timestep the internal LSTM (memory and hidden) states need to be remembered for the next 'batch'. For the very beginning of the inference the internal LSTM states init_c, init_h are computed given the input. These are then stored in a LSTMStateTuple object which is passed to the LSTM. During training this state is updated every timestep. However for inference I want the state to be saved in between batches, i.e. the initial states only need to be computed at the very beginning and after that the LSTM states should be saved after each 'batch' (n=1).
I found this related StackOverflow question: Tensorflow, best way to save state in RNNs?. However this only works if state_is_tuple=False, but this behavior is soon to be deprecated by TensorFlow (see rnn_cell.py). Keras seems to have a nice wrapper to make stateful LSTMs possible but I don't know the best way to achieve this in TensorFlow. This issue on the TensorFlow GitHub is also related to my question: https://github.com/tensorflow/tensorflow/issues/2838
Anyone good suggestions for building a stateful LSTM model?
inputs = tf.placeholder(tf.float32, shape=[None, seq_length, 84, 84], name="inputs")
targets = tf.placeholder(tf.float32, shape=[None, seq_length], name="targets")
num_lstm_layers = 2
with tf.variable_scope("LSTM") as scope:
lstm_cell = tf.nn.rnn_cell.LSTMCell(512, initializer=initializer, state_is_tuple=True)
self.lstm = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * num_lstm_layers, state_is_tuple=True)
init_c = # compute initial LSTM memory state using contents in placeholder 'inputs'
init_h = # compute initial LSTM hidden state using contents in placeholder 'inputs'
self.state = [tf.nn.rnn_cell.LSTMStateTuple(init_c, init_h)] * num_lstm_layers
outputs = []
for step in range(seq_length):
if step != 0:
scope.reuse_variables()
# CNN features, as input for LSTM
x_t = # ...
# LSTM step through time
output, self.state = self.lstm(x_t, self.state)
outputs.append(output)
I found out it was easiest to save the whole state for all layers in a placeholder.
init_state = np.zeros((num_layers, 2, batch_size, state_size))
...
state_placeholder = tf.placeholder(tf.float32, [num_layers, 2, batch_size, state_size])
Then unpack it and create a tuple of LSTMStateTuples before using the native tensorflow RNN Api.
l = tf.unpack(state_placeholder, axis=0)
rnn_tuple_state = tuple(
[tf.nn.rnn_cell.LSTMStateTuple(l[idx][0], l[idx][1])
for idx in range(num_layers)]
)
RNN passes in the API:
cell = tf.nn.rnn_cell.LSTMCell(state_size, state_is_tuple=True)
cell = tf.nn.rnn_cell.MultiRNNCell([cell]*num_layers, state_is_tuple=True)
outputs, state = tf.nn.dynamic_rnn(cell, x_input_batch, initial_state=rnn_tuple_state)
The state - variable will then be feeded to the next batch as a placeholder.
Tensorflow, best way to save state in RNNs? was actually my original question. The code bellow is how I use the state tuples.
with tf.variable_scope('decoder') as scope:
rnn_cell = tf.nn.rnn_cell.MultiRNNCell \
([
tf.nn.rnn_cell.LSTMCell(512, num_proj = 256, state_is_tuple = True),
tf.nn.rnn_cell.LSTMCell(512, num_proj = WORD_VEC_SIZE, state_is_tuple = True)
], state_is_tuple = True)
state = [[tf.zeros((BATCH_SIZE, sz)) for sz in sz_outer] for sz_outer in rnn_cell.state_size]
for t in range(TIME_STEPS):
if t:
last = y_[t - 1] if TRAINING else y[t - 1]
else:
last = tf.zeros((BATCH_SIZE, WORD_VEC_SIZE))
y[t] = tf.concat(1, (y[t], last))
y[t], state = rnn_cell(y[t], state)
scope.reuse_variables()
Rather than using tf.nn.rnn_cell.LSTMStateTuple I just create a lists of lists which works fine. In this example I am not saving the state. However you could easily have made state out of variables and just used assign to save the values.

Categories

Resources