Difference between model(x) and model.predict(x) - python

Here is a simple tensorflow functional API model.
input1 = tf.keras.layers.Input(shape=(2,), dtype='float32')
output1 = tf.keras.layers.Dense(2)(input1)
model = tf.keras.Model(inputs=input1, outputs=output1)
In some examples of the functional API, output is obtained using model(), yet there is also model.predict().
With my example above, predict works:
model.predict([[[1.1, 2.2]]])
>> array([[1.8761028 , 0.20520687]], dtype=float32)
If I run just the model though, I get an error:
model([[[1.1, 2.2]]])
>> ... InvalidArgumentError: In[0] is not a matrix [Op:MatMul]
What is the difference and why is the error occuring?
Thanks,
Julian

The error states model() expects a matrix as input, where you've provided a list.
To solve this, just convert it to a matrix:
model(tf.Variable([[[1.1, 2.2]]]))
or
model(np.array([[[1.1, 2.2]]]))
On the difference between model() and model.predict()
The code you are referring to where "output is obtained using model()":
left_proba = model(obs[np.newaxis]) # <--- HERE
action = (tf.random.uniform([1, 1]) > left_proba)
y_target = tf.constant([[1.]]) - tf.cast(action, tf.float32)
loss = tf.reduce_mean(loss_fn(y_target, left_proba))
This is similar to your 2nd line of code:
output1 = tf.keras.layers.Dense(2)(input1)
How is this similar, you ask?
In your code, you create a new node in the graph of layers by calling a Dense layer on this input1 object.
The "layer call" action is like drawing an arrow from input1 to this layer you created.
You're "passing" the inputs to the dense layer, and out you get output1.
In the reference code, they treat model like a layer, and do a "layer call".
See the similarity?:
output = Dense(input)
left_proba = model(obs[...])
In turn this creates new nodes that perform other operations (in the 3 lines that follow).
This is useful when you want to take an existing model and use it as a component (or "layer") to build another, new model.
As for model inference, you will always do this via y = model.predict(x).

Related

Keras: create submodel (from layer "m" to layer "n") of a "full" model without using loops

I have a Keras-model (let's call it full model), which was already trained and now I would like to create a new submodel using layers m to n of the full model.
E.g. full model has 10 layers and my submodel shall comprise layers 3 to 8
For the case that m=0, the task is trivial as one can use: (assume we want to go to layer 5)
full_model = ... # anything we load from a h5-file
submodel=tf.keras.Model(inputs=full_model.inputs, outputs=full_model.layers[5].output)
# =>
submodel.summary()
tf.keras.utils.plot_model(submodel, to_file = ...)
So, we can use the submodel, get its summary and also get the png-plot of the submodel-architecture.
The concrete problem now is that I don't know how to make this if we want to take the last layers of the model for example. I always get a GraphDisconnected error than.
The only way to get around this, that I found, was to manually loop over the layers (as the function below, "create_submodel", is doing it) - but in my case, I cannot use this because the model is quite complex and the layers are not simply put after each other but they are nested and so on i.e. in the architecture-plot, I do not have a straight series of layers but many different branches in the "tree" of layers.
So: Is there a way to create a submodel (from layer "m" to layer "n") of a "full" model without simple, naive looping through the layers (as demonstrated in the function below)
Thanks very much!
def create_submodel(full_model, start_layer_number=None, end_layer_number=None):
layers = tf.keras.layers
if start_layer_number is None:
start_layer_number = 0
if end_layer_number is None:
end_layer_number = len(full_model.layers)
inp_shape = full_model.layers[start_layer_number].input.shape[1:]
inp = layers.Input(shape=(inp_shape))
x = inp
for i in range(start_layer_number, end_layer_number):
print(i, full_model.layers[i].name)
x = full_model.layers[i](x)
out = x
sub_model = tf.keras.Model(inputs=inp, outputs=out)
sub_model.summary()
return sub_model

Using Gradient Tape for Jacobian of LSTM model - Python

I am building a sequence to one model prediction using LSTM. My data has 4 input variables and 1 output variable which needs to be predicted. The data is a time series data. The total length of the data is 38265 (total number of timesteps). The total data is in a Data Frame of size 38265 *5
I want to use the previous 20 timesteps data of the 4 input variables to make prediction of my output variable. I am using the below code for this purpose.
model = Sequential()
model.add(LSTM(units = 120, activation ='relu', return_sequences = False,input_shape =
(train_in.shape[1],5)))
model.add(Dense(100,activation='relu'))
model.add(Dense(50,activation='relu'))
model.add(Dense(1))
I want to calculate the Jacobian of the output variable w.r.t the LSTM model function using tf.Gradient Tape .. Can anyone help me out with this??
The solution to segregate the Jacobian of the output with respect to the LSTM input can be done as follows:
Using tf.GradientTape(), we can compute the Jacobian arising from the gradient flow.
However for getting the Jacobian , the input needs to be in the form of tf.EagerTensor which is usually available when we want to see the Jacobian of the output (after executing y=model(x)). The following code snippet shares this idea:
#Get the Jacobian for each persistent gradient evaluation
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(2,activation='relu'))
model.add(tf.keras.layers.Dense(2,activation='relu'))
x = tf.constant([[5., 6., 3.]])
with tf.GradientTape(persistent=True,watch_accessed_variables=True) as tape:
# Forward pass
tape.watch(x)
y = model(x)
loss = tf.reduce_mean(y**2)
print('Gradients\n')
jacobian_wrt_loss=tape.jacobian(loss,x)
print(f'{jacobian_wrt_loss}\n')
jacobian_wrt_y=tape.jacobian(y,x)
print(f'{jacobian_wrt_y}\n')
But for getting intermediate outputs ,such as in this case, there have been many samples which use Keras. When we separate the outputs coming out from model.layers.output, we get the type to be a Keras.Tensor instead of an EagerTensor.
However for creating the Jacobian, we need the Eager Tensor. (After many failed attempts with #tf.function wrapping as eager execution is already present in TF>2.0)
So alternatively, an auxiliary model can be created with the layers required (in this case, just the Input and LSTM layers).The output of this model will be a tf.EagerTensor which will be useful for the Jacobian tensor creation. The following has been shown in this snippet:
#General Syntax for getting jacobians for each layer output
import numpy as np
import tensorflow as tf
tf.executing_eagerly()
x=tf.constant([[15., 60., 32.]])
x_inp = tf.keras.layers.Input(tensor=tf.constant([[15., 60., 32.]]))
model=tf.keras.Sequential()
model.add(tf.keras.layers.Dense(2,activation='relu',name='dense_1'))
model.add(tf.keras.layers.Dense(2,activation='relu',name='dense_2'))
aux_model=tf.keras.Sequential()
aux_model.add(tf.keras.layers.Dense(2,activation='relu',name='dense_1'))
#model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
with tf.GradientTape(persistent=True,watch_accessed_variables=True) as tape:
# Forward pass
tape.watch(x)
x_y = model(x)
act_y=aux_model(x)
print(x_y,type(x_y))
ops=[layer.output for layer in model.layers]
# ops=[layer.output for layer in model.layers]
# inps=[layer.input for layer in model.layers]
print('Jacobian of Full FFNN\n')
jacobian=tape.jacobian(x_y,x)
print(f'{jacobian[0]}\n')
print('Jacobian of FFNN with Just first Dense\n')
jacobian=tape.jacobian(act_y,x)
print(f'{jacobian[0]}\n')
Here I have used a simple FFNN consisting of 2 Dense layers, but I want to evaluate w.r.t the output of the first Dense layer. Hence I created an auxiliary model having just 1 Dense layer and determined the output of the Jacobian from it.
The details can be found here.
With the help from #Abhilash Majumder, I have done it this way. I am posting it here so that it might help someone in the future.
import numpy as np
import pandas as pd
import tensorflow as tf
tf.compat.v1.enable_eager_execution() #This will enable eager execution which is must.
tf.executing_eagerly() #check if eager execution is enabled or not. Should give "True"
data = pd.read_excel("FileName or Location ")
#My data is in the from of dataframe with 127549 rows and 5 columns(127549*5)
a = data[:20] #shape is (20,5)
b = data[50:70] # shape is (20,5)
A = [a,b] # making a list
A = np.array(A) # convert into array size (2,20,5)
At = tf.convert_to_tensor(A, np.float32) #convert into tensor
At.shape # TensorShape([Dimension(2), Dimension(20), Dimension(5)])
model = load_model('EKF-LSTM-1.h5') # Load the trained model
# I have a trained model which is shown in the question above.
# Output of this model is a single value
with tf.GradientTape(persistent=True,watch_accessed_variables=True) as tape:
tape.watch(At)
y1 = model(At) #defining your output as a function of input variables
print(y1,type(y1)
#output
tf.Tensor([[0.04251503],[0.04634088]], shape=(2, 1), dtype=float32) <class
'tensorflow.python.framework.ops.EagerTensor'>
jacobian=tape.jacobian(y1,At) #jacobian of output w.r.t both inputs
jacobian.shape
Outupt
TensorShape([Dimension(2), Dimension(1), Dimension(2), Dimension(20), Dimension(5)])
Here I calculated Jacobian w.r.t 2 inputs each of size (20,5). If you want to calculate w.r.t to only one input of size (20,5), then use this
jacobian=tape.jacobian(y1,At[0]) #jacobian of output w.r.t only 1st input in 'At'
jacobian.shape
Output
TensorShape([Dimension(1), Dimension(1), Dimension(1), Dimension(20), Dimension(5)])
For those looking to compute the Jacobian over a series of inputs and outputs that are independent of each other for input[i], output[j], i != j, consider the batch_jacobian method.
This will reduce the number of dimensions in your computed Jacobian tensor by one and could be the difference between running out of memory and not.
See: batch_jacobian in the TensorFlow GradientTape docs.

How to print out the tensor values of a specific layer

I wish to exam the values of a tensor after mask is applied to it.
Here is a truncated part of the model. I let temp = x so later I wish to print temp to check the exact values.
So given a 4-class classification model using acoustic features. Assume I have data in (1000,50,136) as (batch, timesteps, features)
The objective is to check if the model is studying the features by timesteps. In other words, we wish to reassure the model is learning using slice as the red rectangle in the picture. Logically, it is the way for Keras LSTM layer but the confusion matrix produced is quite different when a parameter changes (eg. Dense units). The validation accuracy stays 45% thus we would like to visualize the model.
The proposed idea is to print out the first step of the first batch and print out the input in the model. If they are the same, then model is learning in the right way ((136,1) features once) instead of (50,1) timesteps of a single feature once.
input_feature = Input(shape=(X_train.shape[1],X_train.shape[2]))
x = Masking(mask_value=0)(input_feature)
temp = x
x = Dense(Dense_unit,kernel_regularizer=l2(dense_reg), activation='relu')(x)
I have tried tf.print() which brought me AttributeError: 'Tensor' object has no attribute '_datatype_enum'
As Get output from a non final keras model layer suggested by Lescurel.
model2 = Model(inputs=[input_attention, input_feature], outputs=model.get_layer('masking')).output
print(model2.predict(X_test))
AttributeError: 'Masking' object has no attribute 'op'
You want to output after mask.
lescurel's link in the comment shows how to do that.
This link to github, too.
You need to make a new model that
takes as inputs the input from your model
takes as outputs the output from the layer
I tested it with some made-up code derived from your snippets.
import numpy as np
from keras import Input
from keras.layers import Masking, Dense
from keras.regularizers import l2
from keras.models import Sequential, Model
X_train = np.random.rand(4,3,2)
Dense_unit = 1
dense_reg = 0.01
mdl = Sequential()
mdl.add(Input(shape=(X_train.shape[1],X_train.shape[2]),name='input_feature'))
mdl.add(Masking(mask_value=0,name='masking'))
mdl.add(Dense(Dense_unit,kernel_regularizer=l2(dense_reg),activation='relu',name='output_feature'))
mdl.summary()
mdl2mask = Model(inputs=mdl.input,outputs=mdl.get_layer("masking").output)
maskoutput = mdl2mask.predict(X_train)
mdloutput = mdl.predict(X_train)
maskoutput # print output after/of masking
mdloutput # print output of mdl
maskoutput.shape #(4, 3, 2): masking has the shape of the layer before (input here)
mdloutput.shape #(4, 3, 1): shape of the output of dense

Keras : Gradients of output w.r.t. input as input to classifier

I am doing research and for an experiment I want to use gradients of a specific layer in the network with respect to the network's input( similar as guided backprop) as input to another network (classifier). The goal is to 'force' network to change 'attention' according to classifier, so those two networks should be trained simultaneously.
I implemented it on this way :
input_tensor = model.input
output_tensor = model.layers[-2].output
grad_calc = keras.layers.Lambda(lambda x:K.gradients(x,input_tensor)[0],output_shape=(256,256,3),trainable=False)(output_tensor)
pred = classifier(grad_calc)
out_model = Model(input_tensor,pred)
out_model.compile(loss='mse',optimizer=keras.optimizers.Adam(0.0001),metrics=['accuracy'])
Then, when I try to train the model
out_model.train_on_batch(imgs,np.zeros((imgs.shape[0],2)))
it is not working. It seems that it stucks there, nothing is happening (no error nor other message).
I am not sure is this right way to implement this, so I would be very thankful if someone with more experience can take a look and give me advice.
If I was trying to achieve that I would swith to plain Tensorflow and something along the lines:
#build model
input = tf.placeholder()
net = tf.layesr.conv2d(input, 12)
loss = tf.nn.l2_loss(net)
step = tf.train.AdamOptimizer().minimize(loss)
# now inspect your graph and select the gradient tensor you are looking for
for op in tf.get_default_graph.get_operations():
print(op.name)
grad = tf.get_default_graph().get_operation_by_name("enqueue")
with tf.Session as sess:
_, grad, input = sess.run([step, grad, input], ...)
# feed your grad and input into another network

TF Graph does not correspond to the code

I am trying to create a very simple neural network reading in information with the shape 1x2048 and to create a classification for two categories (object or not object). The graph structure however, deviates from what I believe to have coded. The dense layers should be included in the scope of "inner_layer" and should be receiving their input from the "input" placeholder. Instead, TF seems to be treating them as independent layers which do not receive any information from "input".
Also, when using trying to use tensorboard summaries I get an error telling me that I have not mentioned inserting inputs for the apparent placeholders of the dense layers. When omitting tensorboard, everything works as I expected it based on the code.
I have spent a lot of time trying to find the problem but I think I must be overlooking an something very basic.
The graph I get in tensorboard is on this image.
Which corresponds to the following code:
tf.reset_default_graph()
keep_prob = 0.5
# Graph Strcuture
## Placeholders for input
with tf.name_scope('input'):
x_ = tf.placeholder(tf.float32, shape = [None, transfer_values_train.shape[1]], name = "input1")
y_ = tf.placeholder(tf.float32, shape = [None, num_classes], name = "labels")
## Dense Layer one with 2048 nodes
with tf.name_scope('inner_layers'):
first_layer = tf.layers.dense(x_, units = 2048, activation=tf.nn.relu, name = "first_dense")
dropout_layer = tf.nn.dropout(first_layer, keep_prob, name = "dropout_layer")
#readout layer, without softmax
y_conv = tf.layers.dense(dropout_layer, units = 2, activation=tf.nn.relu, name = "second_dense")
# Evaluation and training
with tf.name_scope('cross_entropy'):
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels = y_ , logits = y_conv),
name = "cross_entropy_layer")
with tf.name_scope('trainer'):
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
with tf.name_scope('accuracy'):
prediction = tf.argmax(y_conv, axis = 1)
correct_prediction = tf.equal(prediction, tf.argmax(y_, axis = 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
Does anyone have an idea why the graph is so different from what you would expect based on the code?
The graph rendering in tensorboard may be a bit confusing (initially), but it's correct. Take a look at this picture where I've left only the inner_layers part of your graph:
You may notice that:
The first_dense and second_dense are actually the name scopes themselves (generated by tf.layers.dense function; see also this question).
Their input/output tensors are inside the inner_layers scope and wire correctly to the dropout_layer. Here, in each of dense layers, live the corresponding linear ops: MatMul, BiasAdd, Relu.
Both scopes also include the variables (kernel and bias each), that are shown separately from inner_layers. They encapsulate the ops related specifically to variable, such as read, assign, initialize, etc. The linear ops in first_dense depend on the variable ops of first_dense, and second_dense likewise.
The reason for this separation is that in distributed settings the variables are manages by a different task called parameter server. It's usually run on a different device (CPU as opposed to GPU), sometimes even on a different machine. In other words, for tensorflow the variable management is by design different from matrix computation.
Having said that, I'd love to see a mode in tensorflow that would not split the scope into variables and ops and keep them coupled.
Other than this the graph perfectly matches the code.

Categories

Resources