Subset model outputs in custom loss function in tensorflow/keras

Subset model outputs in custom loss function in tensorflow/keras - python

I am interested in using a neural network to estimate the parameters of a linear regression. To do this I am creating a network that makes two-parameter prediction, and I am trying to write a custom loss function that will determine how well the two parameters do as a slope and intercept in a logistic regression model, using a third dataset as a predictor in the logistic regression.
So I have a matrix of predictors X, with dimensions 10,000 by 20, and a binary outcome variable y. Additionally, I have a 10,000 observations linear_predictor that I want to use to use in the custom loss function evaluate the two outputs of the model.
import numpy as np
from tensorflow.keras import Model, Input
from tensorflow.keras import Model, Input
from tensorflow.keras.layers import Dense
import tensorflow as tf
# create some dummy data
X = np.random.rand(10_000, 20)
y = (np.random.rand(10_000) > 0.8).astype(int)
linear_predictor = np.random.rand(10_000)
# define custom loss function
def CustomLoss(y_true, y_pred, input_):
y_estim = y_pred[:,0]*input_ + y_pred[:,1]
y_estim = tf.gather(y_pred, 0, axis=1)*input_ + tf.gather(y_pred, 1, axis=1)
return tf.keras.losses.BinaryCrossentropy(from_logits=True)(y_true, y_estim)
# create inputs to model
lp_input = Input(shape=linear_predictor.shape)
X_input = Input(shape=X.shape)
y_input = Input(shape=y.shape)
# create network
hidden1 = Dense(32, activation='relu')(X_input)
hidden2 = Dense(8, activation='relu')(hidden1)
output = Dense(2, activation='linear')(hidden2)
model = Model([y_input, X_input, lp_input], output)
# add loss function
model.add_loss(CustomLoss(y_input, output, lp_input))
# fit model
model.fit(x=X_input, y=y_input, epochs=3)
However, I am unable to get the CustomLoss function to work. Something is going wrong with subsetting the model's two-parameter output to get one parameter to use as a scalar as the slope and another to use as the intercept.
The error I am getting is:
ValueError: Exception encountered when calling layer "tf.math.multiply_1" (type TFOpLambda).
Dimensions must be equal, but are 2 and 10000 for '{{node tf.math.multiply_1/Mul}} = Mul[T=DT_FLOAT](
Placeholder, Placeholder_1)' with input shapes: [?,2], [?,10000].
Call arguments received by layer "tf.math.multiply_1" (type TFOpLambda):
• x=tf.Tensor(shape=(None, 2), dtype=float32)
• y=tf.Tensor(shape=(None, 10000), dtype=float32)
• name=None
This suggests that the variable y_pred is not being subset, even though I have tried using the method recommended here with numpy-like indexing (y_pred[:1]) as well as the gather_nd method here, among others.
I think this should be possible, any help is appreciated.

Related

Using Gradient Tape for Jacobian of LSTM model - Python

I am building a sequence to one model prediction using LSTM. My data has 4 input variables and 1 output variable which needs to be predicted. The data is a time series data. The total length of the data is 38265 (total number of timesteps). The total data is in a Data Frame of size 38265 *5
I want to use the previous 20 timesteps data of the 4 input variables to make prediction of my output variable. I am using the below code for this purpose.
model = Sequential()
model.add(LSTM(units = 120, activation ='relu', return_sequences = False,input_shape =
(train_in.shape[1],5)))
model.add(Dense(100,activation='relu'))
model.add(Dense(50,activation='relu'))
model.add(Dense(1))
I want to calculate the Jacobian of the output variable w.r.t the LSTM model function using tf.Gradient Tape .. Can anyone help me out with this??

The solution to segregate the Jacobian of the output with respect to the LSTM input can be done as follows:
Using tf.GradientTape(), we can compute the Jacobian arising from the gradient flow.
However for getting the Jacobian , the input needs to be in the form of tf.EagerTensor which is usually available when we want to see the Jacobian of the output (after executing y=model(x)). The following code snippet shares this idea:
#Get the Jacobian for each persistent gradient evaluation
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(2,activation='relu'))
model.add(tf.keras.layers.Dense(2,activation='relu'))
x = tf.constant([[5., 6., 3.]])
with tf.GradientTape(persistent=True,watch_accessed_variables=True) as tape:
# Forward pass
tape.watch(x)
y = model(x)
loss = tf.reduce_mean(y**2)
print('Gradients\n')
jacobian_wrt_loss=tape.jacobian(loss,x)
print(f'{jacobian_wrt_loss}\n')
jacobian_wrt_y=tape.jacobian(y,x)
print(f'{jacobian_wrt_y}\n')
But for getting intermediate outputs ,such as in this case, there have been many samples which use Keras. When we separate the outputs coming out from model.layers.output, we get the type to be a Keras.Tensor instead of an EagerTensor.
However for creating the Jacobian, we need the Eager Tensor. (After many failed attempts with #tf.function wrapping as eager execution is already present in TF>2.0)
So alternatively, an auxiliary model can be created with the layers required (in this case, just the Input and LSTM layers).The output of this model will be a tf.EagerTensor which will be useful for the Jacobian tensor creation. The following has been shown in this snippet:
#General Syntax for getting jacobians for each layer output
import numpy as np
import tensorflow as tf
tf.executing_eagerly()
x=tf.constant([[15., 60., 32.]])
x_inp = tf.keras.layers.Input(tensor=tf.constant([[15., 60., 32.]]))
model=tf.keras.Sequential()
model.add(tf.keras.layers.Dense(2,activation='relu',name='dense_1'))
model.add(tf.keras.layers.Dense(2,activation='relu',name='dense_2'))
aux_model=tf.keras.Sequential()
aux_model.add(tf.keras.layers.Dense(2,activation='relu',name='dense_1'))
#model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
with tf.GradientTape(persistent=True,watch_accessed_variables=True) as tape:
# Forward pass
tape.watch(x)
x_y = model(x)
act_y=aux_model(x)
print(x_y,type(x_y))
ops=[layer.output for layer in model.layers]
# ops=[layer.output for layer in model.layers]
# inps=[layer.input for layer in model.layers]
print('Jacobian of Full FFNN\n')
jacobian=tape.jacobian(x_y,x)
print(f'{jacobian[0]}\n')
print('Jacobian of FFNN with Just first Dense\n')
jacobian=tape.jacobian(act_y,x)
print(f'{jacobian[0]}\n')
Here I have used a simple FFNN consisting of 2 Dense layers, but I want to evaluate w.r.t the output of the first Dense layer. Hence I created an auxiliary model having just 1 Dense layer and determined the output of the Jacobian from it.
The details can be found here.

With the help from #Abhilash Majumder, I have done it this way. I am posting it here so that it might help someone in the future.
import numpy as np
import pandas as pd
import tensorflow as tf
tf.compat.v1.enable_eager_execution() #This will enable eager execution which is must.
tf.executing_eagerly() #check if eager execution is enabled or not. Should give "True"
data = pd.read_excel("FileName or Location ")
#My data is in the from of dataframe with 127549 rows and 5 columns(127549*5)
a = data[:20] #shape is (20,5)
b = data[50:70] # shape is (20,5)
A = [a,b] # making a list
A = np.array(A) # convert into array size (2,20,5)
At = tf.convert_to_tensor(A, np.float32) #convert into tensor
At.shape # TensorShape([Dimension(2), Dimension(20), Dimension(5)])
model = load_model('EKF-LSTM-1.h5') # Load the trained model
# I have a trained model which is shown in the question above.
# Output of this model is a single value
with tf.GradientTape(persistent=True,watch_accessed_variables=True) as tape:
tape.watch(At)
y1 = model(At) #defining your output as a function of input variables
print(y1,type(y1)
#output
tf.Tensor([[0.04251503],[0.04634088]], shape=(2, 1), dtype=float32) <class
'tensorflow.python.framework.ops.EagerTensor'>
jacobian=tape.jacobian(y1,At) #jacobian of output w.r.t both inputs
jacobian.shape
Outupt
TensorShape([Dimension(2), Dimension(1), Dimension(2), Dimension(20), Dimension(5)])
Here I calculated Jacobian w.r.t 2 inputs each of size (20,5). If you want to calculate w.r.t to only one input of size (20,5), then use this
jacobian=tape.jacobian(y1,At[0]) #jacobian of output w.r.t only 1st input in 'At'
jacobian.shape
Output
TensorShape([Dimension(1), Dimension(1), Dimension(1), Dimension(20), Dimension(5)])

For those looking to compute the Jacobian over a series of inputs and outputs that are independent of each other for input[i], output[j], i != j, consider the batch_jacobian method.
This will reduce the number of dimensions in your computed Jacobian tensor by one and could be the difference between running out of memory and not.
See: batch_jacobian in the TensorFlow GradientTape docs.

Calculate Jacobian Matrix of LSTM Model - Python

I have a trained LSTM model with 1 LSTM Layer and 3 Dense layers. I am using it for a sequence to One prediction. I have 4 input variables and 1 output variable. I am using the values of the last 20 timesteps to predict the next value of my output variable. The architecture of the model is shown below
model = Sequential()
model.add(LSTM(units = 120, activation ='relu', return_sequences = False,input_shape =
(train_in.shape[1],5)))
model.add(Dense(100,activation='relu'))
model.add(Dense(50,activation='relu'))
model.add(Dense(1))
The shapes of training input and training output are as shown below
train_in.shape , train_out.shape
((89264, 20, 5), (89264,))
I want to calculate the jacobian matrix for this model.
Say, Y = f(x1,x2,x3,x4) is the representation of the above neural network where:
Y -- Output variable of the trained model, f -- Is the function representing the Model; x1,x2,x3,x4 --input parameters.
How can I calculate the Jacobian Matrix?? Please share your thoughts on this. Also any valuable references if you know any.
Thank you :)

you might want to take a look at tf.GradientTape in tensorflow. Gradient tape is very simple way to auto-differentiate your computation. And the link has some basic example.
However your model is already quite big. If you have n parameters, your jacobian will have n*n values. I believe your model probably already has more than 10000 parameters. You might need to make it smaller.

I found a way to get the Jacobian matrix for LSTM model output with respect to the input. I am posting it here so that it might help someone in the future. Please share if there is any better or more simple way to do the same
import numpy as np
import pandas as pd
import tensorflow as tf
tf.compat.v1.enable_eager_execution() #This will enable eager execution which is must.
tf.executing_eagerly() #check if eager execution is enabled or not. Should give "True"
data = pd.read_excel("FileName or Location ")
#My data is in the from of dataframe with 127549 rows and 5 columns(127549*5)
a = data[:20] #shape is (20,5)
b = data[50:70] # shape is (20,5)
A = [a,b] # making a list
A = np.array(A) # convert into array size (2,20,5)
At = tf.convert_to_tensor(A, np.float32) #convert into tensor
At.shape # TensorShape([Dimension(2), Dimension(20), Dimension(5)])
model = load_model('EKF-LSTM-1.h5') # Load the trained model
# I have a trained model which is shown in the question above.
# Output of this model is a single value
with tf.GradientTape(persistent=True,watch_accessed_variables=True) as tape:
tape.watch(At)
y1 = model(At) #defining your output as a function of input variables
print(y1,type(y1)
#output
tf.Tensor([[0.04251503],[0.04634088]], shape=(2, 1), dtype=float32) <class
'tensorflow.python.framework.ops.EagerTensor'>
jacobian=tape.jacobian(y1,At) #jacobian of output w.r.t both inputs
jacobian.shape
Outupt
TensorShape([Dimension(2), Dimension(1), Dimension(2), Dimension(20), Dimension(5)])
Here I calculated Jacobian w.r.t 2 inputs each of size (20,5). If you want to calculate w.r.t to only one input of size (20,5), then use this
jacobian=tape.jacobian(y1,At[0]) #jacobian of output w.r.t only 1st input in 'At'
jacobian.shape
Output
TensorShape([Dimension(1), Dimension(1), Dimension(1), Dimension(20), Dimension(5)])

Why is this ML model giving me zero accuracy?

I am trying to train a network on the Swiss Roll dataset with three features X = [x1, x2, x3] for the classification task. There are four classes with labels 1, 2, 3, 4, and the vector y contains the labels for all the data.
A row in the X matrix looks like this:
-5.2146470e+00 7.0879738e+00 6.7292474e+00
The shape of X is (100, 3), and the shape of y is (100,).
I want to use Radial Basis Functions to train this model. I have used the custom RBFLayer from this StackOverflow answer (also see this explanation) to build the RBFLayer. I want to use a couple of Keras Dense layers to build the network for classification.
What I have tried so far
I have used a Dense layer for the first layer, followed by the custom RBFLayer, and two other Dense layers. Here's the code:
model = Sequential()
model.add((Dense(100, input_dim=3)))
# number of units = 10, gamma = 0.05
model.add(RBFLayer(10,0.05))
model.add(Dense(15, activation='relu'))
model.add(Dense(1, activation='softmax'))
This model gives me zero accuracy. I think there is something wrong with the model architecture, but I can't figure out what is the issue.
Also, I thought the number of units in the last Dense layer should match the number of classes, which is 4 in this case. But when I set the number of units to 4 in the last layer, I get the following error:
ValueError: Shapes (None, 1) and (None, 4) are incompatible
Can you help me with this model architecture?

I faced the same issue while practicing with multi-class classification. Where I had 7 features and the model classifies into 7 classes. I tried encoding the labels and it fixed the issue.
First import LabelEncoder class from sklearn and import to_categorical from tensorflow
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras.utils import to_categorical
Then, initialize an object to the LabelEncoder class and transform your labels before fitting and training the model.
encoder = LabelEncoder()
encoder.fit(y)
y = encoder.transform(y)
y = to_categorical(y)
Note that you have to use np.argmax for getting the actual predicted classification. in my case, the prediction is stored in variable called res
res = np.argmax(res, axis=None, out=None)
You can get your actual predicted class after this line. Looking forward to help you. Hope it solved your problem.

There are four classes with labels 1, 2, 3, 4, and the vector y contains the labels for all the data.
The simplest solution for input output matching is that you print the shape of the inputs and output for a single batch and then compare.
RBF layer should have no problem because output is taken from last dense layer rather then RBF layer.
With classification problem you must have last nodes equal to classes in regression the last node is 1 sometimes.
you should print
pseudo code
print(input.shape)
compare it with
print(model.input_shape)
then at output
print(output.shape)
then compare it with
print(model.predict(input).shape)
you can find the correct syntax at keras docs these are approx correct syntax / pseudo

keras model.evaluate() does not show loss

I've created a neural network of the following form in keras:
from keras.layers import Dense, Activation, Input
from keras import Model
input_dim_v = 3
hidden_dims=[100, 100, 100]
inputs = Input(shape=(input_dim_v,))
net = inputs
for h_dim in hidden_dims:
net = Dense(h_dim)(net)
net = Activation("elu")(net)
outputs = Dense(self.output_dim_v)(net)
model_v = Model(inputs=inputs, outputs=outputs)
model_v.compile(optimizer='adam', loss='mean_squared_error', metrics=['mse'])
Later, I train it on single examples using model_v.train_on_batch(X[i],y[i]).
To test, whether the neural network is becoming a better function approximator, I wanted to evaluate the model on the accumulated X and y (in my case, X and y grow over time) periodically. However, when I call model_v.evaluate(X, y), only the characteristic progress bars appear in the console, but neither the loss value nor the mse-metric (which are the same in this case) are printed.
How can I change that?

The loss and metric values are not shown in the progress bar of evaluate() method. Instead, they are returned as the output of the evaluate() method and therefore you can print them:
for i in n_iter:
# ... get the i-th batch or sample
# ... train the model using the `train_on_batch` method
# evaluate the model on whole or part of test data
loss_metric = model.evaluate(test_data, test_labels)
print(loss_metric)
According to the documentation, if your model has multiple outputs and/or metrics, you can use model.metric_names attribute to find out what the values in loss_metric correspond to.

"Invalid shape for y" for Keras LSTM w/ return_sequences=True (and sklearn API)

I have a sequence I am trying to classify, using a Keras LSTM with return_sequences=True. I have 'data' and 'labels' datasets both of which are the same shape - 2D matrices with rows by location and columns by time interval (cell values are my 'signal' feature). So an RNN w/ return_sequences=True seems like an intuitive approach.
After reshaping my data (X) and labels (Y) to 3D tensors of shape (rows, cols, 1), I call model.fit(X, Y) but get the following error:
ValueError('Invalid shape for y')
It points me to the code for class KerasClassifier()'s fit method which checks that len(y.shape)==2.
Ok so maybe I was supposed to reshape my 2D 'X' to a 3D Tensor of shape (rows, cols, 1) but leave my labels as 2D for sklearn interface? But then when I try that I get another Keras error:
ValueError: Error when checking model target: expected lstm_17 to have
3 dimensions, but got array with shape (500, 2880)
...So how does one fit a Sklearn-style Keras RNN to return sequences? Different parts of Keras seem to demand that my target be both 2D and 3D. Or (more likely) I'm misunderstanding something.
...
Here's a reproduceable code example:
from keras.layers import LSTM
from keras.wrappers.scikit_learn import KerasClassifier
# Raw Data/Targets
X = np.array([1,2,3,4,5,6,7,8,9,10,11,12]).reshape(3,4)
Y = np.array([1,0,1,1,0,1,0,1,0,1,0,1]).reshape(3,4)
# Convert X to 3D tensor per Keras doc for recurrent layers
X = X.reshape(X.shape[0], X.shape[1], 1)
# .fit() at bottom will throw an error whether or not this line is used to reshape Y
to reshape Y
Y = Y.reshape(Y.shape[0], Y.shape[1], 1)
# Define function to return compiled Keras Model (to pass to Sklearn API)
def keras_rnn(timesteps, num_features):
'''Function to return compiled Keras Classifier to pass to sklearn wrapper'''
model = Sequential()
model.add(LSTM(8, return_sequences=True, input_shape=(timesteps, num_features)))
model.add(LSTM(1, return_sequences=True, activation = 'sigmoid'))
model.compile(optimizer = 'RMSprop', loss = 'categorical_crossentropy')
return model
# Convert compiled Keras model to Scikit-learn-style classifier (compatible w/ sklearn model-tuning methods)
rnn_sklearn = KerasClassifier(build_fn=keras_rnn,
timesteps=4,
num_features=1)
# Fit RNN Model to Data, Target
rnn_sklearn.fit(X, Y)
ValueError: Invalid shape for y

This is something that I think is a feature of the KerasClassifier class. I ran into the same problem when I was using the class on a multi-step, multi-feature LSTM. For some reason, if I built the model through Keras and ran the fit() method after compile() the model will train normally with no errors. However, when I have the model created in a function and call that function with KerasClassifier, than I run into the error you have. Upon looking at the KerasClassifier class in the keras module (search for wrappers/scikit_learn.py) I found that 'y' had to be a specific shape or the function would raise an exception. This shape was a 2D 'y' tensor (n_samples, n_outputs) or a 1D 'y' tensor (n_samples) which was incompatible for what I was expecting. So I'm just going to use the model's fit() method instead of using the wrapper. Hope this helps.
BTW. My Keras version is 2.2.4 and Tensorflow is 1.15.0. This may not be applicable in the newer versions.

This code work with Keras 2.0.2:
import numpy as np
from keras.models import Sequential
from keras.layers import LSTM, Flatten
from keras.wrappers.scikit_learn import KerasClassifier
# Raw Data/Targets
X = np.array([1,2,3,4,5,6,7,8,9,10,11,12]).reshape(3,4)
Y = np.array([1,0,1,1,0,1,0,1,0,1,0,1]).reshape(3,4)
# Convert X to 3D tensor per Keras doc for recurrent layers
X = X.reshape(X.shape[0], X.shape[1], 1)
# .fit() at bottom will throw an error whether or not this line is used to reshape Y to reshape Y
Y = Y.reshape(Y.shape[0], Y.shape[1], 1)
# Define function to return compiled Keras Model (to pass to Sklearn API)
def keras_rnn(timesteps, num_features):
'''Function to return compiled Keras Classifier to pass to sklearn wrapper'''
model = Sequential()
model.add(LSTM(8, return_sequences=True, input_shape=(timesteps, num_features)))
model.add(LSTM(1, return_sequences=True, activation = 'sigmoid'))
model.compile(optimizer = 'RMSprop', loss = 'binary_crossentropy')
return model
# Convert compiled Keras model to Scikit-learn-style classifier (compatible w/ sklearn model-tuning methods)
rnn_sklearn = KerasClassifier(build_fn=keras_rnn,
timesteps=4,
num_features=1)
# Fit RNN Model to Data, Target
rnn_sklearn.fit(X, Y)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Subset model outputs in custom loss function in tensorflow/keras - python

Related

Using Gradient Tape for Jacobian of LSTM model - Python

Calculate Jacobian Matrix of LSTM Model - Python

Why is this ML model giving me zero accuracy?

keras model.evaluate() does not show loss

"Invalid shape for y" for Keras LSTM w/ return_sequences=True (and sklearn API)

Categories

Resources