After going through some Stack questions and the Keras documentation, I manage to write some code trying to evaluate the gradient of the output of a neural network w.r.t its inputs, the purpose being a simple exercise of approximating a bivariate function (f(x,y) = x^2+y^2) using as loss the difference between analytical and automatic differentiation.
Combining answers from two questions (Keras custom loss function: Accessing current input pattern
and Getting gradient of model output w.r.t weights using Keras
), I came up with this:
import tensorflow as tf
from keras import backend as K
from keras.models import Model
from keras.layers import Dense, Activation, Input
def custom_loss(input_tensor):
outputTensor = model.output
listOfVariableTensors = model.input
gradients = K.gradients(outputTensor, listOfVariableTensors)
sess = tf.InteractiveSession()
sess.run(tf.initialize_all_variables())
evaluated_gradients = sess.run(gradients,feed_dict={model.input:input_tensor})
grad_pred = K.add(evaluated_gradients[0], evaluated_gradients[1])
grad_true = k.add(K.scalar_mul(2, model.input[0][0]), K.scalar_mul(2, model.input[0][1]))
return K.square(K.subtract(grad_pred, grad_true))
input_tensor = Input(shape=(2,))
hidden = Dense(10, activation='relu')(input_tensor)
out = Dense(1, activation='sigmoid')(hidden)
model = Model(input_tensor, out)
model.compile(loss=custom_loss_wrapper(input_tensor), optimizer='adam')
Which yields the error: TypeError: The value of a feed cannot be a tf.Tensor object. because of feed_dict={model.input:input_tensor}. I understand the error, I just don't know how to fix it.
From what I gathered, I can't simply pass input data into the loss function, it must be a tensor. I realized Keras would 'understand' it when I call input_tensor. This all just leads me to think I'm doing things the wrong way, trying to evaluate the gradient like that. Would really appreciate some enlightenment.
I don't really understand why you want this loss function, but I will provide an answer anyway. Also, there is no need to evaluate the gradient within the function (in fact, you would be "disconnecting" the computational graph). The loss function could be implemented as follows:
from keras import backend as K
from keras.models import Model
from keras.layers import Dense, Input
def custom_loss(input_tensor, output_tensor):
def loss(y_true, y_pred):
gradients = K.gradients(output_tensor, input_tensor)
grad_pred = K.sum(gradients, axis=-1)
grad_true = K.sum(2*input_tensor, axis=-1)
return K.square(grad_pred - grad_true)
return loss
input_tensor = Input(shape=(2,))
hidden = Dense(10, activation='relu')(input_tensor)
output_tensor = Dense(1, activation='sigmoid')(hidden)
model = Model(input_tensor, output_tensor)
model.compile(loss=custom_loss(input_tensor, output_tensor), optimizer='adam')
A Keras loss must have y_true and y_pred as inputs. You can try adding your input object as both x and y during the fit:
def custom_loss(y_true,y_pred):
...
return K.square(K.subtract(grad_true, grad_pred))
...
model.compile(loss=custom_loss, optimizer='adam')
model.fit(X, X, ...)
This way, y_true will be the batch being processed at each iteration from the input X, while y_pred will be the output of the model for that particular batch.
Related
I am trying to understand how keras/tensorflow work.
In this example I'm working with, an LSTM network with a defined loss function.
I want to print the values in y_pred and loss variables in this example, however a standard print() function will not print the actual numeric values.
When I try print() function, I get the following output: Tensor("loss_13/dense_14_loss/strided_slice:0", shape=(), dtype=float32)
import tensorflow as tf
from tensorflow.keras import Sequential, backend as K
from tensorflow.keras.layers import Dense, LSTM, Dropout
from tensorflow.keras.losses import categorical_crossentropy
regressor = Sequential()
regressor.add(LSTM(units = 10, dropout=0.10, return_sequences = True, input_shape = (X.shape[1], X.shape[2])))
regressor.add(Dense(units = 4, activation='softmax'))
regressor.compile(optimizer = optimizer, loss = weight_fx(np.array([0.005,0.20,0.79,0.005])), metrics = ['categorical_accuracy'])
def weight_fx(weights):
weights = K.variable(weights)
def loss(y_true, y_pred):
y_pred /= K.sum(y_pred, axis=-1, keepdims=True)
print(y_pred)
loss = y_true * K.log(y_pred) * weights
return loss
return loss
Try doing it like this:
import tensorflow as tf
from tensorflow.keras import Sequential, backend as K
from tensorflow.keras.layers import Dense, LSTM, Dropout
from tensorflow.keras.losses import categorical_crossentropy
import numpy as np
X = tf.ones((10,10,10))
y = tf.ones((10,1))
def weight_fx(weights):
weights = K.variable(weights)
def loss(y_true, y_pred):
y_pred /= K.sum(y_pred, axis=-1, keepdims=True)
tf.print(y_pred)
loss = y_true * K.log(y_pred) * weights
return loss
return loss
regressor = Sequential()
regressor.add(LSTM(units = 10, dropout=0.10, return_sequences = True,
input_shape = (X.shape[1], X.shape[2])))
regressor.add(Dense(units = 4, activation='softmax'))
regressor.compile(optimizer = 'adam', loss = weight_fx(np.array([0.005,0.20,0.79,0.005])), metrics = ['categorical_accuracy'])
regressor.fit(X,y)
Q: Why do you see Tensor("loss_13/dense_14_loss/strided_slice:0", shape=(), dtype=float32)?
A: Tensorflow expects that the loss function is going to get called very often, so it is paramount to optimize it as much as possible. Tensorflow has a way of doing this, called 'tracing'. This basically means passing in a 'detector' variable that 'experiences' all operations in the function and remembers them. Then, based on these experiences, Tensorflow builds a separate so-called 'graph' function that is way faster and lacks ability to call many common functions with side-effect in python. Like print(). What you see there is a detector or a 'tracer'. It only runs once.
Then how do I debug?
Several ways for doing this. If you want to print-debug, use tf.print. In my experience this sometimes works, and sometimes doesn't. In case when it doesn't, and you still see the detector variable only, use the model.run_eagerly = True or pass it as an argument in model.compile. Even if you do not use tf.print and set run_eagerly, python's built-in print will still work (Try this).
Last but not least, you can wrap all your side-effect functions in a tf.py_function. This requires a bit more code and a sample copy-and-paste code can be seen here.
Also, Make sure to first define the function and then use it in the model.compile, especially if you are using Jupyter notebook. A buggy old declaration might still persist in memory and will probably ruin your day.
Did this help?
I haven't tried this yet but you should always use:
tf.print(value)
Instead of just normal
print(value)
Tensorflow has implemented this function specifically for this. Hopefully, this helps!
I'm trying to write a pixel-wise weighted loss function for my model written in Keras but in TensorFlow 2.0 it seems that it is not possible anymore, i.e. it is not possible to have a loss function with other inputs than y_true and y_pred
I used to write it as follows:
from tensorflow.keras.layers import Input, Conv2D
from tensorflow.keras.optimizers import Adam
from tensorflow.keras import backend as K
def my_keras_model():
input = Input((256,256,1), name='input')
weight = Input((256,256,1), name='weights')
c1 = Conv2D(16, (3, 3), activation='relu', kernel_initializer='glorot_uniform', padding='same')(input)
outputs = Conv2D(1, (1, 1), activation='sigmoid')(c1)
model=Model(input=[input,weight], output=outputs)
model.compile(optimizer=Adam(learning_rate=0.001, name='adam'), loss=my_weighted_loss(weight))
return model
def my_weighted_loss(weight):
def loss(y_true, y_pred):
return K.mean(weight * K.binary_crossentropy(y_true, y_pred), axis=-1)
return loss
Any idea of how to do it in TF 2?
One "hacky" way of implementing this would be adding the original input to the output, and writing your own loss function. This way you can do
weight = y_true[...,0]
y_true = y_true[...,1:]
I would also love to hear a better answer :)
Actually it is possible to implement weight maps and do it computation inside the model.
Since it is binary cross_entropy
model=Model(inputs=[image,weight,mask], outputs=outputs)
Define your model in such a way that incase if your using tf dataset.
output_types=((tf.float32,tf.float32,tf.float32),tf.float32)
output_shapes=(([1024,1024,1],[1024,1024,1],[1024,1024,1]),[1024,1024,1])
Now compute the loss function inside the model
bce = y_true * K.log(y_pred+epsilon) + (1-y_true) * K.log(1-y_pred+epsilon) #you have values of y_true also
Here model output would be this computed loss.
Incase if you need a computation out of your model. Just use a Lambda layer for the weights.
weights_out=layers.Lambda(lambda x:x)(weights)
and then output this layer also from your model. So model would have 2 outputs to compute the loss in the form of a tuple and this way also pixelwise weighted loss can be calculated.
model=Model(inputs=[image,weights,mask], outputs=[outputs,weighted_out])
I am using python 3 with anaconda, and trying to use a tf.contrib loss function with a Keras model.
The code is the following
from keras.layers import Dense, Flatten
from keras.optimizers import Adam
from keras.models import Sequential
from tensorflow.contrib.losses import metric_learning
model = Sequential()
model.add(Flatten(input_shape=input_shape))
model.add(Dense(50, activation="relu"))
model.compile(loss=metric_learning.triplet_semihard_loss, optimizer=Adam())
I get the following error:
File
"/home/user/.local/lib/python3.6/site-packages/keras/engine/training_utils.py",
line 404, in weighted
score_array = fn(y_true, y_pred) File "/home/user/anaconda3/envs/siamese/lib/python3.6/site-packages/tensorflow/contrib/losses/python/metric_learning/metric_loss_ops.py",
line 179, in triplet_semihard_loss
assert lshape.shape == 1 AssertionError
When I am using the same network with a keras loss function it works fine, I tried to wrap the tf loss function in a function like so
def func(y_true, y_pred):
import tensorflow as tf
return tf.contrib.losses.metric_learning.triplet_semihard_loss(y_true, y_pred)
And still getting the same error
What am I doing wrong here?
update:
When changing the func to return the following
return K.categorical_crossentropy(y_true, y_pred)
everything works fine!
But i cant get it to work with the specific tf loss function...
When i go into tf.contrib.losses.metric_learning.triplet_semihard_loss and remove this line of code: assert lshape.shape == 1 it runs fine
Thanks
The problem is that you pass wrong input to the loss function.
According to triplet_semihard_loss docstring you need to pass labels and embeddings.
So your code have to be:
def func(y, embeddings):
return tf.contrib.losses.metric_learning.triplet_semihard_loss(labels=y, embeddings=embeddings)
And two more notes about network for embeddings:
Last dense layer has to be without activation
Don't forget to normalise output vector model.add(Lambda(lambda x: K.l2_normalize(x, axis=1)))
It seems that your problem comes from an incorrect input in the loss function. In fact, the triplet loss wants the parameters:
Args:
labels: 1-D tf.int32 `Tensor` with shape [batch_size] of
multiclass integer labels.
embeddings: 2-D float `Tensor` of embedding vectors. Embeddings should
be l2 normalized.
Are you sure that y_true has the correct shape? Can you give us more details about the tensors you are using?
I use tensorflow to implement handwritten digit recognition. I hope that the logits in softmax_cross_entropy_with_logits are first represented by a placeholder, and then passed to the placeholder by the calculated value when calculating, but tensorflow will report error ValueError: No gradients provided for any variable, check Your graph for ops that do not support gradients. I know that it is ok to change the logits directly to outputs, but if I have to use logits, the result is a placeholder first. How should I solve it?
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/home/as/downloads/resnet-152_mnist-master/mnist_dataset", one_hot=True)
from tensorflow.contrib.layers import fully_connected
x = tf.placeholder(dtype=tf.float32,shape=[None,784])
y = tf.placeholder(dtype=tf.float32,shape=[None,10])
hidden1 = fully_connected(x,100,activation_fn=tf.nn.elu,
weights_initializer=tf.random_normal_initializer())
hidden2 = fully_connected(hidden1,200,activation_fn=tf.nn.elu,
weights_initializer=tf.random_normal_initializer())
hidden3 = fully_connected(hidden2,200,activation_fn=tf.nn.elu,
weights_initializer=tf.random_normal_initializer())
outputs = fully_connected(hidden3,10,activation_fn=None,
weights_initializer=tf.random_normal_initializer())
a = tf.placeholder(tf.float32,[None,10])
loss = tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=a)
reduce_mean_loss = tf.reduce_mean(loss)
equal_result = tf.equal(tf.argmax(outputs,1),tf.argmax(y,1))
cast_result = tf.cast(equal_result,dtype=tf.float32)
accuracy = tf.reduce_mean(cast_result)
train_op = tf.train.AdamOptimizer(0.001).minimize(reduce_mean_loss)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(30000):
xs,ys = mnist.train.next_batch(128)
result = outputs.eval(feed_dict={x:xs})
sess.run(train_op,feed_dict={a:result,y:ys})
print(i)
To be brief, the logits in your loss can't be a placeholder, but need to be a tensorflow Operation. Otherwise, your optimizer can't calculate the gradient w.r.t any variables (see error message).
Operations are "a graph node that performs computation on tensors", whereas a placeholder is a tensor that needs to be fed, when evaluating the graph.
I don't really understand, why you don't directly assign the outputs operation to logits, like so:
loss = tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=outputs)
I could try to further help you, if you provide a special use case?
I am coming from tensorflow learning more about keras and came across this notation. I looked in the documentation but couldn't find any examples. The syntax is when a function is followed with a variable in parenthesis.
model_input = Input(shape=input_shape)
z = model_input
z = Dropout(dropout_prob[0])(z) # Not sure what this means
The only idea I had is this may be a layer multiplication, but I am not sure thank you for your help.
It's part of the Sequential model in Keras; as it's stated in the doc here
A layer instance is callable (on a tensor), and it returns a tensor
Input tensor(s) and output tensor(s) can then be used to define a
Model
Such a model can be trained just like Keras Sequential models.
So following up your code (that is only a portion), first probably you imported
from keras.layers import Input, Dropout
Then in var "model_input" you return a tensor
model_input = Input(shape=input_shape)
And then a layer instance is callable on a tensor, and returns a tensor
z = model_input
z = Dropout(dropout_prob[0])(z) # This returns another tensor
After that, for example, you can follow with a model like this:
from keras.models import Model
model = Model(inputs=model_input, outputs=z)
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(data, labels) # starts training
So now, it is easy to reuse trained models: you can treat any model as if it were a layer, by calling it on a tensor, like this:
x = Input(shape=(784,))
y = model(x)