Get gradient of layer before activation with respect to input picture - python

Currently, I am working on a Universal Perturbation type of research, where I would use the gradient of the layer before the activation function to retrace the gradient step taken in the last iteration.
However, when I try to extract the gradient using K.gradients, I can't seem to extract the right stuff.
Either I get a tensor, which I don't want, or I get [zero]. What I want are the exact gradients of that second to last layer, given the input-image. This is what I currently have:
f_image = np.array(model.predict(image)).flatten()
I = (np.array(f_image)).flatten().argsort()[::-1]
I = I[0:num_classes]
pert_image = image
gradients = np.asarray(grads(pert_image,I))
Here grads should be the gradient function to get the exact gradients. When I use the following code, I get a tensor:
gradients = K.gradients(model.layers[-2].output, model.layers[0].input)[0]
Where the output is the I, which gives the largest influences before making the activation to classify, and the input is the perturbed image, starting off with the original image.
Could someone tell me what is wrong with my K.gradients implementation?

K.gradients computes the gradient in a symbolic way, you need to evaluate the gradient with actual inputs in order to get numerical values. You can do this using K.function to build a callable:
import keras.backend as K
gradients = K.gradients(model.layers[-2].output, model.layers[0].input)[0]
grad_fn = K.function([model.input], [gradients])
Then you can now call grad_fn with an appropriate input (including the batch dimension) that will return the numerical values of the gradient:
actual_gradients = grad_fn([image])

Related

I want to calculate and adapt the gradient by myself in tensorflow

I want to calculate and adapt the gradient by myself in tensorflow.
I want to calculate the gradient for one of the multiple outputs by myself, and adapt 0 for the rest of the outputs.
I have tested building the model with the following simple program and trying to adapt it by setting one gradient to a constant value and the other gradient to 0, but I am not getting the desired output.
What should I give to apply_gradients?
I don't know much about tensorflow, so I would appreciate if you could kindly tell me about it
def cmodel():
in_ = Input(shape=(1,))
x=in_
x=Dense(16)(x)
x=Activation('relu')(x)
x=Dense(2)(x)
x=Activation('linear')(x)
model = Model(inputs=in_, outputs=[x])
return model
model=cmodel()
model.compile(optimizer=Nadam(learning_rate=0.005))
mylist2=[]
for i in range(100):
g=tf.constant([-0.3,0])
model.optimizer.apply_gradients(zip(g, model.trainable_variables))
print('num:',model.predict(np.array([1])))
mylist2.append(model.predict(np.array([1]))[0])
I'm not sure I understand what you want to do.
In fact, what you are doing is applying the values -3 to the trainable variable of the first layer, i.e. to the kernel of the first dense layer and 0 to the bias.
For example, if you want to modify the first layer only, you can define g=tf.constant([-0.3, 3., 0., 0.]). You could also get the training variables of the first layer: model.layers[1].trainable_variables and apply the gradient.

Error in batch size with custom loss function in Keras

I'm working on a detector with Keras, where the output y_true consists in a vector "y" with 500 values, which contains a pulse that indicates the time of the event detected within 500 samples from a signal.
Ex: y=[0, 0, 0,....,0,1,1,1,1,1,1,1,1,1,1,1,0,....0,0,0]
I've worked before with the 'mse' for the loss, and it works, but I want to use a loss function that considers the distance between the middle value from the pulse in y_true and the max value in y_pred. Later I use the max value in the y_pred to normalize it and define the pulse around it.
Since I can't work with just the distance and make it differentiable, I defined this custom loss function, which weights the mean square error with the estimated distance.
import tensorflow as tf
import keras.backend as kb
def custom_loss_function (y_true, y_pred):
t_label = []
t_picking = 0
t_label = tf.where(y_true == 1)[:,0]
mayor = tf.reduce_max(y_pred)
t_picking = tf.where(y_pred == mayor)[:,0]
d = tf.cast(abs(t_label[5]-t_picking)/50,tf.float32)
loss = (kb.mean(kb.square(y_true-y_pred)))*d
return loss
Where t_label[5] and t_picking are the middle value of the pulse in y_trye and the max value in y_pred respectively. And d is the distance between them.
I compiled the model with this loss function, using Adam optimizer and a batch size of 64.
Everything works, and the model can be compiled, but I get this error in the middle of the training:
InvalidArgumentError: Incompatible shapes: [64] vs. [2]
[[node Adam/gradients/gradients/loss/dense_1_loss/custom_loss_function/weighted_loss/mul_grad/BroadcastGradientArgs (defined at C:\Users\Maca\anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py:3009) ]] [Op:__inference_keras_scratch_graph_2220]
I've tried before with other custom loss functions and didn't have this problem, but I can't see where's the error is coming from.
Do you know why am I getting this error and how can I fix it?
There are two equal max value in a particular batch. So your t_picking sometimes (rarely) has two (or even more) values instead of one.

How do I mutate the input using gradient descent in PyTorch?

I'm new to PyTorch. I learned it uses autograd to automatically calculate the gradients for the gradient descent function.
Instead of adjusting the weights, I would like to mutate the input to achieve a desired output, using gradient descent. So, instead of the weights of neurons changing, I want to keep all of the weights the same and just change the input to minimize the loss.
For example. The network is a trained image classifier with the numbers 0-9. I input random noise, and I want to morph it so that the network considers it a 3 with 60% confidence. I would like to utilize gradient descent to adjust the values of the input (originally noise) until the network considers the input to be a 3, with 60% confidence.
Is there a way to do this?
I assume you know how to do regular training with gradient descent. You only need to change the parameters to be optimized by the optimizer. Something like
# ... Setup your network, load the input
# ...
# Set proper requires_grad -> We train the input, not the parameters
input.requires_grad = True
for p in network.parameters():
p.requires_grad = False
# Setup the optimizer
# Previously we should have SomeOptimizer(net.parameters())
optim = SomeOptimizer([input])
output_that_you_want = ...
actual_output = net(input)
some_loss = SomeLossFunction(output_that_you_want, actual_output)
# ...
# Back-prop and optim.step() as usual

Tensorflow - Access weights while doing backprop

I want to implement C-MWP as described here: https://arxiv.org/pdf/1608.00507.pdf in keras/tensorflow.
This involves modifying the way backprop is performed. The new gradient is a function of the bottom activation responses the weight parameters and the gradients of the layer above.
As a start, I was looking at the way keras-vis is doing modified backprop:
def _register_guided_gradient(name):
if name not in ops._gradient_registry._registry:
#tf.RegisterGradient(name)
def _guided_backprop(op, grad):
dtype = op.outputs[0].dtype
gate_g = tf.cast(grad > 0., dtype)
gate_y = tf.cast(op.outputs[0] > 0, dtype)
return gate_y * gate_g * grad
However, to implement C-MWP I need access to the weights of the layer on which the backprop is performed. Is it possible to access the weight within the #tf.RegisterGradient(name) function? Or am I on the wrong path?
The gradient computation in TF is fundamentally per-operation. If the operation whose gradient you want to change is performed on the weights, or at least the weights are not far from it in the operation graph, you can try finding the weights tensor by walking the graph inside your custom gradient. For example, say you have something like
x = tf.get_variable(...)
y = 5.0 * x
tf.gradients(y, x)
You can get to the variable tensor (more precisely, the tensor produced by the variable reading operation) with something like
#tf.RegisterGradient(name)
def my_grad(op, grad):
weights = op.inputs[1]
...
If the weights are not immediate inputs, but you know how to get to them, you can walk the graph a bit using something like:
#tf.RegisterGradient(name)
def my_grad(op, grad):
weights = op.inputs[1].op.inputs[0].op.inputs[2]
...
You should understand that this solution is very hacky. If you control the forward pass, you might want to just define a custom gradient just for the subgraph you care about. You can see how you can do that in How to register a custom gradient for a operation composed of tf operations
and How Can I Define Only the Gradient for a Tensorflow Subgraph? and https://www.tensorflow.org/api_docs/python/tf/Graph#gradient_override_map

Using TensorFlow ``grad_loss / grad_ys`` parameter to add gradients

I'm trying to use the grad_loss parameter in optimizer.minimize(loss, grad_loss=) to modify the network gradients with existing gradients.
I followed the comments here:
Use of grads_ys parameter in tf.gradients - TensorFlow
and I would like to run a toy example, in which I recreate the default 1 values for grad_ys, as specified in the documentation.
Here's the relevant code segment:
grads_and_vars = optimizer.compute_gradients(loss_op)
vars_with_grad = [v for g, v in grads_and_vars if g is not None]
grad_loss = []
for grad,var in grads_and_vars:
grad_loss.append(tf.ones_like(grad))
train_op = optimizer.minimize(loss_op, grad_loss=grad_loss)
The first part extracts gradients using compute_gradients. The last line computes gradients of the loss function loss_op but attempts to use 1-filled vectors for the grads. As far as I understand, this should behave similarly to funning minimize without the grad_loss parameter.
Unfortunately, this fails since it expects grad_loss to be a Tensor (and have a dtype) and not a list. Looking into gradients_impl.py I see that the function expected grad_loss to be of the same dimension as loss (which in this case is a scalar).
I would appreciate any assistance in this simple example - how do I add elements to the gradients this way?
EDIT: I guess the question boils down to the definition of grad_loss: "A Tensor holding the gradient computed for loss." How do I generate such a tensor from a set of gradients obtained by compute_gradients?
Thanks.
You can make use of the tf.convert_to_tensor method to convert your list of gradients to a tensor, and then use tf.reduce_sum:
train_op = optimizer.minimize(loss_op, grad_loss=tf.reduce_sum(tf.convert_to_tensor(grad_loss)))

Categories

Resources