I'm trying to create an attack on a cnn following the example [here][1]. https://www.anishathalye.com/2017/07/25/synthesizing-adversarial-examples/
Their notebook runs on my system without any issues.
Instead of loading Inception, I want to attack my own network. For simplicity I'm training the network in the same notebook first:
tf.reset_default_graph()
x = tf.placeholder(tf.float32, shape=(None, *img_shape), name='input')
y = tf.placeholder(tf.float32, shape=(None, total_labels), name='output')
keep_prob = tf.placeholder(tf.float32, name='keep_prob')
logits = conv_net(x, keep_prob, total_labels) # some standard cnn architecture is used
cross_entropy = tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=y)
cost = tf.reduce_mean(cross_entropy)
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
So far so good. The network is trained with
sess.run(optimizer, feed_dict={x: feature_batch, y: label_batch, keep_prob: keep_probability})
Now I want to create an attack by training an overlay on the image:
overlay = tf.Variable(tf.zeros(1, *img_shape), name='Overlay')
assign_op = tf.assign(overlay, x)
loss = tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=y)
optim = tf.train.GradientDescentOptimizer(0.0001).minimize(loss, var_list=[overlay])
My understanding is that the code above assigns the trainable Variable overlay to my input placeholder x as an input to my network, the loss is then calculated on the network predictions and the optimizer minimizes that loss. By passing var_list I tell the optimizer "Hey, you are training overlay that you cannot see explicitly."
However, this doesn't work and throws this error:
ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients, between variables ["<tf.Variable 'Overlay:0' shape=(1, 55, 65, 1) dtype=float32_ref>"] and loss Tensor("softmax_cross_entropy_with_logits_3/Reshape_2:0", shape=(?,), dtype=float32).
Clearly I'm missing some step/understanding here. I don't see the source notebook doing anything else.
placeholder is not a trainable and gradients will not propagate through placeholders. If I were you, I would try next workflow: feed the data to net through a varaible at training also.
x = tf.placeholder(size)
x_var = tf.Variable(same_size)
assign = tf.assign(x_var, x)
logits = conv_net(x_var)
...
sess.run(assign, data)
sess.run(minimize)
# not sure, if it would work in a single sess.run([assign, minimize], data)
Related
I have a CNN model for image classification which I have trained over my dataset. The model goes something like this
Convolution
Relu
pooling
Convolution
Relu
Convolution
Relu
pooling
flat
fully connected (FC1)
Relu
fully connected (FC2)
softmax
After training, I want to get the feature vectors for an image that I input to the pre-trained model i.e. I want to get the output of FC1 layer. Is there any way we can get it, I browsed the web but couldn't find anything useful any suggestions would be of great help guys.
Training script
# input
x = tf.placeholder(tf.float32, shape=[None, img_size_h, img_size_w, num_channels], name='x')
# lables
y_true = tf.placeholder(tf.float32, shape=[None, num_classes], name='y_true')
y_true_cls = tf.argmax(y_true, axis=1)
y_pred = build_model(x) # Builds model architecture
y_pred_cls = tf.argmax(y_pred, axis=1)
cross_entropy = tf.nn.softmax_cross_entropy_with_logits_v2(logits=y_pred, labels=y_true)
cost = tf.reduce_mean(cross_entropy)
optimizer = tf.train.MomentumOptimizer(learn_rate, 0.9, use_locking=False, use_nesterov=True).minimize(cost)
accuracy = tf.reduce_mean(tf.cast(tf.equal(y_pred_cls, y_true_cls), tf.float32))
sess = tf.Session()
sess.run(tf.global_variables_initializer())
tf_saver = tf.train.Saver()
train(num_iteration) # Trains the network and saves the model
sess.close()
Testing script
sess = tf.Session()
tf_saver = tf.train.import_meta_graph('model/model.meta')
tf_saver.restore(sess, tf.train.latest_checkpoint('model'))
x = tf.get_default_graph().get_tensor_by_name('x:0')
y_true = tf.get_default_graph().get_tensor_by_name('y_true:0')
y_true_cls = tf.argmax(y_true, axis=1)
y_pred = tf.get_default_graph().get_tensor_by_name('y_pred:0') # refers to FC2 in the model
y_pred_cls = tf.argmax(y_pred, axis=1)
correct_prediction = tf.equal(y_pred_cls, y_true_cls)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
images, labels = read_data() # read data for testing
feed_dict_test = {x: images, y_true: labels}
test_acc = sess.run(accuracy, feed_dict=feed_dict_test)
sess.close()
You can just perform sess.run on the right tensor to get the values. First you need the tensor. You can give it a name inside build_model by adding a name argument (which you can do for any tensor), e.g.:
FC1 = tf.add(tf.multiply(Flat, W1), b1, name="FullyConnected1")
Later, you can get the tensor for the fully connected layer and evaluate it:
with tf.Session() as sess:
FC1 = tf.get_default_graph().get_tensor_by_name('FullyConnected1:0')
FC1_values = sess.run(FC1, feed_dict={x: input_img_arr})
(This is assuming there is no other layer called FullyConnected1 in the graph)
I have trained a model and saved its checkpoint using
saver.save(sess, "checkpoints/sleeve.ckpt")
The following is the architecture of the CNN model
inputs_ = tf.placeholder(tf.float32, shape=[None, codes.shape[1]], name ='inputs_')
labels_ = tf.placeholder(tf.int64, shape=[None, label_vecs.shape[1]], name='labels_')
layer = tf.contrib.layers.fully_connected(inputs_, 128)
logits = tf.contrib.layers.fully_connected(layer, label_vecs.shape[1],activation_fn=None)# output layer logits
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=labels_, logits=logits)# cross entropy loss
cost = tf.reduce_mean(cross_entropy)
optimizer = tf.train.AdamOptimizer().minimize(cost)# training optimizer
# Operations for validation/test accuracy
predicted = tf.nn.softmax(logits)
correct_pred = tf.equal(tf.argmax(predicted, 1), tf.argmax(labels_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
I am restoring the above model in a different notebook from the following command
imported_meta tf.train.import_meta_graph("checkpoints/neckline.ckpt.meta")
with tf.Session() as sess:
imported_meta.restore(sess, tf.train.latest_checkpoint('./checkpoints'))
graph = tf.get_default_graph()
inputs = graph.get_tensor_by_name("inputs_:0")
labels = graph.get_tensor_by_name("labels_:0")
pred = graph.get_tensor_by_name("prediction:0")
print(sess.run(pred, feed_dict={inputs:code, layer:label_vecs}))
It is throwing an error Cannot interpret feed_dict key as Tensor: Tensor Tensor("fully_connected/Relu_1:0", shape=(?, 128), dtype=float32) is not an element of this graph.
Your feed_dict key is layer, not labels. layer was created in a different graph. Change to:
print(sess.run(pred, feed_dict={inputs: code, labels: label_vecs}))
I would like to know if it is possible to compute the gradients of the output of a model with respect to the model parameters. In other words I would like to compute dy / d theta.
Here is a short example of what I mean:
import keras
import tensorflow as tf
# Dummy input
test = np.random.rand(1, 32, 32, 1)
x = tf.placeholder(tf.float32, shape=(None, 32, 32, 1))
model = keras.layers.Conv2D(16, 5, padding = 'same', activation='elu') (x)
model = keras.layers.Flatten() (model)
model = keras.layers.Dense(128, activation='relu') (model)
predictions = keras.layers.Dense(1) (model)
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
y = sess.run(predictions, feed_dict={x: test})
# Get gradients of y w.r.t model parameters.
gradients = sess.run(tf.gradients(y, model_parameters))
I have looked at the documentation of tf.gradients() and it states
ys and xs are each a Tensor or a list of tensors. grad_ys is a list of Tensor, holding the gradients received by the ys. The list must be the same length as ys.
So I do understand that both args need to be a tensor. However, when I try
model_parameters = tf.trainable_variables()
model_parameters is a list of elements of type tensorflow.python.ops.variables.Variable
Is there a way to get the parameters of the model as a tensor to use for differentiation?
Two things here.
Theta corresponds to the weights in the layers.
To get weights in Keras, use get_weights(). Do something like following:
m1 = keras.layers.Conv2D(16, 5, padding = 'same', activation='elu')
model = m1 (x)
W1 = m1.get_weights()
Now you can see W1 constains the weights.
Okay, so I figured it out. If I want to compute the gradients of the output with respect to the variables of the network it goes like this.
import keras
import tensorflow as tf
# Dummy input
test = np.random.rand(1, 32, 32, 1)
x = tf.placeholder(tf.float32, shape=(None, 32, 32, 1))
model = keras.layers.Conv2D(16, 5, padding = 'same', activation='elu') (x)
model = keras.layers.Flatten() (model)
model = keras.layers.Dense(128, activation='relu') (model)
predictions = keras.layers.Dense(1) (model)
# This was the part that I was missing.
============================================================
opt = tf.train.GradientDescentOptimizer(learning_rate=0.01)
gradient_step = opt.compute_gradients(predictions, tf.trainable_variables())
============================================================
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
# This part changed too.
==========================================================
gradients = sess.run(gradient_step, feed_dict={x: test})
==========================================================
I had to define an optimizer tf.train.GradientDescentOptimizer and then feed the predictions to the gradient_step operation to find the gradient for my output. It was actually pretty simple!
Thank you all for your help ^.^
I just begin to study tensorflow and I want to create a DNN for MNIST. In the tutorial, there is a very simple neural network with 784 input nodes, 10 output nodes and no hidden nodes. I try to modify these codes to create a DNN network. Here is my code. I think I just add a hidden layer with 500 nodes between input and output layers, but the test accuracy is just 10%, which means it is not trained. Do you know what's wrong with my codes?
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import os
os.chdir('../')
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
x=tf.placeholder(tf.float32,[None,784])
W_h1=tf.Variable(tf.zeros([784,500]))
B_h1=tf.Variable(tf.zeros([500]))
h1=tf.nn.relu(tf.matmul(x,W_h1)+B_h1)
'''
W_h2=tf.Variable(tf.zeros([5,5]))
B_h2=tf.Variable(tf.zeros([5]))
h2=tf.nn.relu(tf.matmul(h1,W_h2)+B_h2)
'''
B_o=tf.Variable(tf.zeros([10]))
W_o=tf.Variable(tf.zeros([500,10]))
y=tf.nn.relu(tf.matmul(h1,W_o)+B_o)
y_=tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.05).minimize(cross_entropy)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
number_steps = 10000
batch_size = 100
for _ in range(number_steps):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
train=sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# Print classifier's accuracy
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))
OK, according to #lejlot's suggestion, I change my code as following.
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import os
os.chdir('../')
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
x=tf.placeholder(tf.float32,[None,784])
W_h1=tf.Variable(tf.random_normal([784,500]))
B_h1=tf.Variable(tf.random_normal([500]))
h1=tf.nn.relu(tf.matmul(x,W_h1)+B_h1)
'''
W_h2=tf.Variable(tf.random_normal([500,500]))
B_h2=tf.Variable(tf.random_normal([500]))
h2=tf.nn.relu(tf.matmul(h1,W_h2)+B_h2)
'''
B_o=tf.Variable(tf.random_normal([10]))
W_o=tf.Variable(tf.random_normal([500,10]))
y= tf.matmul(h1,W_o)+B_o # notice no activation
y_=tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.nn.log_softmax(y), # notice log_softmax
reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
number_steps = 10000
batch_size = 100
for i in range(number_steps):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
train=sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
if i % 1000==0:
acc=sess.run(accuracy,feed_dict={x: mnist.test.images, y_: mnist.test.labels})
print('Current loop %d, Accuracy: %g'%(i,acc))
# Print classifier's accuracy
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))
There are two modification:
change the initial value of W_h1 and B_h1 with tf.random_normal
change the define of y and cross_entropy
The modification dose work. But I still don't know what's wrong with my original code. I call the tf.global_variables_initializer().run(), and I think this function will random the value of W_h1 and B_h1. Besides, if I define y and cross_entropy as following, it doesn't work.
y= tf.nn.softmax(tf.matmul(h1,W_o)+B_o)
y_=tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y),reduction_indices=[1]))
First of all this is not valid classifier model.
y=tf.nn.relu(tf.matmul(h1,W_o)+B_o)
y_=tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
You are using explicit equation for cross entropy which requires y to be a (row-wise) probability distribution, yet you produce y by applying relu, meaning that you are simply outputing some non-negative numbers. In fact, if you ever output zeros, your code will produce NaNs and fail (as log of 0 is minus infinity).
You should use
y = tf.nn.softmax(tf.matmul(h1,W_o)+B_o)
instead. Or even better (for better numerical stability):
y= tf.matmul(h1,W_o)+B_o # notice no activation
y_=tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(
-tf.reduce_sum(y_ * tf.nn.log_softmax(y), # notice log_softmax
reduction_indices=[1]))
update
Second issue is initialisation - you cannot initialise neural network weights to zeros, they have to be random numbers, typically sampled from low-variance zero-mean Gaussians. Global initialiser does not randomise weights, it simply runs all the initialisation ops - if the initialisation ops are constant ones (like zeros), it simply makes sure these zeros are assigned to variables, nothing else (thus it can be used to reset the network etc.). Zero initialisation works only for convex problems, such as logistic regression, but cannot work for complex model like neural network.
I'm creating a neural network using TensorFlow.
I have some helper functions in the help.py file:
import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data
def read_mnist(folder_path="MNIST_data/"):
return input_data.read_data_sets(folder_path, one_hot=True)
def build_training(y_labels, y_output, learning_rate=0.5):
# Define loss function
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_labels, logits=y_output))
#train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
train_step = tf.train.AdamOptimizer(1e-4).minimize(loss)
# Calculate accuracy
correct_prediction = tf.equal(tf.argmax(y_output,1), tf.argmax(y_labels,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
return train_step, accuracy
def train_test_model(mnist, x_input, y_labels, accuracy, train_step, steps=1000, batch=100):
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
for i in range(steps):
input_batch, labels_batch = mnist.train.next_batch(batch)
feed_dict = {x_input: input_batch, y_labels: labels_batch}
if i%100 == 0:
train_accuracy = accuracy.eval(feed_dict=feed_dict)
print("Step %d, training batch accuracy %g"%(i, train_accuracy))
train_step.run(feed_dict=feed_dict)
print("The end of training!")
print("Test accuracy: %g"%accuracy.eval(feed_dict={x_input: mnist.test.images, y_labels: mnist.test.labels}))
Then I try to use it when training the network.
First I use very simple 1 output layer:
import help
import tensorflow as tf
import numpy as np
# Read mnist data
mnist = help.read_mnist()
image_size = 28
labels_size = 10
# Input layer - flattened images
x_input = tf.placeholder(tf.float32, [None, image_size*image_size])
y_labels = tf.placeholder(tf.float32, [None, labels_size])
# Layers:
# - Input
# - Output (Dense)
# Output dense layer
y_output = tf.layers.dense(inputs=x_input, units=labels_size)
# Define training
train_step, accuracy = help.build_training(y_labels, y_output)
# Run the training & test
help.train_test_model(mnist, x_input, y_labels, accuracy, train_step)
Then I add another ReLU layer:
# Hidden Layer
hidden = tf.layers.dense(inputs=x_input, units=hidden_size, activation=tf.nn.relu)
# Output dense layer
y_output = tf.layers.dense(inputs=hidden, units=labels_size)
I get Segmentation fault error both times.
I tried few things that I found online like reordering numpy and tensorflow import clauses, putting the help.py code in the same file as the network architecture and training process or increasing the memory for the docker image. Nothing worked.
Can someone help?
I upgraded to the next version of TensorFlow, version 1.2. It fixed all the issues.