Restoring and use multiple tensorflow model in the same time - python

I'm quiet new to tensorflow and I struggle to understand how to us it. I'm am currently trying to use it to identify numbers so I used the code providing in the mnist tutorial ( https://www.tensorflow.org/get_started/mnist/pros ) with few modification. I used my own sources rather that the sources given in mnist and I change few part of the code so that I can creat model with sources of different sizes. (28x28 and 56x56)
The, I saved the model as follow :
def save_progression(sess, id_collec, x, y_conv, y_, accuracy, keep_prob, train_step, i, modelDir):
saver = tf.train.Saver()
print(modelDir)
modelNamePrefix=os.path.join(modelDir, "step%s" % str(i))
if (os.path.isdir(modelNamePrefix) == False):
os.makedirs(modelNamePrefix)
if (len(tf.get_collection(id_collec)) > 0):
tf.get_collection_ref(id_collec)[0] = x
tf.get_collection_ref(id_collec)[1] = y_conv
tf.get_collection_ref(id_collec)[2] = y_
tf.get_collection_ref(id_collec)[3] = accuracy
tf.get_collection_ref(id_collec)[4] = keep_prob
tf.get_collection_ref(id_collec)[5] = train_step
else:
tf.add_to_collection(id_collec, x)
tf.add_to_collection(id_collec, y_conv)
tf.add_to_collection(id_collec, y_)
tf.add_to_collection(id_collec, accuracy)
tf.add_to_collection(id_collec, keep_prob)
tf.add_to_collection(id_collec, train_step)
saver.save(sess, os.path.join(modelNamePrefix, "myModel"));
with
sess beign the tf.InteractiveSession()
id_collec is '28x28' or '56x56'
x being the placeholder for input imagies
y_conv the result of a tf.matmul
accuracy beign the result of tf.reduce_mean
y_ the placeholder that defined the number of class
keep_prob a placeholder for a float
train_step = the result of tf.train.AdamOptimizer
i is just a number to change the out directory for the model
modelDir = where the model directory will be created
Then in another program I restore the model as follow:
self._sess = tf.Session()
print("import meta graph %s.meta" % (os.path.join(modelDir, modelName)))
saver = tf.train.import_meta_graph("%s.meta" % (os.path.join(modelDir, modelName)))
print("restoring %s" % (os.path.join(modelDir, modelName)))
saver.restore(self._sess, "%s" % (os.path.join(modelDir, modelName)));
self._placeHolder_x, self._predictNode, _, _, self._placeHolder_keep_prob, _ = tf.get_collection('%dx%d' % (dim, dim))
When I load one model it's ok, but when I load two different model (one base on 28x28 images and one base on 56x56 images) I got an error on the second tf.restore.
[...]
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [3136,1024] rhs shape= [5,5,64,128] [[Node: save/Assign_14 = Assign[T=DT_FLOAT,
_class=["loc:#Variable_4"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](Variable_4/Adam_1, save/RestoreV2_14)]]
Caused by op u'save/Assign_14'
[...]
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [3136,1024] rhs shape= [5,5,64,128] [[Node: save/Assign_14 = Assign[T=DT_FLOAT, _class=["loc:#Variable_4"], use_locking=true, validate_shape=true,
_device="/job:localhost/replica:0/task:0/cpu:0"](Variable_4/Adam_1, save/RestoreV2_14)]]
What do I do wrong ? clearly, the two model use some variables or something.
I first though it was because I use the same id for the collection so I made it different. But the error is in the restore itself not even the get collection.
I heard there is a way somehow to make a scope of somekind allowing to avoid the two model to share things together but I'm don't understabd how that work.
When I seek answer to the web I found many informations but being new to tensorflow I failed to apply those information tu my code.
any idea ?
Ps: If I put those value in colelction if because I need them, either to continue training in a later date if I want two, or to launch the sess.run.

Ok, I found a solution, I added
dim = int(sys.argv[5])
with tf.variable_scope('%dx%d' % (dim, dim)):
before the call to my function that defined the graph entire graph and I added the same line before restoring the graph as well and it run without crashing

The problem might be that you're restoring both models into the same Graph. You might want to initialize separate graphs for each of your models:
graph1 = tf.Graph()
graph2 = tf.Graph()
with tf.Session(graph = graph1) as sess1:
saver.restore(.....)
with tf.Session(graph = graph2) as sess2:
saver.restore(...)

Related

Tensorflow - Unable to obtain gradients between loss and input image

I'm trying to produce adversarial examples for a semantic segmentation classifier, which involves optimising an image based using the gradients of the loss with respect to the input image variable (where loss is between the current and goal network outputs).
However, no matter what I've tried, I can't seem to create the graph in a way that allows these gradients to be calculated. I need to ensure that the calculated network output for each iteration of the image is not disconnected from the loss.
Here is the code. I've not included absolutely everything as it would be nightmarishly long. The model builder is a method from the code suite I'm trying to adapt. I'm sure that this must be some kind of trivial misunderstanding on my part.
#From elsewhere - x is the processed input image and yg is calculated using argmin on the output
#of a previous run through the network.
x = self.xclean
self.get_ygoal()
yg = self.ygoal
yg = tf.convert_to_tensor(yg)
tf.reset_default_graph()
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
with tf.Session(config=config) as sess:
#sess.run(tf.global_variables_initializer())
net_input = tf.placeholder(tf.float32,shape=[None,None,None,3])
net_output = tf.placeholder(tf.float32,shape=[None,None,None,self.num_classes])
network, _ = model_builder.build_model(self.model, net_input=net_input,
num_classes=self.num_classes,
crop_width=self.dims[0],
crop_height=self.dims[1],
is_training=True)
print('Loading model checkpoint weights')
checkpoint_path = 'checkpoints/latest_model_'+self.model+'_'+self.dataset+'.ckpt'
saver=tf.train.Saver(max_to_keep=1000)
saver.restore(sess, checkpoint_path)
img = tf.Variable(tf.zeros(shape=(1,self.dims[0], self.dims[1], 3)),name='img')
assign = tf.assign(img,net_input)
learning_rate = tf.constant(lr,dtype=tf.float32)
loss = tf.nn.softmax_cross_entropy_with_logits_v2(logits=network, labels=net_output)
optim_step = tf.compat.v1.train.GradientDescentOptimizer(learning_rate).minimize(loss, var_list=[img])
epsilon_ph = tf.placeholder(tf.float32, ())
below = net_input - epsilon_ph
above = net_input + epsilon_ph
projected = tf.clip_by_value(tf.clip_by_value(img, below, above), 0, 1)
with tf.control_dependencies([projected]):
project_step = tf.assign(img, projected)
sess.run(assign, feed_dict={net_input: x})
for i in range(steps):
print('Starting...')
# gradient descent step
_, loss_value = sess.run([optim_step], feed_dict={net_input:x,net_output:yg})
# project step
sess.run(project_step, feed_dict={net_input: x, epsilon_ph: epsilon})
if (i+1) % 10 == 0:
print('step %d, loss=%g' % (i+1, loss_value))
adv = img.eval() # retrieve the adversarial example
Here's the error message I get:
ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients, between variables ["<tf.Variable 'img:0' shape=(1, 512, 512, 3) dtype=float32_ref>"] and loss Tensor("softmax_cross_entropy_with_logits/Reshape_2:0", shape=(?, ?, ?), dtype=float32).
I should mention that this is using Tensorflow 1.14 - as the code suite is built around it.
Thanks in advance.

How to save and restore a lstm trained model in Tensorflow using Saver?

I have saved a trained LSTM model and I want to restore the prediction to use it in testing. I was trying to follow this post. But I am getting errors. Here is what I tried:
x = tf.placeholder('tf.float32', [None, input_vec_size, 1])
y = tf.placeholder('tf.float32')
def recurrent_neural_network(x):
layer = {'weights': tf.Variable(tf.random_normal([n_hidden, n_classes])),
'biases': tf.Variable(tf.random_normal([n_classes]))}
x = tf.transpose(x, [1, 0, 2])
x = tf.reshape(x, [-1, 1])
x = tf.split(x, input_vec_size, 0)
lstm_cell = rnn.BasicLSTMCell(n_hidden, state_is_tuple=True)
outputs, states = rnn.static_rnn(lstm_cell, x, dtype=tf.float32)
output = tf.add(tf.matmul(outputs[-1], layer['weights']), layer['biases'])
return output
def train_neural_network(x):
prediction = recurrent_neural_network(x)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
Training ...
saver.save(sess, os.path.join(os.getcwd(), 'my_test_model'))
After that, in the training phase I am trying
def test_neural_network(input_data):
with tf.Session() as sess:
#sess.run(tf.global_variables_initializer())
new_saver = tf.train.import_meta_graph('my_test_model.meta')
new_saver.restore(sess, tf.train.latest_checkpoint('./'))
prediction = tf.get_default_graph().get_tensor_by_name("prediction:0")
Calculate features from input_data ...
result = sess.run(tf.argmax(prediction.eval(feed_dict={x: features}), 1))
But this throws the following error:
KeyError: "The name 'prediction:0' refers to a Tensor which does not exist. The operation, 'prediction', does not exist in the graph."
Then I tried adding :
tf.add_to_collection('prediction', prediction) before saving and replacing by prediction = tf.get_collection('prediction')[0] after restoring. But this gives me the following error:
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Placeholder_2' with dtype float and shape [?,34,1]
[[Node: Placeholder_2 = Placeholderdtype=DT_FLOAT, shape=[?,34,1], _device="/job:localhost/replica:0/task:0/cpu:0"]]
I know for the first error, I am supposed to assign a name in order to restore but prediction is not a tensorflow variable. I went through few previous posts and articles but unable to come up with a working solution. So, my questions are:
Am I doing something conceptually wrong? If so, what?
If not, is there an implementation error? And how do I solve it?
Thanks.
I could save my trained model at last and so posting an answer in case anyone comes across this question. I did not get a solution for the exact problem but I could build and save my model using tflearn. In order to train and store:
model = tflearn.DNN(lstm_model(n_classes, input_vec_size))
model.fit(train_x, train_y, validation_set=(test_x, test_y), n_epoch=20,
show_metric=True, snapshot_epoch=True, run_id='lstm_model')
model.save("../Models/lstm_model")
And later, to restore:
model.load(filepath+"lstm_model")
This seemed to be a far easier way to work with the model, and provides a compact and novel way to do the same task which I posted in the question.

serving a tensorflow classifier

I have been fighting with tensorflow's builder to be able to serve my model, I am trying to feed data to my classifier after serving the model
My question is how would i feed the input to the model?
I have seen the code used by google's inception tutorial
and have tried to implement it
classify_inputs_tensor_info = utils.build_tensor_info(
serialized_tf_example)
classes_output_tensor_info = utils.build_tensor_info(classes)
scores_output_tensor_info = utils.build_tensor_info(values)
classification_signature = signature_def_utils.build_signature_def(
inputs={
signature_constants.CLASSIFY_INPUTS: classify_inputs_tensor_info
},
outputs={
signature_constants.CLASSIFY_OUTPUT_CLASSES:
classes_output_tensor_info,
signature_constants.CLASSIFY_OUTPUT_SCORES:
scores_output_tensor_info
},
method_name=signature_constants.CLASSIFY_METHOD_NAME)
and from what i understand the input is passed to a tensor called serialized_tf_example which as the name suggests serializes the input to string but then they use tf.FixedLenFeature which i don't understand and then parses the serialized_tf_example with tf.parse_example and assigns it to x which is used within the model, but i would like to parse it to a classifier that accepts arrays as inputs but don't know how to go around this.
while trying to implement this i wrote this
serialized_tf_example = tf.placeholder(tf.string, name='tf_example')
feature_configs = { 'audio/encoded': tf.FixedLenFeature( shape=[193], dtype=tf.float32, default_value=input_x),}
tf_example = tf.parse_example(serialized_tf_example, feature_configs)
x = tf_example['audio/encoded']
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
# Define the dimensions in the feature columns
feature_columns = [tf.contrib.layers.real_valued_column("", dimension=5)]
classifier = tf.contrib.learn.DNNLinearCombinedClassifier(
dnn_feature_columns=feature_columns, dnn_hidden_units=[200,300], n_classes=10,
dnn_optimizer=tf.train.GradientDescentOptimizer(
learning_rate=0.01
)
)
#run training
classifier.fit(input_fn=get_train_inputs, steps=100)
#testing
accuracy_score = classifier.evaluate(input_fn=get_test_inputs, steps=10)["accuracy"]
print('Test accuracy : ', format(accuracy_score))
prediction = format(list(classifier.predict_classes(x, as_iterable=True)))
but x is a tensor and so is not able to be read. when i try use run or .eval() it asks me to feed a value to serialized_tf_example
InvalidArgumentError (see above for traceback): You must feed a value
for placeholder tensor 'tf_example' with dtype string [[Node:
tf_example = Placeholderdtype=DT_STRING, shape=[],
_device="/job:localhost/replica:0/task:0/cpu:0"]]
when i use prediction = format(list(classifier.predict_classes(np.array(x), as_iterable=True))
I get
InvalidArgumentError (see above for traceback): Shape in
shape_and_slice spec [1,200] does not match the shape stored in
checkpoint: [193,200] [[Node: save/RestoreV2_1 =
RestoreV2[dtypes=[DT_FLOAT],
_device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_1/tensor_names, save/RestoreV2_1/shape_and_slices)]]
You can/should use classifier.predict without tf.Example.Your input_fn in train and eval returns x, y. you can write a predict_input_fn similar to other input functions.
predictoin = next(classifier.predict_classes(input_fn=predict_input_fn))
Please note that, if you get all predictions with list function should end by an exception. You can check tf.estimator.inputs.numpy_input_fn

How to change dimension of input during TensorFlow import_graph_def

My scenario:
Define an RNN model structure and train it using an input with fixed batch size and sequence length.
Freeze the model (i.e. converting all trainable variables into constants) producing a GraphDef containing everything one needs to use the model at test-time (via tf.graph_util.convert_variables_to_constants).
Import the GraphDef via tf.import_graph_def and replace the input using the input_map argument. The new input needs to have arbitrary batch size and sequence length.
The problem: All of the above works until I pass in an input to the test-time graph that uses a batch size or sequence length that differs from the original sizes used at training-time. At that point I get an error like this:
InvalidArgumentError (see above for traceback): ConcatOp : Dimensions of inputs should match: shape[0] = [1,5] vs. shape[1] = [2,7]
[[Node: import/rnn/while/basic_rnn_cell/basic_rnn_cell_1/concat = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](import/rnn/while/TensorArrayReadV3, import/rnn/while/Identity_2, import/rnn/while/basic_rnn_cell/basic_rnn_cell_1/concat/axis)]]
To illustrate and reproduce the problem, please consider the following minimal examples.
v1: a graph is created with arbitrary batch size and sequence length. This works fine but unfortunately I must use a fixed batch size and sequence length at training-time and must use an arbitrary batch size and sequence length at test-time so I can't use this simple approach.
v2a: we simulate creating the training-time graph with fixed batch size (2) and sequence length (3) and freeze the graph.
v2ba: we demonstrate that loading the frozen model in unchanged still produces the same results.
v2bb: we demonstrate that loading the frozen model in with a replaced input that still uses a fixed batch size and sequence length still produces the same results.
v2bc: we demonstrate that loading the frozen model in with a replaced input that uses arbitrary batch size and sequence length still produces the same results, as long as the input is shaped according to the original batch size and sequence length. It works with data but fails with data2 -- the only difference being the batch size of the former is 2 and the batch size of the latter is 1.
Is it possible to change an RNN graph via the input_map argument to tf.import_graph_def such that the input no longer has a fixed batch size and sequence length?
The following code works with TensorFlow 1.1 RC2 and may work with TensorFlow 1.0.
import numpy
import tensorflow as tf
from tensorflow import graph_util as tf_graph_util
from tensorflow.contrib import rnn as tfc_rnn
def v1(data):
with tf.Graph().as_default():
tf.set_random_seed(1)
x = tf.placeholder(tf.float32, shape=(None, None, 5))
_, s = tf.nn.dynamic_rnn(tfc_rnn.BasicRNNCell(7), x, dtype=tf.float32)
with tf.Session() as session:
session.run(tf.global_variables_initializer())
print session.run(s, feed_dict={x: data})
def v2a():
with tf.Graph().as_default():
tf.set_random_seed(1)
x = tf.placeholder(tf.float32, shape=(2, 3, 5), name="x")
_, s = tf.nn.dynamic_rnn(tfc_rnn.BasicRNNCell(7), x, dtype=tf.float32)
with tf.Session() as session:
session.run(tf.global_variables_initializer())
return tf_graph_util.convert_variables_to_constants(
session, session.graph_def, [s.op.name]), s.name
def v2ba((graph_def, s_name), data):
with tf.Graph().as_default():
x, s = tf.import_graph_def(graph_def,
return_elements=["x:0", s_name])
with tf.Session() as session:
print '2ba', session.run(s, feed_dict={x: data})
def v2bb((graph_def, s_name), data):
with tf.Graph().as_default():
x = tf.placeholder(tf.float32, shape=(2, 3, 5))
[s] = tf.import_graph_def(graph_def, input_map={"x:0": x},
return_elements=[s_name])
with tf.Session() as session:
print '2bb', session.run(s, feed_dict={x: data})
def v2bc((graph_def, s_name), data):
with tf.Graph().as_default():
x = tf.placeholder(tf.float32, shape=(None, None, 5))
[s] = tf.import_graph_def(graph_def, input_map={"x:0": x},
return_elements=[s_name])
with tf.Session() as session:
print '2bc', session.run(s, feed_dict={x: data})
def main():
data1 = numpy.random.random_sample((2, 3, 5))
data2 = numpy.random.random_sample((1, 3, 5))
v1(data1)
model = v2a()
v2ba(model, data1)
v2bb(model, data1)
v2bc(model, data1)
v2bc(model, data2)
if __name__ == "__main__":
main()
This is a bug in tensorflow that has been going on for a while: you cannot reliably replace a placeholder with a defined shape with another one with (partially) undefined shape.
You will find a related issue filed here, which apparently did not get much attention.

Feed a trainable variable into a placeholder in TensorFlow

Suppose I want to train a model over a number of samples (and also variables) that are known only at run time.
Case study: PCA (X W = Y)
(this is only a simplification of a much more complex model)
Take for example this simple PCA model, where only the features dimensions (Din and Dout) are known and fixed.
W = tf.Variable(tf.zeros([D_in, D_out]), name='weights', trainable=True)
X = tf.placeholder(tf.float32, [None, D_in], name='placeholder_latent')
Y_est = tf.matmul(X, W)
loss = tf.reduce_sum((Y_tf-Y_est)**2)
train_step = tf.train.AdamOptimizer(0.001).minimize(loss)
Suppose now we generate some data
W_true = np.random.randn(D_in, D_out)
X_true = np.random.randn(N, D_in)
Y_true = np.dot(X_true, W_true)
Y_tf = tf.constant(Y_true.astype(np.float32))
As soon as I know the dimension of my training data, I can declare the latent variable that will be fed to the placeholder X to be optimised.
latent = tf.Variable(tf.zeros([N, D_in]), name='latent', trainable=True)
init_op = tf.global_variables_initializer()
After that, what I would like to do is to feed this latent variable to the placeholder X and run the optimisation.
with tf.Session() as sess:
sess.run(init_op)
for n in range(10000):
sess.run(train_step, feed_dict={X : sess.run(latent)})
if (n+1) % 1000 == 0:
print('iter %i, %f' % (n+1, sess.run(loss, feed_dict={X : sess.run(latent)})))
The problem is that the optimiser does not optimise W and latent, but only W. I have tried also to feed the variable directly without evaluation but I get this error:
ValueError: setting an array element with a sequence.
Have you ever encountered this kind of issue? Do you know how to overcome this problem? Are there any possible workaround to optimise on a placeholder?
By the way, I am using TensorFlow 1.1.0rc0 with Python 2.7.13

Categories

Resources