Feed Iterator to Tensorflow Graph - python

I have a tf.data.Iterator created with make_one_shot_iterator() and want to use it to train my (existing) model.
Currently my training looks like this
input_node = tf.placeholder(tf.float32, shape=(None, height, width, channels))
net = models.ResNet50UpProj({'data': input_node}, batch_size, keep_prob=True,is_training=True)
labels = tf.placeholder(tf.float32, shape=(None, width, height, 1))
huberloss = tf.losses.huber_loss(predictions=net.get_output(),labels=labels)
And then calling
sess.run(train_op, feed_dict={labels:output_img, input_node:input_img})
After training I can get a prediction like that:
pred = sess.run(net.get_output(), feed_dict={input_node: img})
Now with an iterator I tried something like this
next_element = iterator.get_next()
Passing the input data like this:
net = models.ResNet50UpProj({'data': next_element[0]}, batch_size, keep_prob=True,is_training=True)
Defining the loss function like this:
huberloss = tf.losses.huber_loss(predictions=net.get_output(),labels=next_element[1])
And executing the training as simple as while iterating over the iterator automatically with every call of this:
sess.run(train_op)
My problem is: After training I can't make any prediction. Or rather I don't know the proper way of using the iterator in my case.

Solution 1: create a separate sub-graph just for inference, especially when you have layers like batch normalization and dropout (is_training=False).
# The following code assumes that you create variables with `tf.get_variable`.
# If you create variables manually, you have to reuse them manually.
with tf.variable_scope('somename'):
net = models.ResNet50UpProj({'data': next_element[0]}, batch_size, keep_prob=True, is_training=True)
with tf.variable_scope('somename', reuse=True):
net_for_eval = models.ResNet50UpProj({'data': some_placeholder_or_inference_data_iterator}, batch_size, keep_prob=True, is_training=False)
Solution 2: use feed_dict. You can replace almost any tf.Tensor, not just tf.placeholder with a feed dict.
sess.run(huber_loss, {next_element[0]: inference_image, next_element[1]: inference_labels})

Related

Tensorflow: How to get the correct output of a pretrained Keras model with inputs from another Tensorflow model

I have a Keras pre-trained model "model_keras" and I want to use it in a loss function. The input of model "model_keras" is an output of another Tensorflow model "model_tf" (a generative model). I'm trying to update the weights of "model_tf" by minimizing the loss. During the optimization, "model_kears" is only used for inference and will not get updated. My problem is that I'm not able to get the correct inference result from "model_keras", due to this issue, I'm not able to update the "model_tf" correctly. The code is shown below:
loss_func(input, target, model_keras): # the input is an output of another Tensorflow model.
inference_res = model_keras(input)
loss = tf.reduce_mean(inference_res-target)
return loss
train_phase = tf.placeholder(tf.bool)
z = tf.placeholder(tf.float32, [None, 128])
y = tf.placeholder(tf.int32, [None])
t = tf.placeholder(tf.float32, [None, 10])
model_tf = Generator("generator") # Building the Tensorflow model "model_tf"
fake_img = model_tf(z, train_phase, y, NUMS_CLASS) # fake_img is the output of "model_tf" and will be served as the input of "model_keras"
model_keras = MyKerasModel("Vgg19") # Loading the pretrained Keras model
G_loss = loss_func(fake_img, t, model_keras)
G_opt = tf.train.AdamOptimizer(4e-4, beta1=0., beta2=0.9).minimize(G_loss, var_list=model_tf.var_list())
sess = tf.Session()
sess.run(tf.global_variables_initializer())
sess.run(G_opt, feed_dict={z: Z, train_phase: True, y: Y, t: target}) # Z, Y and target are numpy arrays.
I also tried to use model.predict(input) but got the ValueError: "When feeding symbolic tensors to a model, we expect the tensors to have a static batch size". Reason behind is that model.predict() expects the input to be real data tensor instead of a symbolic tensor. However, since I want to update the weights of "model_tf", I need to make the loss function differentiable and compute the gradients. Therefore, I can not just pass a numpy array to "model_keras".
How can I get the correct output(inference_res) of "model_keras" in this case? The Tensorflow and Keras version I'm using is 1.15 and 2.2.5, respectively.
If I understood your question, here is an idea. You can pass your input to model_keras and lets name the output keras_y. Then freeze the model_keras and add the model to the end of model_tf so you have a big model which is sequence of model_tf and then model_keras (which the second part has been freezed). Next give the inputs to your model and name the output as model_y. Now you can compute the loss as loss_func(keras_y, model_y)

Tensorflow: how to use RNN initial state in an estimator with different batch size for training and testing?

I am working on a Tensorflow estimator using RNN (GRUCell).
I use zero_state to initialize the first state, it requires a fixed size.
My problem is that I want to be able to use the estimator to predict with a single sample (batchsize=1).
When it load the serialized estimator, it complain that the size of the batch I use for prediction does not match the training batch size.
If I reconstruct the estimator with a different batch size, I cannot load what has been serialized.
Is there an elegant way to use zero_state in an estimator?
I saw some solutions using a variable to store batch size, but using feed_dict method. I don't find how to make it work in the context of an estimator.
Here is the core of my simple test RNN in the estimator:
cells = [ tf.nn.rnn_cell.GRUCell(self.getNSize()) for _ in range(self.getNLayers())]
multicell = tf.nn.rnn_cell.MultiRNNCell(cells, state_is_tuple=False)
H_init = tf.Variable( multicell.zero_state( batchsize, dtype=tf.float32 ), trainable=False)
H = tf.Variable( H_init )
Yr, state = tf.nn.dynamic_rnn(multicell, Xo, dtype=tf.float32, initial_state=H)
Would someone have a clue on that?
EDIT:
Ok, I try various things on this problem.
I now try to filter the variables I load from the checkpoint to remove 'H', which is used as internal state of the recurrent cells. For prediction, I can leave it with all 0 values.
So far, I did that:
First I define a hook:
class RestoreHook(tf.train.SessionRunHook):
def __init__(self, init_fn):
self.init_fn = init_fn
def after_create_session(self, session, coord=None):
print("--------------->After create session.")
self.init_fn(session)
Then in my model_fn:
if mode == tf.estimator.ModeKeys.PREDICT:
logits = tf.nn.softmax(logits)
# Do not restore H as it's batch size might be different.
vlist = tf.contrib.framework.get_variables_to_restore()
vlist = [ x for x in vlist if x.name.split(':')[0] != 'architecture/H']
init_fn = tf.contrib.framework.assign_from_checkpoint_fn(tf.train.latest_checkpoint(self.modelDir), vlist, ignore_missing_vars=True)
spec = tf.estimator.EstimatorSpec(mode=mode,
predictions = {
'logits': logits,
},
export_outputs={
'prediction': tf.estimator.export.PredictOutput( logits )
},
prediction_hooks=[RestoreHook(init_fn)])
I took this piece of code from https://github.com/tensorflow/tensorflow/issues/14713
But it does not work yet. It seems that it still trying to load H from the file... I checked that it is not in vlist.
I am still looking for a solution.
You can get batch size form other tensor example
decoder_initial_state = cell.zero_state(array_ops.shape(attention_states)[0],
dtypes.float32).clone(cell_state=encoder_state)
I found a solution:
I create the variables for the initial state for both batchsize=64 and batchsize=1.
At training I use the first one to initialize the RNN.
At Predict time, I use the second one.
It works as both those variables will be serialized and restored by the estimator code so it will not complain.
The drawback is that the query batch size (in my case, 1) bust be known at training time (when it create both variables).

Tensorflow error concerning the shape of placeholders

I'm very new to TensorFlow and I try to understand the concept of Placeholders.
Let's say I have a feature set with the shape of 100x4. So I have a 100 rows of 4 different features. The target is then of a 100x1 shape. If I want to use both matrices as a training set. What I do is:
X = tf.placeholder(tf.float64, shape=X_train.shape)
Y = tf.placeholder(tf.float64, shape=y_train.shape)
W = tf.Variable(tf.random_normal([4, 1]), name="weight",dtype=tf.float32)
b = tf.Variable(rng.randn(), name="bias",dtype=tf.float32)
pred = tf.add(tf.multiply(X, W), b)
cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
init = tf.global_variables_initializer()
with tf.Session() as sess:
# Run the initializer
sess.run(init)
# Fit all training data
for epoch in range(training_epochs):
for (x, y) in zip(X_train, y_train):
sess.run(optimizer, feed_dict={X: x, Y: y})
... # some plotting and printing of results
Which then results into a "ValueError: Cannot feed value of shape (...,) for Tensor 'Placeholder:0', which has shape '(..., ...)'". More specifically, the dimensions are not equal for 'sub' in cost function.
Could someone explain how to proceed and why?
Thanks in advance
You should use placeholders if you want to train your data in batches.
Why?
This is done when you have a large dataset, for example if you want to train your classifier on an image classification problem but can't load all of your training images on your memory. What is done instead, is training your model through batch gradient descent. Through this technique only a single batch of images is loaded each time and backpropagation is performed only on that batch. This requires more epochs to converge to a minima but each epoch is faster to train.
How?
You first define two placeholders one for the training examples X and one for their labels Y, with respective shapes (batch_size, 4) and (batch_size, 1) in your case.
Then when you want to train your model you should feed your data into the placeholders through a feed dictionary:
with tf.Session() as sess:
sess.run(train_op, feed_dict={X:x_batch, Y:y_batch}) # train_op is the operation that minimizes your cost function
where x_batch and y_batch should be random batches from your X_train and Y_train arrays, but instead of 100 examples they should have batch_size examples (so that their dimensions match the placeholders' dimensions).
Why you shouldn't do this in your case?
Since you have a small dataset, that is already loaded in your memory you could use regular gradient descent.
How?
Just use variables (tf.Variable()) instead of placeholders.
X = tf.Variable(X_train)
Y = tf.Variable(Y_train)
This will create two Variable type tensors which, when initialized will take the shape and values of X_train and Y_train respectively.
Just don't forget to initialize them in your session:
with tf.Session() as sess:
sess.run(tf.global_variables_initializer()) # initialize variables
sess.run(train_op) # no need for a feed_dict
understand the concept of Placeholders
Placeholders are needed to hold a place for real data that you will feed in future:
x = tf.placeholder(tf.float32, shape=X_train.shape)
logits = nn(x) # making some operations with x in order to calculate logits
s = tf.Session()
logits = s.run(logits, feed_dict={x: X_train})
because we used placeholder to make logits we need to place real data instead of placeholder in order to compute logits
"ValueError: Cannot feed value of shape (...,) for Tensor 'Placeholder:0', which has shape '(..., ...)'"
looks like in feed_dict={x: X_train} your placeholder x has 2nd rank but X_train is 1st rank. Better to double-check your data.

TF: how to create a dataset from user input data

I've started recently to play with tensorflow and, more specifically, with the new dataset API.
I've successfully used a dataset to feed training data to my simple model by plugging dataset's iterators to the nodes of my graph representing input and label. Something like:
input = input_dataset.make_one_shot_iterator().get_next()
label = label_dataset.make_one_shot_iterator().get_next()
Now I'm wondering what to do when I have to do inference on a user input, that is, the user gives me one single input value and I have to make my prediction. If I had a placeholder I would just put the user input in a feed_dict, but with the dataset api I have very little idea how to do something similar. Shall I have a separate graph only for inference in which my input variable is a placeholder?
I've tried already to make a feedable iterator as described here but that only works with a placeholder for strings, while my input are int32.
Thanks for any advice.
For that specific purpose, tensorflow provides tf.placeholder_with_default API
# Create a Dataset
dataset = tf.data.Dataset.zip((input_dataset, label_dataset)).batch(32).repeat(...)
# Create Iterator
input, label = dataset.make_one_shot_iterator()
# Create Placholders
x = tf.placeholder_with_default(input, shape=[...], name='input')
y = tf.placeholder_with_default(label, shape-[...], name='label')
def nn_model(features, labels):
logits = ...
loss = tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits_v2(labels=labels, logits=logits))
optimizer = tf.train.AdamOptimizer(learning_rate=0.01).minimize(loss)
return optimizer, loss
# Create Model
train_op, loss_op = nn_model(x, y)
# Training
sess.run(train_op)
# Inference
sess.run(logits, feed_dict={x:..., y:...})

Tensorflow: Feeding placeholder from variable

I'm feeding to tensorflow computation(train) graph using input queue and
tf.train.batch function that prepares huge tensor with data.
I have another queue with test data I would like to feed to graph every 50th step.
Question
Given the form of the input (tensors) do I have to define separate test graph for test data computation or I can somehow reuse train grap?
# Prepare data
batch = tf.train.batch([train_image, train_label], batch_size=200)
batchT = tf.train.batch([test_image, test_label], batch_size=200)
x = tf.reshape(batch[0], [-1, IMG_SIZE, IMG_SIZE, 3])
y_ = batch[1]
xT = tf.reshape(batchT[0], [-1, IMG_SIZE, IMG_SIZE, 3])
y_T = batchT[1]
# Graph definition
train_step = ... # train_step = g(x)
# Session
sess = tf.Session()
sess.run(tf.initialize_all_variables())
for i in range(1000):
if i%50 == 0:
# here i would like reuse train graph but with tensor x replaced by x_t
# train_accuracy = ?
# print("step %d, training accuracy %g"%(i, train_accuracy))
train_step.run(session=sess)
I would use placeholders but I can't feed tf.placeholder with tf.Tensors and this is the thing I'm getting from queues.
How is it supposed to be done?
I'm really just starting.
Take a look at how this is done in the MNIST example: You need to use a placeholder with an initializer of the none-tensor form of your data (like filenames, or CSV) and then inside the graph, use the slice_input_producer -> deocde_jpeg (or whatever...) -> tf.train.batch() to create batches and feed those to the computation graph.
So your graph looks like something like:
Placeholder initialized with big filenames list/CSV/range
tf.slice_input_producer
tf.image.decode_jpeg or tf.py_func - loading of the actual data
tf.train.batch - create mini batches for training
feed to your model

Categories

Resources