restore Tensorflow model without extracting from directory

restore Tensorflow model without extracting from directory - python

I'm currently saving and restoring neural network models using Tensorflow's Saver class, as shown below:
saver.save(sess, checkpoint_prefix, global_step=step)
saver.restore(sess, checkpoint_file)
This saves .ckpt files of the model to a specified path. Because I am running multiple experiments, I have limited space to save these models.
I would like to know if there is a way to save these models without saving content in specified directories.
Ex. can I just pass some object at the last checkpoint to some evaluate() function and restore the model from that object?
So far as I see, the save_path parameter in tf.train.Saver.restore() is not optional.
Any insight would be much appreciated.
Thanks

You can use the loaded graph and weights to evaluate in the same way that you train. You just need to change the input to be from your evaluation set. Here is some pseudo code of a train loop with an evaluation loop every 1000 iterations (assumes you have created a tf.Session called sess):
x = tf.placeholder(...)
loss, train_step = model(x)
for i in range(num_step):
input_x = get_train_data(i)
sess.run(train_step, feed_dict={x: input_x})
if i % 1000 == 0 and i != 0:
eval_loss = 0
for j in range(num_eval):
input_x = get_eval_data(j)
eval_loss += sess.run(loss, feed_dict={x: input_x})
print(eval_loss/num_eval)
If you're using tf.data for your input then you can just create a tf.cond to select which input to use:
is_training = tf.placeholder(tf.bool)
next_element = tf.cond(is_training,
lambda: get_next_train(),
lambda: get_next_eval())
get_next_train and get_next_eval would have to create all ops that are used for reading the dataset, or else there will be side affects of running the above code.
This way you don't have to save anything to disc if you don't want to.

Related

How to save and resume training a GAN with multiple model parts with Tensorflow 2/ Keras

I am currently trying to add a feature to interrupt and resume training on a GAN created form this example code: https://machinelearningmastery.com/how-to-develop-an-auxiliary-classifier-gan-ac-gan-from-scratch-with-keras/
I managed to get it working in a way where I save the weights of the entire composite GAN in the summarize_performance function, which gets triggered every 10 epochs, like this:
# save all weights
filename3 = 'weights_%08d.h5' % (step+1)
gan_model.save_weights(filename3)
print('>Saved: %s and %s and %s' % (filename1, filename2, filename3))
which is loaded in a function I added to the start of the program called load_model, which takes the architecture of the gan built like normal, but updates it's weights to the most recent values, like this:
#load model from file and return startBatch number
def load_model(gan_model):
start_batch = 0
files = glob.glob("./weights_0*.h5")
if(len(files) > 0 ):
most_recent_file = files[len(files)-1]
gan_model.load_weights(most_recent_file)
#TODO: breaks if using more than 8 digits for batches
startBatch = int(most_recent_file[10:18])
if (start_batch != 0):
print("> found existing weights; starting at batch %d" % start_batch)
return start_batch
where the start_batch gets passed to the train function in order to skip the already completed epochs.
While this weight saving approach does "work", I still think that my approach here is wrong since I've discovered that the weight data obviously does not include the optimizer status of the GAN, hence the training does not continue as it would if it hadn't been interrupted.
The way I've found to save progress while also saving optimizer status is apparently done by saving the entire model instead of just the weights
Here I run into a problem since in a GAN I dont just have one model which I train but I have 3 models:
The generator model g_model
The discriminator model d_model
and the composite GAN model gan_model
which are all connected and dependant on each other. If I did the naive approach and saved and restored each of these part models individually I'd end up having 3 seperate disjointed models instead of a GAN
Is there a way to save and restore the entire GAN in a way that would let me resume training as if no interruption had occured?

Maybe consider using tf.train.Checkpoint, if you would like to restore your entire GAN:
### In your training loop
checkpoint_dir = '/checkpoints'
checkpoint = tf.train.Checkpoint(gan_optimizer=gan_optimizer,
discriminator_optimizer=discriminator_optimizer,
generator=generator,
discriminator=discriminator
gan_model = gan_model)
ckpt_manager = tf.train.CheckpointManager(checkpoint, checkpoint_dir, max_to_keep=3)
if ckpt_manager.latest_checkpoint:
checkpoint.restore(ckpt_manager.latest_checkpoint)
print ('Latest checkpoint restored!!')
....
....
if (epoch + 1) % 40 == 0:
ckpt_save_path = ckpt_manager.save()
print ('Saving checkpoint for epoch {} at {}'.format(epoch+1,ckpt_save_path))
### After x number of epochs, just save your generator model for inference.
generator.save('your_model.h5')
You can also consider getting rid of the composite model completely. Here is an example of what I mean.

Is it possible to update existing text classification model in tensorflow?

I am new to Python and have been performing text classification with tensorflow. I would like to know if this text classification model could be updated with every new data that I might acquire in future so that I would not have to train the model from scratch. Also, sometimes with time, the number of classes might also be more since I am mostly dealing with customer data. Is it possible to update this existing text classification model with data containing more number of classes by using the existing checkpoints?

Given that you are asking 2 different question I'm now answering both separately:
1) Yes, you can continue the training with the new data you have acquired. This is very simple, you just need to restore your model as you do now to use it. Instead of running some placeholder like outputs, or prediction, you should run the optimizer operation.
This translates into the following code:
model = build_model() # this is the function that build the model graph
saver = tf.train.Saver()
with tf.Session() as session:
saver.restore(session, "/path/to/model.ckpt")
########### keep training #########
data_x, data_y = load_new_data(new_data_path)
for epoch in range(1, epochs+1):
all_losses = list()
num_batches = 0
for b_x, b_y in batchify(data_x, data_y)
_, loss = session.run([model.opt, model.loss], feed_dict={model.input:b_x, model.input_y : b_y}
all_losses.append(loss * len(batch_x))
num_batches += 1
print("epoch %d - loss: %2f" % (epoch, sum(losses) / num_batches))
note that you need to now the name of the operations defined by the model in order to run the optimizer (model.opt) and the loss op (model.loss) to train and monitor the loss during training.
2) If you want to change the number of labels you want to use then it is a bit more complicated. If your network is 1 layer feed forward then there is not much to do, because you need to change the matrix dimensionality then you need to retrain everything from scratch. On the other hand, if you have for example a multi-layer network (e.g. an LSTM + dense layer that do the classification) then you can restore the weights of the old model and just train from scratch the last layer. To do that i recommend you to read this answer https://stackoverflow.com/a/41642426/4186749

Restore variables that are a subset of new model in Tensorflow?

I am doing an example for boosting(4 layers DNN to 5 layers DNN) via Tensorflow. I am making it with save session and restore in TF because there is a brief paragraph in TF tute:
'For example, you may have trained a neural net with 4 layers, and you now want to train a new model with 5 layers, restoring the parameters from the 4 layers of the previously trained model into the first 4 layers of the new model.', where tensorflow tute inspires in https://www.tensorflow.org/how_tos/variables/.
However, I found that nobody has asked how to use 'restore' when the checkpoint saves parameters of 4 layers but we need to put that into 5 layers, raising a red flag.
Making this in real code, I made
with tf.name_scope('fcl1'):
hidden_1 = fully_connected_layer(inputs, train_data.inputs.shape[1], num_hidden)
with tf.name_scope('fcl2'):
hidden_2 = fully_connected_layer(hidden_1, num_hidden, num_hidden)
with tf.name_scope('fclf'):
hidden_final = fully_connected_layer(hidden_2, num_hidden, num_hidden)
with tf.name_scope('outputl'):
outputs = fully_connected_layer(hidden_final, num_hidden, train_data.num_classes, tf.identity)
outputs = tf.nn.softmax(outputs)
with tf.name_scope('boosting'):
boosts = fully_connected_layer(outputs, train_data.num_classes, train_data.num_classes, tf.identity)
where variables inside(or called from) 'fcl1' - so that I have 'fcl1/Variable' and 'fcl1/Variable_1' for weight and bias - 'fcl2', 'fclf', and 'outputl' are stored by saver.save() in the script without 'boosting' layer. However, as we now have 'boosting' layer, saver.restore(sess, "saved_models/model_list.ckpt") does not work as
NotFoundError: Key boosting/Variable_1 not found in checkpoint
I really hope to hear about this problem. Thank you.
Below code is the main part of the code I am in trouble.
def fully_connected_layer(inputs, input_dim, output_dim, nonlinearity=tf.nn.relu):
weights = tf.Variable(
tf.truncated_normal(
[input_dim, output_dim], stddev=2. / (input_dim + output_dim)**0.5),
'weights')
biases = tf.Variable(tf.zeros([output_dim]), 'biases')
outputs = nonlinearity(tf.matmul(inputs, weights) + biases)
return outputs
inputs = tf.placeholder(tf.float32, [None, train_data.inputs.shape[1]], 'inputs')
targets = tf.placeholder(tf.float32, [None, train_data.num_classes], 'targets')
with tf.name_scope('fcl1'):
hidden_1 = fully_connected_layer(inputs, train_data.inputs.shape[1], num_hidden)
with tf.name_scope('fcl2'):
hidden_2 = fully_connected_layer(hidden_1, num_hidden, num_hidden)
with tf.name_scope('fclf'):
hidden_final = fully_connected_layer(hidden_2, num_hidden, num_hidden)
with tf.name_scope('outputl'):
outputs = fully_connected_layer(hidden_final, num_hidden, train_data.num_classes, tf.identity)
with tf.name_scope('error'):
error = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(outputs, targets))
with tf.name_scope('accuracy'):
accuracy = tf.reduce_mean(tf.cast(
tf.equal(tf.argmax(outputs, 1), tf.argmax(targets, 1)),
tf.float32))
with tf.name_scope('train'):
train_step = tf.train.AdamOptimizer().minimize(error)
init = tf.global_variables_initializer()
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(init)
saver.restore(sess, "saved_models/model.ckpt")
print("Model restored")
print("Optimization Starts!")
for e in range(training_epochs):
...
#Save model - save session
save_path = saver.save(sess, "saved_models/model.ckpt")
### I once saved the variables using var_list, but didn't work as well...
print("Model saved in file: %s" % save_path)
For clarity, the checkpoint file has
fcl1/Variable:0
fcl1/Variable_1:0
fcl2/Variable:0
fcl2/Variable_1:0
fclf/Variable:0
fclf/Variable_1:0
outputl/Variable:0
outputl/Variable_1:0
As the original 4 layers model does not have 'boosting' layer.

It doesn't look right to read values for boosting from the checkpoint in this case and I think that's not what you want to do. Obviously you're getting error, since while restoring the variables you are first catching the list of all of the variables in your model and then you look for corresponding variables in your checkpoint, which doesn't have them.
You can restore only part of your model by defining a subset of your model variables. For example you can do it using tf.slim library. Getting the list of variables in your models:
variables = slim.get_variables_to_restore()
Now variables is a list of tensors, but for each element you can access its name attribute. Using that you can specify that you only want to restore layers other than boosting, e.g.:
variables_to_restore = [v for v in variables if v.name.split('/')[0]!='boosting']
model_path = 'your/model/path'
saver = tf.train.Saver(variables_to_restore)
with tf.Session() as sess:
saver.restore(sess, model_path)
This way you will have your 4 layers restored. Theoretically you could try to catch values of one of your variables from checkpoint by creating another server that will only have boosting in variables list and renaming the chosen variable from the checkpoint, but I really don't think it's what you need here.
Since this is a custom layer for your model and you don't have this variable anywhere, just initialize it within your workflow instead of trying to import it. You can do for example by passing this argument while calling a function fully_connected:
weights_initializer = slim.variance_scaling_initializer()
You need to check details yourself though, since I'm not sure what your imports are and which function are you using here.
Generally I'd advice you to take a look at slim library, which will make it easier for you to define a model and scopes for layers (instead of defining it by with you can rather pass a scope argument while calling a function). It would look something like that with slim:
boost = slim.fully_connected(input, number_of_outputs, activation_fn=None, scope='boosting', weights_initializer=slim.variance_scaling_initializer())

Batch training in Tensorflow Slim

I am looking at TF Slim introductory document and from what I understand, it only takes in one batch of image data at each run(32 images). Obviously, one wants to loop through this and train for many different batches. The intro does not cover this. How can this be done properly. I imagine there should be some way to specify a load batch function which should be called automatically when starting a batch training event, but I can't seem to find a simple example for this on the intro.
# Note that this may take several minutes.
import os
from datasets import flowers
from nets import inception
from preprocessing import inception_preprocessing
slim = tf.contrib.slim
image_size = inception.inception_v1.default_image_size
def get_init_fn():
"""Returns a function run by the chief worker to warm-start the training."""
checkpoint_exclude_scopes=["InceptionV1/Logits", "InceptionV1/AuxLogits"]
exclusions = [scope.strip() for scope in checkpoint_exclude_scopes]
variables_to_restore = []
for var in slim.get_model_variables():
excluded = False
for exclusion in exclusions:
if var.op.name.startswith(exclusion):
excluded = True
break
if not excluded:
variables_to_restore.append(var)
return slim.assign_from_checkpoint_fn(
os.path.join(checkpoints_dir, 'inception_v1.ckpt'),
variables_to_restore)
train_dir = '/tmp/inception_finetuned/'
with tf.Graph().as_default():
tf.logging.set_verbosity(tf.logging.INFO)
dataset = flowers.get_split('train', flowers_data_dir)
images, _, labels = load_batch(dataset, height=image_size, width=image_size)
# Create the model, use the default arg scope to configure the batch norm parameters.
with slim.arg_scope(inception.inception_v1_arg_scope()):
logits, _ = inception.inception_v1(images, num_classes=dataset.num_classes, is_training=True)
# Specify the loss function:
one_hot_labels = slim.one_hot_encoding(labels, dataset.num_classes)
slim.losses.softmax_cross_entropy(logits, one_hot_labels)
total_loss = slim.losses.get_total_loss()
# Create some summaries to visualize the training process:
tf.scalar_summary('losses/Total Loss', total_loss)
# Specify the optimizer and create the train op:
optimizer = tf.train.AdamOptimizer(learning_rate=0.01)
train_op = slim.learning.create_train_op(total_loss, optimizer)
# Run the training:
final_loss = slim.learning.train(
train_op,
logdir=train_dir,
init_fn=get_init_fn(),
number_of_steps=2)
print('Finished training. Last batch loss %f' % final_loss)

The slim.learning.train function contains a training loop, so the code you've given does in fact train on multiple batches of images.
See here in the source code, where train_step_fn is called within a while loop. train_step (the default value of train_step_fn) contains the line sess.run([train_op, global_step]...), which actually runs the training operation on a single batch of images.

Tensorflow: restoring a graph and model then running evaluation on a single image

I think it would be immensely helpful to the Tensorflow community if there was a well-documented solution to the crucial task of testing a single new image against the model created by the convnet in the CIFAR-10 tutorial.
I may be wrong, but this critical step that makes the trained model usable in practice seems to be lacking. There is a "missing link" in that tutorial—a script that would directly load a single image (as array or binary), compare it against the trained model, and return a classification.
Prior answers give partial solutions that explain the overall approach, but none of which I've been able to implement successfully. Other bits and pieces can be found here and there, but unfortunately haven't added up to a working solution. Kindly consider the research I've done, before tagging this as duplicate or already answered.
Tensorflow: how to save/restore a model?
Restoring TensorFlow model
Unable to restore models in tensorflow v0.8
https://gist.github.com/nikitakit/6ef3b72be67b86cb7868
The most popular answer is the first, in which #RyanSepassi and #YaroslavBulatov describe the problem and an approach: one needs to "manually construct a graph with identical node names, and use Saver to load the weights into it". Although both answers are helpful, it is not apparent how one would go about plugging this into the CIFAR-10 project.
A fully functional solution would be highly desirable so we could port it to other single image classification problems. There are several questions on SO in this regard that ask for this, but still no full answer (for example Load checkpoint and evaluate single image with tensorflow DNN).
I hope we can converge on a working script that everyone could use.
The below script is not yet functional, and I'd be happy to hear from you on how this can be improved to provide a solution for single-image classification using the CIFAR-10 TF tutorial trained model.
Assume all variables, file names etc. are untouched from the original tutorial.
New file: cifar10_eval_single.py
import cv2
import tensorflow as tf
FLAGS = tf.app.flags.FLAGS
tf.app.flags.DEFINE_string('eval_dir', './input/eval',
"""Directory where to write event logs.""")
tf.app.flags.DEFINE_string('checkpoint_dir', './input/train',
"""Directory where to read model checkpoints.""")
def get_single_img():
file_path = './input/data/single/test_image.tif'
pixels = cv2.imread(file_path, 0)
return pixels
def eval_single_img():
# below code adapted from #RyanSepassi, however not functional
# among other errors, saver throws an error that there are no
# variables to save
with tf.Graph().as_default():
# Get image.
image = get_single_img()
# Build a Graph.
# TODO
# Create dummy variables.
x = tf.placeholder(tf.float32)
w = tf.Variable(tf.zeros([1, 1], dtype=tf.float32))
b = tf.Variable(tf.ones([1, 1], dtype=tf.float32))
y_hat = tf.add(b, tf.matmul(x, w))
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
ckpt = tf.train.get_checkpoint_state(FLAGS.checkpoint_dir)
if ckpt and ckpt.model_checkpoint_path:
saver.restore(sess, ckpt.model_checkpoint_path)
print('Checkpoint found')
else:
print('No checkpoint found')
# Run the model to get predictions
predictions = sess.run(y_hat, feed_dict={x: image})
print(predictions)
def main(argv=None):
if tf.gfile.Exists(FLAGS.eval_dir):
tf.gfile.DeleteRecursively(FLAGS.eval_dir)
tf.gfile.MakeDirs(FLAGS.eval_dir)
eval_single_img()
if __name__ == '__main__':
tf.app.run()

There are two methods to feed a single new image to the cifar10 model. The first method is a cleaner approach but requires modification in the main file, hence will require retraining. The second method is applicable when a user does not want to modify the model files and instead wants to use the existing check-point/meta-graph files.
The code for the first approach is as follows:
import tensorflow as tf
import numpy as np
import cv2
sess = tf.Session('', tf.Graph())
with sess.graph.as_default():
# Read meta graph and checkpoint to restore tf session
saver = tf.train.import_meta_graph("/tmp/cifar10_train/model.ckpt-200.meta")
saver.restore(sess, "/tmp/cifar10_train/model.ckpt-200")
# Read a single image from a file.
img = cv2.imread('tmp.png')
img = np.expand_dims(img, axis=0)
# Start the queue runners. If they are not started the program will hang
# see e.g. https://www.tensorflow.org/programmers_guide/reading_data
coord = tf.train.Coordinator()
threads = []
for qr in sess.graph.get_collection(tf.GraphKeys.QUEUE_RUNNERS):
threads.extend(qr.create_threads(sess, coord=coord, daemon=True,
start=True))
# In the graph created above, feed "is_training" and "imgs" placeholders.
# Feeding them will disconnect the path from queue runners to the graph
# and enable a path from the placeholder instead. The "img" placeholder will be
# fed with the image that was read above.
logits = sess.run('softmax_linear/softmax_linear:0',
feed_dict={'is_training:0': False, 'imgs:0': img})
#Print classifiction results.
print(logits)
The script requires that a user creates two placeholders and a conditional execution statement for it to work.
The placeholders and conditional execution statement are added in cifar10_train.py as shown below:
def train():
"""Train CIFAR-10 for a number of steps."""
with tf.Graph().as_default():
global_step = tf.contrib.framework.get_or_create_global_step()
with tf.device('/cpu:0'):
images, labels = cifar10.distorted_inputs()
is_training = tf.placeholder(dtype=bool,shape=(),name='is_training')
imgs = tf.placeholder(tf.float32, (1, 32, 32, 3), name='imgs')
images = tf.cond(is_training, lambda:images, lambda:imgs)
logits = cifar10.inference(images)
The inputs in cifar10 model are connected to queue runner object which is a multistage queue that can prefetch data from files in parallel. See a nice animation of queue runner here
While queue runners are efficient in prefetching large dataset for training, they are an overkill for inference/testing where only a single file is needed to be classified, also they are a bit more involved to modify/maintain.
For that reason, I have added a placeholder "is_training", which is set to False while training as shown below:
import numpy as np
tmp_img = np.ndarray(shape=(1,32,32,3), dtype=float)
with tf.train.MonitoredTrainingSession(
checkpoint_dir=FLAGS.train_dir,
hooks=[tf.train.StopAtStepHook(last_step=FLAGS.max_steps),
tf.train.NanTensorHook(loss),
_LoggerHook()],
config=tf.ConfigProto(
log_device_placement=FLAGS.log_device_placement)) as mon_sess:
while not mon_sess.should_stop():
mon_sess.run(train_op, feed_dict={is_training: True, imgs: tmp_img})
Another placeholder "imgs" holds a tensor of shape (1,32,32,3) for the image that will be fed during inference -- the first dimension is the batch size which is one in this case. I have modified cifar model to accept 32x32 images instead of 24x24 as the original cifar10 images are 32x32.
Finally, the conditional statement feeds the placeholder or queue runner output to the graph. The "is_training" placeholder is set to False during inference and "img" placeholder is fed a numpy array -- the numpy array is reshaped from 3 to 4 dimensional vector to conform to the input tensor to inference function in the model.
That is all there is to it. Any model can be inferred with a single/user defined test data like shown in the script above. Essentially read the graph, feed data to the graph nodes and run the graph to get the final output.
Now the second method. The other approach is to hack cifar10.py and cifar10_eval.py to change batch size to one and replace the data coming from the queue runner with the one read from a file.
Set batch size to 1:
tf.app.flags.DEFINE_integer('batch_size', 1,
"""Number of images to process in a batch.""")
Call inference with an image file read.
def evaluate(): with tf.Graph().as_default() as g:
# Get images and labels for CIFAR-10.
eval_data = FLAGS.eval_data == 'test'
images, labels = cifar10.inputs(eval_data=eval_data)
import cv2
img = cv2.imread('tmp.png')
img = np.expand_dims(img, axis=0)
img = tf.cast(img, tf.float32)
logits = cifar10.inference(img)
Then pass logits to eval_once and modify eval once to evaluate logits:
def eval_once(saver, summary_writer, top_k_op, logits, summary_op):
...
while step < num_iter and not coord.should_stop():
predictions = sess.run([top_k_op])
print(sess.run(logits))
There is no separate script to run this method of inference, just run cifar10_eval.py which will now read a file from the user defined location with a batch size of one.

Here's how I ran a single image at a time. I'll admit it seems a bit hacky with the reuse of getting the scope.
This is a helper function
def restore_vars(saver, sess, chkpt_dir):
""" Restore saved net, global score and step, and epsilons OR
create checkpoint directory for later storage. """
sess.run(tf.initialize_all_variables())
checkpoint_dir = chkpt_dir
if not os.path.exists(checkpoint_dir):
try:
os.makedirs(checkpoint_dir)
except OSError:
pass
path = tf.train.get_checkpoint_state(checkpoint_dir)
#print("path1 = ",path)
#path = tf.train.latest_checkpoint(checkpoint_dir)
print(checkpoint_dir,"path = ",path)
if path is None:
return False
else:
saver.restore(sess, path.model_checkpoint_path)
return True
Here is the main part of the code that runs a single image at a time within the for loop.
to_restore = True
with tf.Session() as sess:
for i in test_img_idx_set:
# Gets the image
images = get_image(i)
images = np.asarray(images,dtype=np.float32)
images = tf.convert_to_tensor(images/255.0)
# resize image to whatever you're model takes in
images = tf.image.resize_images(images,256,256)
images = tf.reshape(images,(1,256,256,3))
images = tf.cast(images, tf.float32)
saver = tf.train.Saver(max_to_keep=5, keep_checkpoint_every_n_hours=1)
#print("infer")
with tf.variable_scope(tf.get_variable_scope()) as scope:
if to_restore:
logits = inference(images)
else:
scope.reuse_variables()
logits = inference(images)
if to_restore:
restored = restore_vars(saver, sess,FLAGS.train_dir)
print("restored ",restored)
to_restore = False
logit_val = sess.run(logits)
print(logit_val)
Here is an alternative implementation to the above using place holders it's a bit cleaner in my opinion. but I'll leave the above example for historical reasons.
imgs_place = tf.placeholder(tf.float32, shape=[my_img_shape_put_here])
images = tf.reshape(imgs_place,(1,256,256,3))
saver = tf.train.Saver(max_to_keep=5, keep_checkpoint_every_n_hours=1)
#print("infer")
logits = inference(images)
restored = restore_vars(saver, sess,FLAGS.train_dir)
print("restored ",restored)
with tf.Session() as sess:
for i in test_img_idx_set:
logit_val = sess.run(logits,feed_dict={imgs_place=i})
print(logit_val)

got it working with this
softmax = gn.inference(image)
saver = tf.train.Saver()
ckpt = tf.train.get_checkpoint_state(FLAGS.checkpoint_dir)
with tf.Session() as sess:
saver.restore(sess, ckpt.model_checkpoint_path)
softmaxval = sess.run(softmax)
print(softmaxval)
output
[[ 6.73550041e-03 4.44930716e-04 9.92570221e-01 1.00681427e-06
3.05406687e-08 2.38927707e-04 1.89839399e-12 9.36238484e-06
1.51646684e-09 3.38977535e-09]]

I don't have working code for you I'm afraid, but here's how we often tackle this problem in production:
Save out the GraphDef to disk, using something like write_graph.
Use freeze_graph to load the GraphDef and checkpoints, and save out a GraphDef with the Variables converted into Constants.
Load the GraphDef in something like label_image or classify_image.
For your example this is overkill, but I would at least suggest serializing the graph in the original example as a GraphDef, and then loading it in your script (so you don't have to duplicate the code generating the graph). With the same graph created, you should be able to populate it from a SaverDef, and the freeze_graph script may help as an example.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

restore Tensorflow model without extracting from directory - python

Related

How to save and resume training a GAN with multiple model parts with Tensorflow 2/ Keras

Is it possible to update existing text classification model in tensorflow?

Restore variables that are a subset of new model in Tensorflow?

Batch training in Tensorflow Slim

Tensorflow: restoring a graph and model then running evaluation on a single image

Categories

Resources