is tensor2tensor/avg_checkpoints script wrong

is tensor2tensor/avg_checkpoints script wrong - python

I am training two sub-models in parallel, and want to average them after both of them are finished.
The model is implemented out of tensor2tensor (but still use tensorflow). Part of the definition is like this:
def build_model(self):
layer_sizes = [n,n,n]
kernel_sizes = [n,n,n]
tf.reset_default_graph()
self.graph = tf.Graph()
with self.graph.as_default():
self.num_classes = num_classes
# placeholder for parameter
self.learning_rate = tf.placeholder(tf.float32, name="learning_rate")
self.dropout_keep_prob = tf.placeholder(tf.float32, name="dropout_keep_prob")
self.phase = tf.placeholder(tf.bool, name="phase")
# Placeholders for regular data
self.input_x = tf.placeholder(tf.float32, [None, None, input_feature_dim], name="input_x")
self.input_y = tf.placeholder(tf.float32, [None, num_classes], name="input_y")
h = self.input_x
......[remaining codes]
I save it following:
def save_model(sess, output):
saver = tf.train.Saver()
save_path = saver.save(sess, os.path.join(output, 'model'))
When I load it, I use:
def load_model(self, sess, input_dir, logger):
if logger is not None:
logger.info("Start loading graph ...")
saver = tf.train.import_meta_graph(os.path.join(input_dir, 'model.meta'))
saver.restore(sess, os.path.join(input_dir, 'model'))
self.graph = sess.graph
self.input_x = self.graph.get_tensor_by_name("input_x:0")
self.input_y = self.graph.get_tensor_by_name("input_y:0")
self.num_classes = self.input_y.shape[1]
self.learning_rate = self.graph.get_tensor_by_name("learning_rate:0")
self.dropout_keep_prob = self.graph.get_tensor_by_name("dropout_keep_prob:0")
self.phase = self.graph.get_tensor_by_name("phase:0")
self.loss = self.graph.get_tensor_by_name("loss:0")
self.optimizer = self.graph.get_operation_by_name("optimizer")
self.accuracy = self.graph.get_tensor_by_name("accuracy/accuracy:0")
I use the avg_checkpoint to average two sub-models:
python utils/avg_checkpoints.py
--checkpoints path/to/checkpoint1,path/to/checkpoint2
--num_last_checkpoints 2
--output_path where/to/save/the/output
But I find problems when I check the avg_checkpoints code further:
for checkpoint in checkpoints:
reader = tf.train.load_checkpoint(checkpoint)
for name in var_values:
tensor = reader.get_tensor(name)
var_dtypes[name] = tensor.dtype
var_values[name] += tensor
tf.logging.info("Read from checkpoint %s", checkpoint)
for name in var_values: # Average.
var_values[name] /= len(checkpoints)
with tf.variable_scope(tf.get_variable_scope(), reuse=tf.AUTO_REUSE):
tf_vars = [
tf.get_variable(v, shape=var_values[v].shape, dtype=var_dtypes[v])
for v in var_values
]
placeholders = [tf.placeholder(v.dtype, shape=v.shape) for v in tf_vars]
assign_ops = [tf.assign(v, p) for (v, p) in zip(tf_vars, placeholders)]
global_step = tf.Variable(
0, name="global_step", trainable=False, dtype=tf.int64)
saver = tf.train.Saver(tf.all_variables())
# Build a model consisting only of variables, set them to the average values.
with tf.Session() as sess:
for p, assign_op, (name, value) in zip(placeholders, assign_ops,
six.iteritems(var_values)):
sess.run(assign_op, {p: value})
# Use the built saver to save the averaged checkpoint.
saver.save(sess, FLAGS.output_path, global_step=global_step)
It only save the variables, not all tensors. E.G. when I load it using the above load_model function, it cannot have "input_x:0" tensor. Is this script wrong, or I should modify it based on my use?
I am using TF r1.13. Thanks!

Related

Why doesn't this test function output an image? pokeGAN Tutorial

I am new to python coding and using tensorflow, that being said this may be a dumb question. I was following the pokeGAN tutorial done by Siraj, and he doesn't really comment on the test function. I have trained the model, but when I uncomment the test function, it just exits with a code of 0, and gives me no image it may have generated. I know the exit code of 0 means there was no errors, but I am curious as to why its not generating an image. Is the function just not telling it to generate an image? Is there something else that needs to be uncommented (or commented) out to make it work correctly? Any help would be great.
Here is a link to the github for the code in its entirety: pokeGAN
Here is the actual test function:
def test():
random_dim = 100
with tf.variable_scope('input'):
real_image = tf.placeholder(tf.float32, shape=[None, HEIGHT, WIDTH, CHANNEL], name='real_image')
random_input = tf.placeholder(tf.float32, shape=[None, random_dim], name='rand_input')
is_train = tf.placeholder(tf.bool, name='is_train')
# wgan
fake_image = generator(random_input, random_dim, is_train)
real_result = discriminator(real_image, is_train)
fake_result = discriminator(fake_image, is_train, reuse=True)
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
variables_to_restore = slim.get_variables_to_restore(include=['gen'])
print(variables_to_restore)
saver = tf.train.Saver(variables_to_restore)
ckpt = tf.train.latest_checkpoint('./model/' + version)
saver.restore(sess, ckpt)

You're code misses some information to create an image and save it.
def test():
random_dim = 100
with tf.variable_scope('input'):
real_image = tf.placeholder(tf.float32, shape = [None, HEIGHT, WIDTH, CHANNEL], name='real_image')
random_input = tf.placeholder(tf.float32, shape=[None, random_dim], name='rand_input')
is_train = tf.placeholder(tf.bool, name='is_train')
# wgan
fake_image = generator(random_input, random_dim, is_train)
real_result = discriminator(real_image, is_train)
fake_result = discriminator(fake_image, is_train, reuse=True)
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
variables_to_restore = slim.get_variables_to_restore(include=['gen'])
print(variables_to_restore)
saver = tf.train.Saver(variables_to_restore)
ckpt = tf.train.latest_checkpoint('./model/' + version)
saver.restore(sess, ckpt)
#image creation
sample_noise = np.random.uniform(-1.0, 1.0, size=[64, random_dim]).astype(np.float32)
imgtest = sess.run(fake_image, feed_dict={random_input: sample_noise, is_train: False})
save_images(imgtest, [8,8] ,newPoke_path + '/epoch' + 'image.jpg')

Using multiple Tensorflow Restored Graphs and Sessions

I am attempting to reload a session and graph in a function to process more data. I keep getting errors as using uninitialized variable.
I have tried to reuse my GRU cell with no avail.
I am currently resorting to loading each model into a dameon thread and having it look for an empty list to run data and return to list instead of a function return the predictions.
model = {
'chunk_size' : 9,
'num_chunk' : 31,
'rnn_size' : 18,
'rnn_classes' : 2
}
graphx = tf.Graph()
sess = tf.Session(graph= graphx)
save_path = ('models/rnn_1d/rnn_1d.ckpt')
def loadModel(model, graphx, sess, save_path):
with graphx.as_default():
chunk_size = model['chunk_size']
num_chunks = model['num_chunk'] #need to update to num_chunks in model creator
rnn_size = model['rnn_size']
rnn_classes = model['rnn_classes']
X = tf.placeholder(dtype=tf.float32, shape=[None, num_chunks, chunk_size])
Y = tf.placeholder(dtype=tf.float32)
def rnn_model(x):
weight_initializer = tf.variance_scaling_initializer(mode="fan_avg", distribution="uniform", scale=1)
bias_initializer = tf.zeros_initializer()
layer = {'weights': tf.Variable(weight_initializer([rnn_size, rnn_classes])),
'biases':tf.Variable(bias_initializer([rnn_classes]))}
x = tf.transpose(x, [1, 0, 2])
x = tf.reshape(x, [-1, chunk_size])
x = tf.split(x, num_chunks, 0)
lstm_cell = rnn_cell.GRUCell(rnn_size)
outputs, states = rnn.static_rnn(lstm_cell, x, dtype=tf.float32)
output = tf.add(tf.matmul(outputs[-1], layer['weights']), layer['biases'])
return output
prediction = rnn_model(X)
saver = tf.train.Saver()
saver.restore(sess, save_path)
print(' loaded')
return sess, graphx
def feedModel(model, sess, graphx, Set):
with graphx.as_default():
chunk_size = model['chunk_size']
num_chunks = model['num_chunk'] #need to update to num_chunks in model creator
rnn_size = model['rnn_size']
rnn_classes = model['rnn_classes']
X = tf.placeholder(dtype=tf.float32, shape=[None, num_chunks, chunk_size])
Y = tf.placeholder(dtype=tf.float32)
def rnn_model(x):
weight_initializer = tf.variance_scaling_initializer(mode="fan_avg", distribution="uniform", scale=1)
bias_initializer = tf.zeros_initializer()
layer = {'weights': tf.Variable(weight_initializer([rnn_size, rnn_classes])),
'biases':tf.Variable(bias_initializer([rnn_classes]))}
x = tf.transpose(x, [1, 0, 2])
x = tf.reshape(x, [-1, chunk_size])
x = tf.split(x, num_chunks, 0)
lstm_cell = rnn_cell.GRUCell(rnn_size, reuse=tf.AUTO_REUSE)
outputs, states = rnn.static_rnn(lstm_cell, x, dtype=tf.float32)
output = tf.add(tf.matmul(outputs[-1], layer['weights']), layer['biases'])
return output
prediction = rnn_model(X)
prediction = sess.run(prediction, feed_dict={X: Set})
return prediction
sess, graphx = loadModel(model, graphx, sess, save_path)
print(feedModel(model, sess, graphx, np.ones((1,31,9)) ) )

It looks you're totally (and I think unnecessarily) recreating your model in feedModel. All of those ops are already defined in graphx, which you pass in. You presumably initialize those in some other code (the one that saves the session) using something like tf.global_variables_initializer().run(). The new variables that you define here won't be initialized- I suspect that's why you're getting the uninitialized variable error.
It looks to me like feedModel should simply look like this:
def feedModel(model, sess, graphx, Set):
with graphx.as_default():
prediction = sess.run(prediction, feed_dict={X: Set})
return prediction

tensorflow calculations without retraining

I want to train a model once in tensorflow, and then want to use the trained model for predicting some functions. Before we get into specifics, lets define a couple of functions ...
def runTF(func, inpDict):
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
result = sess.run(func, feed_dict = inpDict)
sess.close()
return result
and
def optTF(opt, res, others, inpDict, nSteps, printSteps=50):
os, re = [], []
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
for i in range(nSteps):
# First run the optimizer ...
sess.run(opt, feed_dict = inpDict)
# Save all the data you want to save
temp = sess.run( [res] + others, feed_dict = inpDict)
re.append(temp[0])
os.append(temp[1:])
if (i%printSteps) == 0:
print('{:5d}'.format(i))
sess.close()
return re, os
Here are a couple of steps for what I am doing ...
A. Generating some data
N = 500
features = 2
nSteps = 1000
X = np.array([np.random.random(N), np.random.random(N)])
data = [X.T, X[0].reshape(-1, 1)]
B. create a simple linear model
d = tf.placeholder(shape = np.shape(data[0]), dtype = tf.float32, name='d') # input layer
dOut = tf.placeholder(shape = np.shape(data[1]), dtype = tf.float32, name='dOut') # output layer
W = tf.Variable( np.random.randn(features, 1), dtype = tf.float32, name='W')
b = tf.Variable( np.zeros((1,1)), dtype = tf.float32, name='b')
result = tf.matmul(d, W)+b
cost = tf.reduce_mean((dOut - result)**2)
optimizer = tf.train.AdamOptimizer(learning_rate = 0.1).minimize(cost)
C. Lets run the optimiser
inpDict = {d: data[0], dOut:data[1]}
ot = optTF(optimizer, result, [W, cost], inpDict, 200, 50)
Here, I have checked the result, and see that it is what I want. So the optimiser is working fine. The model has been optimised. Now, I want to make a prediction with some other data. So I do ...
r = runTF(result, inpDict)
This new result is not what I would expect from the same trained model.
Now, as long as I stay within the same tf.Session(), we are ok. However, I do want to be able to do predictions even when I an done with a session. So my question is, how do I use a model once we have trained it in one session in a different session?
Note, the entire thing is something that I did in a different session?
Edited:
I edited the two functions to incorporate the saving ...
def runTF(func, inpDict, modelFile=None):
if modelFile is not None:
saver = tf.train.Saver()
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
if modelFile is not None:
ckpt = tf.train.get_checkpoint_state(modelFile)
print(modelFile, ckpt)
if ckpt and ckpt.model_checkpoint_path:
saver.restore(sess, ckpt.model_checkpoint_path)
print('Session restored')
result = sess.run(func, feed_dict = inpDict)
sess.close()
return result
and
def optTF(opt, res, others, inpDict, nSteps, printSteps=50, modelFile='models/temp.ckpt'):
os, re = [], []
saver = tf.train.Saver()
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
for i in range(nSteps):
# First run the optimizer ...
sess.run(opt, feed_dict = inpDict)
# Save all the data you want to save
temp = sess.run( [res] + others, feed_dict = inpDict)
re.append(temp[0])
os.append(temp[1:])
if (i%printSteps) == 0:
print('{:5d}'.format(i))
path = saver.save(sess, modelFile)
print('Model saved in: {}'.format(path))
sess.close()
return re, os
And running the model as:
ot = optTF(optimizer, result, [cost]+weights+biases, inpDict, 200, 50)
r = runTF([result], inpDict, 'models/temp.ckpt')
Still nothing! I checked that:
The value of ckpt is None
The models folder has the following files:
checkpoint
temp.ckpt.index
temp.ckpt.data-00000-of-00001
temp.ckpt.meta

You need to save and restore the session you are creating and training. As in
init = tf.initialize_all_variables()
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(init)
if restore:
ckpt = tf.train.get_checkpoint_state(save_path)
if ckpt and ckpt.model_checkpoint_path:
saver.restore(sess, ckpt.model_checkpoint_path)
else:
# ... training code omitted ...
saver.save(sess, save_path)
checkout also https://www.tensorflow.org/programmers_guide/saved_model if you want to create a full model with estimator instead of just one session.

Tensorflow restored weights is not set

I'm trying to restore a model in Tensorflow which I've trained. The problem is that it does not seem like the weights are properly restored.
For the training I've got the weights and biases defined as:
W = {
'h1': tf.Variable(tf.random_normal([n_inputs, n_hidden_1]), name='wh1'),
'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2]), name='wh2'),
'o': tf.Variable(tf.random_normal([n_hidden_2, n_classes]), name='wo')
}
b = {
'b1': tf.Variable(tf.random_normal([n_hidden_1]), name='bh1'),
'b2': tf.Variable(tf.random_normal([n_hidden_2]), name='bh2'),
'o': tf.Variable(tf.random_normal([n_classes]), name='bo')
}
Then I do some training on my own custom 2D image dataset and save the model by calling the tf.saver
saver = tf.train.Saver()
saver.save(sess, 'tf.model')
Later I want to restore that model with the exact same weights, so I build the model as before (also with the random_normal initialization) and call the tf.saver.restore
saver = tf.train.import_meta_graph('tf.model.meta')
saver.restore(sess, tf.train.latest_checkpoint('./'))
Now, if i call:
temp = sess.run(W['h1'][0][0])
print temp
I get random values, and not the restored value of the weight.
I've drawn a blank on this one, can somebody point me in the right direction?
FYI, I've tried (without) luck to simply declare the tf.Variables, but I keep getting:
ValueError: initial_value must be specified.
even though Tensorflow themselves state that it should be possible to simply declare with no initial value (https://www.tensorflow.org/programmers_guide/variables part: Restoring Values)
Update 1
When I, as suggested, run
all_vars = tf.global_variables()
for v in all_vars:
print v.name
I get the following output:
wh1:0
wh2:0
wo:0
bh1:0
bh2:0
bo:0
wh1:0
wh2:0
wo:0
bh1:0
bh2:0
bo:0
beta1_power:0
beta2_power:0
wh1/Adam:0
wh1/Adam_1:0
wh2/Adam:0
wh2/Adam_1:0
wo/Adam:0
wo/Adam_1:0
bh1/Adam:0
bh1/Adam_1:0
bh2/Adam:0
bh2/Adam_1:0
bo/Adam:0
bo/Adam_1:0
Which shows that the variables indeed is read. However invoking
print sess.run("wh1:0")
Results in the error: Attempting to use uninitialized value wh1

So with the help of you guys, I ended up dividing the saving and restoring parts of my program into two files, to ensure that no unwanted variables were initialized.
Train and Save routines fnn.py
def build(self, topology):
"""
Builds the topology of the model
"""
# Sanity check
assert len(topology) == 4
n_inputs = topology[0]
n_hidden_1 = topology[1]
n_hidden_2 = topology[2]
n_classes = topology[3]
# Sanity check
assert self.img_h * self.img_w == n_inputs
# Instantiate TF Placeholders
self.x = tf.placeholder(tf.float32, [None, n_inputs], name='x')
self.y = tf.placeholder(tf.float32, [None, n_classes], name='y')
self.W = {
'h1': tf.Variable(tf.random_normal([n_inputs, n_hidden_1]), name='wh1'),
'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2]), name='wh2'),
'o': tf.Variable(tf.random_normal([n_hidden_2, n_classes]), name='wo')
}
self.b = {
'b1': tf.Variable(tf.random_normal([n_hidden_1]), name='bh1'),
'b2': tf.Variable(tf.random_normal([n_hidden_2]), name='bh2'),
'o': tf.Variable(tf.random_normal([n_classes]), name='bo')
}
# Create model
self.l1 = tf.nn.sigmoid(tf.add(tf.matmul(self.x, self.W['h1']), self.b['b1']))
self.l2 = tf.nn.sigmoid(tf.add(tf.matmul(self.l1, self.W['h2']), self.b['b2']))
logits = tf.add(tf.matmul(self.l2, self.W['o']), self.b['o'])
# Define predict operation
self.predict_op = tf.argmax(logits, 1)
probs = tf.nn.softmax(logits, name='probs')
# Define cost function
self.cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits, self.y))
# Adding these to collection so we can restore them again
tf.add_to_collection('inputs', self.x)
tf.add_to_collection('inputs', self.y)
tf.add_to_collection('outputs', logits)
tf.add_to_collection('outputs', probs)
tf.add_to_collection('outputs', self.predict_op)
def train(self, X, Y, n_epochs=10, learning_rate=0.001, logs_path=None):
"""
Trains the Model
"""
self.optimizer = tf.train.AdamOptimizer(learning_rate).minimize(self.cost)
costs = []
# Instantiate TF Saver
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(tf.local_variables_initializer())
# start the threads used for reading files
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
# Compute total number of batches
total_batch = int(self.get_num_examples() / self.batch_size)
# start training
for epoch in range(n_epochs):
for i in range(total_batch):
batch_xs, batch_ys = sess.run([X, Y])
# run the training step with feed of images
_, cost = sess.run([self.optimizer, self.cost], feed_dict={self.x: batch_xs,
self.y: batch_ys})
costs.append(cost)
print "step %d" % (epoch * total_batch + i)
#costs.append(cost)
print "Epoch %d" % epoch
saver.save(sess, self.model_file)
temp = sess.run(self.W['h1'][0][0])
print temp
if self.visu:
plt.plot(costs)
plt.show()
# finalize
coord.request_stop()
coord.join(threads)
Predict routine fnn_eval.py:
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(tf.local_variables_initializer())
g = tf.get_default_graph()
# restore the model
self.saver = tf.train.import_meta_graph(self.model_file)
self.saver.restore(sess, tf.train.latest_checkpoint('./tfmodels/fnn/'))
wh1 = g.get_tensor_by_name("wh1:0")
print sess.run(wh1[0][0])
x, y = tf.get_collection('inputs')
logits, probs, predict_op = tf.get_collection('outputs')
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
predictions = []
print Y.eval()
for i in range(1):#range(self.get_num_examples()):
batch_xs = sess.run(X)
# Reshape batch_xs if only a single image is given
# (numpy is 4D: batch_size * heigth * width * channels)
batch_xs = np.reshape(batch_xs, (-1, self.img_w * self.img_h))
prediction, probabilities, logit = sess.run([predict_op, probs, logits], feed_dict={x: batch_xs})
predictions.append(prediction[0])
# finalize
coord.request_stop()
coord.join(threads)

I guess the problem might be caused by creating a new variable when you restore the model, not getting the already existed variable. I tried this code
saver = tf.train.import_meta_graph('./model.ckpt-10.meta')
w1 = None
for v in tf.global_variables():
print v.name
w1 = tf.get_variable('wh1', [])
init = tf.global_variables_initializer()
sess.run(init)
saver.restore(sess, './model.ckpt-10')
for v in tf.global_variables():
print v.name
and clearly you can see the output that it creates a new variable called wh1_1:0.
If you try this
w1 = None
for v in tf.global_variables():
print v.name
if v.name == 'wh1:0':
w1 = v
init = [tf.global_variables_initializer(), tf.local_variables_initializer()]
sess.run(init)
saver.restore(sess, './model.ckpt-10')
for v in tf.global_variables():
print v.name
temp = sess.run(w1)
print temp[0][0]
There will be no problem.
Tensorflow suggests that it is better to use tf.variable_scope() (link) like this
with tf.variable_scope("foo"):
v = tf.get_variable("v", [1])
with tf.variable_scope("foo", reuse=True):
v1 = tf.get_variable("v", [1])
assert v1 == v

I have meet the same problem when saving model to saved_model format. Anyone using the function add_meta_graph_and_variables to save the model for serving, be careful about this parameter "legacy_init_op: Legacy support for op or group of ops to execute after the restore op upon a load."

You want to pass in a var_list to the Saver.
In your case, the variable list would come from your W and b dictionaries: var_list = list(W.values())+list(b.values()). Then, to restore the model, pass in var_list to the Saver: saver = tf.train.Saver(var_list=var_list).
Next, you need to get your checkpoint state: model = tf.train.get_checkpoint_state(<your saved model directory>). After that you can restore the trained weights.
var_list = list(W.values())+list(b.values())
saver = tf.train.Saver(var_list=var_list)
model = tf.get_checkpoint_state('./model/')
with tf.Session() as sess:
saver.restore(sess,model.model_checkpoint_path)
#Now use the pretrained weights

Transform map to mapPartition using pyspark

I am trying to load a tensorflow model from disk and predicting the values.
Code
def get_value(row):
print("**********************************************")
graph = tf.Graph()
rowkey = row[0]
checkpoint_file = "/home/sahil/Desktop/Relation_Extraction/data/1485336002/checkpoints/model-300"
print("Loading model................................")
with graph.as_default():
session_conf = tf.ConfigProto(
allow_soft_placement=allow_soft_placement,
log_device_placement=log_device_placement)
sess = tf.Session(config=session_conf)
with sess.as_default():
# Load the saved meta graph and restore variables
saver = tf.train.import_meta_graph("{}.meta".format(checkpoint_file))
saver.restore(sess, checkpoint_file)
input_x = graph.get_operation_by_name("X_train").outputs[0]
dropout_keep_prob = graph.get_operation_by_name("dropout_keep_prob").outputs[0]
predictions = graph.get_operation_by_name("output/predictions").outputs[0]
batch_predictions = sess.run(predictions, {input_x: [row[1]], dropout_keep_prob: 1.0})
print(batch_predictions)
return (rowkey, batch_predictions)
I have a RDD which consists of a tuple (rowkey, input_vector). I want to use the loaded model to predict the score/class of the input.
Code to call get_value()
result = data_rdd.map(lambda iter: get_value(iter))
result.foreach(print)
The problem is every time I call the map, the model is loaded everytime for each tuple and it takes a lot of time.
I am thinking of loading the model using mapPartitions and then use map to call get_value function.
I have no clue as how to convert the code to a mapPartition where I load the tensorflow model only once per parition and reduce the running time.
Thanks in advance.

I am not sure if I get your question correctly, but we can optimise your code a bit here.
graph = tf.Graph()
checkpoint_file = "/home/sahil/Desktop/Relation_Extraction/data/1485336002/checkpoints/model-300"
with graph.as_default():
session_conf = tf.ConfigProto(
allow_soft_placement=allow_soft_placement,
log_device_placement=log_device_placement)
sess = tf.Session(config=session_conf)
s = sess.as_default()
saver = tf.train.import_meta_graph("{}.meta".format(checkpoint_file))
saver.restore(sess, checkpoint_file)
input_x = graph.get_operation_by_name("X_train").outputs[0]
dropout_keep_prob = graph.get_operation_by_name("dropout_keep_prob").outputs[0]
predictions = graph.get_operation_by_name("output/predictions").outputs[0]
session_pickle = cPickle.dumps(sess)
def get_value(key, vector, session_pickle):
sess = cPickle.loads(session_pickle)
rowkey = key
batch_predictions = sess.run(predictions, {input_x: [vector], dropout_keep_prob: 1.0})
print(batch_predictions)
return (rowkey, batch_predictions
result = data_rdd.map(lambda (key, row): get_value(key=key, vector = row , session_pickle = session_pickle))
result.foreach(print)
So you can serialize your tensorflow session. Though I haven't tested your code here. Run this and leave a comment.

I guess that the below code is a huge improvement as it uses mapPartitions.
Code
def predict(rows):
graph = tf.Graph()
checkpoint_file = "/home/sahil/Desktop/Relation_Extraction/data/1485336002/checkpoints/model-300"
print("Loading model................................")
with graph.as_default():
session_conf = tf.ConfigProto(
allow_soft_placement=allow_soft_placement,
log_device_placement=log_device_placement)
sess = tf.Session(config=session_conf)
with sess.as_default():
# Load the saved meta graph and restore variables
saver = tf.train.import_meta_graph("{}.meta".format(checkpoint_file))
saver.restore(sess, checkpoint_file)
print("**********************************************")
# Get the placeholders from the graph by name
input_x = graph.get_operation_by_name("X_train").outputs[0]
dropout_keep_prob = graph.get_operation_by_name("dropout_keep_prob").outputs[0]
# Tensors we want to evaluate
predictions = graph.get_operation_by_name("output/predictions").outputs[0]
# Generate batches for one epoch
for row in rows:
X_test = [row[1]]
batch_predictions = sess.run(predictions, {input_x: X_test, dropout_keep_prob:
yield (row[0], batch_predictions)
result = data_rdd.mapPartitions(lambda iter: predict(iter))
result.foreach(print)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

is tensor2tensor/avg_checkpoints script wrong - python

Related

Why doesn't this test function output an image? pokeGAN Tutorial

Using multiple Tensorflow Restored Graphs and Sessions

tensorflow calculations without retraining

Tensorflow restored weights is not set

Transform map to mapPartition using pyspark

Categories

Resources