It seems that in Tensorflow, there are at least three methods to print out the value of a tensor. I've been reading here and there, yet still very confused.
These authors seem to summarize the different usage as:
Python print: can only print out certain attributes of a tensor, e.g. its shape, because outside the computation graph it's merely an operation.
tensor.eval() : Not sure its difference with tf.print()
tf.print(): can output the actual value of a tensor, but must be inserted somewhere in the graph as well as being used by some other operation otherwise it will be dangling and still not printed.
My confusion probably also lies in I'm not sure how much Python functionalities we can access in a tensorflow computation graph, or where the computation graph "ends" and Python code "begins". e.g.
If I insert a Python print between two lines where I construct a computation graph, when I call sess.run() later, will this line be called?
If I want to print out multiple tensor values in a computation graph, where should I put these statements?
What's the difference between tensor.eval() and tf.print()? How should I use them differently?
The native Python print() statement will be called when the graph is build the first time. Check this out:
a = tf.placeholder(shape=None, dtype=tf.int32)
b = tf.placeholder(shape=None, dtype=tf.int32)
print("a is ",a," while b is ",b)
c = tf.add(a, b)
with tf.Session() as sess:
print(sess.run(c, feed_dict={a: 1, b: 2}))
print(sess.run(c, feed_dict={a: 3, b: 1}))
By exectuing this code block, the output is:
# a is Tensor("Placeholder:0", dtype=int32) while b is Tensor("Placeholder_1:0", dtype=int32)
# 3
# 4
On the other hand, let us see tf.print():
a = tf.placeholder(shape=None, dtype=tf.int32)
b = tf.placeholder(shape=None, dtype=tf.int32)
print_op = tf.print("a is ",a," while b is ",b)
with tf.control_dependencies([print_op]):
c = tf.add(a, b)
with tf.Session() as sess:
print(sess.run(c, feed_dict={a: 1, b: 2}))
print(sess.run(c, feed_dict={a: 3, b: 1}))
So, according to the output below, we can see that if we add the dependency that the tf.print op must be run whenever c is run, we get to see the output we want to:
# a is 1 while b is 2
# 3
# a is 3 while b is 1
# 4
Finally, tensor.eval() is identical to sess.run(tensor). However, the limitation of tensor.eval() is that you can run it to evaluate a single tensor, while the tf.Session can be used to evaluate multiple tensors sess.run([tensor1, tensor2]). If you ask me, I would always use sess.run(list_of_tensors), to evaluate as many tensors as I want to, and print out their values.
No. The Python print is not called when you call sess.run() later.
If you want to print when you call sess.run() then you can use tf.print.
To print out multiple tensor values in a graph, you should use sess.run() after opening tf.Session(). Sample code is below.
t = tf.constant(42.0)
u = tf.constant(37.0)
pt = tf.print(t)
pu = tf.print(u)
with sess.as_default():
sess.run([pt, pu])
42
37
This answer and this in another question will be helpful.
tensor.eval() evaluates tensor operation and is not an operator.
tf.print() is just an operator that prints out given tensor. So after invoking tf.Session(), tf.print() is to be one of the graph nodes.
Related
I am using keras for a while now, but usually I don't have to use customized layers or perform some more complex flow control, so I'm struggling trying to understand somethings.
I am modeling a neural network with a customized layer on the top. This customized layer calls another function (search_sigma) and inside this function I execute tf.while_loop and inside of tf.while_loop I execute tf.cond.
I cannot understand why the conditions are not working.
tf.while_loop stops even though the condition (l1) still true
tf.cond executes both f1 and f2 (callables true_fn and false_fn)
Could someone help me understand what I am missing?
I already tried to change both tf.cond and tf.while_loop conditions for true tensors, just to see what would happen. The behavior (exactly same errors) remained the same.
I also tried to write this code without implementing a class (using just functions). Nothing changed.
I tried to find solutions looking at tensorflow documentation, other stack overflow doubts and websites talking about tf.while_loop and tf.cond.
I left some print()s in the body of the code to try to track what was happening.
class find_sigma:
def __init__ (self, t_inputs, inputs, expected_perp=10. ):
self.sigma, self.cluster = t_inputs
self.inputs = inputs
self.expected_perp = expected_perp
self.min_sigma=tf.constant([0.01],tf.float32)
self.max_sigma=tf.constant([50.],tf.float32)
def search_sigma(self):
def cond(s,sigma_not_found): return sigma_not_found
def body(s,sigma_not_found):
print('loop')
pi = K.exp( - K.sum( (K.expand_dims(self.inputs, axis=1) - self.cluster)**2, axis=2 )/(2*s**2) )
pi = pi / K.sum(pi)
MACHINE_EPSILON = np.finfo(np.double).eps
pi = K.maximum(pi, MACHINE_EPSILON)
H = - K.sum ( pi*(K.log(pi)/K.log(2.)) , axis=0 )
perp = 2**H
print('0')
l1 = tf.logical_and (tf.less(perp , self.expected_perp), tf.less(0.01, self.max_sigma-s))
l2 = tf.logical_and (tf.less( self.expected_perp , perp) , tf.less(0.01, s-self.min_sigma) )
def f1():
print('f1')
self.min_sigma = s
s2 = (s+self.max_sigma)/2
return [s2, tf.constant([True])]
def f2(l2):
tf.cond( l2, true_fn=f3 , false_fn = f4)
def f3():
print('f3')
self.max_sigma = s
s2 = (s+self.min_sigma)/2
return [s2, tf.constant([True])]
def f4():
print('f4')
return [s, tf.constant([False])]
output = tf.cond( l1, f1 , f4 ) #colocar f2 no lugar de f4
s, sigma_not_found = output
print('sigma_not_found = ',sigma_not_found)
return [s,sigma_not_found]
print('01')
sigma_not_found = tf.constant([True])
new_sigma,sigma_not_found=sigma_not_found = tf.while_loop(
cond , body, loop_vars=[self.sigma,sigma_not_found]
)
print('saiu')
print(new_sigma)
return new_sigma
The piece of code that calls the above code is:
self.sigma = tf.map_fn(fn=lambda t: find_sigma(t, inputs).search_sigma() , elems=(self.sigma,self.clusters), dtype=tf.float32)
'inputs' is a (None, 10) size tensor
'self.sigma' is a (10,) size tensor
'self.clusters' is a (N, 10) size tensor
First of all, your first question was outstanding ! Lots of info !
tf.while_loop is very confusing and that is one of the reasons why tf moved to eager execution. You don't need to do that anymore.
Anyway, back to your 2 questions. The answer is the same for both, you are never executing your graph, you are just building it. While building the execution graph, tensorflow needs to trace your python code and that is why you think tf.conf is running f1 and f2. It is "sort of running" because it needs to go inside to figure out what tensors/operations will be added to the graph.
Same applies to your question regarding tf.while_loop. It is never executing that.
I recommend a small change which might help you to understand what I'm saying and also fix your problem. Remove that tf.while_loop from inside the body method. Create another method, let's say run() and move the loop there. Sort of like this
def run(self):
out = tf.while_loop(cond, body, loop_vars)
Then, invoke run(). It will force the graph to get executed.
How do I clear the gradient before calling the .backward() for the second time.
RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time.
a = torch.tensor([2.0], requires_grad = True)
b = torch.tensor([2.0], requires_grad = True)
d = torch.tensor([2.0], requires_grad = True)
c=a*b
c.backward()
e = d*e
e.backward(retain_graph=True)
I tried doing this: c.zero_grad() but I got the error c has no method zero_grad()
As the error message reads, you need to specify the retain_graph=True option on the first .backward call, not the second:
c.backward(retain_graph=True)
e = d*c
e.backward()
If you do not retain the graph, the second backward pass will not be able to reach the nodes c, a, and b because the activations will have been cleared by the first backward pass.
I'm trying to execute these functions
def evaluate(sentence):
sentence = preprocess_sentence(sentence)
sentence = tf.expand_dims(
START_TOKEN + tokenizer.encode(sentence) + END_TOKEN, axis=0)
output = tf.expand_dims(START_TOKEN, 0)
for i in range(MAX_LENGTH):
predictions = model(inputs=[sentence, output], training=False)
# select the last word from the seq_len dimension
predictions = predictions[:, -1:, :]
predicted_id = tf.cast(tf.argmax(predictions, axis=-1), tf.int32)
# return the result if the predicted_id is equal to the end token
if tf.equal(predicted_id, END_TOKEN[0]):
break
#check()
#tf.cond(tf.equal(predicted_id, END_TOKEN[0]),true_fn=break,false_fn=lambda: tf.no_op())
# concatenated the predicted_id to the output which is given to the decoder
# as its input.
output = tf.concat([output, predicted_id], axis=-1)
return tf.squeeze(output, axis=0)
def predict(sentence):
prediction = evaluate(sentence)
predicted_sentence = tokenizer.decode(
[i for i in prediction if i < tokenizer.vocab_size])
print('Input: {}'.format(sentence))
print('Output: {}'.format(predicted_sentence))
return predicted_sentence
however, I'm having the following error:
OperatorNotAllowedInGraphError: using a `tf.Tensor` as a Python `bool` is not allowed in Graph execution. Use Eager execution or decorate this function with #tf.function.
I do understand that I have to rewrite the if condtion in a form of tf.cond(). however, i don't know how to write break in tensor flow, also I'm not sure which if condition is causing the problem as the same function exactly in this notebook is working properly?
https://colab.research.google.com/github/tensorflow/examples/blob/master/community/en/transformer_chatbot.ipynb#scrollTo=_NURhwYz5AXa
Any help?
The code in the notebook works because it uses TF 2.0, which has eager execution enabled by default. You can turn it on in older versions with tf.enable_eager_execution.
Alternatively, you can use break in graph mode without writing tf.cond if you use tf.function or tf.autograph, but they have some restrictions on the code you can run.
There is nothing wrong with the break statement. The problem is elsewhere.
if tf.equal(predicted_id, END_TOKEN[0]):
break
will give error something about using Python bool in tensor ops. Since you have already used tf.equal condition this could be confusing. The solution is simple. The error is being thrown for the
if (boolean): python syntax.
You would have to take care of this (bool) Python syntax and convert to tensor-style, based on what you are planning to achieve. Remember, the condition returns a tensor of boolean values. Read this tensor and then proceed to do what you want.. So for e.g. below would work unconditionally irrespective of the value of the condition:
if tf.equal(predicted_id, END_TOKEN[0]) is not None:
break
I'm doing some experimentation with TensorFlow, and have run into a snag. I'm trying to use TF to evalute a change in a model, then either retain or revert the model, based on the resultant change in loss function. I've got the hard part (conditional control) figured out, but I'm stuck on something that should be fairly straightforward: I can't seem to store a tf.trainable_variables for an iteration, then restore it if needed.
Let's say a build an Op:
...
store_trainable_vars = []
for v in tf.trainable_variables():
store_trainable_vars.append(v)
...
Then later, I want to restore tf.trainable_variables to the value it had when this Op was last run. I'd want to do something like:
def reject_move():
revert_state = []
for (v, s) in zip(tf.trainable_variables(), store_trainable_vars):
revert_state.append(tf.assign(v, s, name="revert_state"))
return(revert_state)
Obviously, this will re-evaluate store_trainable_vars, which in turn links to the present value of tf.trainable_variables(), obviating the revert_state Op. I need some way to store and retrieve the value of Tensors without calling back to the present value of those Tensors. Something like
...
store_trainable_vars = []
for v in tf.trainable_variables():
store_trainable_vars.append(v.value_right_now())
...
where v.value_right_now() returns a constant that won't change until overwritten.
I know I could use Saver, but that solution writes to the disk, which is not acceptable for this application as it will run inside a training loop.
I'm probably missing something obvious - any guidance would be appreciated.
To restore a graph state manually you need to use tf.tuple or tf.group operation, that will modify the flow for a bulk change:
This creates a tuple of tensors with the same values as the tensors
argument, except that the value of each tensor is only returned after
the values of all tensors have been computed.
[Update] Here's how I would do it:
import numpy as np
import tensorflow as tf
x = tf.placeholder(shape=[None, 5], dtype=tf.float32, name='x')
W = tf.Variable(np.zeros([5, 5]), dtype=tf.float32, name='W')
b = tf.Variable(np.zeros([5]), dtype=tf.float32, name='b')
y = tf.add(tf.matmul(x, W), b)
with tf.Session() as session:
batch = np.ones([2, 5])
session.run(tf.global_variables_initializer())
print session.run(y, feed_dict={x: batch}) # prints [2, 5] zeros
# store the current value
store = {v.name: v.eval(session) for v in tf.trainable_variables()}
print store # prints [5, 5] and [5] zeros
# update
new = {'W:0': np.ones([5, 5]), 'b:0': np.ones([5])}
session.run(tf.tuple([tf.assign(var, new[var.name]) for var in tf.trainable_variables()]))
print session.run(y, feed_dict={x: batch}) # prints [2, 5] sixes
# restore
session.run(tf.tuple([tf.assign(var, store[var.name]) for var in tf.trainable_variables()]))
print session.run(y, feed_dict={x: batch}) # prints [2, 5] zeros again
But I really think you should reconsider your decision about Saver, because it was designed to be used inside a training loop as well. Internally, Saver does all the tricky work for you (in particular, it's restore op calls tf.group and tf.control_dependencies if needed), which may otherwise become the source of pretty nasty bugs. Besides, the disk is (almost) always bigger than your GPU and main memory, so if you can afford to store the model in memory, you should be able to store on disk as well.
Here are some parameters that help to control the proliferation of checkpoint files on disk:
max_to_keep indicates the maximum number of recent checkpoint files to
keep. As new files are created, older files are deleted. If None or 0, all checkpoint files are kept. Defaults to 5 (that is, the 5 most recent
checkpoint files are kept).
keep_checkpoint_every_n_hours: In addition to keeping the most recent
max_to_keep checkpoint files, you might want to keep one checkpoint file
for every N hours of training. This can be useful if you want to later
analyze how a model progressed during a long training session. For
example, passing keep_checkpoint_every_n_hours=2 ensures that you keep one checkpoint file for every 2 hours of training. The default value of 10,000 hours effectively disables the feature.
[Update] As clarified in the comments, the main concern is disk latency, that may slow down the training if accessed too often. If you're using Linux, it caches frequently used disk pages, Windows does it as well. But if you want to be absolutely sure, consider using tmpfs.
It wasn't my original intent to answer this question myself, but I've come up with a method that works fairly well. So, I thought I'd share it. The key insight came from this very clever answer. The approach is to reuse the assignment nodes created for inital variable assignment. A complete class implementing that approach is given below.
import tensorflow as tf
class TensorFlowState(object):
def __init__(self):
# Get the graph.
graph = tf.get_default_graph()
# Extract the global varibles from the graph.
self.gvars = graph.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
# Exract the Assign operations for later use.
self.assign_ops = [graph.get_operation_by_name(v.op.name + "/Assign")
for v in self.gvars]
# Extract the initial value ops from each Assign op for later use.
self.init_values = [op.inputs[1] for op in self.assign_ops]
def start(self, sess):
self.sess = sess
def store(self):
# Record the current state of the TF global varaibles
self.state = self.sess.run(self.gvars)
def restore(self):
# Create a dictionary of the iniailizers and stored state of globals.
feed_dict = {init_value: val
for init_value, val in zip(self.init_values, self.state)}
# Use the initializer ops for each variable to load the stored values.
return(self.sess.run(self.assign_ops, feed_dict=feed_dict))
To use, simply instantiate the class, call the start method to pass a tf.Session, and call the store and restore methods as needed inside your imperative training loop. I've used this implementation to build an optimizer, which runs about as fast as the gradient descent optimizers included with TensorFlow.
I am creating neural nets with Tensorflow and skflow; for some reason I want to get the values of some inner tensors for a given input, so I am using myClassifier.get_layer_value(input, "tensorName"), myClassifier being a skflow.estimators.TensorFlowEstimator.
However, I find it difficult to find the correct syntax of the tensor name, even knowing its name (and I'm getting confused between operation and tensors), so I'm using tensorboard to plot the graph and look for the name.
Is there a way to enumerate all the tensors in a graph without using tensorboard?
You can do
[n.name for n in tf.get_default_graph().as_graph_def().node]
Also, if you are prototyping in an IPython notebook, you can show the graph directly in notebook, see show_graph function in Alexander's Deep Dream notebook
I'll try to summarize the answers:
To get all nodes in the graph: (type tensorflow.core.framework.node_def_pb2.NodeDef)
all_nodes = [n for n in tf.get_default_graph().as_graph_def().node]
To get all ops in the graph: (type tensorflow.python.framework.ops.Operation)
all_ops = tf.get_default_graph().get_operations()
To get all variables in the graph: (type tensorflow.python.ops.resource_variable_ops.ResourceVariable)
all_vars = tf.global_variables()
To get all tensors in the graph: (type tensorflow.python.framework.ops.Tensor)
all_tensors = [tensor for op in tf.get_default_graph().get_operations() for tensor in op.values()]
To get all placeholders in the graph: (type tensorflow.python.framework.ops.Tensor)
all_placeholders = [placeholder for op in tf.get_default_graph().get_operations() if op.type=='Placeholder' for placeholder in op.values()]
Tensorflow 2
To get the graph in Tensorflow 2, instead of tf.get_default_graph() you need to instantiate a tf.function first and access the graph attribute, for example:
graph = func.get_concrete_function().graph
where func is a tf.function
There is a way to do it slightly faster than in Yaroslav's answer by using get_operations. Here is a quick example:
import tensorflow as tf
a = tf.constant(1.3, name='const_a')
b = tf.Variable(3.1, name='variable_b')
c = tf.add(a, b, name='addition')
d = tf.multiply(c, a, name='multiply')
for op in tf.get_default_graph().get_operations():
print(str(op.name))
tf.all_variables() can get you the information you want.
Also, this commit made today in TensorFlow Learn that provides a function get_variable_names in estimator that you can use to retrieve all variable names easily.
I think this will do too:
print(tf.contrib.graph_editor.get_tensors(tf.get_default_graph()))
But compared with Salvado and Yaroslav's answers, I don't know which one is better.
The accepted answer only gives you a list of strings with the names. I prefer a different approach, which gives you (almost) direct access to the tensors:
graph = tf.get_default_graph()
list_of_tuples = [op.values() for op in graph.get_operations()]
list_of_tuples now contains every tensor, each within a tuple. You could also adapt it to get the tensors directly:
graph = tf.get_default_graph()
list_of_tuples = [op.values()[0] for op in graph.get_operations()]
Since the OP asked for the list of the tensors instead of the list of operations/nodes, the code should be slightly different:
graph = tf.get_default_graph()
tensors_per_node = [node.values() for node in graph.get_operations()]
tensor_names = [tensor.name for tensors in tensors_per_node for tensor in tensors]
Previous answers are good, I'd just like to share a utility function I wrote to select Tensors from a graph:
def get_graph_op(graph, and_conds=None, op='and', or_conds=None):
"""Selects nodes' names in the graph if:
- The name contains all items in and_conds
- OR/AND depending on op
- The name contains any item in or_conds
Condition starting with a "!" are negated.
Returns all ops if no optional arguments is given.
Args:
graph (tf.Graph): The graph containing sought tensors
and_conds (list(str)), optional): Defaults to None.
"and" conditions
op (str, optional): Defaults to 'and'.
How to link the and_conds and or_conds:
with an 'and' or an 'or'
or_conds (list(str), optional): Defaults to None.
"or conditions"
Returns:
list(str): list of relevant tensor names
"""
assert op in {'and', 'or'}
if and_conds is None:
and_conds = ['']
if or_conds is None:
or_conds = ['']
node_names = [n.name for n in graph.as_graph_def().node]
ands = {
n for n in node_names
if all(
cond in n if '!' not in cond
else cond[1:] not in n
for cond in and_conds
)}
ors = {
n for n in node_names
if any(
cond in n if '!' not in cond
else cond[1:] not in n
for cond in or_conds
)}
if op == 'and':
return [
n for n in node_names
if n in ands.intersection(ors)
]
elif op == 'or':
return [
n for n in node_names
if n in ands.union(ors)
]
So if you have a graph with ops:
['model/classifier/dense/kernel',
'model/classifier/dense/kernel/Assign',
'model/classifier/dense/kernel/read',
'model/classifier/dense/bias',
'model/classifier/dense/bias/Assign',
'model/classifier/dense/bias/read',
'model/classifier/dense/MatMul',
'model/classifier/dense/BiasAdd',
'model/classifier/ArgMax/dimension',
'model/classifier/ArgMax']
Then running
get_graph_op(tf.get_default_graph(), ['dense', '!kernel'], 'or', ['Assign'])
returns:
['model/classifier/dense/kernel/Assign',
'model/classifier/dense/bias',
'model/classifier/dense/bias/Assign',
'model/classifier/dense/bias/read',
'model/classifier/dense/MatMul',
'model/classifier/dense/BiasAdd']
The following solution works for me in TensorFlow 2.3 -
def load_pb(path_to_pb):
with tf.io.gfile.GFile(path_to_pb, 'rb') as f:
graph_def = tf.compat.v1.GraphDef()
graph_def.ParseFromString(f.read())
with tf.Graph().as_default() as graph:
tf.import_graph_def(graph_def, name='')
return graph
tf_graph = load_pb(MODEL_FILE)
sess = tf.compat.v1.Session(graph=tf_graph)
# Show tensor names in graph
for op in tf_graph.get_operations():
print(op.values())
where MODEL_FILE is the path to your frozen graph.
Taken from here.
This worked for me:
for n in tf.get_default_graph().as_graph_def().node:
print('\n',n)