I want to print a tensor in my program to see its internal values once it gets evaluated. The problem, however, is that the tensor being declared inside a function. To understand my problem better, here is some example code to better explain what it is I want to do:
a = tf.Variable([[2,3,4], [5,6,7]])
b = tf.Variable([[1,2,2], [3,3,3]])
def divide(a,b):
with tf.variable_scope('tfdiv', reuse=True):
c = tf.divide(a,b, name='c')
# Cannot print(c) here, as this will only yield tf info on c
return c
d = divide(a,b)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(d)
sess.run(tf.get_variable('tfdiv/c:0').eval(session=sess))
Previously, I have been able to do a print(c.eval(session=sess)), but as c is a local variable inside a function now, that does not work. As can be seen in the code above, I have tried to use tensorflow's variable scope in order to access the variable and then evaluate it. Unfortunately, this results in the error message:
ValueError: Shape of a new variable (tfdiv/c:0) must be fully defined, but
instead was <unknown>.
I tried to use the reuse=True flag, but I still get the same error. Any thoughts on how I can solve this problem? Best would be if there is a print(c) equivalent that can be put into the divide function, as written in the code above.
This will achieve what you want to do:
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(d))
Alternatively, you could replace the last line with:
print(sess.run(tf.get_default_graph().get_tensor_by_name('tfdiv/c:0')))
It is important to understand the difference between Python side code and TensorFlow side code. In python, you only setup the graph: d = divide(a, b) creates something like:
You set up a node (square) that will divide the data in the nodes a and b. It doesn't divide them right away! Note that in black you have the python variable names, and in gray you have the TensorFlow node names 1. a and b also have some default names, if you didn't specify them. The gray "c" you specified with name='c'. And local variable c and global variable d (Python) both refer to that same operation (node).
This is why if you say print(d) merely prints the info about that node. Once you setup the graph, doing sess.run(d) runs all the nodes required by the node in d on TensorFlow side. Then it retrieves the result and makes it available on python side as a numpy array.
You can use tf.Print(input, data) to print tensors on TF side. Note this is an operation (a node in the graph) that does nothing to the input tensor, it merely passes it through, while also printing everything in data.
In your case, you can use Print on tensorflow side it like this:
def divide(a,b):
with tf.variable_scope('tfdiv', reuse=True):
c = tf.divide(a,b, name='c')
cp = tf.Print(c, [c], message='Value of c: ', name='P')
return cp
This effectively adds another node in the graph (named P on TF side):
Now the value of operation c will be printed every time it will be evaluated. Note it will also be printed every time one of its dependencies will be evaluated, for example if you later do e = d + 1, when you evaluate e, it needs d, which refers to the printing node (returned from the function divide).
Finally, note that if you do this in a Jupyter notebook, the print will appear in the terminal of the notebook server. The details of this are not important for now :).
1 the :0 is added by default so that you can retrieve any tensor with by using name_of_op:0. Distinction between name of operation (tfdiv/c) and name of tensor(tfdiv/c:0).
Related
...and a suggestion to Use a.any() or a.all().
I am new to python and i am trying to implement a sabr model. I have defined a function with the following parameters:
def haganimpliedvol(a,f,k,B,v,t,p):
if k != f:
z = v/a*math.pow(f*k,(1-B)/2)*math.log(f/k)
xz = math.log((math.sqrt(1-2*p*z+math.pow(z,2))+z-p)/(1-p))
sigma = a/math.pow(f*k,(1-B)/2)*(1 + math.pow(1-B,2)/24* math.pow(math.log(f/k),2)+\
math.pow(1-B,4)/1920* math.pow(math.log(f/k),4))*\
xz*\
(1+(math.pow(1-B,2)/24*math.pow(a,2)/math.pow(f/k,1-B)+1/4*(p*B*v*a)/math.pow(f/k,(1-B)/2)+\
(2-3*math.pow(p,2))/24*math.pow(v,2)))*t
else:
sigma = a/math.pow(f,1-B)*\
(1+(math.pow(1-B,2)/24*math.pow(a,2)/math.pow(f,(2-2*B))+\
1/4*(p*B*a*v)/math.pow(f,1-B)+(2-3*math.pow(p,2))/24*math.pow(v,2)))*t
return(sigma)
Now I define another function to and call the haganimpliedvol() function
params = [0.4,0.6,0.1,-0.4]
def objective(params):
global k,sigma_iv,t,f
a = params[0]
B = params[1]
v = params[2]
p = params[1]
for (i,j,k) in zip(k,t,f):
calc_vols = np.array([haganimpliedvol(a,f,k,B,v,t,p)])
return(calc_vols)
As can be seen, a few parameters in the functions are list. I want to get an array as an output. However, I keep getting the message in the subject line.
Pay attention to the variables in this call:
for (i,j,k) in zip(k,t,f):
calc_vols = np.array([haganimpliedvol(a,f,k,B,v,t,p)])
for the zip to work, k,t, f have to be lists or arrays of matching size;
Done use k for an iteration variable; it is already used in the zip. I think you are just being careless here; or confused.
And the arguments to the hagen... function. Are the f, k, t supposed to be variables used in the zip? It would make more sense to use the iteration variables (i,j,?). Again, this just looks like you are careless, or don't care what happens.
As for the ambiguity error, that most likely arises in the
if k != f:
If either k or f is an array (or both) the k!=f will be a boolean array. That can't be used in if, which requires a simple True or False value. It does not iterate on the conditions. It is a basic Python if - a switch.
This ambiguity error comes up frequently, in various contexts, but all with the same basic issue - using an array in a context that requires a scalar T/F. A simple web search should provide lots of examples.
#hpaulj thank you for leading me on the right path. I vectorized my function and made some edits and now it is working fine.
haganimpliedvol = np.vectorize(haganimpliedvol,excluded = ['a','B','v','p'])
params = [0.2,0.7,0.01,-0.4]
def objective(params):
global k,sigma_iv,t,f
a = params[0]
B = params[1]
v = params[2]
p = params[1]
calc_vols = haganimpliedvol(a,f,k,B,v,t,p)
return(calc_vols)
Are you sure you want to pass arrays into the haganimpliedvol() function?
The general convention is to write functions which take a single input type.
Maybe call it one per item in the array?
Or write the function in a way that, if it sees the input is a list it iterates and if it sees the inputs arent lists then it just calculates it one time.
See this thread for ideas
How to make a function that can handle single inputs or lists of inputs
I just tried to get an understanding of the TensorFlow naming behavior and I still need some clarification.
I worked on a project where I went into trouble with the naming of tensor, because they were predefined in a function, which was called later.
So I have the following example here:
import tensorflow as tf
def foo():
with tf.variable_scope("foo", reuse=True):
a = tf.placeholder(tf.float32,name="a")
b = tf.placeholder(tf.float32,name="b")
return a,b
##
a,b = foo()
print(a)
print(b)
I get the output:
Tensor("foo/a:0", dtype=float32)
Tensor("foo/b:0", dtype=float32)
When I call it again though, I get the output:
Tensor("foo_1/a:0", dtype=float32)
Tensor("foo_1/b:0", dtype=float32)
Why is this the case? I set reuse on true, so I expect the tensors to be in the same variable_scope "foo" again or that the program throws an error like "tensors already defined".
So, I tried a workaround with tf.get_variable:
def foo():
with tf.variable_scope("foo", reuse=True):
a = tf.get_variable("v", [1])
return a
##
a1 = foo()
print(a1)
graph = tf.get_default_graph()
#call tensors by name in tensorflow to avoid confusion with the naming
graph.get_tensor_by_name("foo/v:0")
Here, i get always the same output:
<tf.Variable 'foo/v:0' shape=(1,) dtype=float32_ref>
Unfortunately, I cannot work with variables because you cannot define a dynamic shape for them. You need placeholders to define variable shapes.
Can someone explain me why the program continues to create new variable_scopes for placeholders but not when I call tf.get_variable() ?
Thank you!
You can force the reuse of a scope by adding a '/' after the name i.e.: tf.variable_scope("foo/", reuse=True):
However that won't solve your problem.
In the case of variables, calling tf.Variable will always create a new variable, whereas calling tf.get_variable will reuse it if it already exists.
But with Placeholders there is no tf.get_placeholder.
What you can do is define your placeholders outside of foo, only once, and get them by name using tf.get_default_graph().get_tensor_by_name(name) or directly using the python variable whenever you need them.
example with get_tensor_by_name:
import tensorflow as tf
with tf.name_scope("scope"):
tf.placeholder(tf.float32,name="a")
tf.placeholder(tf.float32,name="b")
def foo():
a = tf.get_default_graph().get_tensor_by_name("scope/a:0")
b = tf.get_default_graph().get_tensor_by_name("scope/b:0")
return a,b
a,b = foo()
print(a)
print(b)
Note that placeholders, unlike variables, do not maintain a state that can be reused or not. They are merely a "pointer" to a tensor which will be fed later. They should not be part of your model, but an input to it, so you should not be creating them several times anyway.
Just for the case that you ran the code in a Jupyter notebook twice make sure that the notebook's kernel is not reusing the variables.
If you, instead, call the function foo() multiple times within the scope of the default graph you always get the same result:
def foo():
with tf.variable_scope("foo", reuse=True):
a = tf.placeholder(tf.float32,name="a")
b = tf.placeholder(tf.float32,name="b")
return a,b
with tf.Graph().as_default():
a,b = foo()
print(a) # gives variable name 'foo/a'
print(b) # gives variable name 'foo/b'
But that is different from the use case, where the function is called repeatedly to create more placeholders.
c,d = foo()
In this case the answer above of f4 is the recommended solution.
I need to run a loop where I use the iterator only for the bound and access the result of the loop afterwards.
import tensorflow as tf
def add(h_tl):
res = tf.add(h_tl, tf.constant(1, shape=[2,1]))
return res
x = tf.constant(5) # iterator for while loop
h_tl = tf.constant(0, shape=[2,1])
H_l = tf.constant(0, shape=[2,1])
def body(x): # Body of loop
global H_l
global h_tl # I need to use H_l and h_tl in the body of the loop
h_tl = add(h_tl) # Some simple modifications to h_tl which will be added to H_l
H_l = tf.concat([H_l, h_tl], 1)
# I want the result of H_l after the loop and use it for further calculations
return (x+1)
def condition(x): # Stop condition
return x < tf.constant(10)
y = tf.while_loop(condition, body, [x])
sess = tf.Session()
print(sess.run(y))
print(sess.run(H_l)) # Result of the loop
sess.close()
So, I want to do calculations on H_l which is done for every x between 5 and 10. How do I access the end result of H_l after the loop is completed? When I run x I get the correct value of 10, but when I try to run H_l I get the following error: "Operation 'while/concat' has been marked as not fetchable."
TensorFlow while loops are designed to allow different iterations to be run in parallel. This causes them to behave rather differently from what you'd expect coming from other programming languages.
One such oddity is that if you create a tensor inside the loop, you cannot access it from outside. In your case, you are creating a new tensor with every tf.concat op. Yes, even though that tensor is stored under the same Python variable name as the previous version, it is still a new tensor: indeed, the original H_l is a constant, so its value cannot be changed; you can only create a new tensor in its place. The same is true for h_tl.
To solve this problem for H_l, what you want is to store an array of tensors, and add another entry to this array at each iteration. The data structure for doing this is tf.TensorArray, which you should initialize outside the loop and then write into at each iteration.
For h_tl, in which you only want to examine the final value, you don't need a tf.TensorArray; you could use a tf.Variable instead, and use tf.assign to update its value.
In both cases, take care to ensure that the mutating ops are executed, by including them in a tf.control_dependencies block.
I have a function in a file neural_network.py that defines a loss function:
def loss(a, b):
...
debug = tf.Print(a, [a], message = 'debug: ')
debug.eval(session = ???)
return tf.add(a, b)
To explain, somewhere in this function I want to print a tensor. However, I don't have any session declared in this function; my sessions are declared in another file called forecaster.py. Therefore, when I try to put tf.Print() in loss(), I can't because I don't know which session to eval with. Is there a way to solve this problem, either by using tf.Print() or other debug methods? Thanks!
tf.Print works as an identity function which returns the same tensor that you passed as the first parameter, having a side effect of printing the list of tensors specified as the second parameter.
So you should use as following:
def loss(a, b):
...
a = tf.Print(a, [a], message = 'debug: ')
return tf.add(a, b)
a will be printed each time tensor tf.add(a, b) is evaluated.
I was trying to build an autoincrementing graph in TensorFlow. I thought that the assign op might be suitable for that, but found no documentation for it.
I assumed that this op returns its value—like in C-like languages—and wrote the following code:
import tensorflow as tf
counter = tf.Variable(0, name="counter")
one = tf.constant(1)
ten = tf.constant(10)
new_counter = tf.add(counter, one)
assign = tf.assign(counter, new_counter)
result = tf.add(assign, ten)
init_op = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init_op)
for _ in range(3):
print sess.run(result)
and this code works.
The question is: is this the expected behavior? Why is the assign op not documented here: https://www.tensorflow.org/versions/0.6.0/api_docs/index.html
Is it a non-recommended op?
The tf.assign() operator is the underlying mechanism that implements the Variable.assign() method. It takes a mutable tensor (with tf.*_ref type) and a new value, and returns a mutable tensor that has been updated with the new value. The return value is provided to make it easier to order an assignment before a subsequent read, but this feature is not well documented. An example will hopefully illustrate:
v = tf.Variable(0)
new_v = v.assign(10)
output = v + 5 # `v` is evaluated before or after the assignment.
sess.run(v.initializer)
result, _ = sess.run([output, new_v.op])
print result # ==> 10 or 15, depending on the order of execution.
v = tf.Variable(0)
new_v = v.assign(10)
output = new_v + 5 # `new_v` is evaluated after the assignment.
sess.run(v.initializer)
result = sess.run([output])
print result # ==> 15
In your code example the dataflow dependencies enforce the order of execution [read counter] -> new_counter = tf.add(...) -> tf.assign(...) -> [read output of assign] -> result = tf.add(...), which means that the semantics are unambiguous. However, the read-modify-write steps to update the counter are somewhat inefficient, and can have unexpected behavior when there are multiple steps running concurrently. For example, multiple threads accessing the same variable could observe the counter moving backwards (in the case that an older value was written back after a newer value).
I would recommend that you use Variable.assign_add() to update the counter, as follows:
counter = tf.Variable(0, name="counter")
one = tf.constant(1)
ten = tf.constant(10)
# assign_add ensures that the counter always moves forward.
updated_counter = counter.assign_add(one, use_locking=True)
result = tf.add(updated_counter, ten)
# ...
tf.assign() is nicely documented in the latest versions and it is used frequently in the projects.
This operation outputs "ref" after the assignment is done. This makes
it easier to chain operations that need to use the reset value.
In easier words it takes your original tensor and a new tensor. It updates original value of your tensor with a new value and returns the reference of your original tensor.