How can one specify a custom initializer as the third argument for tf.get_variable()? Specifically, I have a variable y which I want to initialize using another (already initialized) variable x.
This is easy to do using tf.Variable(), just say, y = tf.Variable(x.initialized_value()). But I couldn't find an analog in the documentation for tf.get_variable().
You can use x.initialized_value() as well. For example:
import tensorflow as tf
x = tf.Variable(1.0)
y = tf.get_variable('y', initializer=x.initialized_value())
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
print(sess.run(y)) # prints 1.0
Related
I have a complicated use case which I've distilled down to just incrementing a variable in Tensorflow.
a = tf.Variable(1, trainable=False)
b = tf.constant(2)
a = tf.assign_add(a, b)
In [32]: type(a)
Out[32]: tensorflow.python.framework.ops.Tensor
My actual use case is actually generating a new random tensor under certain conditions each time my custom Keras layer is called, but seems like it boils down to a variable turning into a tensor if I do anything to it. Is the correct use case to wrap each a = tf.Variable(tf.assign(a, b)) and have a change everytime my keras layer is called?
You are overthinking it. tf.assign_add returns an op that adds to a variable. The fact that it also return the resulting value is for convenience only — the variable is affected.
Example:
import tensorflow as tf
a = tf.Variable(1, trainable=False)
b = tf.constant(2)
c = tf.assign_add(a, b)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
print(sess.run(a))
# 1: the original value
print(sess.run(c))
# 3: the result of the addition
print(sess.run(a))
# 3: OK, the variable has indeed been added to
I want to optimize a cost function. This cost function contains variables and other parameters that are not variables. This non-variable parameters are obtained from the variables.
Here is a toy example that illustrates the point:
import numpy as np
import tensorflow as tf
r_init = np.array([5.0,6.0])
x = tf.get_variable("x_var", initializer = r_init[0], trainable = True)
y = tf.get_variable("y_var", initializer = r_init[1], trainable = True)
def cost(x,y):
a = x
return a*((x-1.0)**2+(y-1.0)**2)
train_op = tf.train.AdamOptimizer(learning_rate=0.05).minimize(cost(x,y))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(100):
print(sess.run([cost(x,y), train_op]))
print('x=', x.eval(session=sess))
print('y=', y.eval(session=sess))
As you can see, the parameter a is defined from the variable x, on the other hand a should not be a variable, I want the optimizer to see it as a constant. This constant should be updated as the variable x is updated in the optimization process.
How can I define a non-variable parameter a from the variable x? I am making this up, but intuitively, what comes to my mind is something like:
a = tf.to_constant(x)
Any ideas?
You are looking for tf.stop_gradient:
a = tf.stop_gradient(x)
Quoting the docs,
This is useful any time you want to compute a value with TensorFlow but need to pretend that the value was a constant.
In the following code segment, even though the initial values and expected_shape do not match, the code runs and gives output as 1.0. Here is the code.
import tensorflow as tf
import numpy as np
X = tf.placeholder(dtype=tf.float32)
y = tf.placeholder(dtype=tf.float32)
W = tf.Variable(0.0, expected_shape=(3,1))
b = tf.Variable(1.0)
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
print(sess.run(tf.add(W,b)))
Isn't it supposed to throw an error?
As of now, it is advised to make use of tf.get_variable and avoid tf.Variable as much as possible.
Now coming to your question on why expected_shape is not having any effect, if you look into source code, it is mentioned as it is Deprecated and ignored. If you further look into the function of _init_from_args, the parameter of expected_shape is totally ignored and that value is not used for further processing.
I am starting to use TensorFlow (with Python) and was wondering: when using a placeholder in a function, why not have an argument in my function which would feed a TensorFlow constant rather than the placeholder?
Here is an example (the difference is in x):
def sigmoid(z):
x = tf.constant(z, dtype=tf.float32, name = "x")
sigmoid = tf.sigmoid(x)
with tf.Session() as sess:
result = sess.run(sigmoid)
return result
instead of:
def sigmoid(z):
x = tf.placeholder(tf.float32, name = "...")
sigmoid = tf.sigmoid(x)
with tf.Session() as sess:
result = sess.run(sigmoid, feed_dict={x:z})
return result
The idea with Tensorflow is that you will repeat the same calculation on lots of data. when you write the code you are setting up a computational graph that later you will execute on the data. In your first example, you have hard-coded the data to a constant. This is not a typical tensorflow use case. The second example is better because it allows you to reuse the same computational graph with different data.
I've been trying to gather the gradient steps for each step of the GradientDescentOptimizer within TensorFlow, however I keep running into a TypeError when I try to pass the result of apply_gradients() to sess.run(). The code I'm trying to run is:
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
x = tf.placeholder(tf.float32,[None,784])
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x,W)+b)
y_ = tf.placeholder(tf.float32,[None,10])
cross_entropy = -tf.reduce_sum(y_*log(y))
# note that up to this point, this example is identical to the tutorial on tensorflow.org
gradstep = tf.train.GradientDescentOptimizer(0.01).compute_gradients(cross_entropy)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
batch_x,batch_y = mnist.train.next_batch(100)
print sess.run(gradstep, feed_dict={x:batch_x,y_:batch_y})
Note that if I replace the last line with print sess.run(train_step,feed_dict={x:batch_x,y_:batch_y}), where train_step = tf.GradientDescentOptimizer(0.01).minimize(cross_entropy), the error is not raised. My confusion arises from the fact that minimize calls compute_gradients with exactly the same arguments as its first step. Can someone explain why this behavior occurs?
The Optimizer.compute_gradients() method returns a list of (Tensor, Variable) pairs, where each tensor is the gradient with respect to the corresponding variable.
Session.run() expects a list of Tensor objects (or objects convertible to a Tensor) as its first argument. It does not understand how to handle a list of pairs, and hence you get a TypeError which you try to run sess.run(gradstep, ...)
The correct solution depends on what you are trying to do. If you want to fetch all of the gradient values, you can do the following:
grad_vals = sess.run([grad for grad, _ in gradstep], feed_dict={x: batch_x, y: batch_y})
# Then, e.g., nuild a variable name-to-gradient dictionary.
var_to_grad = {}
for grad_val, (_, var) in zip(grad_vals, gradstep):
var_to_grad[var.name] = grad_val
If you also want to fetch the variables, you can execute the following statement separately:
sess.run([var for _, var in gradstep])
...though note that—without further modification to your program—this will just return the initial values for each variable.
You will have to run the optimizer's training step (or otherwise call Optimizer.apply_gradients()) to update the variables.
minimize calls compute_gradients followed by apply_gradients: it's possible you're missing the second step.
compute_gradients just returns the grads / variables, but doesn't apply the update rule to them.
Here is an example: https://github.com/tensorflow/tensorflow/blob/f2bd0fc399606d14b55f3f7d732d013f32b33dd5/tensorflow/python/training/optimizer.py#L69