I want to optimize a cost function. This cost function contains variables and other parameters that are not variables. This non-variable parameters are obtained from the variables.
Here is a toy example that illustrates the point:
import numpy as np
import tensorflow as tf
r_init = np.array([5.0,6.0])
x = tf.get_variable("x_var", initializer = r_init[0], trainable = True)
y = tf.get_variable("y_var", initializer = r_init[1], trainable = True)
def cost(x,y):
a = x
return a*((x-1.0)**2+(y-1.0)**2)
train_op = tf.train.AdamOptimizer(learning_rate=0.05).minimize(cost(x,y))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(100):
print(sess.run([cost(x,y), train_op]))
print('x=', x.eval(session=sess))
print('y=', y.eval(session=sess))
As you can see, the parameter a is defined from the variable x, on the other hand a should not be a variable, I want the optimizer to see it as a constant. This constant should be updated as the variable x is updated in the optimization process.
How can I define a non-variable parameter a from the variable x? I am making this up, but intuitively, what comes to my mind is something like:
a = tf.to_constant(x)
Any ideas?
You are looking for tf.stop_gradient:
a = tf.stop_gradient(x)
Quoting the docs,
This is useful any time you want to compute a value with TensorFlow but need to pretend that the value was a constant.
Related
In the following code segment, even though the initial values and expected_shape do not match, the code runs and gives output as 1.0. Here is the code.
import tensorflow as tf
import numpy as np
X = tf.placeholder(dtype=tf.float32)
y = tf.placeholder(dtype=tf.float32)
W = tf.Variable(0.0, expected_shape=(3,1))
b = tf.Variable(1.0)
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
print(sess.run(tf.add(W,b)))
Isn't it supposed to throw an error?
As of now, it is advised to make use of tf.get_variable and avoid tf.Variable as much as possible.
Now coming to your question on why expected_shape is not having any effect, if you look into source code, it is mentioned as it is Deprecated and ignored. If you further look into the function of _init_from_args, the parameter of expected_shape is totally ignored and that value is not used for further processing.
I am starting to use TensorFlow (with Python) and was wondering: when using a placeholder in a function, why not have an argument in my function which would feed a TensorFlow constant rather than the placeholder?
Here is an example (the difference is in x):
def sigmoid(z):
x = tf.constant(z, dtype=tf.float32, name = "x")
sigmoid = tf.sigmoid(x)
with tf.Session() as sess:
result = sess.run(sigmoid)
return result
instead of:
def sigmoid(z):
x = tf.placeholder(tf.float32, name = "...")
sigmoid = tf.sigmoid(x)
with tf.Session() as sess:
result = sess.run(sigmoid, feed_dict={x:z})
return result
The idea with Tensorflow is that you will repeat the same calculation on lots of data. when you write the code you are setting up a computational graph that later you will execute on the data. In your first example, you have hard-coded the data to a constant. This is not a typical tensorflow use case. The second example is better because it allows you to reuse the same computational graph with different data.
How can one specify a custom initializer as the third argument for tf.get_variable()? Specifically, I have a variable y which I want to initialize using another (already initialized) variable x.
This is easy to do using tf.Variable(), just say, y = tf.Variable(x.initialized_value()). But I couldn't find an analog in the documentation for tf.get_variable().
You can use x.initialized_value() as well. For example:
import tensorflow as tf
x = tf.Variable(1.0)
y = tf.get_variable('y', initializer=x.initialized_value())
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
print(sess.run(y)) # prints 1.0
I created a function func that contains some variables. Now, I want to use this function standalone and also through tf.map_fn function and I want to keep the same set of variables for both the cases. But, apparently tf.map_fn function appends the current variable scope with map and hence the variable scope of standalone case can no longer matches the case with tf.map_fn. So, the following code throws an error as variable mul1/map/weights does not exist before calling it with reuse=True.
import tensorflow as tf
D = 5
batch_size = 1
def func(x):
W = tf.get_variable(initializer=tf.constant_initializer(1), shape=[D,1], dtype=tf.float32, trainable=True, name="weights")
y = tf.matmul(x, W)
return y
x = tf.placeholder(tf.float32, [batch_size, 5])
x_cat = tf.placeholder(tf.float32, [None, batch_size, 5])
with tf.variable_scope("mul1") as mul1_scope:
y_sum = func(x)
with tf.variable_scope(mul1_scope, reuse=True):
cost = tf.map_fn(lambda x: func(x), x_cat)
Here I want to run gradient update only on the variables under mul1/map scope. So, I can probably use tf.assign after every update to change the variables under mul1 scope (which is used only for the feedforward step). But that's a rather painful way to do variable sharing. So, I was wondering if there is any better way to solve this. Any help would be much appreciated !
I've been trying to gather the gradient steps for each step of the GradientDescentOptimizer within TensorFlow, however I keep running into a TypeError when I try to pass the result of apply_gradients() to sess.run(). The code I'm trying to run is:
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
x = tf.placeholder(tf.float32,[None,784])
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x,W)+b)
y_ = tf.placeholder(tf.float32,[None,10])
cross_entropy = -tf.reduce_sum(y_*log(y))
# note that up to this point, this example is identical to the tutorial on tensorflow.org
gradstep = tf.train.GradientDescentOptimizer(0.01).compute_gradients(cross_entropy)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
batch_x,batch_y = mnist.train.next_batch(100)
print sess.run(gradstep, feed_dict={x:batch_x,y_:batch_y})
Note that if I replace the last line with print sess.run(train_step,feed_dict={x:batch_x,y_:batch_y}), where train_step = tf.GradientDescentOptimizer(0.01).minimize(cross_entropy), the error is not raised. My confusion arises from the fact that minimize calls compute_gradients with exactly the same arguments as its first step. Can someone explain why this behavior occurs?
The Optimizer.compute_gradients() method returns a list of (Tensor, Variable) pairs, where each tensor is the gradient with respect to the corresponding variable.
Session.run() expects a list of Tensor objects (or objects convertible to a Tensor) as its first argument. It does not understand how to handle a list of pairs, and hence you get a TypeError which you try to run sess.run(gradstep, ...)
The correct solution depends on what you are trying to do. If you want to fetch all of the gradient values, you can do the following:
grad_vals = sess.run([grad for grad, _ in gradstep], feed_dict={x: batch_x, y: batch_y})
# Then, e.g., nuild a variable name-to-gradient dictionary.
var_to_grad = {}
for grad_val, (_, var) in zip(grad_vals, gradstep):
var_to_grad[var.name] = grad_val
If you also want to fetch the variables, you can execute the following statement separately:
sess.run([var for _, var in gradstep])
...though note that—without further modification to your program—this will just return the initial values for each variable.
You will have to run the optimizer's training step (or otherwise call Optimizer.apply_gradients()) to update the variables.
minimize calls compute_gradients followed by apply_gradients: it's possible you're missing the second step.
compute_gradients just returns the grads / variables, but doesn't apply the update rule to them.
Here is an example: https://github.com/tensorflow/tensorflow/blob/f2bd0fc399606d14b55f3f7d732d013f32b33dd5/tensorflow/python/training/optimizer.py#L69