I have a complicated use case which I've distilled down to just incrementing a variable in Tensorflow.
a = tf.Variable(1, trainable=False)
b = tf.constant(2)
a = tf.assign_add(a, b)
In [32]: type(a)
Out[32]: tensorflow.python.framework.ops.Tensor
My actual use case is actually generating a new random tensor under certain conditions each time my custom Keras layer is called, but seems like it boils down to a variable turning into a tensor if I do anything to it. Is the correct use case to wrap each a = tf.Variable(tf.assign(a, b)) and have a change everytime my keras layer is called?
You are overthinking it. tf.assign_add returns an op that adds to a variable. The fact that it also return the resulting value is for convenience only — the variable is affected.
Example:
import tensorflow as tf
a = tf.Variable(1, trainable=False)
b = tf.constant(2)
c = tf.assign_add(a, b)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
print(sess.run(a))
# 1: the original value
print(sess.run(c))
# 3: the result of the addition
print(sess.run(a))
# 3: OK, the variable has indeed been added to
Related
In Tensorflow 1.9, I want to create a network and then recursively feed the output (the prediction) of the network back into the input of the network. During this loop, I want to store the predictions made by the network in a list.
Here is my attempt:
# Define the number of steps over which to loop the network
num_steps = 5
# Define the network weights
weights_1 = np.random.uniform(0, 1, [1, 10]).astype(np.float32)
weights_2 = np.random.uniform(0, 1, [10, 1]).astype(np.float32)
# Create a variable to store the predictions, one for each loop
predictions = tf.Variable(np.zeros([num_steps, 1]), dtype=np.float32)
# Define the initial prediction to feed into the loop
initial_prediction = np.array([[0.1]], dtype=np.float32)
x = initial_prediction
# Loop through the predictions
for step_num in range(num_steps):
x = tf.matmul(x, weights_1)
x = tf.matmul(x, weights_2)
predictions[step_num-1].assign(x)
# Define the final prediction
final_prediction = x
# Start a session
sess = tf.Session()
sess.run(tf.global_variables_initializer())
# Make the predictions
last_pred, all_preds = sess.run([final_prediction, predictions])
print(last_pred)
print(all_preds)
And this prints out:
[[48.8769]]
[[0.]
[0.]
[0.]
[0.]
[0.]]
So whilst the value of final_prediction appears correct, the value of predictions is not what I would expect. It seems that predictions is never actually assigned to, despite the line predictions[step_num-1].assign(x).
Please can somebody explain to me why this isn't working, and what I should be doing instead? Thanks!
This happens because assign ist just a TF op like any other, and as such is only executed if needed. Since nothing on the path to final_prediction relies on the assign op, and predictions is just a variable, the assignment is never executed.
I think the most straightforward solution would be to replace the line
predictions[step_num-1].assign(x)
by
x = predictions[step_num-1].assign(x)
This works because assign also returns the value it is assigning. Now, to compute final_prediction TF actually needs to "go through" the assign op so the assignments should be carried out.
Another option would be to use tf.control_dependencies which is a way to "force" TF to compute specific ops when it is computing other ones. However in this case it could be a bit icky because the op we want to force (assign) depends on values that are being computed within the loop and I'm not sure about the order in which TF does stuff in this case. The following should work:
for step_num in range(num_steps):
x = tf.matmul(x, weights_1)
x = tf.matmul(x, weights_2)
with tf.control_dependencies([predictions[step_num-1].assign(x)]):
x = tf.identity(x)
We use tf.identity as a noop just to have something to wrap with control_dependencies. I think this is the more flexible option between the two. However it comes with some caveats discussed in the docs.
I want to optimize a cost function. This cost function contains variables and other parameters that are not variables. This non-variable parameters are obtained from the variables.
Here is a toy example that illustrates the point:
import numpy as np
import tensorflow as tf
r_init = np.array([5.0,6.0])
x = tf.get_variable("x_var", initializer = r_init[0], trainable = True)
y = tf.get_variable("y_var", initializer = r_init[1], trainable = True)
def cost(x,y):
a = x
return a*((x-1.0)**2+(y-1.0)**2)
train_op = tf.train.AdamOptimizer(learning_rate=0.05).minimize(cost(x,y))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(100):
print(sess.run([cost(x,y), train_op]))
print('x=', x.eval(session=sess))
print('y=', y.eval(session=sess))
As you can see, the parameter a is defined from the variable x, on the other hand a should not be a variable, I want the optimizer to see it as a constant. This constant should be updated as the variable x is updated in the optimization process.
How can I define a non-variable parameter a from the variable x? I am making this up, but intuitively, what comes to my mind is something like:
a = tf.to_constant(x)
Any ideas?
You are looking for tf.stop_gradient:
a = tf.stop_gradient(x)
Quoting the docs,
This is useful any time you want to compute a value with TensorFlow but need to pretend that the value was a constant.
In the following code segment, even though the initial values and expected_shape do not match, the code runs and gives output as 1.0. Here is the code.
import tensorflow as tf
import numpy as np
X = tf.placeholder(dtype=tf.float32)
y = tf.placeholder(dtype=tf.float32)
W = tf.Variable(0.0, expected_shape=(3,1))
b = tf.Variable(1.0)
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
print(sess.run(tf.add(W,b)))
Isn't it supposed to throw an error?
As of now, it is advised to make use of tf.get_variable and avoid tf.Variable as much as possible.
Now coming to your question on why expected_shape is not having any effect, if you look into source code, it is mentioned as it is Deprecated and ignored. If you further look into the function of _init_from_args, the parameter of expected_shape is totally ignored and that value is not used for further processing.
Currently I am having trouble in making a few elements in a variable as non-trainable. It implies that given a variable such as x,
x= tf.Variable(tf.zeros([2,2]))
I wish to train only x[0,0] and x[1,1] while keeping x[0,1] ans x[1.0] as fixed while training.
Currently tensorflow does provide the option to make any variable non-trainable by using trainable=False or tf.stop_gradient(). However, these method will make the all element in x as non-trainable. My question is how to obtain this selectivity?
There is no selective lack of update as for now; however you can achieve this effect indirectly by specifing explicitely variables that should be updated. Both .minimize and all the gradient functions accept the list of variables you want to optimize over - just create a list omitting some of these, for example
v1 = tf.Variable( ... ) # we want to freeze it in one op
v2 = tf.Variable( ... ) # we want to freeze it in another op
v3 = tf.Variable( ... ) # we always want to train this one
loss = ...
optimizer = tf.train.GradientDescentOptimizer(0.1)
op1 = optimizer.minimize(loss,
var_list=[v for v in tf.get_collection(tf.TRAINABLE_VARIABLES) if v != v1])
op2 = optimizer.minimize(loss,
var_list=[v for v in tf.get_collection(tf.TRAINABLE_VARIABLES) if v != v2])
and now you can call them whenever you want to train wrt. subset of variables. Note that this might require 2 separate optimizers if you are using Adam or some other method gathering statistics (and you will end up with separate statistics per optimizer!). However if there is just one set of frozen variables per training - everything will be straightforward with var_list.
However there is no way to fix training of the subset of the variable. Tensorflow treats variable as a single unit, always. You have to specify your computations in a different way to achieve this, one way is to:
create a binary mask M with 1's where you want to stop updates over X
create separate variable X', which is non-trainable, and tf.assign to it value of X
output X'*M + (1-M)*X
for example:
x = tf.Variable( ... )
xp= tf.Variable( ..., trainable=False)
m = tf.Constant( ... ) # mask
cp= tf.Assign(x, xp)
with tf.control_dependencies([cp]):
x_frozen = m*xp + (1-m)*x
and you just use x_frozen instead of x. Note that we need control dependency as tf.assign can execute asynchronously, and here we want to make sure it always has the most up to date value of x.
You can use tf.stop_gradient trick to prevent masked tf.Variable elements from training. For example:
x = tf.Variable(tf.zeros([2, 2]))
mask = tf.constant([[1, 0], [0, 1]], dtype=x.dtype)
x = mask * x + tf.stop_gradient((1 - mask) * x)
I created a function func that contains some variables. Now, I want to use this function standalone and also through tf.map_fn function and I want to keep the same set of variables for both the cases. But, apparently tf.map_fn function appends the current variable scope with map and hence the variable scope of standalone case can no longer matches the case with tf.map_fn. So, the following code throws an error as variable mul1/map/weights does not exist before calling it with reuse=True.
import tensorflow as tf
D = 5
batch_size = 1
def func(x):
W = tf.get_variable(initializer=tf.constant_initializer(1), shape=[D,1], dtype=tf.float32, trainable=True, name="weights")
y = tf.matmul(x, W)
return y
x = tf.placeholder(tf.float32, [batch_size, 5])
x_cat = tf.placeholder(tf.float32, [None, batch_size, 5])
with tf.variable_scope("mul1") as mul1_scope:
y_sum = func(x)
with tf.variable_scope(mul1_scope, reuse=True):
cost = tf.map_fn(lambda x: func(x), x_cat)
Here I want to run gradient update only on the variables under mul1/map scope. So, I can probably use tf.assign after every update to change the variables under mul1 scope (which is used only for the feedforward step). But that's a rather painful way to do variable sharing. So, I was wondering if there is any better way to solve this. Any help would be much appreciated !