I want to generate a constant tensor in Tensorflow, which will be initialized with a specified mechanism, eg, random_uniform, random_normal.
I know that I can generate a random numpy array according to these mechanisms, say random_uniform, random_normal, etc; Then we feed the resulted numpy array as value argument in tf.constant.
However, the question is that we must give a shape when using numpy version of random mechanism. However, I don't want to pre-specify the shape, and I hope the shape is resilient, just like we write Tensorflow code shape = tf.shape(some_previous_tensor)
Way1 I tried: There is not a must to pre-specify the concrete shape of the constant in the graph construction phase. However, the generated tensor is random rather than static. That is not I expected.
var = tf.random.normal(
[2,2], mean=0.0, stddev=0.5, dtype=tf.float32,
)
with tf.Session() as sess:
print('var:', sess.run(var))
print('var:', sess.run(var))
Output:
var: [[ 0.21260215 0.13721827]
[ 0.7704196 -0.48304045]]
var: [[-0.63397115 -0.0956466 ]
[ 0.0761982 0.54037064]]
Way2 I tried: I can get static constant, but it is necessary to give a size in np.random.normal, which is not I expected.
var_np = np.random.normal(0,0.5, size=(2,2))
var = tf.constant(value=var_np)
with tf.Session() as sess:
print('var:', sess.run(var))
print('var:', sess.run(var))
Output:
var: [[-0.73357953 -0.10277695]
[ 0.57406473 0.32157612]]
var: [[-0.73357953 -0.10277695]
[ 0.57406473 0.32157612]]
You can use tf.Variable / tf.get_variable with trainable=False and validate_shape=False. You can use a value depending on a placeholder for the shape as initial value. Then, when you initialize the variable (either using the initializer attribute or something more common like tf.global_variables_initializer), you just have to give the shape for initialization. After the initialization, the value of the variable will be kept the same for the whole session, as long as it is not initialized again or assigned a different value.
import tensorflow as tf
shape = tf.placeholder(tf.int32, [None])
var_init = tf.random.normal(
shape, mean=0.0, stddev=0.5, dtype=tf.float32,
)
var = tf.Variable(var_init, validate_shape=False, trainable=False, name='Var')
with tf.Session() as sess:
tf.random.set_random_seed(0)
sess.run(var.initializer, feed_dict={shape: [2, 3]})
print('var:', sess.run(var), sep='\n')
print('var:', sess.run(var), sep='\n')
Output:
var:
[[-0.4055751 0.7597851 -0.04810145]
[ 0.92776746 -0.3747548 -0.03715562]]
var:
[[-0.4055751 0.7597851 -0.04810145]
[ 0.92776746 -0.3747548 -0.03715562]]
Just run tf.shape(t) for a tensor t in whose shape you want your static random tensor to be. Feed the output value as size argument to np.random.normal and you're all set.
Related
I know that there is a tf.assign function in TensorFlow, but this function is mainly aimed at mutable tensor (tf.Variable). How to modify the value of the tensor? For example, the following code,
import numpy as np
import tensorflow as tf
X = tf.placeholder(tf.float32, shape=[None, 32, 32, 3])
conv1 = tf.layers.conv2d(X, filters=64, kernel_size=(3, 3), padding='same',name='conv1')
relu1 = tf.nn.relu(conv1)
conv2 = tf.layers.conv2d(relu1, filters=64, kernel_size=(3, 3), padding='same',name='conv2')
relu2 = tf.nn.relu(conv2)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
tensor = sess.graph.get_tensor_by_name(u'conv2/Conv2D:0')
feature_map = tf.reduce_mean(tensor[:,:,:,24])
image = np.random.uniform(size=(1,32,32,3))
sess.run([feature_map], feed_dict={X: image})
How to modify the value of feature_map and do not affect its derivation?
More specifically, when I change the value of feature_map, it does not affect its derivation process.
For example, y = a^2, y'= 2a, I just need to change a = 1 to a = 2.
Other_op = tf.gradients(feature_map, X)
Different feature_map would achieve the different values, but it does not destroy the graph structures of operation.
In your example feature_map doesn't have a value as it's an operation. Therefore you can't change it's value as such. What you can do, is pass another value in as part of the feed_dict parameter of session.run.
So for example if your feature_map is followed by an operation like this:
other_op = tf.gradient(feature_map, X)
Then you can change the value passed in to that op (gradient in this case) via feed_dict like so:
session.run(other_op, feed_dict={feature_map: <new value>})
That's not possible. A tensor is the output of tf.Operation. From documentation:
A Tensor is a symbolic handle to one of the outputs of an Operation. It does not hold the values of that operation's output, but instead provides a means of computing those values in a TensorFlow tf.Session.
So you can't change its value independently.
I want to slice a tensor in "None" dimension.
For example,
tensor = tf.placeholder(tf.float32, shape=[None, None, 10], name="seq_holder")
sliced_tensor = tensor[:,1:,:] # it works well!
but
# Assume that tensor's shape will be [3,10, 10]
tensor = tf.placeholder(tf.float32, shape=[None, None, 10], name="seq_holder")
sliced_seq = tf.slice(tensor, [0,1,0],[3, 9, 10]) # it doens't work!
It is same that i get a message when i used another place_holder to feed size parameter for tf.slice().
The second methods gave me "Input size (depth of inputs) must be accessible via shape inference" error message.
I'd like to know what's different between two methods and what is more tensorflow-ish way.
[Edited]
Whole code is below
import tensorflow as tf
import numpy as np
print("Tensorflow for tests!")
vec_dim = 5
num_hidden = 10
# method 1
input_seq1 = np.random.random([3,7,vec_dim])
# method 2
input_seq2 = np.random.random([5,10,vec_dim])
shape_seq2 = [5,9,vec_dim]
# seq: [batch, seq_len]
seq = tf.placeholder(tf.float32, shape=[None, None, vec_dim], name="seq_holder")
# Method 1
sliced_seq = seq[:,1:,:]
# Method 2
seq_shape = tf.placeholder(tf.int32, shape=[3])
sliced_seq = tf.slice(seq,[0,0,0], seq_shape)
cell = tf.contrib.rnn.GRUCell(num_units=num_hidden)
init_state = cell.zero_state(tf.shape(seq)[0], tf.float32)
outputs, last_state = tf.nn.dynamic_rnn(cell, sliced_seq, initial_state=init_state)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# method 1
# states = sess.run([sliced_seq], feed_dict={seq:input_seq1})
# print(states[0].shape)
# method 2
states = sess.run([sliced_seq], feed_dict={seq:input_seq2, seq_shape:shape_seq2})
print(states[0].shape)
Your problem is exactly described by issue #4590
The problem is that tf.nn.dynamic_rnn needs to know the size of the last dimension in the input (the "depth"). Unfortunately, as the issue points out, currently tf.slice cannot infer any output size if any of the slice ranges are not fully known at graph construction time; therefore, sliced_seq ends up having a shape (?, ?, ?).
In your case, the first issue is that you are using a placeholder of three elements to determine the size of the slice; this is not the best approach, since the last dimension should never change (even if you later pass vec_dim, it could cause errors). The easiest solution would be to turn seq_shape into a placeholder of size 2 (or even two separate placeholders), and then do the slicing like:
sliced_seq = seq[:seq_shape[0], :seq_shape[1], :]
For some reason, the NumPy-style indexing seems to have better shape inference capabilities, and this will preserve the size of the last dimension in sliced_seq.
My scenario:
Define an RNN model structure and train it using an input with fixed batch size and sequence length.
Freeze the model (i.e. converting all trainable variables into constants) producing a GraphDef containing everything one needs to use the model at test-time (via tf.graph_util.convert_variables_to_constants).
Import the GraphDef via tf.import_graph_def and replace the input using the input_map argument. The new input needs to have arbitrary batch size and sequence length.
The problem: All of the above works until I pass in an input to the test-time graph that uses a batch size or sequence length that differs from the original sizes used at training-time. At that point I get an error like this:
InvalidArgumentError (see above for traceback): ConcatOp : Dimensions of inputs should match: shape[0] = [1,5] vs. shape[1] = [2,7]
[[Node: import/rnn/while/basic_rnn_cell/basic_rnn_cell_1/concat = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](import/rnn/while/TensorArrayReadV3, import/rnn/while/Identity_2, import/rnn/while/basic_rnn_cell/basic_rnn_cell_1/concat/axis)]]
To illustrate and reproduce the problem, please consider the following minimal examples.
v1: a graph is created with arbitrary batch size and sequence length. This works fine but unfortunately I must use a fixed batch size and sequence length at training-time and must use an arbitrary batch size and sequence length at test-time so I can't use this simple approach.
v2a: we simulate creating the training-time graph with fixed batch size (2) and sequence length (3) and freeze the graph.
v2ba: we demonstrate that loading the frozen model in unchanged still produces the same results.
v2bb: we demonstrate that loading the frozen model in with a replaced input that still uses a fixed batch size and sequence length still produces the same results.
v2bc: we demonstrate that loading the frozen model in with a replaced input that uses arbitrary batch size and sequence length still produces the same results, as long as the input is shaped according to the original batch size and sequence length. It works with data but fails with data2 -- the only difference being the batch size of the former is 2 and the batch size of the latter is 1.
Is it possible to change an RNN graph via the input_map argument to tf.import_graph_def such that the input no longer has a fixed batch size and sequence length?
The following code works with TensorFlow 1.1 RC2 and may work with TensorFlow 1.0.
import numpy
import tensorflow as tf
from tensorflow import graph_util as tf_graph_util
from tensorflow.contrib import rnn as tfc_rnn
def v1(data):
with tf.Graph().as_default():
tf.set_random_seed(1)
x = tf.placeholder(tf.float32, shape=(None, None, 5))
_, s = tf.nn.dynamic_rnn(tfc_rnn.BasicRNNCell(7), x, dtype=tf.float32)
with tf.Session() as session:
session.run(tf.global_variables_initializer())
print session.run(s, feed_dict={x: data})
def v2a():
with tf.Graph().as_default():
tf.set_random_seed(1)
x = tf.placeholder(tf.float32, shape=(2, 3, 5), name="x")
_, s = tf.nn.dynamic_rnn(tfc_rnn.BasicRNNCell(7), x, dtype=tf.float32)
with tf.Session() as session:
session.run(tf.global_variables_initializer())
return tf_graph_util.convert_variables_to_constants(
session, session.graph_def, [s.op.name]), s.name
def v2ba((graph_def, s_name), data):
with tf.Graph().as_default():
x, s = tf.import_graph_def(graph_def,
return_elements=["x:0", s_name])
with tf.Session() as session:
print '2ba', session.run(s, feed_dict={x: data})
def v2bb((graph_def, s_name), data):
with tf.Graph().as_default():
x = tf.placeholder(tf.float32, shape=(2, 3, 5))
[s] = tf.import_graph_def(graph_def, input_map={"x:0": x},
return_elements=[s_name])
with tf.Session() as session:
print '2bb', session.run(s, feed_dict={x: data})
def v2bc((graph_def, s_name), data):
with tf.Graph().as_default():
x = tf.placeholder(tf.float32, shape=(None, None, 5))
[s] = tf.import_graph_def(graph_def, input_map={"x:0": x},
return_elements=[s_name])
with tf.Session() as session:
print '2bc', session.run(s, feed_dict={x: data})
def main():
data1 = numpy.random.random_sample((2, 3, 5))
data2 = numpy.random.random_sample((1, 3, 5))
v1(data1)
model = v2a()
v2ba(model, data1)
v2bb(model, data1)
v2bc(model, data1)
v2bc(model, data2)
if __name__ == "__main__":
main()
This is a bug in tensorflow that has been going on for a while: you cannot reliably replace a placeholder with a defined shape with another one with (partially) undefined shape.
You will find a related issue filed here, which apparently did not get much attention.
I am trying to pass x_data as feed_dict but getting below error, I am not sure that what is wrong in the code.
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'x_12' with dtype int32 and shape [1000]
[[Node: x_12 = Placeholder[dtype=DT_INT32, shape=[1000], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
My Code:
import tensorflow as tf
import numpy as np
model = tf.global_variables_initializer()
#define x and y
x = tf.placeholder(shape=[1000],dtype=tf.int32,name="x")
y = tf.Variable(5*x**2-3*x+15,name = "y")
x_data = tf.pack(np.random.randint(0,100,size=1000))
print(x_data)
print(x)
with tf.Session() as sess:
sess.run(model)
print(sess.run(y,feed_dict={x:x_data}))
I checked the shape of the x and x_data and it is same
Tensor("pack_8:0", shape=(1000,), dtype=int32)
Tensor("x_14:0", shape=(1000,), dtype=int32)
I am working with one dimensional data.
Any help is appreciated, Thanks!
To make it work I have changed two things, first I changed y to be a Tensor. And secondly I have not changed the x_data to Tensor, as commented here:
The optional feed_dict argument allows the caller to override the value of tensors in the graph. Each key in feed_dict can be one of the following types:
If the key is a Tensor, the value may be a Python scalar, string, list, or numpy ndarray that can be converted to the same dtype as that tensor. Additionally, if the key is a placeholder, the shape of the value will be checked for compatibility with the placeholder.
The changed code which works for me:
import tensorflow as tf
import numpy as np
model = tf.global_variables_initializer()
#define x and y
x = tf.placeholder(shape=[1000],dtype=tf.int32,name="x")
y = 5*x**2-3*x+15 # without tf.Variable, making it a tf.Tensor
x_data = np.random.randint(0,100,size=1000) # without tf.pack
print(x_data)
print(x)
with tf.Session() as sess:
sess.run(model)
print(sess.run(y,feed_dict={x:x_data}))
I've found that indexing still is an open issue in tensorflow (#206), so I'm wondering what I could use as a workaround at the moment. I want to index/slice a row/column of a matrix based on a variable that changes for every training example.
What I've tried so far:
Slicing based on placeholder (doesn't work)
The following (working) code slices based on a fixed number.
import tensorflow as tf
import numpy as np
x = tf.placeholder("float")
y = tf.slice(x,[0],[1])
#initialize
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
#run
result = sess.run(y, feed_dict={x:[1,2,3,4,5]})
print(result)
However, it seems that I can't simply replace one of these fixed numbers with a tf.placeholder. The following code gives me the error "TypeError: List of Tensors when single Tensor expected."
import tensorflow as tf
import numpy as np
x = tf.placeholder("float")
i = tf.placeholder("int32")
y = tf.slice(x,[i],[1])
#initialize
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
#run
result = sess.run(y, feed_dict={x:[1,2,3,4,5],i:0})
print(result)
This sounds like the brackets around [i] are too much, but removing them doesn't help either. How to use a placeholder/variable as index?
Slicing based on python variable (doesn't backprop/update properly)
I've also tried using a normal python variable as index. This does not lead to an error, but the network doesn't learn anything while training. I suppose because the changing variable is not properly registered, the graph is malformed and updates don't work?
Slicing via one-hot vector + multiplication (works, but is slow)
One workaround I found is using a one-hot vector. Making a one-hot vector in numpy, passing this using a placeholder, then doing the slicing via matrix multiplication. This works, but is quite slow.
Any ideas how to efficiently slice/index based on a variable?
Slicing based on a placeholder should work just fine. It looks like you are running into a type error, due to some subtle issues of shapes and types. Where you have the following:
x = tf.placeholder("float")
i = tf.placeholder("int32")
y = tf.slice(x,[i],[1])
...you should instead have:
x = tf.placeholder("float")
i = tf.placeholder("int32")
y = tf.slice(x,i,[1])
...and then you should feed i as [0] in the call to sess.run().
To make this a little clearer, I would recommend rewriting the code as follows:
import tensorflow as tf
import numpy as np
x = tf.placeholder(tf.float32, shape=[None]) # 1-D tensor
i = tf.placeholder(tf.int32, shape=[1])
y = tf.slice(x, i, [1])
#initialize
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
#run
result = sess.run(y, feed_dict={x: [1, 2, 3, 4, 5], i: [0]})
print(result)
The additional shape arguments to the tf.placeholder op help to ensure that the values you feed have the appropriate shapes, and also that TensorFlow will raise an error if the shapes are not correct.
If you have an extra dimension, this works.
import tensorflow as tf
import numpy as np
def reorder0(e, i, length):
'''
e: a two dimensional tensor
i: a one dimensional int32 tensor, of shape (e.shape[0])
returns: a tensor of the same shape as e, where the jth entry is entry i[j] from e
'''
return tf.concat(
[ tf.expand_dims( e[i[j],:], axis=0) for j in range(length) ],
axis=0
)
e = tf.placeholder(tf.float32, shape=(2,3,5), name='e' ) # sentences, words, embedding
i = tf.placeholder(tf.int32, shape=(2,3), name='i' ) # for each word, index of parent
p = tf.concat(
[ tf.expand_dims(reorder0(e[k,:,:], i[k,:], 3), axis=0) for k in range(2) ],
axis=0,
name='p'
)
#initialize
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
#run
result = sess.run(p, feed_dict={
e: [
( (1.0,1.1,1.2,1.3,1.4),(2.0,2.1,2.2,2.3,2.4),(3.0,3.1,3.2,3.3,3.4) ),
( (21.0,21.1,21.2,21.3,21.4),(22.0,22.1,22.2,22.3,22.4),(23.0,23.1,23.2,23.3,23.4) ),
],
i: [ (1,1,1), (2,0,2)]
})
print(result)
If the sizes are not known when building the model, use TensorArray.
e = tf.placeholder(tf.float32, shape=(3,5) ) # words, embedding
i = tf.placeholder(tf.int32, shape=(3) ) # for each word, index of parent
#p = reorder0(e, i, 3)
a = tf.TensorArray(
tf.float32,
size=e.get_shape()[0],
dynamic_size=True,
infer_shape= True,
element_shape=e.get_shape()[1],
clear_after_read = False
)
#initialize
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
#run
result = sess.run(
a.unstack(e).gather(i),
feed_dict={
e: ( (1.0,1.1,1.2,1.3,1.4),(2.0,2.1,2.2,2.3,2.4),(3.0,3.1,3.2,3.3,3.4) ),
#( (21.0,21.1,21.2,21.3,21.4),(22.0,22.1,22.2,22.3,22.4),(23.0,23.1,23.2,23.3,23.4) ),
i: (2,0,2)
}
)
print(result)