I've found that indexing still is an open issue in tensorflow (#206), so I'm wondering what I could use as a workaround at the moment. I want to index/slice a row/column of a matrix based on a variable that changes for every training example.
What I've tried so far:
Slicing based on placeholder (doesn't work)
The following (working) code slices based on a fixed number.
import tensorflow as tf
import numpy as np
x = tf.placeholder("float")
y = tf.slice(x,[0],[1])
#initialize
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
#run
result = sess.run(y, feed_dict={x:[1,2,3,4,5]})
print(result)
However, it seems that I can't simply replace one of these fixed numbers with a tf.placeholder. The following code gives me the error "TypeError: List of Tensors when single Tensor expected."
import tensorflow as tf
import numpy as np
x = tf.placeholder("float")
i = tf.placeholder("int32")
y = tf.slice(x,[i],[1])
#initialize
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
#run
result = sess.run(y, feed_dict={x:[1,2,3,4,5],i:0})
print(result)
This sounds like the brackets around [i] are too much, but removing them doesn't help either. How to use a placeholder/variable as index?
Slicing based on python variable (doesn't backprop/update properly)
I've also tried using a normal python variable as index. This does not lead to an error, but the network doesn't learn anything while training. I suppose because the changing variable is not properly registered, the graph is malformed and updates don't work?
Slicing via one-hot vector + multiplication (works, but is slow)
One workaround I found is using a one-hot vector. Making a one-hot vector in numpy, passing this using a placeholder, then doing the slicing via matrix multiplication. This works, but is quite slow.
Any ideas how to efficiently slice/index based on a variable?
Slicing based on a placeholder should work just fine. It looks like you are running into a type error, due to some subtle issues of shapes and types. Where you have the following:
x = tf.placeholder("float")
i = tf.placeholder("int32")
y = tf.slice(x,[i],[1])
...you should instead have:
x = tf.placeholder("float")
i = tf.placeholder("int32")
y = tf.slice(x,i,[1])
...and then you should feed i as [0] in the call to sess.run().
To make this a little clearer, I would recommend rewriting the code as follows:
import tensorflow as tf
import numpy as np
x = tf.placeholder(tf.float32, shape=[None]) # 1-D tensor
i = tf.placeholder(tf.int32, shape=[1])
y = tf.slice(x, i, [1])
#initialize
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
#run
result = sess.run(y, feed_dict={x: [1, 2, 3, 4, 5], i: [0]})
print(result)
The additional shape arguments to the tf.placeholder op help to ensure that the values you feed have the appropriate shapes, and also that TensorFlow will raise an error if the shapes are not correct.
If you have an extra dimension, this works.
import tensorflow as tf
import numpy as np
def reorder0(e, i, length):
'''
e: a two dimensional tensor
i: a one dimensional int32 tensor, of shape (e.shape[0])
returns: a tensor of the same shape as e, where the jth entry is entry i[j] from e
'''
return tf.concat(
[ tf.expand_dims( e[i[j],:], axis=0) for j in range(length) ],
axis=0
)
e = tf.placeholder(tf.float32, shape=(2,3,5), name='e' ) # sentences, words, embedding
i = tf.placeholder(tf.int32, shape=(2,3), name='i' ) # for each word, index of parent
p = tf.concat(
[ tf.expand_dims(reorder0(e[k,:,:], i[k,:], 3), axis=0) for k in range(2) ],
axis=0,
name='p'
)
#initialize
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
#run
result = sess.run(p, feed_dict={
e: [
( (1.0,1.1,1.2,1.3,1.4),(2.0,2.1,2.2,2.3,2.4),(3.0,3.1,3.2,3.3,3.4) ),
( (21.0,21.1,21.2,21.3,21.4),(22.0,22.1,22.2,22.3,22.4),(23.0,23.1,23.2,23.3,23.4) ),
],
i: [ (1,1,1), (2,0,2)]
})
print(result)
If the sizes are not known when building the model, use TensorArray.
e = tf.placeholder(tf.float32, shape=(3,5) ) # words, embedding
i = tf.placeholder(tf.int32, shape=(3) ) # for each word, index of parent
#p = reorder0(e, i, 3)
a = tf.TensorArray(
tf.float32,
size=e.get_shape()[0],
dynamic_size=True,
infer_shape= True,
element_shape=e.get_shape()[1],
clear_after_read = False
)
#initialize
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
#run
result = sess.run(
a.unstack(e).gather(i),
feed_dict={
e: ( (1.0,1.1,1.2,1.3,1.4),(2.0,2.1,2.2,2.3,2.4),(3.0,3.1,3.2,3.3,3.4) ),
#( (21.0,21.1,21.2,21.3,21.4),(22.0,22.1,22.2,22.3,22.4),(23.0,23.1,23.2,23.3,23.4) ),
i: (2,0,2)
}
)
print(result)
Related
I've attempted converting a Python-side training loop to Tensorflow to (hypothetically) make the code run faster - not having to pass control over to cpu constantly. However, I can't manage using tf.while_loop.
Here's the code that works:
import numpy as np
import tensorflow as tf
from tqdm import tqdm
from sklearn.datasets import load_iris
from sklearn.preprocessing import RobustScaler
x, y = load_iris(True)
x = RobustScaler().fit_transform(x)
shape = (10, 10)
max_epochs = 1000
graph = tf.Graph()
sess = tf.Session(graph=graph)
x = x.astype(np.float64)
# Construct graph
with graph.as_default():
weights = tf.get_variable(
'weights', shape, initializer=tf.constant_initializer, dtype=tf.float64
)
curr_epoch = tf.placeholder(dtype=tf.int64, shape=())
with tf.name_scope('data'):
data = tf.data.Dataset.from_tensor_slices(x)
data = data.shuffle(buffer_size=10000)
data = data.repeat(max_epochs)
data = data.batch(1)
data = data.make_one_shot_iterator().get_next()
with tf.name_scope('update'):
update_op = make_update_op(weights)
init = tf.global_variables_initializer()
sess.run(init)
for i in tqdm(range(max_epochs)):
for _ in range(x.shape[0]):
sess.run(update_op, feed_dict={
curr_epoch: i
})
np_weights = sess.run(weights)
print(np_weights) # Correctly prints an array of 150's.
Now, if I create an update function to pass tf.while_loop, an error is thrown.
def make_update_op(w):
return w.assign(
w + 0.001
)
# In the code above:
update_op = tf.while_loop(lambda _: True, make_update_op, (weights,), maximum_iterations=x.shape[0])
# No inner loop:
for i in tqdm(range(max_epochs)):
sess.run(update_op, feed_dict={
curr_epoch: i
})
Line 22, in make_update_op
return w.assign(
AttributeError: 'Tensor' object has no attribute 'assign'
I don't quite understand what is happening even after reading the documentation. weights is a Variable after all. What could be done to correctly make the training loop?
The tensor that you're trying to assign a new value within a while loop is a result of a sequence of multiple operations-tensors (operation is node in the graph, while tensor is a directed edge). In particular, the while loop will produce:
Variable/Read-->while/Enter-->while/Merge-->while/Switch-->while/Identity
What you're trying to assign here is a tensor while/Identity.
tf.while_loop is usually used to iterate over the dimensions of a tensor (also over the None - the unknown dimension). You're trying to iterate over the variables that are fully defined. You don't need to create a tf.while_loop for that. Just create operations that update each variable and group these operations together:
update_ops = [w.assign(w + 0.001) for w in weights]
update_op = tf.group(update_ops)
Now, when you execute the update_op with tf.Session() interface it will update all variables.
Example:
import tensorflow as tf
v1 = tf.Variable(tf.ones((1, 2), dtype=tf.float32))
v2 = tf.Variable(2*tf.ones((1, 3), dtype=tf.float32))
update_ops = [w.assign(w + 0.001) for w in [v1, v2]]
update_op = tf.group(update_ops)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print('before update:')
print(v1.eval(), v2.eval())
print('after update:')
sess.run(update_op) # <-- update your variables
print(v1.eval(), v2.eval())
# before update:
# [[1. 1.]] [[2. 2. 2.]]
# after update:
# [[1.001 1.001]] [[2.001 2.001 2.001]]
Turns out, all that was missing was the fact that one cannot assign to a variable inside a loop as Vlad pointed out. Instead, one can return the new value of a variable.
def make_update_op(w):
return w + 0.001
new_w = tf.while_loop(lambda _: True, make_update_op, (weights,), maximum_iterations=x.shape[0])
update_op = weights.assign(new_w)
To use more variables one would need to return the same amount from the function and unpack them in Python, but the principle is the same.
def make_update_op(w, d):
return w + 0.001, d
new_w, _ = tf.while_loop(lambda *_: True, make_update_op, (weights, data), maximum_iterations=x.shape[0])
update_op = weights.assign(new_w)
Code
#!/usr/bin/env python3
import tensorflow as tf
import numpy as np
def customOps(n):
x = tf.placeholder(tf.float32)
v1 = tf.reduce_sum(x,1)
v2 = tf.reduce_sum(x,0)
v = tf.nn.softmax(tf.concat([v1, v2], 0))
index = np.argmax(v)
if index > n/3:
finalval = tf.norm(v1-v2, ord='euclidean')
else:
finalval = tf.norm(v1+v2, ord='euclidean')
return finalval
if __name__ == '__main__':
mat = np.asarray([[0, 1], [1, 0]], dtype = np.float32)
n = mat.shape[0]
finalVal = customOps(n)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
outVal = sess.run(finalVal, feed_dict={x:mat})
print(outVal)
sess.close()
Error Thrown
InvalidArgumentError: You must feed a value for placeholder tensor 'Placeholder_5' with dtype float [[{{node Placeholder_5}} = Placeholder[dtype=DT_FLOAT, shape=<unknown>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
The error is thrown at sess.run(init) line in the above snippet. I am feeding a float type array through feed_dict and I am not sure why the error is being thrown.
Where is the error and why?
Why the error:
Because you ran the same snippet multiple times in an unclean graph (i.e, your graph has multiple copies of the network).
The reason I can say this is the _5 at the end of the node name in the error message. TF assigns a default name to all tensors in the graph using incremental indices in case a name is already taken. Placeholder_5 means that in the same graph there is at least 5 Placeholder instances without a custom default name assigned which, given your code, should be impossible unless you called the function multiple times without cleaning up the graph.
How to fix it:
Run in a clean graph: Put tf.reset_default_graph() before finalVal = customOps(n).
Note: Your code has more issues than that (for example, you have x in the main branch, but x is a local variable of customOps), but the cause of the error you have is the one stated above.
Below you find a tested and working version of your code that addresses both issues.
import tensorflow as tf
import numpy as np
def customOps(n):
x = tf.placeholder(tf.float32)
v1 = tf.reduce_sum(x,1)
v2 = tf.reduce_sum(x,0)
v = tf.nn.softmax(tf.concat([v1, v2], 0))
index = np.argmax(v)
if index > n/3:
finalval = tf.norm(v1-v2, ord='euclidean')
else:
finalval = tf.norm(v1+v2, ord='euclidean')
return x, finalval
if __name__ == '__main__':
mat = np.asarray([[0, 1], [1, 0]], dtype = np.float32)
n = mat.shape[0]
tf.reset_default_graph()
x, finalVal = customOps(n)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
outVal = sess.run(finalVal, feed_dict={x:mat})
print(outVal)
sess.close()
I have read through previous strings. My data are in the form of an array fed to a placeholder. Trying to convert the data to a tensor before feeding produces a different (inverse) error message. Other solutions similarly do not seem to work in this situation. Here is minimal code.
from __future__ import print_function
import numpy as np
import tensorflow as tf
from tensorflow.contrib.factorization import KMeans
X = tf.placeholder(tf.float32, shape=[None, 10], name="X")
data = np.random.randn(2,10)
def lump(X):
# Build KMeans graph
kmeans = KMeans(inputs=X, num_clusters=k, distance_metric='cosine',
use_mini_batch=True)
(all_scores, cluster_idx, scores, cluster_centers_initialized, cluster_centers_var, init_op,
train_op) = kmeans.training_graph()
cluster_idx = cluster_idx[0] # fix for cluster_idx being a tuple
avg_distance = tf.reduce_mean(scores)
return cluster_idx, scores
# Initialize the variables (i.e. assign their default value)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
idx, d = sess.run(lump,feed_dict={X: data})
Correct, you can't evaluate just lump, because it's a function (returning tensors), not a tensor or an op. You probably meant to do something like this:
cluster_idx, scores = lump(X)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
idx, d = sess.run([cluster_idx, scores], feed_dict={X: data})
Note that lump() is invoked before tf.global_variables_initializer(), because it defines new variables in the graph, so they must be initialized.
The code still fails, because lump is clearly not finished and has issues with dimensions, but it is the right way to evaluate something in a session.
This is the main part of my code.
I'm confused on function shuffle_batch and feed_dict.
In my code below, the features and labels I put into the function are "list".(I also tried "array" before.But it seems doesn't matter.)
What I want to do is make my testing data(6144,26) and training data(1024,13) into batch:(100,26) and (100,13),then set them as the feed_dict for the placeholders.
My questions are:
1.The outputs of the function tf.train.batch_shuffle are Tensors.But I can not put tensors in the feed_dict,right?
2.When I compiled the last two rows,error says,got shape [6144, 26], but wanted [6144] .I know it may be a dimension error,but how can I fix it.
Thanks a lot.
import tensorflow as tf
import scipy.io as sio
#import signal matfile
#[('label', (8192, 13), 'double'), ('clipped_DMT', (8192, 26), 'double')]
file = sio.loadmat('DMTsignal.mat')
#get array(clipped_DMT)
data_cDMT = file['clipped_DMT']
#get array(label)
data_label = file['label']
with tf.variable_scope('split_cDMT'):
cDMT_test_list = []
cDMT_training_list = []
for i in range(0,8192):
if i % 4 == 0:
cDMT_test_list.append(data_cDMT[i])
else:
cDMT_training_list.append(data_cDMT[i])
with tf.variable_scope('split_label'):
label_test_list = []
label_training_list = []
for i in range(0,8192):
if i % 4 == 0:
label_test_list.append(data_label[i])
else:
label_training_list.append(data_label[i])
#set parameters
n_features = cDMT_training.shape[1]
n_labels = label_training.shape[1]
learning_rate = 0.8
hidden_1 = 256
hidden_2 = 128
training_steps = 1000
BATCH_SIZE = 100
#set Graph input
with tf.variable_scope('cDMT_Inputs'):
X = tf.placeholder(tf.float32,[None, n_features],name = 'Input_Data')
with tf.variable_scope('labels_Inputs'):
Y = tf.placeholder(tf.float32,[None, n_labels],name = 'Label_Data')
#set variables
#Initialize both W and b as tensors full of zeros
with tf.variable_scope('layerWeights'):
h1 = tf.Variable(tf.random_normal([n_features,hidden_1]))
h2 = tf.Variable(tf.random_normal([hidden_1,hidden_2]))
w_out = tf.Variable(tf.random_normal([hidden_2,n_labels]))
with tf.variable_scope('layerBias'):
b1 = tf.Variable(tf.random_normal([hidden_1]))
b2 = tf.Variable(tf.random_normal([hidden_2]))
b_out = tf.Variable(tf.random_normal([n_labels]))
#create model
def neural_net(x):
layer_1 = tf.add(tf.matmul(x,h1),b1)
layer_2 = tf.nn.relu(tf.add(tf.matmul(layer_1,h2),b2))
out_layer = tf.add(tf.matmul(layer_2,w_out),b_out)
return out_layer
nn_out = neural_net(X)
#loss and optimizer
with tf.variable_scope('Loss'):
loss = tf.reduce_mean(tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(logits = nn_out,labels = Y)))
with tf.name_scope('Train'):
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss)
with tf.name_scope('Accuracy'):
correct_prediction = tf.equal(tf.argmax(nn_out,1),tf.argmax(Y,1))
#correct_prediction = tf.metrics.accuracy (labels = Y, predictions =nn_out)
acc = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
# Initialize
init = tf.global_variables_initializer()
# start computing & training
with tf.Session() as sess:
sess.run(init)
for step in range(training_steps):
#set batch
cmt_train_bat,label_train_bat = sess.run(tf.train.shuffle_batch([cDMT_training_list,label_training_list],batch_size = BATCH_SIZE,capacity=50000,min_after_dequeue=10000))
cmt_test_bat,label_test_bat = sess.run(tf.train.shuffle_batch([cDMT_test_list,label_test_list],batch_size = BATCH_SIZE,capacity=50000,min_after_dequeue=10000))
From the Session.run doc:
The optional feed_dict argument allows the caller to override the
value of tensors in the graph. Each key in feed_dict can be one of the
following types:
If the key is a tf.Tensor, the value may be a Python scalar, string,
list, or numpy ndarray that can be converted to the same dtype as that
tensor. Additionally, if the key is a tf.placeholder, the shape of the
value will be checked for compatibility with the placeholder.
...
So you are right: for X and Y (which are placeholders) you can't feed a tensor and tf.train.shuffle_batch is not designed to work with placeholders.
You can follow one of two ways:
get rid of placeholders and use tf.TFRecordReader in combination with tf.train.shuffle_batch, as suggested here. This way you'll have only tensors in your model and you won't need to "feed" anything additionally.
batch and shuffle the data yourself in numpy and feed into placeholders. This takes just several lines of code, so I find it easier, though both paths are valid.
Take also into account performance considerations.
I was hoping someone could explain the difference (if any) between the Input Layer in Keras and Placeholders within Tensorflow?
The more I investigate, the more the two appear similar, but I am not convinced 100% either way thus far.
Here is what I have observed in favor of the claim that Input Layers and tf Placeholders are the same:
1) The tensor returned from keras.Input() can be used like a placeholder in the feed_dict of tf.Session's run method. Here is part of a simple example using Keras, which adds two tensors (a and b) and concatenates the result with a third tensor (c):
model = create_graph()
con_cat = model.output[0]
ab_add = model.output[1]
# These values are used equivalently to tf.Placeholder() below
mdl_in_a = model.input[0]
mdl_in_b = model.input[1]
mdl_in_c = model.input[2]
sess = k.backend.get_session()
a_in = rand_array() # 2x2 numpy arrays
b_in = rand_array()
c_in = rand_array()
a_in = np.reshape( a_in, (1,2,2))
b_in = np.reshape( b_in, (1,2,2))
c_in = np.reshape( c_in, (1,2,2))
val_cat, val_add = sess.run([con_cat, ab_add],
feed_dict={ mdl_in_a: a_in, mdl_in_b: b_in, mdl_in_c: c_in})
2) The docs from the Tensorflow Contrib regarding the Keras Input Layer mention Placeholders in its argument description:
"sparse: A boolean specifying whether the placeholder
to be created is sparse"
Here is what I have observed in favor of the claim that Input Layers and tf Placeholders are NOT the same:
1) I have seen people utilize tf.Placeholder's instead of the Input Layer's returned Tensor. Something like:
a_holder = tf.placeholder(tf.float32, shape=(None, 2,2))
b_holder = tf.placeholder(tf.float32, shape=(None, 2,2))
c_holder = tf.placeholder(tf.float32, shape=(None, 2,2))
model = create_graph()
con_cat, ab_add = model( [a_holder, b_holder, c_holder])
sess = k.backend.get_session()
a_in = rand_array() # 2x2 numpy arrays
b_in = rand_array()
c_in = rand_array()
a_in = np.reshape( a_in, (1,2,2))
b_in = np.reshape( b_in, (1,2,2))
c_in = np.reshape( c_in, (1,2,2))
val_cat, val_add = sess.run([con_cat, ab_add],
feed_dict={ a_holder: a_in, b_holder: b_in, c_holder: c_in})
Input() returns a handle to created placeholder, and does not create other tf operators; Tensor stands for both output of operations and placeholders so there is no contradiction.
To analyse what exactly is created by Input() lets run following code:
with tf.name_scope("INPUT_LAYER"):
input_l = Input(shape = [n_features])
Then:
writer = tf.summary.FileWriter('./my_graph', tf.get_default_graph())
writer.close()
And launch Tensorboard from your console:
tensorboard --logdir="./my_graph"
Look at the results: