I want to resize 3D images with a dynamic shape, for instance go from shape (64,64,64,1) to (128,128,128,1). The idea is to unstack the image along one axis, then use tf.image.resize_images and stack them again.
My issue is that tf.unstack can not handle variable sized inputs. If I run my code I obtain "ValueError: Cannot infer num from shape (?, ?, ?, 1)"
I have considered using tf.split instead, however it expects an integer input. Does anybody know a workaround?
Here is an example:
import tensorflow as tf
import numpy as np
def resize_by_axis(image, dim_1, dim_2, ax):
resized_list = []
# Unstack along axis to obtain 2D images
unstack_img_depth_list = tf.unstack(image, axis = ax)
# Resize 2D images
for i in unstack_img_depth_list:
resized_list.append(tf.image.resize_images(i, [dim_1, dim_2], method=1, align_corners=True))
# Stack it to 3D
stack_img = tf.stack(resized_list, axis=ax)
return stack_img
#X = tf.placeholder(tf.float32, shape=[64,64,64,1])
X = tf.placeholder(tf.float32, shape=[None,None,None,1])
# Get new shape
shape = tf.cast(tf.shape(X), dtype=tf.float32) * tf.constant(2, dtype=tf.float32)
x_new = tf.cast(shape[0], dtype=tf.int32)
y_new = tf.cast(shape[1], dtype=tf.int32)
z_new = tf.cast(shape[2], dtype=tf.int32)
# Reshape
X_reshaped_along_xy = resize_by_axis(X, dim_1=x_new, dim_2=y_new, ax=2)
X_reshaped_along_xyz= resize_by_axis(X_reshaped_along_xy, dim_1=x_new, dim_2=z_new, ax=1)
init = tf.global_variables_initializer()
# Run
with tf.Session() as sess:
sess.run(init)
result = X_reshaped_along_xyz.eval(feed_dict={X : np.zeros((64,64,64,1))})
print(result.shape)
tf.image.resize_images can resize multiple images at the same time, but it does not allow you to pick the batch axis. However, you can manipulate the dimensions of the tensor to put the axis that you want first, so it is used as batch dimension, and then put it back after resizing:
import tensorflow as tf
def resize_by_axis(image, dim_1, dim_2, ax):
# Make permutation of dimensions to put ax first
dims = tf.range(tf.rank(image))
perm1 = tf.concat([[ax], dims[:ax], dims[ax + 1:]], axis=0)
# Transpose to put ax dimension first
image_tr = tf.transpose(image, perm1)
# Resize
resized_tr = tf.image.resize_images(image_tr, [dim_1, dim_2],
method=1, align_corners=True)
# Make permutation of dimensions to put ax in its place
perm2 = tf.concat([dims[:ax] + 1, [0], dims[ax + 1:]], axis=0)
# Transpose to put ax in its place
resized = tf.transpose(resized_tr, perm2)
return resized
In your example:
import tensorflow as tf
import numpy as np
X = tf.placeholder(tf.float32, shape=[None, None, None, 1])
# Get new shape
shape = tf.cast(tf.shape(X), dtype=tf.float32) * tf.constant(2, dtype=tf.float32)
x_new = tf.cast(shape[0], dtype=tf.int32)
y_new = tf.cast(shape[1], dtype=tf.int32)
z_new = tf.cast(shape[2], dtype=tf.int32)
# Reshape
X_reshaped_along_xy = resize_by_axis(X, dim_1=x_new, dim_2=y_new, ax=2)
X_reshaped_along_xyz = resize_by_axis(X_reshaped_along_xy, dim_1=x_new, dim_2=z_new, ax=1)
init = tf.global_variables_initializer()
# Run
with tf.Session() as sess:
sess.run(init)
result = X_reshaped_along_xyz.eval(feed_dict={X : np.zeros((64, 64, 64, 1))})
print(result.shape)
# (128, 128, 128, 1)
Related
I have a requirement, that I want to use the updated value of x as an input to RNN. The below code snippet might illustrate you in detail.
x = tf.placeholder("float", shape=[None,1])
RNNcell = tf.nn.rnn_cell.BasicRNNCell(....)
outputs, _ = tf.dynamic_rnn(RNNCell, tf.reshape(x, [1,-1,1]))
x = outputs[-1] * (tf.Varaibles(...) * tf.Constants(...))
#Vlad answer is correct but since am new member cannot vote. The below code snippet is updated version of Vlads one with RNN cell.
x = tf.placeholder("float", shape=[None,1])
model = tf.nn.rnn_cell.BasicRNNCell(num_units=1, activation=None)
outputs, state = tf.nn.dynamic_rnn(model, tf.reshape(x, [-1,1, 1]), dtype=tf.float32)
# output1 = model.output
# output1 = outputs[-1]
output1 = outputs[:,-1,:]
# output1 = outputs
some_value = tf.constant([9.0], # <-- Some tensor the output will be multiplied by
dtype=tf.float32)
output1 *= some_value # <-- The output had been multiplied by `some_value`
# (with broadcasting in case of
# more than one input samples)
with tf.control_dependencies([output1]): # <-- Not necessary, but explicit control
output2, state2 = model(output1,state)
The example is more or less self-explanatory. We take the output of the model, multiply it by some tensor (could be scalar, or tensor with rank > 0 that could be broadcasted), feed it again to the model and get the result:
import tensorflow as tf
import numpy as np
x = tf.placeholder(tf.float32, shape=(None, 2))
w = tf.Variable(tf.random_normal([2, 2]))
bias = tf.Variable(tf.zeros((2, )))
output1 = tf.matmul(x, w) + bias
some_value = tf.constant([3, 3], # <-- Some tensor the output will be multiplied by
dtype=tf.float32)
output1 *= some_value*x # <-- The output had been multiplied by `some_value`
# (in this case with broadcasting in case of
# more than one input sample)
with tf.control_dependencies([output1]): # <-- Not necessary, but explicit control
output2 = tf.matmul(output1, w) + bias # dependencies is always good practice.
data = np.ones((3, 2)) # 3 two-dimensional samples
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(output2, feed_dict={x:data}))
# [[3.0432963 3.6584744]
# [3.0432963 3.6584744]
# [3.0432963 3.6584744]]
I have spent about two hours on this, but could not find the solution. The closes thing to what I need is probably this boolen mask, but I am still missing the next step.
My neural network wasn't learning so I started looking at every step it performs. And sure enough I found a problem. The problem lies in the fact that due to sparsity on my input layer I get too many bias terms propagated throughout. Uniqueness of my set up though is that the last time matrices will be zero matrices. Let me show you, I will first show a screenshot of my notebook and will then present the code.
screenshot:
I do not want bias terms added to where the whole time is a zeros matrix. I thought I could perhaps perform an op on the boolean mask filtered matrix?
Here is the code:
import tensorflow as tf
import numpy as np
dim = 4
# batch x time x events x dim
tensor = np.random.rand(1, 3, 4, dim)
zeros_last_time = np.zeros((4, dim))
tensor[0][2] = zeros_last_time
input_layer = tf.placeholder(tf.float64, shape=(None, None, 4, dim))
# These are supposed to perform operations on the non-zero times
Wn = tf.Variable(
tf.truncated_normal(dtype=dtype, shape=(dim,), mean=0, stddev=0.01),
name="Wn")
bn = tf.Variable(tf.truncated_normal(dtype=dtype, shape=(1,), mean=0,
stddev=0.01), name="bn")
# this is the op I want to be performed only on non-zero times
op = tf.einsum('bted,d->bte', input_layer, Wn) + bn
s = tf.Session()
glob_vars = tf.global_variables_initializer()
s.run(glob_vars)
# first let's see what the bias term is
s.run(bn, feed_dict={input_layer: tensor})
s.run(op, feed_dict={input_layer: tensor})
EDIT: So I believe tf.where is what I need.
Maybe a good solution can be the usage of tf.where to create a mask of zeros where the input is zero (in the last dimension) and is one otherwise.
Once we got this mask, we can just multiply it for the bias to get the result.
Here's my solution:
import tensorflow as tf
import numpy as np
dim = 4
# batch x time x events x dim
tensor = np.random.rand(1, 3, 4, dim)
zeros_last_time = np.zeros((4, dim))
tensor[0][2] = zeros_last_time
dtype = tf.float64
input_layer = tf.placeholder(tf.float64, shape=(None, None, 4, dim))
# These are supposed to perform operations on the non-zero times
Wn = tf.Variable(
tf.truncated_normal(dtype=dtype, shape=(dim,), mean=0, stddev=0.01),
name="Wn")
bn = tf.Variable(
tf.truncated_normal(dtype=dtype, shape=(1,), mean=0, stddev=0.01),
name="bn")
bias = bn * tf.cast(
tf.where(input_layer == tf.zeros(tf.shape(input_layer)[-1]),
tf.zeros(tf.shape(input_layer)[-1]),
tf.ones(tf.shape(input_layer)[-1])), dtype)
# this is the op I want to be performed only on non-zero times
op = tf.einsum('bted,d->bte', input_layer, Wn) + bias
s = tf.Session()
glob_vars = tf.global_variables_initializer()
s.run(glob_vars)
# first let's see what the bias term is
print(s.run(bn, feed_dict={input_layer: tensor}))
print(s.run(op, feed_dict={input_layer: tensor}))
I managed to get the right bias, but then noticed that the dimensions are messed up. So this is only a partial answer:
import tensorflow as tf
import numpy as np
dim = 4
# batch x time x events x dim
tensor = np.random.rand(1, 3, 4, dim)
zeros_last_time = np.zeros((4, dim))
tensor[0][2] = zeros_last_time
dtype = tf.float64
input_layer = tf.placeholder(dtype, shape=(None, None, 4, dim))
# These are supposed to perform operations on the non-zero times
Wn = tf.Variable(
tf.truncated_normal(dtype=dtype, shape=(dim,), mean=0, stddev=0.01),
name="Wn")
bn = tf.Variable(
tf.truncated_normal(dtype=dtype, shape=(1,), mean=0, stddev=0.01),
name="bn")
zeros = tf.equal(input_layer, tf.cast(tf.zeros(tf.shape(input_layer)[2:]),
tf.float64))
# bias
where_ = tf.where(zeros, tf.zeros(tf.shape(input_layer)),
tf.ones(tf.shape(input_layer)))
bias = bn * tf.cast(where_, tf.float64)
op = tf.einsum('bted,d->bte', input_layer, Wn) + bias # will fail
print(bias)
s = tf.Session()
glob_vars = tf.global_variables_initializer()
s.run(glob_vars)
s = tf.Session()
glob_vars = tf.global_variables_initializer()
s.run(glob_vars)
feed_dict = {input_layer: tensor}
s.run(bias, feed_dict)
and these two for bias do the job:
biases = tf.slice(biases, [0, 0, 0, 0], [1, 3, 1, 4])
squeezed_biases = tf.squeeze(biases)
I am generally struggling with indexing tensors in tensorflow.
I have image data and additional scalar data. I can only use a single placeholder to input all the data to a Neural Network.
The images (img) are numpy arrays with shape (84,84,3) and I have data a with shape (2) and b with shape (1).
Now I create a single sample
sample = np.reshape(np.array([img,a,b]),(3,1)) #shape (3,1)
The placeholder is
input = tf.placeholder(dtype=tf.float32,shape=[None] + list(sample.shape))
Now when TF reads a batch of samples I would like to retrieve the batch of images, the batch of a, and the batch of b, because they need to be input in different locations in the Neural Network.
Here is a minimal example:
import tensorflow as tf
from tensorflow.contrib import layers
import numpy as np
#Numpy
img = np.random.rand(84,84,3)
a = np.random.rand(2)
b = np.random.rand(1)
sample = np.reshape(np.array([img,a,b]),(3,1)) #shape (3,1)
batch = np.repeat(np.expand_dims(sample,axis=0),32,axis=0) #shape (32,3,1)
#TF
input = tf.placeholder(dtype=tf.float32,shape=[None] + list(sample.shape))
#TODO:
tf_img = tf.#get image batch from input
tf_a = tf.#get a batch from input
tf_b = tf.#get b batch from input
out = layers.convolution2d(tf_img,num_outputs=64,kernel_size=8,stride=2,activation_fn=tf.nn.relu)
out = layers.flatten(out)
out = tf.concat([out,tf_a,tf_b])
out = layers.fully_connected(out,10,activation_fn=tf.nn.relu)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
_ = sess.run(out,feed_dict={input:batch})
How can I extract the individual parts of the input from a tensor with shape (?,3,1), use the image data to create an embedding and concatenate the other two parts to that output enbedding.
Is there a better way to input the data? My only constraint is that it has to be a single placeholder.
Here's a complete example for my comment above:
import numpy as np
import tensorflow as tf
im_height = 84
im_width = 84
im_channels = 3
a_len = 2
b_len = 1
np_img = np.random.rand(im_height, im_width, im_channels)
np_a = np.random.rand(a_len)
np_b = np.random.rand(b_len)
# flatten the input and concatenate to a single 1D numpy array
np_sample = np.concatenate((np_img.reshape(-1), np_a.reshape(-1), np_b.reshape(-1)), axis=0)
# construct a pseudo batch
np_batch = np.repeat(np_sample[np.newaxis, :], 32, axis=0)
tf_batch = tf.placeholder(shape=(None, im_height*im_width*im_channels + a_len + b_len), dtype=tf.float32)
img_stop = im_height*im_width*im_channels
a_stop = img_stop+a_len
# you could also use tf.slice(...) here
tf_img = tf.reshape(tf_batch[:, 0:img_stop], (-1, im_height, im_width, im_channels))
tf_a = tf.reshape(tf_batch[:, img_stop:a_stop], (-1, a_len))
tf_b = tf.reshape(tf_batch[:, a_stop:], (-1, b_len))
with tf.Session() as sess:
fetch_dict = {'img': tf_img, 'a': tf_a, 'b': tf_b}
feed_dict = {tf_batch: np_batch}
res = sess.run(fetch_dict, feed_dict=feed_dict)
assert(np.isclose(res['img'][0, ...], np_img).all())
assert(np.isclose(res['a'][0, :], np_a).all())
assert(np.isclose(res['b'][0, :], np_b).all())
However, this is at least as invasive as adding appropriate placeholders to the code. Additionally, it's much less readable, in my opinion.
Given a tensor input of undefined shape H x W, I would like to reverse every other row.
In numpy, I would simply do
input[1::2, :] = input[1::2, ::-1]
but this is apparently not possible in TensorFlow.
Note that the input shape is only partially-known, i.e., input.shape == (None, None).
Any ideas?
You can achieve the same using placeholder
input = tf.placeholder(shape=(None, None), dtype=tf.int32)
# define axis to reverse
axis_to_reverse=1
input_reversed = tf.reverse(input, [axis_to_reverse])
sess = tf.Session()
_input_reversed = sess.run(input_reversed, {input: your array})
I'm trying to build a softmax regression model for CIFAR classification. At first when I tried to pass in my images and labels into the feed dictionary, I got an error that said that feed dictionaries do not accept Tensors. I then converted them into numpy arrays using .eval() but the program hangs at the .eval() line and does not continue any further. How can I pass this data into the feed_dict?
CIFARIMAGELOADING.PY
import tensorflow as tf
import os
import tensorflow.models.image.cifar10 as cf
IMAGE_SIZE = 24
BATCH_SIZE = 128
def loadimagesandlabels(size):
# Load the images from the CIFAR data directory
FLAGS = tf.app.flags.FLAGS
data_dir = os.path.join(FLAGS.data_dir, 'cifar-10-batches-bin')
filenames = [os.path.join(data_dir, 'data_batch_%d.bin' % i) for i in xrange(1, 6)]
filename_queue = tf.train.string_input_producer(filenames)
read_input = cf.cifar10_input.read_cifar10(filename_queue)
# Reshape and crop the image
height = IMAGE_SIZE
width = IMAGE_SIZE
reshaped_image = tf.cast(read_input.uint8image, tf.float32)
cropped_image = tf.random_crop(reshaped_image, [height, width, 3])
# Generate a batch of images and labels by building up a queue of examples
print('Filling queue with CIFAR images')
num_preprocess_threads = 16
min_fraction_of_examples_in_queue = 0.4
min_queue_examples = int(BATCH_SIZE*min_fraction_of_examples_in_queue)
images, label_batch = tf.train.batch([cropped_image,read_input.label],batch_size=BATCH_SIZE, num_threads=num_preprocess_threads, capacity=min_queue_examples+3*BATCH_SIZE)
print(images)
print(label_batch)
return images, tf.reshape(label_batch, [BATCH_SIZE])
CIFAR.PY
#Set up placeholder vectors for image and labels
x = tf.placeholder(tf.float32, shape = [None, 1728])
y_ = tf.placeholder(tf.float32, shape = [None,10])
W = tf.Variable(tf.zeros([1728,10]))
b = tf.Variable(tf.zeros([10]))
#Implement regression model. Multiply input images x by weight matrix W, add the bias b
#Compute the softmax probabilities that are assigned to each class
y = tf.nn.softmax(tf.matmul(x,W) + b)
#Define cross entropy
#tf.reduce sum sums across all classes and tf.reduce_mean takes the average over these sums
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_*tf.log(y), reduction_indices = [1]))
#Train the model
#Each training iteration we load 128 training examples. We then run the train_step operation
#using feed_dict to replace the placeholder tensors x and y_ with the training examples
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
#Open up a Session
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
for i in range(1000) :
images, labels = CIFARImageLoading.loadimagesandlabels(size=BATCH_SIZE)
unrolled_images = tf.reshape(images, (1728, BATCH_SIZE))
#convert labels to their one_hot representations
# should produce [[1,0,0,...],[0,1,0...],...]
one_hot_labels = tf.one_hot(indices= labels, depth=NUM_CLASSES, on_value=1.0, off_value= 0.0, axis=-1)
print(unrolled_images)
print(one_hot_labels)
images_numpy, labels_numpy = unrolled_images.eval(session=sess), one_hot_labels.eval(session=sess)
sess.run(train_step, feed_dict = {x: images_numpy, y_:labels_numpy})
#Evaluate the model
#.equal returns a tensor of booleans, we want to cast these as floats and then take their mean
#to get percent correctness (accuracy)
print("evaluating")
test_images, test_labels = CIFARImageLoading.loadimagesandlabels(TEST_SIZE)
test_images_unrolled = tf.reshape(test_images, (1728, TEST_SIZE))
test_images_one_hot = tf.one_hot(indices= test_labels, depth=NUM_CLASSES, on_value=1.0, off_value= 0.0, axis=-1)
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(accuracy.eval(feed_dict = {x: unrolled_images.eval(), y_ : test_images_one_hot.eval()}))
Theres a couple of things that you not are understanding really well. Throughout your graph you will work with Tensors. You define Tensors by either using tf.placeholder and feeding them in the session.run(feed_dict{}) or with tf.Variable and initializing it with session.run(tf.initialize_all_variables()). You must feed your input this way, and it should be numpy arrays in the same as shape as you expect in the placeholders. Here's a simple example:
images = tf.placeholder(type, [1728, BATCH_SIZE])
labels = tf.placeholder(type, [size])
'''
Build your network here so you have the variable: Output
'''
images_feed, labels_feed = CIFARImageLoading.loadimagesandlabels(size=BATCH_SIZE)
# here you can see your output
print sess.run(Output, feed_dict = {x: images_feed, y_:labels_feed})
You do not feed tf.functions with numpy arrays, you always feed them with Tensors. And the feed_dict is always fed with numpy arrays. The thing is: you will never have to convert tensors to numpy arrays for the input, that does not make sense. Your input must be numpy arrays, if it's a list, you can use np.asarray(list), if it's a tensor, you are doing this wrong.
I do not know what CIFARImageLoading.loadimagesandlabels returns to you, but I imagine it's not a Tensor, it's probably a numpy array already, so just get rid of this .eval().