Restoring queue state in Tensorflow from checkpoint - python

Context: I am training a model using an Estimator. Without extraneous details, I am using queues to read in a series of input images, which are batched and manipulated using an input function which I am call "read_pics_batch":
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(tf.local_variables_initializer())
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
keypoint_regression.fit(
input_fn = lambda: inp_mod.read_pics_batch(names_train, \
joint_annopoints_train,num_in_batch,max_num_epochs,'TRAIN'),
steps= max_steps_per_epoch*max_num_epochs, # max number of steps
monitors=[logging_hook])
coord.request_stop()
coord.join(threads)
The input function has the following form, where I am also randomising the input file order:
def read_pics_batch(names_list,joint_list,batch_size,max_num_epochs,task):
names_tensor = tf.convert_to_tensor(names_list, dtype=tf.string)
joint_total_tensor = tf.convert_to_tensor(joint_list, dtype=tf.int32)
min_after_dequeue = 100
capacity = min_after_dequeue + 3 * batch_size
file_pattern = [("...")]
examples = graph_io.read_keyed_batch_examples(file_pattern, batch_size, \
reader = tf.WholeFileReader, randomize_input = True, \
parse_fn = example_to_standard_pic, \
num_epochs = max_num_epochs, queue_capacity = capacity)
My questions are as follows:
1) Is there any way to restore the queue state from a checkpoint, like any other variable? If "randomize_input" from read_keyed_batch_examples would be set to False, than at each restart of the training_op I would read the same input files over and over, which is clearly not what I want.
2) If randomize_input = True, how exactly does the queue decide which files to enqueue? I see two possible options and I am unsure which is correct:
it selects a short-list of size "capacity" (from the full-list given by all the filenames defined by "file_pattern") and then randomises the order of the names in this short-list
it randomises the names in the full-list first, and then creates a short-list of size "capacity" out of this
If the second case applies, I don't believe that I would actually need to restore the queue state, since I would in principle read different files every time, but if the first case applies, I would still be reading the same few files over and over, just in a different order.
Thank you for your time!

Related

Tensorflow and reading binary data properly

I am trying to properly read in my own binary data to Tensorflow based on Fixed length records section of this tutorial, and by looking at the read_cifar10 function here. Mind you I am new to tensorflow, so my understanding may be off.
My Data
My files are binary with float32 type. The first 32 bit sample is the label, and the remaining 256 samples are the data. I want to reshape the data at the end to a [2, 128] matrix.
My Code So far:
import tensorflow as tf
import os
def read_data(filename_queue):
item_type = tf.float32
label_items = 1
data_items = 256
label_bytes = label_items * item_type.size
data_bytes = data_items * item_type.size
record_bytes = label_bytes + data_bytes
reader = tf.FixedLengthRecordReader(record_bytes=record_bytes)
key, value = reader.read(filename_queue)
record_data = tf.decode_raw(value, item_type)
# labels = tf.cast(tf.strided_slice(record_data, [0], [label_items]), tf.int32)
label = tf.strided_slice(record_data, [0], [label_items])
data0 = tf.strided_slice(record_data, [label_items], [label_items + data_items])
data = tf.reshape(data0, [2, data_items/2])
return data, label
if __name__ == '__main__':
os.environ["CUDA_VISIBLE_DEVICES"] = "0" # Set GPU device
datafiles = ['train_0000.dat', 'train_0001.dat']
num_epochs = 2
filename_queue = tf.train.string_input_producer(datafiles, num_epochs=num_epochs, shuffle=True)
data, label = read_data(filename_queue)
with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run(init)
(x, y) = read_data(filename_queue)
print(y.eval())
This code hands at the print(y.eval()), but I fear I have much bigger issues than that.
Question:
When I execute this, I get a data and label tensor returned. The problem is I don't quite understand how to actually read the data from the tensor. For example, I understand the autoencoder example here, however this has a mnist.train.next_batch(batch_size) function that is called to read the next batch. Do I need to write that for my function, or is it handled by something internal to my read_data() function. If I need to write that function, what does it look like?
Are their any other obvious things I'm missing? My goal in using this method is to reduce I/O overhead, and not store all of the data in memory, since my file are quite large.
Thanks in advance.
Yes. You are pretty much done. At this point you need to:
1) Write your neural network model model which is supposed to take your data and return a label.
2) Write your cost function C which takes the network prediction and the true label and gives you a cost.
3) Choose and optimizer.
4) Put everything together:
opt = tf.AdamOptimizer(learning_rate=0.001)
datafiles = ['train_0000.dat', 'train_0001.dat']
num_epochs = 2
with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run(init)
filename_queue = tf.train.string_input_producer(datafiles, num_epochs=num_epochs, shuffle=True)
data, label = read_data(filename_queue)
example_batch, label_batch = tf.train.shuffle_batch(
[data, label], batch_size=128)
y_pred = model(data)
loss = C(label, y_pred)
After which you iterate and minimize the loss with:
opt.minimize(loss)
See also tf.train.string_input_producer behavior in a loop for related information.

How to use crop huge batch of images in tensorflow

I am trying to use below function to crop large number of images 100,000s. I am doing this operation serially, but its taking lot of time. What is the efficient way to do this?
tf.image.crop_to_bounding_box
Below is my code:
def crop_images(img_dir, list_images):
outlist=[]
with tf.Session() as session:
for image1 in list_images[:5]:
image = mpimg.imread(img_dir+image1)
x = tf.Variable(image, name='x')
data_t = tf.placeholder(tf.uint8)
op = tf.image.encode_jpeg(data_t, format='rgb')
model = tf.global_variables_initializer()
img_name = "img/"+image1.split("_img_0")[0] + "/img_0"+image1.split("_img_0")[1]
height = x.shape[1]
[x1,y1,x2,y2] = img_bbox_dict[img_name]
x = tf.image.crop_to_bounding_box(x, int(y1), int(x1), int(y2)-int(y1), int(x2)-int(x1))
session.run(model)
result = session.run(x)
data_np = session.run(op, feed_dict={ data_t: result })
with open(img_path+image1, 'w+') as fd:
fd.write(data_np)
I'll give a simplified version of one of the examples from Tensorflow's Programmer's guide on reading data which can be found here. Basically, it uses Reader and Filename Queues to batch together image data using a specified number of threads. These threads are coordinated using what is called a thread Coordinator.
import tensorflow as tf
import glob
images_path = "./" #RELATIVE glob pathname of current directory
images_extension = "*.png"
# Save the list of files matching pattern, so it is only computed once.
filenames = tf.train.match_filenames_once(glob.glob(images_path+images_extension))
batch_size = len(glob.glob1(images_path,images_extension))
num_epochs=1
standard_size = [500, 500]
num_channels = 3
min_after_dequeue = 10
num_preprocess_threads = 3
seed = 14131
"""
IMPORTANT: Cropping params. These are arbitrary values used only for this example.
You will have to change them according to your requirements.
"""
crop_size=[200,200]
boxes = [1,1,460,460]
"""
'WholeFileReader' is a Reader who's 'read' method outputs the next
key-value pair of the filename and the contents of the file (the image) from
the Queue, both of which are string scalar Tensors.
Note that the The QueueRunner works in a thread separate from the
Reader that pulls filenames from the queue, so the shuffling and enqueuing
process does not block the reader.
'resize_images' is used so that all images are resized to the same
size (Aspect ratios may change, so in that case use resize_image_with_crop_or_pad)
'set_shape' is used because the height and width dimensions of 'image' are
data dependent and cannot be computed without executing this operation. Without
this Op, the 'image' Tensor's shape will have None as Dimensions.
"""
def read_my_file_format(filename_queue, standard_size, num_channels):
image_reader = tf.WholeFileReader()
_, image_file = image_reader.read(filename_queue)
if "jpg" in images_extension:
image = tf.image.decode_jpeg(image_file)
elif "png" in images_extension:
image = tf.image.decode_png(image_file)
image = tf.image.resize_images(image, standard_size)
image.set_shape(standard_size+[num_channels])
print "Successfully read file!"
return image
"""
'string_input_producer' Enters matched filenames into a 'QueueRunner' FIFO Queue.
'shuffle_batch' creates batches by randomly shuffling tensors. The 'capacity'
argument controls the how long the prefetching is allowed to grow the queues.
'min_after_dequeue' defines how big a buffer we will randomly
sample from -- bigger means better shuffling but slower startup & more memory used.
'capacity' must be larger than 'min_after_dequeue' and the amount larger
determines the maximum we will prefetch.
Recommendation: min_after_dequeue + (num_threads + a small safety margin) * batch_size
"""
def input_pipeline(filenames, batch_size, num_epochs, standard_size, num_channels, min_after_dequeue, num_preprocess_threads, seed):
filename_queue = tf.train.string_input_producer(filenames, num_epochs=num_epochs, shuffle=True)
example = read_my_file_format(filename_queue, standard_size, num_channels)
capacity = min_after_dequeue + 3 * batch_size
example_batch = tf.train.shuffle_batch([example], batch_size=batch_size, capacity=capacity, min_after_dequeue=min_after_dequeue, num_threads=num_preprocess_threads, seed=seed, enqueue_many=False)
print "Batching Successful!"
return example_batch
"""
Any transformation on the image batch goes here. Refer the documentation
for the details of how the cropping is done using this function.
"""
def crop_batch(image_batch, batch_size, b_boxes, crop_size):
cropped_images = tf.image.crop_and_resize(image_batch, boxes=[b_boxes for _ in xrange(batch_size)], box_ind=[i for i in xrange(batch_size)], crop_size=crop_size)
print "Cropping Successful!"
return cropped_images
example_batch = input_pipeline(filenames, batch_size, num_epochs, standard_size, num_channels, min_after_dequeue, num_preprocess_threads, seed)
cropped_images = crop_batch(example_batch, batch_size, boxes, crop_size)
"""
if 'num_epochs' is not `None`, the 'string_input_producer' function creates local
counter `epochs`. Use `local_variables_initializer()` to initialize local variables.
'Coordinator' class implements a simple mechanism to coordinate the termination
of a set of threads. Any of the threads can call `coord.request_stop()` to ask for all
the threads to stop. To cooperate with the requests, each thread must check for
`coord.should_stop()` on a regular basis.
`coord.should_stop()` returns True` as soon as `coord.request_stop()` has been called.
A thread can report an exception to the coordinator as part of the `should_stop()`
call. The exception will be re-raised from the `coord.join()` call.
After a thread has called `coord.request_stop()` the other threads have a
fixed time to stop, this is called the 'stop grace period' and defaults to 2 minutes.
If any of the threads is still alive after the grace period expires `coord.join()`
raises a RuntimeError reporting the laggards.
IMPORTANT: 'start_queue_runners' starts threads for all queue runners collected in
the graph, & returns the list of all threads. This must be executed BEFORE running
any other training/inference/operation steps, or it will hang forever.
"""
with tf.Session() as sess:
_, _ = sess.run([tf.global_variables_initializer(), tf.local_variables_initializer()])
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
try:
while not coord.should_stop():
# Run training steps or whatever
cropped_images1 = sess.run(cropped_images)
print cropped_images1.shape
except tf.errors.OutOfRangeError:
print('Load and Process done -- epoch limit reached')
finally:
# When done, ask the threads to stop.
coord.request_stop()
coord.join(threads)
sess.close()

Matrix factorization based recommendation using Tensorflow

I am new to tensor Flow and exploring about recommendation system using tensorflow. I have verified few sample codes in in github and come across mostly the same like following as the follwing
https://github.com/songgc/TF-recomm/blob/master/svd_train_val.py
But the question is, how do I pick top recommendation for user U1 in the above code?
If there any sample code or approach, please share. Thanks
It is a little difficult! Basically, when svd returns, it closes the session, and the tensors lose their values (you still keep the graph). There are a few options:
Save the model to a file and restore it later;
Don't put the session in a with tf.Session() as sess: .... block, and instead return the session;
Do the user processing inside the with ... block
The worst option is option 3: you should train your model separately from using it. The best approach is to save your model and weights somewhere, then restore the session. However, you are still left with the question of how you use this session object once you have recovered it. To demonstrate just that part, I am going to solve this problem using option 3, assuming that you know how to restore a session.
def svd(train, test):
samples_per_batch = len(train) // BATCH_SIZE
iter_train = dataio.ShuffleIterator([train["user"],
train["item"],
train["rate"]],
batch_size=BATCH_SIZE)
iter_test = dataio.OneEpochIterator([test["user"],
test["item"],
test["rate"]],
batch_size=-1)
user_batch = tf.placeholder(tf.int32, shape=[None], name="id_user")
item_batch = tf.placeholder(tf.int32, shape=[None], name="id_item")
rate_batch = tf.placeholder(tf.float32, shape=[None])
infer, regularizer = ops.inference_svd(user_batch, item_batch, user_num=USER_NUM, item_num=ITEM_NUM, dim=DIM,
device=DEVICE)
global_step = tf.contrib.framework.get_or_create_global_step()
_, train_op = ops.optimization(infer, regularizer, rate_batch, learning_rate=0.001, reg=0.05, device=DEVICE)
init_op = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init_op)
summary_writer = tf.summary.FileWriter(logdir="/tmp/svd/log", graph=sess.graph)
print("{} {} {} {}".format("epoch", "train_error", "val_error", "elapsed_time"))
errors = deque(maxlen=samples_per_batch)
start = time.time()
for i in range(EPOCH_MAX * samples_per_batch):
users, items, rates = next(iter_train)
_, pred_batch = sess.run([train_op, infer], feed_dict={user_batch: users, item_batch: items, rate_batch: rates})
pred_batch = clip(pred_batch)
errors.append(np.power(pred_batch - rates, 2))
if i % samples_per_batch == 0:
train_err = np.sqrt(np.mean(errors))
test_err2 = np.array([])
for users, items, rates in iter_test:
pred_batch = sess.run(infer, feed_dict={user_batch: users,item_batch: items})
pred_batch = clip(pred_batch)
test_err2 = np.append(test_err2, np.power(pred_batch - rates, 2))
end = time.time()
test_err = np.sqrt(np.mean(test_err2))
print("{:3d} {:f} {:f} {:f}(s)".format(i // samples_per_batch, train_err, test_err, end - start))
train_err_summary = make_scalar_summary("training_error", train_err)
test_err_summary = make_scalar_summary("test_error", test_err)
summary_writer.add_summary(train_err_summary, i)
summary_writer.add_summary(test_err_summary, i)
start = end
# Get the top rated movie for user #1 for every item in the set
userNumber = 1
user_prediction = sess.run(infer, feed_dict={user_batch: np.array([userNumber]), item_batch: np.array(range(ITEM_NUM))})
# The index number is the same as the item number. Orders from lowest (least recommended)
# to largeset
index_rating_order = np.argsort(user_prediction)
print "Top ten recommended items for user {} are".format(userNumber)
print index_rating_order[-10:][::-1] # at the end, reverse the list
# If you want to include the score:
items_to_choose = index_rating_order[-10:][::-1]
for item, score in zip(items_to_choose, user_prediction[items_to_choose]):
print "{}: {}".format(item,score)
The only changes I made begin at the first commented line. To emphasize again, best practice would be to train in this function, but to actually make your predictions separately.

TF slice_input_producer not keeping tensors in sync

I'm reading images into my TF network, but I also need the associated labels along with them.
So I tried to follow this answer, but the labels that are output don't actually match the images that I'm getting in every batch.
The names of my images are in the format dir/3.jpg, so I just extract the label from the image file name.
truth_filenames_np = ...
truth_filenames_tf = tf.convert_to_tensor(truth_filenames_np)
# get the labels
labels = [f.rsplit("/", 1)[1] for f in truth_filenames_np]
labels_tf = tf.convert_to_tensor(labels)
# *** This line should make sure both input tensors are synced (from my limited understanding)
# My list is also already shuffled, so I set shuffle=False
truth_image_name, truth_label = tf.train.slice_input_producer([truth_filenames_tf, labels_tf], shuffle=False)
truth_image_value = tf.read_file(truth_image_name)
truth_image = tf.image.decode_jpeg(truth_image_value)
truth_image.set_shape([IMAGE_DIM, IMAGE_DIM, 3])
truth_image = tf.cast(truth_image, tf.float32)
truth_image = truth_image/255.0
# Another key step, where I batch them together
truth_images_batch, truth_label_batch = tf.train.batch([truth_image, truth_label], batch_size=mb_size)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
for i in range(epochs):
print "Epoch ", i
X_truth_batch = truth_images_batch.eval()
X_label_batch = truth_label_batch.eval()
# Here I display all the images in this batch, and then I check which file numbers they actually are.
# BUT, the images that are displayed don't correspond with what is printed by X_label_batch!
print X_label_batch
plot_batch(X_truth_batch)
coord.request_stop()
coord.join(threads)
Am I doing something wrong, or does the slice_input_producer not actually ensure that its input tensors are synced?
Aside:
I also noticed that when I get a batch from tf.train.batch, the elements in the batch are adjacent to each other in the original list I gave it, but the batch order isn't in the original order.
Example: If my data is ["dir/1.jpg", "dir/2.jpg", "dir/3.jpg", "dir/4.jpg", "dir/5.jpg, "dir/6.jpg"], then I may get the batch (with batch_size=2) ["dir/3.jpg", "dir/4.jpg"], then batch ["dir/1.jpg", "dir/2.jpg"], and then the last one.
So this makes it hard to even just use a FIFO queue for the labels since the order won't match the batch order.
Here is a complete runnable example that reproduces the problem:
import tensorflow as tf
truth_filenames_np = ['dir/%d.jpg' % j for j in range(66)]
truth_filenames_tf = tf.convert_to_tensor(truth_filenames_np)
# get the labels
labels = [f.rsplit("/", 1)[1] for f in truth_filenames_np]
labels_tf = tf.convert_to_tensor(labels)
# My list is also already shuffled, so I set shuffle=False
truth_image_name, truth_label = tf.train.slice_input_producer(
[truth_filenames_tf, labels_tf], shuffle=False)
# # Another key step, where I batch them together
# truth_images_batch, truth_label_batch = tf.train.batch(
# [truth_image_name, truth_label], batch_size=11)
epochs = 7
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
for i in range(epochs):
print("Epoch ", i)
X_truth_batch = truth_image_name.eval()
X_label_batch = truth_label.eval()
# Here I display all the images in this batch, and then I check
# which file numbers they actually are.
# BUT, the images that are displayed don't correspond with what is
# printed by X_label_batch!
print(X_truth_batch)
print(X_label_batch)
coord.request_stop()
coord.join(threads)
What this prints is:
Epoch 0
b'dir/0.jpg'
b'1.jpg'
Epoch 1
b'dir/2.jpg'
b'3.jpg'
Epoch 2
b'dir/4.jpg'
b'5.jpg'
Epoch 3
b'dir/6.jpg'
b'7.jpg'
Epoch 4
b'dir/8.jpg'
b'9.jpg'
Epoch 5
b'dir/10.jpg'
b'11.jpg'
Epoch 6
b'dir/12.jpg'
b'13.jpg'
So basically each eval call runs the operation another time ! Adding the batching does not make a difference to that - just prints batches (the first 11 filenames followed by the next 11 labels and so on)
The workaround I see is:
for i in range(epochs):
print("Epoch ", i)
pair = tf.convert_to_tensor([truth_image_name, truth_label]).eval()
print(pair[0])
print(pair[1])
which correctly prints:
Epoch 0
b'dir/0.jpg'
b'0.jpg'
Epoch 1
b'dir/1.jpg'
b'1.jpg'
# ...
but does nothing for the violation of the principle of the least surprise.
EDIT: yet another way of doing it:
import tensorflow as tf
truth_filenames_np = ['dir/%d.jpg' % j for j in range(66)]
truth_filenames_tf = tf.convert_to_tensor(truth_filenames_np)
labels = [f.rsplit("/", 1)[1] for f in truth_filenames_np]
labels_tf = tf.convert_to_tensor(labels)
truth_image_name, truth_label = tf.train.slice_input_producer(
[truth_filenames_tf, labels_tf], shuffle=False)
epochs = 7
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
tf.train.start_queue_runners(sess=sess)
for i in range(epochs):
print("Epoch ", i)
X_truth_batch, X_label_batch = sess.run(
[truth_image_name, truth_label])
print(X_truth_batch)
print(X_label_batch)
That's a much better way as tf.convert_to_tensor and co only accept tensors of same type/shape etc.
Note that I removed the coordinator for simplicity, which however results in a warning:
W c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\kernels\queue_base.cc:294] _0_input_producer/input_producer/fraction_of_32_full/fraction_of_32_full: Skipping cancelled enqueue attempt with queue not closed
See this

How to prefetch data using a custom python function in tensorflow

I am trying to prefetch training data to hide I/O latency. I would like to write custom Python code that loads data from disk and preprocesses the data (e.g. by adding a context window). In other words, one thread does data preprocessing and the other does training. Is this possible in TensorFlow?
Update: I have a working example based on #mrry's example.
import numpy as np
import tensorflow as tf
import threading
BATCH_SIZE = 5
TRAINING_ITERS = 4100
feature_input = tf.placeholder(tf.float32, shape=[128])
label_input = tf.placeholder(tf.float32, shape=[128])
q = tf.FIFOQueue(200, [tf.float32, tf.float32], shapes=[[128], [128]])
enqueue_op = q.enqueue([label_input, feature_input])
label_batch, feature_batch = q.dequeue_many(BATCH_SIZE)
c = tf.reshape(feature_batch, [BATCH_SIZE, 128]) + tf.reshape(label_batch, [BATCH_SIZE, 128])
sess = tf.Session()
def load_and_enqueue(sess, enqueue_op, coord):
with open('dummy_data/features.bin') as feature_file, open('dummy_data/labels.bin') as label_file:
while not coord.should_stop():
feature_array = np.fromfile(feature_file, np.float32, 128)
if feature_array.shape[0] == 0:
print('reach end of file, reset using seek(0,0)')
feature_file.seek(0,0)
label_file.seek(0,0)
continue
label_value = np.fromfile(label_file, np.float32, 128)
sess.run(enqueue_op, feed_dict={feature_input: feature_array,
label_input: label_value})
coord = tf.train.Coordinator()
t = threading.Thread(target=load_and_enqueue, args=(sess,enqueue_op, coord))
t.start()
for i in range(TRAINING_ITERS):
sum = sess.run(c)
print('train_iter='+str(i))
print(sum)
coord.request_stop()
coord.join([t])
This is a common use case, and most implementations use TensorFlow's queues to decouple the preprocessing code from the training code. There is a tutorial on how to use queues, but the main steps are as follows:
Define a queue, q, that will buffer the preprocessed data. TensorFlow supports the simple tf.FIFOQueue that produces elements in the order they were enqueued, and the more advanced tf.RandomShuffleQueue that produces elements in a random order. A queue element is a tuple of one or more tensors (which can have different types and shapes). All queues support single-element (enqueue, dequeue) and batch (enqueue_many, dequeue_many) operations, but to use the batch operations you must specify the shapes of each tensor in a queue element when constructing the queue.
Build a subgraph that enqueues preprocessed elements into the queue. One way to do this would be to define some tf.placeholder() ops for tensors corresponding to a single input example, then pass them to q.enqueue(). (If your preprocessing produces a batch at once, you should use q.enqueue_many() instead.) You might also include TensorFlow ops in this subgraph.
Build a subgraph that performs training. This will look like a regular TensorFlow graph, but will get its input by calling q.dequeue_many(BATCH_SIZE).
Start your session.
Create one or more threads that execute your preprocessing logic, then execute the enqueue op, feeding in the preprocessed data. You may find the tf.train.Coordinator and tf.train.QueueRunner utility classes useful for this.
Run your training graph (optimizer, etc.) as normal.
EDIT: Here's a simple load_and_enqueue() function and code fragment to get you started:
# Features are length-100 vectors of floats
feature_input = tf.placeholder(tf.float32, shape=[100])
# Labels are scalar integers.
label_input = tf.placeholder(tf.int32, shape=[])
# Alternatively, could do:
# feature_batch_input = tf.placeholder(tf.float32, shape=[None, 100])
# label_batch_input = tf.placeholder(tf.int32, shape=[None])
q = tf.FIFOQueue(100, [tf.float32, tf.int32], shapes=[[100], []])
enqueue_op = q.enqueue([feature_input, label_input])
# For batch input, do:
# enqueue_op = q.enqueue_many([feature_batch_input, label_batch_input])
feature_batch, label_batch = q.dequeue_many(BATCH_SIZE)
# Build rest of model taking label_batch, feature_batch as input.
# [...]
train_op = ...
sess = tf.Session()
def load_and_enqueue():
with open(...) as feature_file, open(...) as label_file:
while True:
feature_array = numpy.fromfile(feature_file, numpy.float32, 100)
if not feature_array:
return
label_value = numpy.fromfile(feature_file, numpy.int32, 1)[0]
sess.run(enqueue_op, feed_dict={feature_input: feature_array,
label_input: label_value})
# Start a thread to enqueue data asynchronously, and hide I/O latency.
t = threading.Thread(target=load_and_enqueue)
t.start()
for _ in range(TRAINING_EPOCHS):
sess.run(train_op)
In other words, one thread does data preprocessing and the other does training. Is this possible in TensorFlow?
Yes, it is. mrry's solution works, but simpler exists.
Fetching data
tf.py_func wraps a python function and uses it as a TensorFlow operator. So we can load the data at sess.run() each time. The problem with this approach is that data is loaded during sess.run() via the main thread.
A minimal example:
def get_numpy_tensor():
return np.array([[1,2],[3,4]], dtype=np.float32)
tensorflow_tensor = tf.py_func(get_numpy_tensor, [], tf.float32)
A more complex example:
def get_numpy_tensors():
# Load data from the disk into numpy arrays.
input = np.array([[1,2],[3,4]], dtype=np.float32)
target = np.int32(1)
return input, target
tensorflow_input, tensorflow_target = tf.py_func(get_numpy_tensors, [], [tf.float32, tf.int32])
tensorflow_input, tensorflow_target = 2*tensorflow_input, 2*tensorflow_target
sess = tf.InteractiveSession()
numpy_input, numpy_target = sess.run([tensorflow_input, tensorflow_target])
assert np.all(numpy_input==np.array([[2,4],[6,8]])) and numpy_target==2
Prefetching data in another thread
To queue our data in another thread (so that sess.run() won't have to wait for the data), we can use tf.train.batch() on our operators from tf.py_func().
A minimal example:
tensor_shape = get_numpy_tensor().shape
tensorflow_tensors = tf.train.batch([tensorflow_tensor], batch_size=32, shapes=[tensor_shape])
# Run `tf.train.start_queue_runners()` once session is created.
We can omit the argument shapes if tensorflow_tensor has its shape specified:
tensor_shape = get_numpy_tensor().shape
tensorflow_tensor.set_shape(tensor_shape)
tensorflow_tensors = tf.train.batch([tensorflow_tensor], batch_size=32)
# Run `tf.train.start_queue_runners()` once session is created.
A more complex example:
input_shape, target_shape = (2, 2), ()
def get_numpy_tensors():
input = np.random.rand(*input_shape).astype(np.float32)
target = np.random.randint(10, dtype=np.int32)
print('f', end='')
return input, target
tensorflow_input, tensorflow_target = tf.py_func(get_numpy_tensors, [], [tf.float32, tf.int32])
batch_size = 2
tensorflow_inputs, tensorflow_targets = tf.train.batch([tensorflow_input, tensorflow_target], batch_size, shapes=[input_shape, target_shape], capacity=2)
# Internal queue will contain at most `capasity=2` times `batch_size=2` elements `[tensorflow_input, tensorflow_target]`.
tensorflow_inputs, tensorflow_targets = 2*tensorflow_inputs, 2*tensorflow_targets
sess = tf.InteractiveSession()
tf.train.start_queue_runners() # Internally, `tf.train.batch` uses a QueueRunner, so we need to ask tf to start it.
for _ in range(10):
numpy_inputs, numpy_targets = sess.run([tensorflow_inputs, tensorflow_targets])
assert numpy_inputs.shape==(batch_size, *input_shape) and numpy_targets.shape==(batch_size, *target_shape)
print('r', end='')
# Prints `fffffrrffrfrffrffrffrffrffrffrf`.
In case get_numpy_tensor() returns a batch of tensors, then tf.train.batch(..., enqueue_many=True) will help.

Categories

Resources