How to prefetch data using a custom python function in tensorflow

How to prefetch data using a custom python function in tensorflow - python

I am trying to prefetch training data to hide I/O latency. I would like to write custom Python code that loads data from disk and preprocesses the data (e.g. by adding a context window). In other words, one thread does data preprocessing and the other does training. Is this possible in TensorFlow?
Update: I have a working example based on #mrry's example.
import numpy as np
import tensorflow as tf
import threading
BATCH_SIZE = 5
TRAINING_ITERS = 4100
feature_input = tf.placeholder(tf.float32, shape=[128])
label_input = tf.placeholder(tf.float32, shape=[128])
q = tf.FIFOQueue(200, [tf.float32, tf.float32], shapes=[[128], [128]])
enqueue_op = q.enqueue([label_input, feature_input])
label_batch, feature_batch = q.dequeue_many(BATCH_SIZE)
c = tf.reshape(feature_batch, [BATCH_SIZE, 128]) + tf.reshape(label_batch, [BATCH_SIZE, 128])
sess = tf.Session()
def load_and_enqueue(sess, enqueue_op, coord):
with open('dummy_data/features.bin') as feature_file, open('dummy_data/labels.bin') as label_file:
while not coord.should_stop():
feature_array = np.fromfile(feature_file, np.float32, 128)
if feature_array.shape[0] == 0:
print('reach end of file, reset using seek(0,0)')
feature_file.seek(0,0)
label_file.seek(0,0)
continue
label_value = np.fromfile(label_file, np.float32, 128)
sess.run(enqueue_op, feed_dict={feature_input: feature_array,
label_input: label_value})
coord = tf.train.Coordinator()
t = threading.Thread(target=load_and_enqueue, args=(sess,enqueue_op, coord))
t.start()
for i in range(TRAINING_ITERS):
sum = sess.run(c)
print('train_iter='+str(i))
print(sum)
coord.request_stop()
coord.join([t])

This is a common use case, and most implementations use TensorFlow's queues to decouple the preprocessing code from the training code. There is a tutorial on how to use queues, but the main steps are as follows:
Define a queue, q, that will buffer the preprocessed data. TensorFlow supports the simple tf.FIFOQueue that produces elements in the order they were enqueued, and the more advanced tf.RandomShuffleQueue that produces elements in a random order. A queue element is a tuple of one or more tensors (which can have different types and shapes). All queues support single-element (enqueue, dequeue) and batch (enqueue_many, dequeue_many) operations, but to use the batch operations you must specify the shapes of each tensor in a queue element when constructing the queue.
Build a subgraph that enqueues preprocessed elements into the queue. One way to do this would be to define some tf.placeholder() ops for tensors corresponding to a single input example, then pass them to q.enqueue(). (If your preprocessing produces a batch at once, you should use q.enqueue_many() instead.) You might also include TensorFlow ops in this subgraph.
Build a subgraph that performs training. This will look like a regular TensorFlow graph, but will get its input by calling q.dequeue_many(BATCH_SIZE).
Start your session.
Create one or more threads that execute your preprocessing logic, then execute the enqueue op, feeding in the preprocessed data. You may find the tf.train.Coordinator and tf.train.QueueRunner utility classes useful for this.
Run your training graph (optimizer, etc.) as normal.
EDIT: Here's a simple load_and_enqueue() function and code fragment to get you started:
# Features are length-100 vectors of floats
feature_input = tf.placeholder(tf.float32, shape=[100])
# Labels are scalar integers.
label_input = tf.placeholder(tf.int32, shape=[])
# Alternatively, could do:
# feature_batch_input = tf.placeholder(tf.float32, shape=[None, 100])
# label_batch_input = tf.placeholder(tf.int32, shape=[None])
q = tf.FIFOQueue(100, [tf.float32, tf.int32], shapes=[[100], []])
enqueue_op = q.enqueue([feature_input, label_input])
# For batch input, do:
# enqueue_op = q.enqueue_many([feature_batch_input, label_batch_input])
feature_batch, label_batch = q.dequeue_many(BATCH_SIZE)
# Build rest of model taking label_batch, feature_batch as input.
# [...]
train_op = ...
sess = tf.Session()
def load_and_enqueue():
with open(...) as feature_file, open(...) as label_file:
while True:
feature_array = numpy.fromfile(feature_file, numpy.float32, 100)
if not feature_array:
return
label_value = numpy.fromfile(feature_file, numpy.int32, 1)[0]
sess.run(enqueue_op, feed_dict={feature_input: feature_array,
label_input: label_value})
# Start a thread to enqueue data asynchronously, and hide I/O latency.
t = threading.Thread(target=load_and_enqueue)
t.start()
for _ in range(TRAINING_EPOCHS):
sess.run(train_op)

In other words, one thread does data preprocessing and the other does training. Is this possible in TensorFlow?
Yes, it is. mrry's solution works, but simpler exists.
Fetching data
tf.py_func wraps a python function and uses it as a TensorFlow operator. So we can load the data at sess.run() each time. The problem with this approach is that data is loaded during sess.run() via the main thread.
A minimal example:
def get_numpy_tensor():
return np.array([[1,2],[3,4]], dtype=np.float32)
tensorflow_tensor = tf.py_func(get_numpy_tensor, [], tf.float32)
A more complex example:
def get_numpy_tensors():
# Load data from the disk into numpy arrays.
input = np.array([[1,2],[3,4]], dtype=np.float32)
target = np.int32(1)
return input, target
tensorflow_input, tensorflow_target = tf.py_func(get_numpy_tensors, [], [tf.float32, tf.int32])
tensorflow_input, tensorflow_target = 2*tensorflow_input, 2*tensorflow_target
sess = tf.InteractiveSession()
numpy_input, numpy_target = sess.run([tensorflow_input, tensorflow_target])
assert np.all(numpy_input==np.array([[2,4],[6,8]])) and numpy_target==2
Prefetching data in another thread
To queue our data in another thread (so that sess.run() won't have to wait for the data), we can use tf.train.batch() on our operators from tf.py_func().
A minimal example:
tensor_shape = get_numpy_tensor().shape
tensorflow_tensors = tf.train.batch([tensorflow_tensor], batch_size=32, shapes=[tensor_shape])
# Run `tf.train.start_queue_runners()` once session is created.
We can omit the argument shapes if tensorflow_tensor has its shape specified:
tensor_shape = get_numpy_tensor().shape
tensorflow_tensor.set_shape(tensor_shape)
tensorflow_tensors = tf.train.batch([tensorflow_tensor], batch_size=32)
# Run `tf.train.start_queue_runners()` once session is created.
A more complex example:
input_shape, target_shape = (2, 2), ()
def get_numpy_tensors():
input = np.random.rand(*input_shape).astype(np.float32)
target = np.random.randint(10, dtype=np.int32)
print('f', end='')
return input, target
tensorflow_input, tensorflow_target = tf.py_func(get_numpy_tensors, [], [tf.float32, tf.int32])
batch_size = 2
tensorflow_inputs, tensorflow_targets = tf.train.batch([tensorflow_input, tensorflow_target], batch_size, shapes=[input_shape, target_shape], capacity=2)
# Internal queue will contain at most `capasity=2` times `batch_size=2` elements `[tensorflow_input, tensorflow_target]`.
tensorflow_inputs, tensorflow_targets = 2*tensorflow_inputs, 2*tensorflow_targets
sess = tf.InteractiveSession()
tf.train.start_queue_runners() # Internally, `tf.train.batch` uses a QueueRunner, so we need to ask tf to start it.
for _ in range(10):
numpy_inputs, numpy_targets = sess.run([tensorflow_inputs, tensorflow_targets])
assert numpy_inputs.shape==(batch_size, *input_shape) and numpy_targets.shape==(batch_size, *target_shape)
print('r', end='')
# Prints `fffffrrffrfrffrffrffrffrffrffrf`.
In case get_numpy_tensor() returns a batch of tensors, then tf.train.batch(..., enqueue_many=True) will help.

Related

Efficient example implementation of GPU-training of a simple feed-forward NN in TensorFlow? Maybe with tf.data?

I just started using the GPU version of TensorFlow hoping that it would speed up the training of my feed-forward neural networks. I am able to train on my GPU (GTX1080ti), but unfortunately it is not notably faster than doing the same training on my CPU (i7-8700K) the current way I’ve implemented it. During training, the GPU appears to barely be utilized at all, which makes me suspect that the bottleneck in my implementation is how the data is copied from the host to the device using feed_dict.
I’ve heard that TensorFlow has something called the “tf.data” pipeline which is supposed to make it easier and faster to feed data to GPUs etc. However I have not been able to find any simple examples where this concept is implemented into multilayer perceptron training as a replacement for feed_dict.
Is anyone aware of such an example and can point me to it? Preferably as simple as possible since I’m new to TensorFlow in general. Or is there something else I should change in my current implementation to make it more efficient? I’m pasting the code I have here:
import tensorflow as tf
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
tf.reset_default_graph()
import time
# Function for iris dataset.
def get_iris_data():
iris = datasets.load_iris()
data = iris["data"]
target = iris["target"]
# Convert to one-hot vectors
num_labels = len(np.unique(target))
all_Y = np.eye(num_labels)[target]
return train_test_split(data, all_Y, test_size=0.33, random_state=89)
# Function which initializes tensorflow weights & biases for feed-forward NN.
def InitWeights(LayerSizes):
with tf.device('/gpu:0'):
# Make tf placeholders for network inputs and outputs.
X = tf.placeholder( shape = (None,LayerSizes[0]),
dtype = tf.float32,
name ='InputData')
y = tf.placeholder( shape = (None,LayerSizes[-1]),
dtype = tf.float32,
name ='OutputData')
# Initialize weights and biases.
W = {}; b = {};
for ii in range(len(LayerSizes)-1):
layername = f'layer%s' % ii
with tf.variable_scope(layername):
ny = LayerSizes[ii]
nx = LayerSizes[ii+1]
# Weights (initialized with xavier initializatiion).
W['Weights_'+layername] = tf.get_variable(
name = 'Weights_'+layername,
shape = (ny, nx),
initializer = tf.contrib.layers.xavier_initializer(),
dtype = tf.float32
)
# Bias (initialized with xavier initializatiion).
b['Bias_'+layername] = tf.get_variable(
name = 'Bias_'+layername,
shape = (nx),
initializer = tf.contrib.layers.xavier_initializer(),
dtype = tf.float32
)
return W, b, X, y
# Function for forward propagation of NN.
def FeedForward(X, W, b):
with tf.device('/gpu:0'):
# Initialize 'a' of first layer to the placeholder of the network input.
a = X
# Loop all layers of the network.
for ii in range(len(W)):
# Use name of each layer as index.
layername = f'layer%s' % ii
## Weighted sum: z = input*W + b
z = tf.add(tf.matmul(a, W['Weights_'+layername], name = 'WeightedSum_z_'+layername), b['Bias_'+layername])
## Passed through actication fcn: a = h(z)
if ii == len(W)-1:
a = z
else:
a = tf.nn.relu(z, name = 'activation_a_'+layername)
return a
if __name__ == "__main__":
# Import data
train_X, test_X, train_y, test_y = get_iris_data()
# Define network size [ninputs-by-256-by-outputs]
LayerSizes = [4, 256, 3]
# Initialize weights and biases.
W, b, X, y = InitWeights(LayerSizes)
# Define loss function to optimize.
yhat = FeedForward(X, W, b)
loss = tf.reduce_sum(tf.square(y - yhat),reduction_indices=[0])
# Define optimizer to use when minimizing loss function.
all_variables = tf.trainable_variables()
optimizer = tf.train.GradientDescentOptimizer(learning_rate = 0.0001)
train_op = optimizer.minimize(loss, var_list = all_variables)
# Start tf session and initialize variables.
sess = tf.Session()
sess.run(tf.global_variables_initializer())
# Train 10000 minibatches and time how long it takes.
t0 = time.time()
for i in range(10000):
ObservationsToUse = np.random.choice(len(train_X), 32)
X_minibatch = train_X[ObservationsToUse,:]
y_minibatch = train_y[ObservationsToUse,:]
sess.run(train_op, feed_dict={X : X_minibatch, y : y_minibatch})
t1 = time.time()
print('Training took %0.2f seconds' %(t1-t0))
sess.close()

The speed might be low because:
You are creating placeholders. Using numpy, we insert the data in the
placeholders and thereby they are converted to tensors of the graph.
By using tf.data.Dataset, you can create a direct pipeline which makes the data directly flow into the graph without the need of placeholders. They are fast, scalable and have a number of functions to play around with.
with np.load("/var/data/training_data.npy") as data:
features = data["features"]
labels = data["labels"]
# Assume that each row of `features` corresponds to the same row as `labels`.
assert features.shape[0] == labels.shape[0]
dataset = tf.data.Dataset.from_tensor_slices((features, labels))
Some useful functions :
dataset = dataset.shuffle(buffer_size=10000)
dataset = dataset.batch(32) # Creating batches
dataset = dataset.repeat(num_epochs) # repeat the dataset 'N' times
iterator = dataset.make_one_shot_iterator() # Create a iterator to retrieve batches of data
X, Y = iterator.get_next()
Here, 32 is the batch size.
In your case,
dataset = tf.data.Dataset.from_tensor_slices((data, targets))
Hence, there is no need of placeholders. Directly run,
session.run( train_op ) # no feed_dict!!

Tensorflow and reading binary data properly

I am trying to properly read in my own binary data to Tensorflow based on Fixed length records section of this tutorial, and by looking at the read_cifar10 function here. Mind you I am new to tensorflow, so my understanding may be off.
My Data
My files are binary with float32 type. The first 32 bit sample is the label, and the remaining 256 samples are the data. I want to reshape the data at the end to a [2, 128] matrix.
My Code So far:
import tensorflow as tf
import os
def read_data(filename_queue):
item_type = tf.float32
label_items = 1
data_items = 256
label_bytes = label_items * item_type.size
data_bytes = data_items * item_type.size
record_bytes = label_bytes + data_bytes
reader = tf.FixedLengthRecordReader(record_bytes=record_bytes)
key, value = reader.read(filename_queue)
record_data = tf.decode_raw(value, item_type)
# labels = tf.cast(tf.strided_slice(record_data, [0], [label_items]), tf.int32)
label = tf.strided_slice(record_data, [0], [label_items])
data0 = tf.strided_slice(record_data, [label_items], [label_items + data_items])
data = tf.reshape(data0, [2, data_items/2])
return data, label
if __name__ == '__main__':
os.environ["CUDA_VISIBLE_DEVICES"] = "0" # Set GPU device
datafiles = ['train_0000.dat', 'train_0001.dat']
num_epochs = 2
filename_queue = tf.train.string_input_producer(datafiles, num_epochs=num_epochs, shuffle=True)
data, label = read_data(filename_queue)
with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run(init)
(x, y) = read_data(filename_queue)
print(y.eval())
This code hands at the print(y.eval()), but I fear I have much bigger issues than that.
Question:
When I execute this, I get a data and label tensor returned. The problem is I don't quite understand how to actually read the data from the tensor. For example, I understand the autoencoder example here, however this has a mnist.train.next_batch(batch_size) function that is called to read the next batch. Do I need to write that for my function, or is it handled by something internal to my read_data() function. If I need to write that function, what does it look like?
Are their any other obvious things I'm missing? My goal in using this method is to reduce I/O overhead, and not store all of the data in memory, since my file are quite large.
Thanks in advance.

Yes. You are pretty much done. At this point you need to:
1) Write your neural network model model which is supposed to take your data and return a label.
2) Write your cost function C which takes the network prediction and the true label and gives you a cost.
3) Choose and optimizer.
4) Put everything together:
opt = tf.AdamOptimizer(learning_rate=0.001)
datafiles = ['train_0000.dat', 'train_0001.dat']
num_epochs = 2
with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run(init)
filename_queue = tf.train.string_input_producer(datafiles, num_epochs=num_epochs, shuffle=True)
data, label = read_data(filename_queue)
example_batch, label_batch = tf.train.shuffle_batch(
[data, label], batch_size=128)
y_pred = model(data)
loss = C(label, y_pred)
After which you iterate and minimize the loss with:
opt.minimize(loss)
See also tf.train.string_input_producer behavior in a loop for related information.

Restoring queue state in Tensorflow from checkpoint

Context: I am training a model using an Estimator. Without extraneous details, I am using queues to read in a series of input images, which are batched and manipulated using an input function which I am call "read_pics_batch":
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(tf.local_variables_initializer())
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
keypoint_regression.fit(
input_fn = lambda: inp_mod.read_pics_batch(names_train, \
joint_annopoints_train,num_in_batch,max_num_epochs,'TRAIN'),
steps= max_steps_per_epoch*max_num_epochs, # max number of steps
monitors=[logging_hook])
coord.request_stop()
coord.join(threads)
The input function has the following form, where I am also randomising the input file order:
def read_pics_batch(names_list,joint_list,batch_size,max_num_epochs,task):
names_tensor = tf.convert_to_tensor(names_list, dtype=tf.string)
joint_total_tensor = tf.convert_to_tensor(joint_list, dtype=tf.int32)
min_after_dequeue = 100
capacity = min_after_dequeue + 3 * batch_size
file_pattern = [("...")]
examples = graph_io.read_keyed_batch_examples(file_pattern, batch_size, \
reader = tf.WholeFileReader, randomize_input = True, \
parse_fn = example_to_standard_pic, \
num_epochs = max_num_epochs, queue_capacity = capacity)
My questions are as follows:
1) Is there any way to restore the queue state from a checkpoint, like any other variable? If "randomize_input" from read_keyed_batch_examples would be set to False, than at each restart of the training_op I would read the same input files over and over, which is clearly not what I want.
2) If randomize_input = True, how exactly does the queue decide which files to enqueue? I see two possible options and I am unsure which is correct:
it selects a short-list of size "capacity" (from the full-list given by all the filenames defined by "file_pattern") and then randomises the order of the names in this short-list
it randomises the names in the full-list first, and then creates a short-list of size "capacity" out of this
If the second case applies, I don't believe that I would actually need to restore the queue state, since I would in principle read different files every time, but if the first case applies, I would still be reading the same few files over and over, just in a different order.
Thank you for your time!

How to change dimension of input during TensorFlow import_graph_def

My scenario:
Define an RNN model structure and train it using an input with fixed batch size and sequence length.
Freeze the model (i.e. converting all trainable variables into constants) producing a GraphDef containing everything one needs to use the model at test-time (via tf.graph_util.convert_variables_to_constants).
Import the GraphDef via tf.import_graph_def and replace the input using the input_map argument. The new input needs to have arbitrary batch size and sequence length.
The problem: All of the above works until I pass in an input to the test-time graph that uses a batch size or sequence length that differs from the original sizes used at training-time. At that point I get an error like this:
InvalidArgumentError (see above for traceback): ConcatOp : Dimensions of inputs should match: shape[0] = [1,5] vs. shape[1] = [2,7]
[[Node: import/rnn/while/basic_rnn_cell/basic_rnn_cell_1/concat = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](import/rnn/while/TensorArrayReadV3, import/rnn/while/Identity_2, import/rnn/while/basic_rnn_cell/basic_rnn_cell_1/concat/axis)]]
To illustrate and reproduce the problem, please consider the following minimal examples.
v1: a graph is created with arbitrary batch size and sequence length. This works fine but unfortunately I must use a fixed batch size and sequence length at training-time and must use an arbitrary batch size and sequence length at test-time so I can't use this simple approach.
v2a: we simulate creating the training-time graph with fixed batch size (2) and sequence length (3) and freeze the graph.
v2ba: we demonstrate that loading the frozen model in unchanged still produces the same results.
v2bb: we demonstrate that loading the frozen model in with a replaced input that still uses a fixed batch size and sequence length still produces the same results.
v2bc: we demonstrate that loading the frozen model in with a replaced input that uses arbitrary batch size and sequence length still produces the same results, as long as the input is shaped according to the original batch size and sequence length. It works with data but fails with data2 -- the only difference being the batch size of the former is 2 and the batch size of the latter is 1.
Is it possible to change an RNN graph via the input_map argument to tf.import_graph_def such that the input no longer has a fixed batch size and sequence length?
The following code works with TensorFlow 1.1 RC2 and may work with TensorFlow 1.0.
import numpy
import tensorflow as tf
from tensorflow import graph_util as tf_graph_util
from tensorflow.contrib import rnn as tfc_rnn
def v1(data):
with tf.Graph().as_default():
tf.set_random_seed(1)
x = tf.placeholder(tf.float32, shape=(None, None, 5))
_, s = tf.nn.dynamic_rnn(tfc_rnn.BasicRNNCell(7), x, dtype=tf.float32)
with tf.Session() as session:
session.run(tf.global_variables_initializer())
print session.run(s, feed_dict={x: data})
def v2a():
with tf.Graph().as_default():
tf.set_random_seed(1)
x = tf.placeholder(tf.float32, shape=(2, 3, 5), name="x")
_, s = tf.nn.dynamic_rnn(tfc_rnn.BasicRNNCell(7), x, dtype=tf.float32)
with tf.Session() as session:
session.run(tf.global_variables_initializer())
return tf_graph_util.convert_variables_to_constants(
session, session.graph_def, [s.op.name]), s.name
def v2ba((graph_def, s_name), data):
with tf.Graph().as_default():
x, s = tf.import_graph_def(graph_def,
return_elements=["x:0", s_name])
with tf.Session() as session:
print '2ba', session.run(s, feed_dict={x: data})
def v2bb((graph_def, s_name), data):
with tf.Graph().as_default():
x = tf.placeholder(tf.float32, shape=(2, 3, 5))
[s] = tf.import_graph_def(graph_def, input_map={"x:0": x},
return_elements=[s_name])
with tf.Session() as session:
print '2bb', session.run(s, feed_dict={x: data})
def v2bc((graph_def, s_name), data):
with tf.Graph().as_default():
x = tf.placeholder(tf.float32, shape=(None, None, 5))
[s] = tf.import_graph_def(graph_def, input_map={"x:0": x},
return_elements=[s_name])
with tf.Session() as session:
print '2bc', session.run(s, feed_dict={x: data})
def main():
data1 = numpy.random.random_sample((2, 3, 5))
data2 = numpy.random.random_sample((1, 3, 5))
v1(data1)
model = v2a()
v2ba(model, data1)
v2bb(model, data1)
v2bc(model, data1)
v2bc(model, data2)
if __name__ == "__main__":
main()

This is a bug in tensorflow that has been going on for a while: you cannot reliably replace a placeholder with a defined shape with another one with (partially) undefined shape.
You will find a related issue filed here, which apparently did not get much attention.

Training huge amounts of data with tensorflow

I have about 60 thousand samples of size 200x870, they are all numpy arrays and I want to build a four-dimensional tensor out of them (with one singleton dimension) and train them with a CNN in tensorflow. Up to this point, I was using data that I could just load and create batches as below:
with tf.Graph().as_default():
data_train = tf.to_float(getInput.data_train)
phase, lr = tf.placeholder(tf.bool), tf.placeholder(tf.float32)
global_step = tf.Variable(0,trainable = False)
image_train, label_train = tf.train.slice_input_producer([data_train, labels_train], num_epochs=args.num_epochs)
images_train, batch_labels_train = tf.train.batch([image_train, label_train], batch_size=args.bsize)
Can someone suggest a way to go around it?
I wanted to split the dataset into subsets and in one epoch train one after the ather using a Queue for the paths of this files:
import scipy.io as sc
import numpy as np
import threading
import time
import tensorflow as tf
from tensorflow.python.client import timeline
def testQueues():
paths = ['data1', 'data2', 'data3', 'data4','data5']
queue_capacity = 6
bsize = 10
num_epochs = 2
filename_queue = tf.FIFOQueue(
#min_after_dequeue=0,
capacity=queue_capacity,
dtypes=tf.string,
shapes=[[]]
)
filenames_placeholder = tf.placeholder(dtype='string', shape=(None))
filenames_enqueue_op = filename_queue.enqueue_many(filenames_placeholder)
data_train, phase = tf.placeholder(tf.float32), tf.placeholder(tf.bool)
sess= tf.Session()
sess.run(filenames_enqueue_op, feed_dict={filenames_placeholder: paths})
for i in range(len(paths)):
train_set_batch_name = sess.run(filename_queue.dequeue())
train_set_batch_name = train_set_batch_name.decode('utf-8')
train_set_batch = np.load(train_set_batch_name+'.npy')
train_set_batch = tf.cast(train_set_batch, tf.float32)
init_op = tf.group(tf.initialize_all_variables(), tf.initialize_local_variables())
sess.run(init_op)
run_one_epoch(train_set_batch, sess)
size = sess.run(filename_queue.size())
print(size)
print(train_set_batch)
def run_one_epoch(train_set,sess):
image_train = tf.train.slice_input_producer([train_set], num_epochs=1)
images_train = tf.train.batch(image_train, batch_size=10)
x = tf.nn.relu(images_train)
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
try:
while not coord.should_stop():
sess.run(x)
except tf.errors.OutOfRangeError:
pass
finally:
# When done, ask the threads to stop.
coord.request_stop()
coord.join(threads)
testQueues()
However I get an error
FailedPreconditionError: Attempting to use uninitialized value input_producer/input_producer/fraction_of_32_full/limit_epochs/epochs
[[Node: input_producer/input_producer/fraction_of_32_full/limit_epochs/CountUpTo = CountUpTo[T=DT_INT64, _class=["loc:#input_producer/input_producer/fraction_of_32_full/limit_epochs/epochs"], limit=1, _device="/job:localhost/replica:0/task:0/cpu:0"](input_producer/input_producer/fraction_of_32_full/limit_epochs/epochs)]]
Also it seems as I can't feed the dictionary with a tf.tensor only with numpy array, but casting it later to tf.tensor is also troublesome.

Have a look at Dataset api.
"The tf.data API enables you to build complex input pipelines from simple, reusable pieces."
In this approach what you do is you model your graph such that it handles data for you and pulls in limited data at a time for you to train your model on.
If memory issue still persists then you might want to look into generator to create your tf.data.Dataset. Your next step could be to potentially speed up the process by preparing tfrecords to create you Dataset.
Follow all the links to learn more and feel free to comment if you don't understand something.

For data that doesn't fit into memory the standard solution is to use Queues. You can set up some ops that read from files directly (cvs files, image files), and feed them into TensorFlow -- https://www.tensorflow.org/versions/r0.11/how_tos/reading_data/index.html

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to prefetch data using a custom python function in tensorflow - python

Related

Efficient example implementation of GPU-training of a simple feed-forward NN in TensorFlow? Maybe with tf.data?

Tensorflow and reading binary data properly

Restoring queue state in Tensorflow from checkpoint

How to change dimension of input during TensorFlow import_graph_def

Training huge amounts of data with tensorflow

Categories

Resources