Tensorflow and threading - python

Below is the simple mnist tutorial (i.e. single layer softmax) from the Tensorflow website, which I tried to extend with a multi-threaded training step:
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
import threading
# Training loop executed in each thread
def training_func():
while True:
batch = mnist.train.next_batch(100)
global_step_val,_ = sess.run([global_step, train_step], feed_dict={x: batch[0], y_: batch[1]})
print("global step: %d" % global_step_val)
if global_step_val >= 4000:
break
# create session and graph
sess = tf.Session()
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
global_step = tf.Variable(0, name="global_step")
y = tf.matmul(x,W) + b
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, y_))
inc = global_step.assign_add(1)
with tf.control_dependencies([inc]):
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# initialize graph and create mnist loader
sess.run(tf.global_variables_initializer())
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
# create workers and execute threads
workers = []
for _ in range(8):
t = threading.Thread(target=training_func)
t.start()
workers.append(t)
for t in workers:
t.join()
# evaluate accuracy of the model
print(accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels},
session=sess))
I must be missing something, as 8 threads as below yield inconsistent results (accuracy approx. = 0.1), when with 1 thread only the expected accuracy is obtained (approx. 0.92). Does anybody have a clue about my mistake(s)? Thanks!

Note that unfortunately, threadingwith python doesn't create real parallelism because of the GIL. So what happens here is that you will have multiple threads which are all running on the same CPU where in reality they are running sequentially. Therefore, I would suggest using Coordinator in Tensorflow. More information about Coordinator can be found here:
https://www.tensorflow.org/programmers_guide/threading_and_queues
https://www.tensorflow.org/programmers_guide/reading_data
Finally, I would suggest you say:
with tf.device('/cpu:0'):
your code should go here... 'for the first thread'
Then use another cpu for the other thread and so on...
Hope this answer finds you well!!

Related

Understanding next step after saving TensorFlow model

I've a simple MNIST which I've successfully saved, being the code the next:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
import tensorflow as tf
sess = tf.InteractiveSession()
tf_save_file = './mnist-to-save-saved'
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
saver = tf.train.Saver()
sess.run(tf.global_variables_initializer())
y = tf.matmul(x, W) + b
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels = y_, logits = y))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
saver.save(sess, tf_save_file)
for _ in range(1000):
batch = mnist.train.next_batch(100)
train_step.run(feed_dict={x: batch[0], y_: batch[1]})
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
saver.save(sess, tf_save_file, global_step=1000)
print(accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels}))
Then, the next files are generated:
checkpoint
mnist-to-save-saved-1000.data-00000-of-00001
mnist-to-save-saved-1000.index
mnist-to-save-saved-1000.meta
mnist-to-save-saved.data-00000-of-00001
mnist-to-save-saved.index
mnist-to-save-saved.meta
Now, in order to use it in production (and so, for example, pass it a number image), I want to be able to execute the trained model by passing it any number image to make the prediction (I mean, not deploying yet a server but making this prediction "locally", having in the same directory that "fixed" number image, so using the model would be like when you run an executable).
But, considering the (mid-low?) API level of my code, I'm confused about what would be the easiest correct next step (if restoring, using an Estimator, etc...), and how to do it.
Although I've read the official documentation, I insist that they seem to be many ways, but some are a bit complex and "noisy" for a simple model like this.
Edit:
I've edit and re-run the mnist file, whose code is the same as above except for those lines:
...
x = tf.placeholder(tf.float32, shape=[None, 784], name='input')
...
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1), name='result')
...
Then, I try to run this another .py code (in the same directory as the above code) in order to pass a local handwritten number image ("mnist-input-image.png") located in the same directory:
import tensorflow as tf
from PIL import Image
import numpy as np
image_test = Image.open("mnist-input-image.png")
image = np.array(image_test)
with tf.Session() as sess:
saver = tf.train.import_meta_graph('/Users/username/.meta')
new = saver.restore(sess, tf.train.latest_checkpoint('/Users/username/'))
graph = tf.get_default_graph()
input_x = graph.get_tensor_by_name("input:0")
result = graph.get_tensor_by_name("result:0")
feed_dict = {input_x: image}
predictions = result.eval(feed_dict=feed_dict)
print(predictions)
Now, if I correctly understand, I've to pass the image as numpy array. Then, my questions are:
1) Which is the exact file reference of those lines (since I've no .meta folder in my User folder)?
saver = tf.train.import_meta_graph('/Users/username/.meta')
new = saver.restore(sess, tf.train.latest_checkpoint('/Users/username/'))
I mean, to which exact files refer those lines (from my generated files list above)?
2) Translasted to my case, is correct this line to pass my numpy array into the feed dict?
feed_dict = {input_x: image}
A simple solution is to use your session object. When you have generated the checkpoint file, you can restore it with a Saver object.
By the way, do you know why most tutorials have their graph creation inside of a function? One good reason is because you can deserialize the graph quickly with your inputs.
The correct method to start a session is with the following:
# Use your placeholders, variables, etc to create the entire graph.
# Usually you return the input placeholder,
# prediction and the loss/accuracy here.
# You don't need the accuracy.
x, y, _ = make_your_graph(test_X, test_y)
# This object is the interface for serialization in tf
saver = tf.train.Saver()
with tf.Session() as sess:
# Takes your current model's checkpoint. "./checkpoint" is your checkpoint file.
saver.restore(sess, tf.train.latest_checkpoint("./checkpoint"))
prediction = sess.run(y)
Want to run more than 1 data point for your already-booted up session?
Then replace the last line with a feed dict:
while waiting_for_new_y():
another_y = get_new_y()
feed_dict = {x: [another_y]}
another_prediction = sess.run(y, feed_dict)
First of all , give value to name parameter in each object which you want to use later , so that you can use it later by it's name:
change this :
x = tf.placeholder(tf.float32, shape=[None, 784])
to
x = tf.placeholder(tf.float32, shape=[None, 784],name='input')
and
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
to
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1),name='result')
Now run this small script to store model :
import tensorflow as tf
with tf.Session() as sess:
saver = tf.train.import_meta_graph('/Users/dummy/.meta')
new=saver.restore(sess, tf.train.latest_checkpoint('/Users/dummy/'))
graph = tf.get_default_graph()
input_x = graph.get_tensor_by_name("input:0")
result = graph.get_tensor_by_name("result:0")
feed_dict = {input_x: mnist.test.images,} #here you feed your new data for example i am feeding mnist
predictions = result.eval(feed_dict=feed_dict)
print(predictions)
And you will get output.

tensorflow mnist example with my own get_next_minibatch

I just started using tensorflow and I followed the tutorial example on MNIST dataset. It went well, I got like around 90% accuracy.
But after I replace the next_batch with my own version, the result was way worse than it used to be, usually 50%.
Instead of using the data Tensorflow downloaded and parsed, I download the dataset from this website. Using numpy to get what I want.
df = pd.read_csv('mnist_train.csv', header=None)
X = df.drop(0,1)
Y = df[0]
temp = np.zeros((Y.size, Y.max()+1))
temp[np.arange(Y.size),Y] = 1
np.save('X',X)
np.save('Y',temp)
do the same thing to the test data, then following the tutorial, nothing is changed
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])
X = np.load('X.npy')
Y = np.load('Y.npy')
X_test = np.load('X_test.npy')
Y_test = np.load('Y_test.npy')
BATCHES = 1000
W = tf.Variable(tf.truncated_normal([784,10], stddev=0.1))
# W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
train_step = tf.train.GradientDescentOptimizer(0.05).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
right here is my own get_mini_batch, I shuffle the original data's index, then every time I get 100 data out of it, which seems to be like the exact same thing example code does. The only difference is data I throw away some of the data in the tail.
pos = 0
idx = np.arange(X.shape[0])
np.random.shuffle(idx)
for _ in range(1000):
batch_xs, batch_ys = X[idx[range(pos,pos+BATCHES)],:], Y[idx[range(pos,pos+BATCHES)],]
if pos+BATCHES >= X.shape[0]:
pos = 0
idx = np.arange(X.shape[0])
np.random.shuffle(idx)
pos += BATCHES
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
print(sess.run(accuracy, feed_dict={x: X_test, y_: Y_test}))
It confuses me why my version is way worse than the tutorial one.
Like lejilot said, we should normalize the data before we push it into the neural network.
See this post

Tensorflow: Print value when Session running

First of all, I'm very new in Python and Tensorflow either.
I'm trying on demo of link: https://www.tensorflow.org/get_started/mnist/beginners
and it runs well.
However, I would like to debug (or log) the value of some placeholders, variables which are changed when I run Session.run(). I
Could you please show me the way to "debug" or log them when Session running in the loops?
Here is my code
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("mnist/", one_hot=True)
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
y1 = tf.add(tf.matmul(x,W),b)
y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
cross_entropy1 = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y1, y_))
train_step = tf.train.GradientDescentOptimizer(0.05).minimize(cross_entropy1)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
for _ in range(1000):
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
sess.run(tf.argmax(y,1), feed_dict={x: mnist.test.images, y_: mnist.test.labels})
In this script, I would like to log the value of y and tf.argmax(y, 1) for each test image processed.
Mrry answered it best in this stackoverflow answer: https://stackoverflow.com/a/33633839/6487788
Exactly what you are asking (printing during sess.run) would be this part of his answer:
To print the value of a tensor without returning it to your Python program, you can use the tf.Print() op, as And suggests in another answer. Note that you still need to run part of the graph to see the output of this op, which is printed to standard output. If you're running distributed TensorFlow, the tf.Print() op will print its output to the standard output of the task where that op runs.
This would be this code for argmax:
argmaxy = tf.Print(tf.argmax(y,1))
correct_prediction = tf.equal(argmaxy, tf.argmax(y_,1))
Good luck!
While #rmeerten's answer is correct, you can consider also using TensorBoard which can be a useful tool for debugging your models and seeing what's happening. For background, you can also check out the TensorBoard session from the TensorFlow Dev Summit.

Tensorflow MNIST (Weight and bias variables)

I'm learning how to use Tensorflow with the MNIST tutorial, but I'm blocking on a point of the tutorial.
Here is the code provided :
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
saver = tf.train.Saver()
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
for i in range(1000):
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
But I actually don't understand at all how the variables "W" (The weight) and "b" (The bias) are changed while the computing ?
On each batch, they are initialized at zero, but after ?
I don't see at all where in the code they're going to change ?
Thanks you very much in advance!
TensorFlow variables maintain their state from one run() call to the next. In your program they will be initialized to zero, and then progressively updated in the training loop.
The code that changes the values of the variables is created, implicitly, by this line:
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
In TensorFlow, a tf.train.Optimizer is a class that creates operations for updating variables, typically based on the gradient of some tensor (e.g. a loss) with respect to those variables. By default, when you
call Optimizer.minimize(), TensorFlow creates operations to update all variables on which the given tensor (in this case cross_entropy) depends.
When you call sess.run(train_step), this runs a graph that includes those update operations, and therefore instructs TensorFlow to update the values of the variables.

How does one use the official Batch Normalization layer in TensorFlow?

I was trying to use batch normalization to train my Neural Networks using TensorFlow but it was unclear to me how to use the official layer implementation of Batch Normalization (note this is different from the one from the API).
After some painful digging on the their github issues it seems that one needs a tf.cond to use it properly and also a 'resue=True' flag so that the BN shift and scale variables are properly reused. After figuring that out I provided a small description of how I believe is the right way to use it here.
Now I have written a short script to test it (only a single layer and a ReLu, hard to make it smaller than this). However, I am not 100% sure how to test it. Right now my code runs with no error messages but returns NaNs unexpectedly. Which lowers my confidence that the code I gave in the other post might be right. Or maybe the network I have is weird. Either way, does someone know whats wrong? Here is the code:
import tensorflow as tf
# download and install the MNIST data automatically
from tensorflow.examples.tutorials.mnist import input_data
from tensorflow.contrib.layers.python.layers import batch_norm as batch_norm
def batch_norm_layer(x,train_phase,scope_bn):
bn_train = batch_norm(x, decay=0.999, center=True, scale=True,
is_training=True,
reuse=None, # is this right?
trainable=True,
scope=scope_bn)
bn_inference = batch_norm(x, decay=0.999, center=True, scale=True,
is_training=False,
reuse=True, # is this right?
trainable=True,
scope=scope_bn)
z = tf.cond(train_phase, lambda: bn_train, lambda: bn_inference)
return z
def get_NN_layer(x, input_dim, output_dim, scope, train_phase):
with tf.name_scope(scope+'vars'):
W = tf.Variable(tf.truncated_normal(shape=[input_dim, output_dim], mean=0.0, stddev=0.1))
b = tf.Variable(tf.constant(0.1, shape=[output_dim]))
with tf.name_scope(scope+'Z'):
z = tf.matmul(x,W) + b
with tf.name_scope(scope+'BN'):
if train_phase is not None:
z = batch_norm_layer(z,train_phase,scope+'BN_unit')
with tf.name_scope(scope+'A'):
a = tf.nn.relu(z) # (M x D1) = (M x D) * (D x D1)
return a
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
# placeholder for data
x = tf.placeholder(tf.float32, [None, 784])
# placeholder that turns BN during training or off during inference
train_phase = tf.placeholder(tf.bool, name='phase_train')
# variables for parameters
hiden_units = 25
layer1 = get_NN_layer(x, input_dim=784, output_dim=hiden_units, scope='layer1', train_phase=train_phase)
# create model
W_final = tf.Variable(tf.truncated_normal(shape=[hiden_units, 10], mean=0.0, stddev=0.1))
b_final = tf.Variable(tf.constant(0.1, shape=[10]))
y = tf.nn.softmax(tf.matmul(layer1, W_final) + b_final)
### training
y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean( -tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]) )
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
steps = 3000
for iter_step in xrange(steps):
#feed_dict_batch = get_batch_feed(X_train, Y_train, M, phase_train)
batch_xs, batch_ys = mnist.train.next_batch(100)
# Collect model statistics
if iter_step%1000 == 0:
batch_xstrain, batch_xstrain = batch_xs, batch_ys #simualtes train data
batch_xcv, batch_ycv = mnist.test.next_batch(5000) #simualtes CV data
batch_xtest, batch_ytest = mnist.test.next_batch(5000) #simualtes test data
# do inference
train_error = sess.run(fetches=cross_entropy, feed_dict={x: batch_xs, y_:batch_ys, train_phase: False})
cv_error = sess.run(fetches=cross_entropy, feed_dict={x: batch_xcv, y_:batch_ycv, train_phase: False})
test_error = sess.run(fetches=cross_entropy, feed_dict={x: batch_xtest, y_:batch_ytest, train_phase: False})
def do_stuff_with_errors(*args):
print args
do_stuff_with_errors(train_error, cv_error, test_error)
# Run Train Step
sess.run(fetches=train_step, feed_dict={x: batch_xs, y_:batch_ys, train_phase: True})
# list of booleans indicating correct predictions
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
# accuracy
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels, train_phase: False}))
when I run it I get:
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
(2.3474066, 2.3498712, 2.3461707)
(0.49414295, 0.88536006, 0.91152304)
(0.51632041, 0.393666, nan)
0.9296
it used to be all the last ones were nan and now only a few of them. Is everything fine or am I paranoic?
I am not sure if this will solve your problem, the documentation for BatchNorm is not quite easy-to-use/informative, so here is a short recap on how to use simple BatchNorm:
First of all, you define your BatchNorm layer. If you want to use it after an affine/fully-connected layer, you do this (just an example, order can be different/as you desire):
...
inputs = tf.matmul(inputs, W) + b
inputs = tf.layers.batch_normalization(inputs, training=is_training)
inputs = tf.nn.relu(inputs)
...
The function tf.layers.batch_normalization calls variable-initializers. These are internal-variables and need a special scope to be called, which is in the tf.GraphKeys.UPDATE_OPS. As such, you must call your optimizer function as follows (after all layers have been defined!):
...
extra_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(extra_update_ops):
trainer = tf.train.AdamOptimizer()
updateModel = trainer.minimize(loss, global_step=global_step)
...
You can read more about it here. I know it's a little late to answer your question, but it might help other people coming across BatchNorm problems in tensorflow! :)
training =tf.placeholder(tf.bool, name = 'training')
lr_holder = tf.placeholder(tf.float32, [], name='learning_rate')
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
optimizer = tf.train.AdamOptimizer(learning_rate = lr).minimize(cost)
when defining the layers, you need to use the placeholder 'training'
batchNormal_layer = tf.layers.batch_normalization(pre_batchNormal_layer, training=training)

Categories

Resources