i'm stuck working on a Tensorflow Convolutional Neural Network for a university project and i hope somebody can help me.
it's supposed to output a picture for a picture input. left is input, right is output. both are in .jpeg format.
input and output
The weights look like this. left image shows the weights before learning, right is after a few epochs and it does not change at all with further training.
The net does not seem to learn anything useful and i have a feeling i forgot something basic.
the accuracy peeks around 5% when learning
weights
here is what it looks when i save the input image x
i dont know if i make a mistake loading or saving the image
And this is what the output y of the net looks like
i based the code on the tensorflow mnist tutorial.
here is my code that i have shortened to make it more readable:
import tensorflow as tf
from PIL import Image
import numpy as np
def weight_variable(dim,stddev=0.35):
init = tf.random_normal(dim, stddev=stddev)
return tf.Variable(init)
def bias_variable(dim,val=0.1):
init = tf.constant(val, shape=dim)
return tf.Variable(init)
def conv2d(x,W):
return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding = 'SAME')
def max_pool2x2(x):
return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding = 'SAME')
def output_pics(pic): # for weights
#1 color (dimension) array cast to uint8 and output as jpeg to file
def output_pics_color(pic):
#3 colors (dimensions) array cast to uint8 and output as jpeg to file
def show_pic(pic):
#3 colors (dimensions) array cast to uint8 and shown in window
filesX = [...] # filenames of inputs for training
filesY = [...] # filenames of outputsfor training
test_filesX = [...]# filenames of inputs for testing
test_filesY = [...]# filenames of outputs for testing
px_size = 128 # size of images 128x128 (resized)
filename_queueX = tf.train.string_input_producer(filesX)
filename_queueY = tf.train.string_input_producer(filesY)
filename_testX = tf.train.string_input_producer(test_filesY)
filename_testY = tf.train.string_input_producer(test_filesY)
image_reader = tf.WholeFileReader()
img_name, img_dataX = image_reader.read(filename_queueX)
imageX = tf.image.decode_jpeg(img_dataX)
imageX = tf.image.resize_images(imageX, [px_size,px_size])
imageX.set_shape((px_size,px_size,3))
imageX=tf.cast(imageX, tf.float32)
...
same for imageY, test_imageX, test_imageY
trainX = []
trainY = []
testX = []
testY = []
j=1
with tf.name_scope('model'):
x=tf.placeholder(tf.float32, [None, px_size,px_size,3])
prob = tf.placeholder(tf.float32)
init_op = tf.global_variables_initializer()
# load images into lists
with tf.Session() as sess:
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
for i in range(1,65):
trainX.append(imageX.eval())
trainY.append(imageY.eval())
for i in range(1, 10):
testX.append(test_imageX.eval())
testY.append(test_imageY.eval())
coord.request_stop()
coord.join(threads)
# layer 1
x_img = tf.reshape(x,[-1,px_size,px_size, 3])
W1 = weight_variable([20,20,3,3])
b1 = bias_variable([3])
y1 = tf.nn.softmax(conv2d(x_img,W1)+b1)
# layer 2
W2 = weight_variable([30,30,3,3])
b2 = bias_variable([3])
y2=tf.nn.softmax(conv2d(y1, W2)+b2)
# layer 3
W3 = weight_variable([40,40,3,3])
b3 = bias_variable([3])
y3=tf.nn.softmax(conv2d(y2, W3)+b3)
y = y3
with tf.name_scope('train'):
y_ =tf.placeholder(tf.float32, [None, px_size,px_size,3])
cross_entropy = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=y_, logits=y))
opt = tf.train.MomentumOptimizer(learning_rate=0.5, momentum=0.1).minimize(cross_entropy)
with tf.name_scope('eval'):
correct = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))
nEpochs = 1000
batchSize = 10
res = 0
with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run(init)
trAccs = []
for i in range(nEpochs):
if i%100 == 0 :
train_accuracy = sess.run(accuracy, feed_dict={x:trainX, y_:trainY, prob: 1.0})
print(train_accuracy)
output_pics(W1)#output weights of layer 1 to file
output_pics_color(x)#save input image
output_pics_color(y)#save net output
sess.run(opt, feed_dict={x:trainX, y_:trainY, prob: 0.5})
This is an Image generation problem
The model you selected is a very bad model for Image generation tasks
Normal CNNs are used for image recognition and object detection tasks
The tutorial on MNIST is image classification problem and not image generation problem
It is very important to select an appropriate model type for a particular problem
Clearly with this model there is no chance of achieving the output that you have mentioned
I do not event understand that how are you even calculating the accuracy because this is unsupervised learning problem
You have used softmax after every layer which is really a bad idea.. Tensorflow mnist tutorial does not even has this code
Softmax is only used in the last layer
In the hidden layers leaky relu or simple relu should be used
I would suggest you to look for a more appropriate deep-learning model
Specifically combination of Variational Auto-Encoder Generative Adversarial Networks or simple Generative Adversarial Networks
Related
I have just begun working with Tensorflow and Colab.
I followed a tutorial online on how to build a simple image recognition model in Colab.
From the tutorial, I was able to build a simple model, without completely understanding every step at this point.
But what I would like to know is how I can now save the model I built for use elsewhere.
Here is the final bits of code used to build and test the model.
Placeholder:
# Initialize placeholders
x = tf.placeholder(dtype = tf.float32, shape = [None, 28, 28])
y = tf.placeholder(dtype = tf.int32, shape = [None])
# Flatten the input data
images_flat = tf.contrib.layers.flatten(x)
# Fully connected layer
logits = tf.contrib.layers.fully_connected(images_flat, 62, tf.nn.relu)
# Define a loss function
loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels = y,
logits = logits))
# Define an optimizer
train_op = tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss)
# Convert logits to label indexes
correct_pred = tf.argmax(logits, 1)
# Define an accuracy metric
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
print("images_flat: ", images_flat)
print("logits: ", logits)
print("loss: ", loss)
print("predicted_labels: ", correct_pred)
Run in session:
tf.set_random_seed(1234)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for i in range(201):
print('EPOCH', i)
_, accuracy_val = sess.run([train_op, accuracy], feed_dict={x: images28, y: labels})
if i % 10 == 0:
print("Loss: ", loss)
print('DONE WITH EPOCH')
Test on test data
# Import `skimage`
from skimage import transform
# Load the test data
test_images, test_labels = load_data(test_data_directory)
# Transform the images to 28 by 28 pixels
test_images28 = [transform.resize(image, (28, 28)) for image in test_images]
# Convert to grayscale
from skimage.color import rgb2gray
test_images28 = rgb2gray(np.array(test_images28))
# Run predictions against the full test set.
predicted = sess.run([correct_pred], feed_dict={x: test_images28})[0]
# Calculate correct matches
match_count = sum([int(y == y_) for y, y_ in zip(test_labels, predicted)])
# Calculate the accuracy
accuracy = match_count / len(test_labels)
# Print the accuracy
print("Accuracy: {:.3f}".format(accuracy))
From the above can someone suggest a bit of code whereby I can save the model to google drive? To be honest I'm not even sure which variable the model is stored in?
Thank you, and sorry for the beginner question.
Recently,I am studying the GAN network,I'm using it to generator a mnisit image,the environment in my computer is ubuntu16.04,tensorflow,python3.
The code can run without any error.But the result shows the network study nothing,through training,the output image is still noisy image.
Firstly I design a generator network:the input is 784 dimension's noisy data,through a hidden layer and rule it,generate a 784 dimension's image.
Then I design a discriminator network:the input is real image and fake image,through a hidden layer and rule it,the output is 1 dimension's logits.
Then I defined the generator_loss and discriminator_loss, then train generator and discriminator.It can run,but the result show the network study nothing, the loss can not convergence.
import tensorflow as tf
import numpy as np
import tensorflow.contrib.slim as slim
import matplotlib.pyplot as plt
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/home/zyw/data/tensor_mnist-master/MNIST_data/",one_hot=True)
batch_size = 100
G_in = tf.placeholder(tf.float32,[None,784])
G_h1 = tf.layers.dense(G_in, 128)
G_h1 = tf.maximum(0.01 * G_h1, G_h1)
G_out = tf.tanh(tf.layers.dense(G_h1, 784))
real = tf.placeholder(tf.float32,[None,784])
Dl0 = tf.layers.dense(G_out, 128)
Dl0 = tf.maximum(0.01 * Dl0, Dl0)
p0 = tf.layers.dense(Dl0, 1)
Dl1 = tf.layers.dense(real, 128)
Dl1 = tf.maximum(0.01 * Dl1, Dl1)
p1 = tf.layers.dense(Dl1, 1)
G_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits =p0,labels=tf.ones_like(p0)*0.9))
D_real_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits =p1,labels=tf.ones_like(p1)*0.9))
D_fake_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits =p0,labels=tf.zeros_like(p0)))
D_total_loss = tf.add(D_fake_loss,D_real_loss)
G_train = tf.train.AdamOptimizer(0.01).minimize(G_loss)
D_train = tf.train.AdamOptimizer(0.01).minimize(D_total_loss)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for i in range(1000):
mnist_data,_ = mnist.train.next_batch(batch_size)
# noise_org = tf.random_normal([batch_size,784],stddev = 0.1,dtype = tf.float32)
noise_org = np.random.randn(batch_size, 784)
a,b,dloss= sess.run([D_real_loss,D_fake_loss,D_total_loss,G_train,D_train],feed_dict={G_in:noise_org,real:mnist_data})[:3]
if i%100==0:
print(a,b,dloss)
#test_generative_image
noise_org = np.random.randn(1, 784)
image = sess.run(G_out,feed_dict ={G_in:noise_org})
outimage = tf.reshape(image, [28,28])
plt.imshow(outimage.eval(),cmap='gray')
plt.show()
print('ok')
the result is:
0.80509 0.63548 1.44057
0.33512 0.20223 0.53735
0.332536 0.97737 1.30991
0.328048 0.814452 1.1425
0.326688 0.411907 0.738596
0.325864 0.570807 0.896671
0.325575 0.970406 1.29598
0.325421 1.02487 1.35029
0.325222 1.34089 1.66612
0.325217 0.747129 1.07235
I have added the modified code with the comments where I made the changes. Moreover, I have described about my changes below.
import tensorflow as tf
import numpy as np
import tensorflow.contrib.slim as slim
import matplotlib.pyplot as plt
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/home/zyw/data/tensor_mnist-master/MNIST_data/",one_hot=True)
batch_size = 100
#define the generator function
def generator(input):
G_h1 = tf.layers.dense(input, 128)
# G_h1 = tf.maximum(0.01 * G_h1, G_h1)
G_out = tf.sigmoid(tf.layers.dense(G_h1, 784)) # sigmoid function added
return G_out
#Define the discrminator function
def discriminator(input):
Dl0 = tf.layers.dense(input, 128)
# Dl0 = tf.maximum(0.01 * Dl0, Dl0)
p0 = tf.layers.dense(Dl0, 1)
return p0
#Generator
with tf.variable_scope('G'):
G_in = tf.placeholder(tf.float32, [None, 784])
G_out = generator(G_in)
real = tf.placeholder(tf.float32, [None, 784])
#Discrimnator that takes the real data
with tf.variable_scope('D'):
D1 = discriminator(real)
#Discriminator that takes fake data
with tf.variable_scope('D', reuse=True): # need to use the same copy of Discrminator
D2 = discriminator(G_out)
G_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D2, labels=tf.ones_like(D2)))
D_real_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D1, labels=tf.ones_like(D1)))
D_fake_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=D2, labels=tf.zeros_like(D2)))
D_total_loss = tf.add(D_fake_loss, D_real_loss)
vars = tf.trainable_variables() #all trainable variables
d_training_vars = [v for v in vars if v.name.startswith('D/')] # varibles associated with the discrminator
g_training_vars = [v for v in vars if v.name.startswith('G/')] # varibles associated with the generator
G_train = tf.train.AdamOptimizer(0.001).minimize(G_loss,var_list=g_training_vars) # only train the variables associated with the generator
D_train = tf.train.AdamOptimizer(0.001).minimize(D_total_loss,var_list=d_training_vars) # only train the variables associated with the discriminator
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for i in range(1000):
mnist_data, _ = mnist.train.next_batch(batch_size)
# noise_org = tf.random_normal([batch_size,784],stddev = 0.1,dtype = tf.float32)
noise_org = np.random.randn(batch_size, 784)
a, b, dloss = sess.run([D_real_loss, D_fake_loss, D_total_loss, G_train, D_train],feed_dict={G_in: noise_org, real: mnist_data})[:3]
if i % 100 == 0:
print(a, b, dloss)
# test_generative_image
noise_org = np.random.randn(1, 784)
image = sess.run(G_out, feed_dict={G_in: noise_org})
outimage = tf.reshape(image, [28, 28])
plt.imshow(outimage.eval(), cmap='gray')
plt.show()
print('ok')
Few points you should note when implementing a GAN,
Need to use the same copies of the discriminator (i.e share same
weights) when implementing the discriminator loss (in your case Dl0
and Dl1 should share same paraments).
Generator activation function should be sigmoid not tanh
since the output of the generator should only be varying between 0
and 1. (since its a image)
When training the discriminator, you should only train the variables that associated with the discriminator. Likewise, when training the generator you only should train the variables that associated with the generator.
Sometimes it is important to make sure that the discriminator is
more powerful than the generator, as otherwise, it would not have
sufficient capacity to learn to be able to distinguish accurately
between generated and real samples.
These are only the basic things of GANs that you should note. However, there are many other aspects that you should consider when developing a GAN. You can get a good basic idea of GANs by reading following two articles.
http://blog.aylien.com/introduction-generative-adversarial-networks-code-tensorflow/
http://blog.evjang.com/2016/06/generative-adversarial-nets-in.html
Hope this helps.
I'm trying to build a softmax regression model for CIFAR classification. At first when I tried to pass in my images and labels into the feed dictionary, I got an error that said that feed dictionaries do not accept Tensors. I then converted them into numpy arrays using .eval() but the program hangs at the .eval() line and does not continue any further. How can I pass this data into the feed_dict?
CIFARIMAGELOADING.PY
import tensorflow as tf
import os
import tensorflow.models.image.cifar10 as cf
IMAGE_SIZE = 24
BATCH_SIZE = 128
def loadimagesandlabels(size):
# Load the images from the CIFAR data directory
FLAGS = tf.app.flags.FLAGS
data_dir = os.path.join(FLAGS.data_dir, 'cifar-10-batches-bin')
filenames = [os.path.join(data_dir, 'data_batch_%d.bin' % i) for i in xrange(1, 6)]
filename_queue = tf.train.string_input_producer(filenames)
read_input = cf.cifar10_input.read_cifar10(filename_queue)
# Reshape and crop the image
height = IMAGE_SIZE
width = IMAGE_SIZE
reshaped_image = tf.cast(read_input.uint8image, tf.float32)
cropped_image = tf.random_crop(reshaped_image, [height, width, 3])
# Generate a batch of images and labels by building up a queue of examples
print('Filling queue with CIFAR images')
num_preprocess_threads = 16
min_fraction_of_examples_in_queue = 0.4
min_queue_examples = int(BATCH_SIZE*min_fraction_of_examples_in_queue)
images, label_batch = tf.train.batch([cropped_image,read_input.label],batch_size=BATCH_SIZE, num_threads=num_preprocess_threads, capacity=min_queue_examples+3*BATCH_SIZE)
print(images)
print(label_batch)
return images, tf.reshape(label_batch, [BATCH_SIZE])
CIFAR.PY
#Set up placeholder vectors for image and labels
x = tf.placeholder(tf.float32, shape = [None, 1728])
y_ = tf.placeholder(tf.float32, shape = [None,10])
W = tf.Variable(tf.zeros([1728,10]))
b = tf.Variable(tf.zeros([10]))
#Implement regression model. Multiply input images x by weight matrix W, add the bias b
#Compute the softmax probabilities that are assigned to each class
y = tf.nn.softmax(tf.matmul(x,W) + b)
#Define cross entropy
#tf.reduce sum sums across all classes and tf.reduce_mean takes the average over these sums
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_*tf.log(y), reduction_indices = [1]))
#Train the model
#Each training iteration we load 128 training examples. We then run the train_step operation
#using feed_dict to replace the placeholder tensors x and y_ with the training examples
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
#Open up a Session
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
for i in range(1000) :
images, labels = CIFARImageLoading.loadimagesandlabels(size=BATCH_SIZE)
unrolled_images = tf.reshape(images, (1728, BATCH_SIZE))
#convert labels to their one_hot representations
# should produce [[1,0,0,...],[0,1,0...],...]
one_hot_labels = tf.one_hot(indices= labels, depth=NUM_CLASSES, on_value=1.0, off_value= 0.0, axis=-1)
print(unrolled_images)
print(one_hot_labels)
images_numpy, labels_numpy = unrolled_images.eval(session=sess), one_hot_labels.eval(session=sess)
sess.run(train_step, feed_dict = {x: images_numpy, y_:labels_numpy})
#Evaluate the model
#.equal returns a tensor of booleans, we want to cast these as floats and then take their mean
#to get percent correctness (accuracy)
print("evaluating")
test_images, test_labels = CIFARImageLoading.loadimagesandlabels(TEST_SIZE)
test_images_unrolled = tf.reshape(test_images, (1728, TEST_SIZE))
test_images_one_hot = tf.one_hot(indices= test_labels, depth=NUM_CLASSES, on_value=1.0, off_value= 0.0, axis=-1)
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(accuracy.eval(feed_dict = {x: unrolled_images.eval(), y_ : test_images_one_hot.eval()}))
Theres a couple of things that you not are understanding really well. Throughout your graph you will work with Tensors. You define Tensors by either using tf.placeholder and feeding them in the session.run(feed_dict{}) or with tf.Variable and initializing it with session.run(tf.initialize_all_variables()). You must feed your input this way, and it should be numpy arrays in the same as shape as you expect in the placeholders. Here's a simple example:
images = tf.placeholder(type, [1728, BATCH_SIZE])
labels = tf.placeholder(type, [size])
'''
Build your network here so you have the variable: Output
'''
images_feed, labels_feed = CIFARImageLoading.loadimagesandlabels(size=BATCH_SIZE)
# here you can see your output
print sess.run(Output, feed_dict = {x: images_feed, y_:labels_feed})
You do not feed tf.functions with numpy arrays, you always feed them with Tensors. And the feed_dict is always fed with numpy arrays. The thing is: you will never have to convert tensors to numpy arrays for the input, that does not make sense. Your input must be numpy arrays, if it's a list, you can use np.asarray(list), if it's a tensor, you are doing this wrong.
I do not know what CIFARImageLoading.loadimagesandlabels returns to you, but I imagine it's not a Tensor, it's probably a numpy array already, so just get rid of this .eval().
I am trying to detect micro-events in a long time series. For this purpose, I will train a LSTM network.
Data. Input for each time sample is 11 different features somewhat normalized to fit 0-1. Output will be either one of two classes.
Batching. Due to huge class imbalance I have extracted the data in batches of each 60 time samples, of which at least 5 will always be class 1, and the rest class to. In this way the class imbalance is reduced from 150:1 to around 12:1 I have then randomized the order of all my batches.
Model. I am attempting to train an LSTM, with initial configuration of 3 different cells with 5 delay steps. I expect the micro events to arrive in sequences of at least 3 time steps.
Problem: When I try to train the network it will quickly converge towards saying that EVERYTHING belongs to the majority class. When I implement a weighted loss function, at some certain threshold it will change to saying that EVERYTHING belongs to the minority class. I suspect (without being expert) that there is no learning in my LSTM cells, or that my configuration is off?
Below is the code for my implementation. I am hoping that someone can tell me
Is my implementation correct?
What other reasons could there be for such behaviour?
ar_model.py
import numpy as np
import tensorflow as tf
from tensorflow.models.rnn import rnn
import ar_config
config = ar_config.get_config()
class ARModel(object):
def __init__(self, is_training=False, config=None):
# Config
if config is None:
config = ar_config.get_config()
# Placeholders
self._features = tf.placeholder(tf.float32, [None, config.num_features], name='ModelInput')
self._targets = tf.placeholder(tf.float32, [None, config.num_classes], name='ModelOutput')
# Hidden layer
with tf.variable_scope('lstm') as scope:
lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(config.num_hidden, forget_bias=0.0)
cell = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * config.num_delays)
self._initial_state = cell.zero_state(config.batch_size, dtype=tf.float32)
outputs, state = rnn.rnn(cell, [self._features], dtype=tf.float32)
# Output layer
output = outputs[-1]
softmax_w = tf.get_variable('softmax_w', [config.num_hidden, config.num_classes], tf.float32)
softmax_b = tf.get_variable('softmax_b', [config.num_classes], tf.float32)
logits = tf.matmul(output, softmax_w) + softmax_b
# Evaluate
ratio = (60.00 / 5.00)
class_weights = tf.constant([ratio, 1 - ratio])
weighted_logits = tf.mul(logits, class_weights)
loss = tf.nn.softmax_cross_entropy_with_logits(weighted_logits, self._targets)
self._cost = cost = tf.reduce_mean(loss)
self._predict = tf.argmax(tf.nn.softmax(logits), 1)
self._correct = tf.equal(tf.argmax(logits, 1), tf.argmax(self._targets, 1))
self._accuracy = tf.reduce_mean(tf.cast(self._correct, tf.float32))
self._final_state = state
if not is_training:
return
# Optimize
optimizer = tf.train.AdamOptimizer()
self._train_op = optimizer.minimize(cost)
#property
def features(self):
return self._features
#property
def targets(self):
return self._targets
#property
def cost(self):
return self._cost
#property
def accuracy(self):
return self._accuracy
#property
def train_op(self):
return self._train_op
#property
def predict(self):
return self._predict
#property
def initial_state(self):
return self._initial_state
#property
def final_state(self):
return self._final_state
ar_train.py
import os
from datetime import datetime
import numpy as np
import tensorflow as tf
from tensorflow.python.platform import gfile
import ar_network
import ar_config
import ar_reader
config = ar_config.get_config()
def main(argv=None):
if gfile.Exists(config.train_dir):
gfile.DeleteRecursively(config.train_dir)
gfile.MakeDirs(config.train_dir)
train()
def train():
train_data = ar_reader.ArousalData(config.train_data, num_steps=config.max_steps)
test_data = ar_reader.ArousalData(config.test_data, num_steps=config.max_steps)
with tf.Graph().as_default(), tf.Session() as session, tf.device('/cpu:0'):
initializer = tf.random_uniform_initializer(minval=-0.1, maxval=0.1)
with tf.variable_scope('model', reuse=False, initializer=initializer):
m = ar_network.ARModel(is_training=True)
s = tf.train.Saver(tf.all_variables())
tf.initialize_all_variables().run()
for batch_input, batch_target in train_data:
step = train_data.iter_steps
dict = {
m.features: batch_input,
m.targets: batch_target
}
session.run(m.train_op, feed_dict=dict)
state, cost, accuracy = session.run([m.final_state, m.cost, m.accuracy], feed_dict=dict)
if not step % 10:
test_input, test_target = test_data.next()
test_accuracy = session.run(m.accuracy, feed_dict={
m.features: test_input,
m.targets: test_target
})
now = datetime.now().time()
print ('%s | Iter %4d | Loss= %.5f | Train= %.5f | Test= %.3f' % (now, step, cost, accuracy, test_accuracy))
if not step % 1000:
destination = os.path.join(config.train_dir, 'ar_model.ckpt')
s.save(session, destination)
if __name__ == '__main__':
tf.app.run()
ar_config.py
class Config(object):
# Directories
train_dir = '...'
ckpt_dir = '...'
train_data = '...'
test_data = '...'
# Data
num_features = 13
num_classes = 2
batch_size = 60
# Model
num_hidden = 3
num_delays = 5
# Training
max_steps = 100000
def get_config():
return Config()
UPDATED ARCHITECTURE:
# Placeholders
self._features = tf.placeholder(tf.float32, [None, config.num_features, config.num_delays], name='ModelInput')
self._targets = tf.placeholder(tf.float32, [None, config.num_output], name='ModelOutput')
# Weights
weights = {
'hidden': tf.get_variable('w_hidden', [config.num_features, config.num_hidden], tf.float32),
'out': tf.get_variable('w_out', [config.num_hidden, config.num_classes], tf.float32)
}
biases = {
'hidden': tf.get_variable('b_hidden', [config.num_hidden], tf.float32),
'out': tf.get_variable('b_out', [config.num_classes], tf.float32)
}
#Layer in
with tf.variable_scope('input_hidden') as scope:
inputs = self._features
inputs = tf.transpose(inputs, perm=[2, 0, 1]) # (BatchSize,NumFeatures,TimeSteps) -> (TimeSteps,BatchSize,NumFeatures)
inputs = tf.reshape(inputs, shape=[-1, config.num_features]) # (TimeSteps,BatchSize,NumFeatures -> (TimeSteps*BatchSize,NumFeatures)
inputs = tf.add(tf.matmul(inputs, weights['hidden']), biases['hidden'])
#Layer hidden
with tf.variable_scope('hidden_hidden') as scope:
inputs = tf.split(0, config.num_delays, inputs) # -> n_steps * (batchsize, features)
cell = tf.nn.rnn_cell.BasicLSTMCell(config.num_hidden, forget_bias=0.0)
self._initial_state = cell.zero_state(config.batch_size, dtype=tf.float32)
outputs, state = rnn.rnn(cell, inputs, dtype=tf.float32)
#Layer out
with tf.variable_scope('hidden_output') as scope:
output = outputs[-1]
logits = tf.add(tf.matmul(output, weights['out']), biases['out'])
Odd elements
Weighted loss
I am not sure your "weighted loss" does what you want it to do:
ratio = (60.00 / 5.00)
class_weights = tf.constant([ratio, 1 - ratio])
weighted_logits = tf.mul(logits, class_weights)
this is applied before calculating the loss function (further I think you wanted an element-wise multiplication as well? also your ratio is above 1 which makes the second part negative?) so it forces your predictions to behave in a certain way before applying the softmax.
If you want weighted loss you should apply this after
loss = tf.nn.softmax_cross_entropy_with_logits(weighted_logits, self._targets)
with some element-wise multiplication of your weights.
loss = loss * weights
Where your weights have a shape like [2,]
However, I would not recommend you to use weighted losses. Perhaps try increasing the ratio even further than 1:6.
Architecture
As far as I can read, you are using 5 stacked LSTMs with 3 hidden units per layer?
Try removing the multi rnn and just use a single LSTM/GRU (maybe even just a vanilla RNN) and jack the hidden units up to ~100-1000.
Debugging
Often when you are facing problems with an odd behaving network, it can be a good idea to:
Print everything
Literally print the shapes and values of every tensor in your model, use sess to fetch it and then print it. Your input data, the first hidden representation, your predictions, your losses etc.
You can also use tensorflows tf.Print() x_tensor = tf.Print(x_tensor, [tf.shape(x_tensor)])
Use tensorboard
Using tensorboard summaries on your gradients, accuracy metrics and histograms will reveal patterns in your data that might explain certain behavior, such as what lead to exploding weights. Like maybe your forget bias goes to infinity or your not tracking gradient through a certain layer etc.
Other questions
How large is your dataset?
How long are your sequences?
Are the 13 features categorical or continuous? You should not normalize categorical variables or represent them as integers, instead you should use one-hot encoding.
Gunnar has already made lots of good suggestions. A few more small things worth paying attention to in general for this sort of architecture:
Try tweaking the Adam learning rate. You should determine the proper learning rate by cross-validation; as a rough start, you could just check whether a smaller learning rate saves your model from crashing on the training data.
You should definitely use more hidden units. It's cheap to try larger networks when you first start out on a dataset. Go as large as necessary to avoid the underfitting you've observed. Later you can regularize / pare down the network after you get it to learn something useful.
Concretely, how long are the sequences you are passing into the network? You say you have a 30k-long time sequence.. I assume you are passing in subsections / samples of this sequence?
I am relatively new to machine-learning and currently have almost no experiencing in developing it.
So my Question is: after training and evaluating the cifar10 dataset from the tensorflow tutorial I was wondering how could one test it with sample images?
I could train and evaluate the Imagenet tutorial from the caffe machine-learning framework and it was relatively easy to use the trained model on custom applications using the python API.
Any help would be very appreciated!
This isn't 100% the answer to the question, but it's a similar way of solving it, based on a MNIST NN training example suggested in the comments to the question.
Based on the TensorFlow begginer MNIST tutorial, and thanks to this tutorial, this is a way of training and using your Neural Network with custom data.
Please note that similar should be done for tutorials such as the CIFAR10, as #Yaroslav Bulatov mentioned in the comments.
import input_data
import datetime
import numpy as np
import tensorflow as tf
import cv2
from matplotlib import pyplot as plt
import matplotlib.image as mpimg
from random import randint
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
x = tf.placeholder("float", [None, 784])
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x,W) + b)
y_ = tf.placeholder("float", [None,10])
cross_entropy = -tf.reduce_sum(y_*tf.log(y))
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
#Train our model
iter = 1000
for i in range(iter):
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
#Evaluationg our model:
correct_prediction=tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy=tf.reduce_mean(tf.cast(correct_prediction,"float"))
print "Accuracy: ", sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})
#1: Using our model to classify a random MNIST image from the original test set:
num = randint(0, mnist.test.images.shape[0])
img = mnist.test.images[num]
classification = sess.run(tf.argmax(y, 1), feed_dict={x: [img]})
'''
#Uncomment this part if you want to plot the classified image.
plt.imshow(img.reshape(28, 28), cmap=plt.cm.binary)
plt.show()
'''
print 'Neural Network predicted', classification[0]
print 'Real label is:', np.argmax(mnist.test.labels[num])
#2: Using our model to classify MNIST digit from a custom image:
# create an an array where we can store 1 picture
images = np.zeros((1,784))
# and the correct values
correct_vals = np.zeros((1,10))
# read the image
gray = cv2.imread("my_digit.png", 0 ) #0=cv2.CV_LOAD_IMAGE_GRAYSCALE #must be .png!
# rescale it
gray = cv2.resize(255-gray, (28, 28))
# save the processed images
cv2.imwrite("my_grayscale_digit.png", gray)
"""
all images in the training set have an range from 0-1
and not from 0-255 so we divide our flatten images
(a one dimensional vector with our 784 pixels)
to use the same 0-1 based range
"""
flatten = gray.flatten() / 255.0
"""
we need to store the flatten image and generate
the correct_vals array
correct_val for a digit (9) would be
[0,0,0,0,0,0,0,0,0,1]
"""
images[0] = flatten
my_classification = sess.run(tf.argmax(y, 1), feed_dict={x: [images[0]]})
"""
we want to run the prediction and the accuracy function
using our generated arrays (images and correct_vals)
"""
print 'Neural Network predicted', my_classification[0], "for your digit"
For further image conditioning (digits should be completely dark in a white background) and better NN training (accuracy>91%) please check the Advanced MNIST tutorial from TensorFlow or the 2nd tutorial i've mentioned.
The below example is not for the mnist tutorial, but a simple XOR example. Note the train() and test() methods. All that we declare & keep globally are the weights, biases, and session. In the test method we redefine the shape of the input and reuse the same weights & biases (and session) that we refined in training.
import tensorflow as tf
#parameters for the net
w1 = tf.Variable(tf.random_uniform(shape=[2,2], minval=-1, maxval=1, name='weights1'))
w2 = tf.Variable(tf.random_uniform(shape=[2,1], minval=-1, maxval=1, name='weights2'))
#biases
b1 = tf.Variable(tf.zeros([2]), name='bias1')
b2 = tf.Variable(tf.zeros([1]), name='bias2')
#tensorflow session
sess = tf.Session()
def train():
#placeholders for the traning inputs (4 inputs with 2 features each) and outputs (4 outputs which have a value of 0 or 1)
x = tf.placeholder(tf.float32, [4, 2], name='x-inputs')
y = tf.placeholder(tf.float32, [4, 1], name='y-inputs')
#set up the model calculations
temp = tf.sigmoid(tf.matmul(x, w1) + b1)
output = tf.sigmoid(tf.matmul(temp, w2) + b2)
#cost function is avg error over training samples
cost = tf.reduce_mean(((y * tf.log(output)) + ((1 - y) * tf.log(1.0 - output))) * -1)
#training step is gradient descent
train_step = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)
#declare training data
training_x = [[0,1], [0,0], [1,0], [1,1]]
training_y = [[1], [0], [1], [0]]
#init session
init = tf.initialize_all_variables()
sess.run(init)
#training
for i in range(100000):
sess.run(train_step, feed_dict={x:training_x, y:training_y})
if i % 1000 == 0:
print (i, sess.run(cost, feed_dict={x:training_x, y:training_y}))
print '\ntraining done\n'
def test(inputs):
#redefine the shape of the input to a single unit with 2 features
xtest = tf.placeholder(tf.float32, [1, 2], name='x-inputs')
#redefine the model in terms of that new input shape
temp = tf.sigmoid(tf.matmul(xtest, w1) + b1)
output = tf.sigmoid(tf.matmul(temp, w2) + b2)
print (inputs, sess.run(output, feed_dict={xtest:[inputs]})[0, 0] >= 0.5)
train()
test([0,1])
test([0,0])
test([1,1])
test([1,0])
I recommend taking a look at the basic MNIST tutorial on the TensorFlow website. It looks like you define some function that generates the type of output that you want, and then run your session, passing it this evaluation function (correct_prediction below), and a dictionary containing whatever arguments you require (x and y_ below).
If you have defined and trained some network that takes an input x, generates a response y based on your inputs, and you know your expected responses for your testing set y_, you may be able to print out every response to your testing set with something like:
correct_prediction = tf.equal(y, y_) % Check whether your prediction is correct
print(sess.run(correct_prediction, feed_dict={x: test_images, y_: test_labels}))
This is just a modification of what is done in the tutorial, where instead of trying to print each response, they determine the percent of correct responses. Also note that the tutorial uses one-hot vectors for the prediction y and actual value y_, so in order to return the associated numeral, they have to find which index of these vectors are equal to one with tf.argmax(y, 1).
Edit
In general, if you define something in your graph, you can output it later when you run your graph. Say you define something that determines the result of the softmax function on your output logits as:
graph = tf.Graph()
with graph.as_default():
...
prediction = tf.nn.softmax(logits)
...
then you can output this at run time with:
with tf.Session(graph=graph) as sess:
...
feed_dict = { ... } # define your feed dictionary
pred = sess.run([prediction], feed_dict=feed_dict)
# do stuff with your prediction vector