I tried to build a convolutional neural network but I have stumbled over some really strange problems.
first thing's first, here's my code:
import tensorflow as tf
import numpy as np
import matplotlib.image as mpimg
import glob
x = []
y = 1
for filename in glob.glob('trainig_data/*.jpg'):
im = mpimg.imread(filename)
x.append(im)
if len(x) == 10:
break
epochs = 5
weights = [tf.Variable(tf.random_normal([5,5,3,32],0.1)),
tf.Variable(tf.random_normal([5,5,32,64],0.1)),
tf.Variable(tf.random_normal([5,5,64,128],0.1)),
tf.Variable(tf.random_normal([75*75*128,1064],0.1)),
tf.Variable(tf.random_normal([1064,1],0.1))]
def CNN(x, weights):
output = tf.nn.conv2d([x], weights[0], [1,1,1,1], 'SAME')
output = tf.nn.relu(output)
output = tf.nn.conv2d(output, weights[1], [1,2,2,1], 'SAME')
output = tf.nn.relu(output)
output = tf.nn.conv2d(output, weights[2], [1,2,2,1], 'SAME')
output = tf.nn.relu(output)
output = tf.reshape(output, [-1,75*75*128])
output = tf.matmul(output, weights[3])
output = tf.nn.relu(output)
output = tf.matmul(output, weights[4])
output = tf.reduce_sum(output)
return output
sess = tf.Session()
prediction = CNN(tf.cast(x[0],tf.float32), weights)
cost = tf.reduce_mean(tf.square(prediction-y))
train = tf.train.GradientDescentOptimizer(0.01).minimize(cost)
init = tf.global_variables_initializer()
sess.run(init)
for e in range(epochs):
print('epoch:',e+1)
for x_i in x:
prediction = CNN(tf.cast(x_i,tf.float32), weights)
sess.run([cost, train])
print(sess.run(cost))
print('optimization finished!')
print(sess.run(prediction))
Now here are my problems:
The values of the weights and filters are not changing
The variable 'cost' is always 1.0
The prediction always puts out a 0
After doing some debugging I found out that the problem must come from the optimizer, because the cost and the prediction are not 1.0 and 0 before I put the weights trough the optimizer.
I hope that was enough information and that you can help me with my problem.
Try changing the way you initialise weights, use tf.truncated_normal to initialise weights. Refer answer, which states difference between tf.truncated_normal.
tf.truncted_normal: Outputs random values from a truncated normal distribution. The generated values follow a normal distribution with specified mean and standard deviation, except that values whose magnitude is more than 2 standard deviations from the mean are dropped and re-picked.
tf.random_normal: Outputs random values from a normal distribution.
the code seems to be odd. in the last line of your CNN function, you used the tf.reduce_mean to get a single value as an output, which is going to be a positive number (most probably greater than 1) (0, inf) as relu activation function output a positive value only for inputs lying on the positive x-axis. so, i think you should use tf.nn.softmax_with_logits() instead of tf.reduce_mean. Also try with using sigmoid activation function.
Related
I am trying to write a custom loss function for a model that utilizes Monte Carlo (MC) dropout. I want the model to run through each sample in a batch n times before feeding the predictions to the loss function. A current toy code is shown below. The model has 24 inputs and 10 outputs with 5000 training samples.
import numpy as np
import tensorflow as tf
X = np.random.rand(5000,24)
y = np.random.rand(5000,10)
def MC_Loss(y_true,y_pred):
mu = tf.math.reduce_mean(y_pred,axis=0)
#error = tf.square(y_true-mu)
error = tf.square(y_true-y_pred)
var = tf.math.reduce_variance(y_pred,axis=0)
return tf.math.divide(error,var)/2 + tf.math.log(var)/2 + tf.math.log(2*np.pi)/2
input_layer = tf.keras.layers.Input(shape=(X.shape[1],))
hidden_layer = tf.keras.layers.Dense(units=100,activation='elu')(input_layer)
do_layer = tf.keras.layers.Dropout(rate=0.20)(hidden_layer,training=True)
output_layer = tf.keras.layers.Dense(units=10,activation='sigmoid')(do_layer)
model = tf.keras.models.Model(input_layer,output_layer)
model.compile(loss=MC_Loss,optimizer='Adam')
model.fit(X,y,epochs=100,batch_size=128,shuffle=True)
The current shape of y_true and y_pred are (None,10) with None being the batch_size. I want to be able to have n values for each sample in the batch, so I can get the mean and standard deviation for each sample to use in the loss function. I want these value, because the mean and standard deviation should be unique to each sample, not taken across all samples in a batch. The current shape of mu and sigma are (10,) and I would want them to be (None,10) which would mean y_true and y_pred have the shape (None,n,10).
How can I accomplish this?
I believe I found the solution after some experimentation. The modified code is shown below.
import numpy as np
import tensorflow as tf
n = 100
X = np.random.rand(5000,24)
X1 = np.concatenate(([X.reshape(X.shape[0],1,X.shape[1]) for _ in range(n)]),axis=1)
y = np.random.rand(5000,10)
y1 = np.concatenate(([y.reshape(y.shape[0],1,y.shape[1]) for _ in range(n)]),axis=1)
def MC_Loss(y_true,y_pred):
mu = tf.math.reduce_mean(y_pred,axis=1)
obs = tf.math.reduce_mean(y_true,axis=1)
error = tf.square(obs-mu)
var = tf.math.reduce_variance(y_pred,axis=1)
return tf.math.divide(error,var)/2 + tf.math.log(var)/2 + tf.math.log(2*np.pi)/2
input_layer = tf.keras.layers.Input(shape=(X.shape[1]))
hidden_layer = tf.keras.layers.Dense(units=100,activation='elu')(input_layer)
do_layer = tf.keras.layers.Dropout(rate=0.20)(hidden_layer,training=True)
output_layer = tf.keras.layers.Dense(units=10,activation='sigmoid')(do_layer)
model = tf.keras.models.Model(input_layer,output_layer)
model.compile(loss=MC_Loss,optimizer='Adam')
model.fit(X1,y1,epochs=100,batch_size=128,shuffle=True)
So what I am now doing is stacking the inputs and outputs about an intermediate axis, creating n identical sets of all input and output samples. While tensorflow shows a warning because the model is created without knowledge of this intermediate axis. It still trains with no issues and the shapes are as expected.
Note: since y_true now has the shape (None,n,10), you have to take the mean about the intermediate axis which gives you the true value since all n are identical.
I have a pre-trained model that I load and it effectively works (i.e. I can make predictions). I want to get the gradients of the model for a certain parameter, however I cannot manage to get any meaningful results. Always a Noneoutput.
My code:
sess = tf.Session()
K.set_session(sess)
x = X_test[0].reshape(1,100)
y = np.reshape(Y_test[0], (1,1))
tf_y = tf.convert_to_tensor(y,dtype=np.float32)
model2 = ClassificationModel(config, logging).model
model2.load_weights("class_models/model.382-0.46-0.87.h5")
# predict real x_test
y_hat = model2.predict(x)
tf_y_hat = tf.convert_to_tensor(y_hat, dtype=np.float32)
loss = keras.losses.binary_crossentropy(tf_y,tf_y_hat)
grad, = K.gradients(loss,x)
print(grad)
And the output I get for the print is None. What am I doing wrong? How do I get the gradient given my model?
With your current code, tensorflow cannot connect x to the computational graph of loss since loss is created from a numpy array (y_hat) and x is also just a numpy array. The following code should work instead:
tf_x = tf.convert_to_tensor(x, dtype=np.float32)
loss = tf.keras.losses.binary_crossentropy(tf_y, model2(tf_x))
grad, = K.gradients(loss, tf_x)
import tensorflow as tf
x = tf.placeholder(tf.float32, [None,4]) # input vector
w1 = tf.Variable(tf.random_normal([4,2])) # weights between first and second layers
b1 = tf.Variable(tf.zeros([2])) # biases added to hidden layer
w2 = tf.Variable(tf.random_normal([2,1])) # weights between second and third layer
b2 = tf.Variable(tf.zeros([1])) # biases added to third (output) layer
def feedForward(x,w,b): # function for forward propagation
Input = tf.add(tf.matmul(x,w), b)
Output = tf.sigmoid(Input)
return Output
>>> Out1 = feedForward(x,w1,b1) # output of first layer
>>> Out2 = feedForward(Out1,w2,b2) # output of second layer
>>> MHat = 50*Out2 # final prediction is in the range (0,50)
>>> M = tf.placeholder(tf.float32, [None,1]) # placeholder for actual (target value of marks)
>>> J = tf.reduce_mean(tf.square(MHat - M)) # cost function -- mean square errors
>>> train_step = tf.train.GradientDescentOptimizer(0.05).minimize(J) # minimize J using Gradient Descent
>>> sess = tf.InteractiveSession() # create interactive session
>>> tf.global_variables_initializer().run() # initialize all weight and bias variables with specified values
>>> xs = [[1,3,9,7],
[7,9,8,2], # x training data
[2,4,6,5]]
>>> Ms = [[47],
[43], # M training data
[39]]
>>> for _ in range(1000): # performing learning process on training data 1000 times
sess.run(train_step, feed_dict = {x:xs, M:Ms})
>>> print(sess.run(MHat, feed_dict = {x:[[1,3,9,7]]}))
[[ 50.]]
>>> print(sess.run(MHat, feed_dict = {x:[[1,15,9,7]]}))
[[ 50.]]
>>> print(sess.run(tf.transpose(MHat), feed_dict = {x:[[1,15,9,7]]}))
[[ 50.]]
In this code, I am trying to predict the marks M of a student out of 50 given how many hours he/she slept, studied, used electronics and played. These 4 features come under the input feature vector x.
To solve this regression problem, I am using a deep neural network with
an input layer with 4 perceptrons (the input features) , a hidden layer with two perceptrons and an output layer with one perceptron. I have used sigmoid as activation function. But, I am getting the exact same prediction([[50.0]]) for M for all possible input vectors I feed in. Can someone please tell me
what is wrong with the code below. I HIGHLY APPRECIATE THE HELP! (IN ADVANCE)
You would need to modify your feedforward() function. Here you don't need to apply sigmoid() at last layer (simply return the activation function!) and also no need to multiply output of this function by 50.
def feedForward(X,W1,b1,W2,b2):
Z=tf.sigmoid(tf.matmul(X,W1)+b1)
return tf.matmul(Z,W2)+b2
MHat = feedForward(x,w1,b1,w2,b2)
Hope this helps!
Don't forget to let us know if it solved your problem :)
I have highly unbalanced data in a two class problem that I am trying to use TensorFlow to solve with a NN. I was able to find a posting that exactly described the difficulty that I'm having and gave a solution which appears to address my problem. However I'm working with an assistant, and neither of us really knows python and so TensorFlow is being used like a black box for us. I have extensive (decades) of experience working in a variety of programming languages in various paradigms. That experience allows me to have a pretty good intuitive grasp of what I see happening in the code my assistant cobbled together to get a working model, but neither of us can follow what is going on enough to be able to tell exactly where in TensorFlow we need to make edits to get what we want.
I'm hoping someone with a good knowledge of Python and TensorFlow can look at this and just tell us something like, "Hey, just edit the file called xxx and at the lines at yyy," so we can get on with it.
Below, I have a link to the solution we want to implement, and I've also included the code my assistant wrote that initially got us up and running. Our code produces good results when our data is balanced, but when highly imbalanced, it tends to classify everything skewed to the larger class to get better results.
Here is a link to the solution we found that looks promising:
Loss function for class imbalanced binary classifier in Tensor flow
I've included the relevant code from this link below. Since I know that where we make these edits will depend on how we are using TensorFlow, I've also included our implementation immediately under it in the same code block with appropriate comments to make it clear what we want to add and what we are currently doing:
# Here is the stuff we need to add some place in the TensorFlow source code:
ratio = 31.0 / (500.0 + 31.0)
class_weight = tf.constant([[ratio, 1.0 - ratio]])
logits = ... # shape [batch_size, 2]
weight_per_label = tf.transpose( tf.matmul(labels
, tf.transpose(class_weight)) ) #shape [1, batch_size]
# this is the weight for each datapoint, depending on its label
xent = tf.mul(weight_per_label
, tf.nn.softmax_cross_entropy_with_logits(logits, labels, name="xent_raw") #shape [1, batch_size]
loss = tf.reduce_mean(xent) #shape 1
# NOW HERE IS OUR OWN CODE TO SHOW HOW WE ARE USING TensorFlow:
# (Obviously this is not in the same file in real life ...)
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import tensorflow as tf
import numpy as np
from math import exp
from PreProcessData import load_and_process_training_Data,
load_and_process_test_data
from PrintUtilities import printf, printResultCompare
tf.set_random_seed(0)
#==============================================================
# predefine file path
''' Unbalanced Training Data, hence there are 1:11 target and nontarget '''
targetFilePath = '/Volumes/Extend/BCI_TestData/60FeaturesVersion/Train1-35/tar.txt'
nontargetFilePath = '/Volumes/Extend/BCI_TestData/60FeaturesVersion/Train1-35/nontar.txt'
testFilePath = '/Volumes/Extend/BCI_TestData/60FeaturesVersion/Test41/feats41.txt'
labelFilePath = '/Volumes/Extend/BCI_TestData/60FeaturesVersion/Test41/labs41.txt'
# train_x,train_y =
load_and_process_training_Data(targetFilePath,nontargetFilePath)
train_x, train_y =
load_and_process_training_Data(targetFilePath,nontargetFilePath)
# test_x,test_y = load_and_process_test_data(testFilePath,labelFilePath)
test_x, test_y = load_and_process_test_data(testFilePath,labelFilePath)
# trained neural network path
save_path = "nn_saved_model/model.ckpt"
# number of classes
n_classes = 2 # in this case, target or non_target
# number of hidden layers
num_hidden_layers = 1
# number of nodes in each hidden layer
nodes_in_layer1 = 40
nodes_in_layer2 = 100
nodes_in_layer3 = 30 # We think: 3 layers is dangerous!! try to avoid it!!!!
# number of data features in each blocks
block_size = 3000 # computer may not have enough memory, so we divide the train into blocks
# number of times we iterate through training data
total_iterations = 1000
# terminate training if computed loss < supposed loss
expected_loss = 0.1
# max learning rate and min learnign rate
max_learning_rate = 0.002
min_learning_rate = 0.0002
# These are placeholders for some values in graph
# tf.placeholder(dtype, shape=None(optional), name=None(optional))
# It's a tensor to hold our datafeatures
x = tf.placeholder(tf.float32, [None,len(train_x[0])])
# Every row has either [1,0] for targ or [0,1] for non_target. placeholder to hold one hot value
Y_C = tf.placeholder(tf.int8, [None, n_classes])
# variable learning rate
lr = tf.placeholder(tf.float32)
# neural network model
def neural_network_model(data):
if (num_hidden_layers == 1):
# layers contain weights and bias for case like all neurons fired a 0 into the layer, we will need result out
# When using RELUs, make sure biases are initialised with small *positive* values for example 0.1 = tf.ones([K])/10
hidden_1_layer = {'weights': tf.Variable(tf.random_normal([len(train_x[0]), nodes_in_layer1])),
'bias': tf.Variable(tf.ones([nodes_in_layer1]) / 10)}
# no more bias when come to the output layer
output_layer = {'weights': tf.Variable(tf.random_normal([nodes_in_layer1, n_classes])),
'bias': tf.Variable(tf.zeros([n_classes]))}
# multiplication of the raw input data multipled by their unique weights (starting as random, but will be optimized)
l1 = tf.add(tf.matmul(data, hidden_1_layer['weights']), hidden_1_layer['bias'])
l1 = tf.nn.relu(l1)
# We repeat this process for each of the hidden layers, all the way down to our output, where we have the final values still being the multiplication of the input and the weights, plus the output layer's bias values.
Ylogits = tf.matmul(l1, output_layer['weights']) + output_layer['bias']
if (num_hidden_layers == 2):
# layers contain weights and bias for case like all neurons fired a 0 into the layer, we will need result out
# When using RELUs, make sure biases are initialised with small *positive* values for example 0.1 = tf.ones([K])/10
hidden_1_layer = {'weights': tf.Variable(tf.random_normal([len(train_x[0]), nodes_in_layer1])),
'bias': tf.Variable(tf.ones([nodes_in_layer1]) / 10)}
hidden_2_layer = {'weights': tf.Variable(tf.random_normal([nodes_in_layer1, nodes_in_layer2])),
'bias': tf.Variable(tf.ones([nodes_in_layer2]) / 10)}
# no more bias when come to the output layer
output_layer = {'weights': tf.Variable(tf.random_normal([nodes_in_layer2, n_classes])),
'bias': tf.Variable(tf.zeros([n_classes]))}
# multiplication of the raw input data multipled by their unique weights (starting as random, but will be optimized)
l1 = tf.add(tf.matmul(data, hidden_1_layer['weights']), hidden_1_layer['bias'])
l1 = tf.nn.relu(l1)
l2 = tf.add(tf.matmul(l1, hidden_2_layer['weights']), hidden_2_layer['bias'])
l2 = tf.nn.relu(l2)
# We repeat this process for each of the hidden layers, all the way down to our output, where we have the final values still being the multiplication of the input and the weights, plus the output layer's bias values.
Ylogits = tf.matmul(l2, output_layer['weights']) + output_layer['bias']
if (num_hidden_layers == 3):
# layers contain weights and bias for case like all neurons fired a 0 into the layer, we will need result out
# When using RELUs, make sure biases are initialised with small *positive* values for example 0.1 = tf.ones([K])/10
hidden_1_layer = {'weights':tf.Variable(tf.random_normal([len(train_x[0]), nodes_in_layer1])), 'bias':tf.Variable(tf.ones([nodes_in_layer1]) / 10)}
hidden_2_layer = {'weights':tf.Variable(tf.random_normal([nodes_in_layer1, nodes_in_layer2])), 'bias':tf.Variable(tf.ones([nodes_in_layer2]) / 10)}
hidden_3_layer = {'weights':tf.Variable(tf.random_normal([nodes_in_layer2, nodes_in_layer3])), 'bias':tf.Variable(tf.ones([nodes_in_layer3]) / 10)}
# no more bias when come to the output layer
output_layer = {'weights':tf.Variable(tf.random_normal([nodes_in_layer3, n_classes])), 'bias':tf.Variable(tf.zeros([n_classes]))}
# multiplication of the raw input data multipled by their unique weights (starting as random, but will be optimized)
l1 = tf.add(tf.matmul(data,hidden_1_layer['weights']), hidden_1_layer['bias'])
l1 = tf.nn.relu(l1)
l2 = tf.add(tf.matmul(l1,hidden_2_layer['weights']), hidden_2_layer['bias'])
l2 = tf.nn.relu(l2)
l3 = tf.add(tf.matmul(l2,hidden_3_layer['weights']), hidden_3_layer['bias'])
l3 = tf.nn.relu(l3)
# We repeat this process for each of the hidden layers, all the way down to our output, where we have the final values still being the multiplication of the input and the weights, plus the output layer's bias values.
Ylogits = tf.matmul(l3,output_layer['weights']) + output_layer['bias']
return Ylogits # return the neural network model
# set up the training process
def train_neural_network(x):
# produce the prediction base on output of nn model
Ylogits = neural_network_model(x)
# measure the error use build in cross entropy function, the value that we want to minimize
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=Ylogits, labels=Y_C))
# To optimize our cost (cross_entropy), reduce error, default learning_rate is 0.001, but you can change it, this case we use default
# optimizer = tf.train.GradientDescentOptimizer(0.003)
optimizer = tf.train.AdamOptimizer(lr)
train_step = optimizer.minimize(cross_entropy)
# start the session
with tf.Session() as sess:
# We initialize all of our variables first before start
sess.run(tf.global_variables_initializer())
# iterate epoch count time (cycles of feed forward and back prop), each epoch means neural see through all train_data once
for epoch in range(total_iterations):
# count the total cost per epoch, declining mean better result
epoch_loss=0
i=0
decay_speed = 150
# current learning rate
learning_rate = min_learning_rate + (max_learning_rate - min_learning_rate) * exp(-epoch/decay_speed)
# divide the dataset in to data_set/batch_size in case run out of memory
while i < len(train_x):
# load train data
start = i
end = i + block_size
batch_x = np.array(train_x[start:end])
batch_y = np.array(train_y[start:end])
train_data = {x: batch_x, Y_C: batch_y, lr: learning_rate}
# train
# sess.run(train_step,feed_dict=train_data)
# run optimizer and cost against batch of data.
_, c = sess.run([train_step, cross_entropy], feed_dict=train_data)
epoch_loss += c
i+=block_size
# print iteration status
printf("epoch: %5d/%d , loss: %f", epoch, total_iterations, epoch_loss)
# terminate training when loss < expected_loss
if epoch_loss < expected_loss:
break
# how many predictions we made that were perfect matches to their labels
# test model
# test data
test_data = {x:test_x, Y_C:test_y}
# calculate accuracy
correct_prediction = tf.equal(tf.argmax(Ylogits, 1), tf.argmax(Y_C, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, 'float'))
print('Accuracy:',accuracy.eval(test_data))
# result matrix, return the position of 1 in array
result = (sess.run(tf.argmax(Ylogits.eval(feed_dict=test_data),1)))
answer = []
for i in range(len(test_y)):
if test_y[i] == [0,1]:
answer.append(1)
elif test_y[i]==[1,0]:
answer.append(0)
answer = np.array(answer)
printResultCompare(result,answer)
# save the prediction of correctness
np.savetxt('nn_prediction.txt', Ylogits.eval(feed_dict={x: test_x}), delimiter=',',newline="\r\n")
# save the nn model for later use again
# 'Saver' op to save and restore all the variables
saver = tf.train.Saver()
saver.save(sess, save_path)
#print("Model saved in file: %s" % save_path)
# load the trained neural network model
def test_loaded_neural_network(trained_NN_path):
Ylogits = neural_network_model(x)
saver = tf.train.Saver()
with tf.Session() as sess:
# load saved model
saver.restore(sess, trained_NN_path)
print("Loading variables from '%s'." % trained_NN_path)
np.savetxt('nn_prediction.txt', Ylogits.eval(feed_dict={x: test_x}), delimiter=',',newline="\r\n")
# test model
# result matrix
result = (sess.run(tf.argmax(Ylogits.eval(feed_dict={x:test_x}),1)))
# answer matrix
answer = []
for i in range(len(test_y)):
if test_y[i] == [0,1]:
answer.append(1)
elif test_y[i]==[1,0]:
answer.append(0)
answer = np.array(answer)
printResultCompare(result,answer)
# calculate accuracy
correct_prediction = tf.equal(tf.argmax(Ylogits, 1), tf.argmax(Y_C, 1))
print(Ylogits.eval(feed_dict={x: test_x}).shape)
train_neural_network(x)
#test_loaded_neural_network(save_path)
So, can anyone help point us to the right place to make the edits that we need to make to resolve our problem? (i.e. what is the name of the file we need to edit, and where is it located.) Thanks in advance!
-gt-
The answer you want:
You should add these codes in your train_neural_network(x) function.
ratio = (num of classes 1) / ((num of classes 0) + (num of classes 1))
class_weight = tf.constant([[ratio, 1.0 - ratio]])
Ylogits = neural_network_model(x)
weight_per_label = tf.transpose( tf.matmul(Y_C , tf.transpose(class_weight)) )
cross_entropy = tf.reduce_mean( tf.mul(weight_per_label, tf.nn.softmax_cross_entropy_with_logits(logits=Ylogits, labels=Y_C) ) )
optimizer = tf.train.AdamOptimizer(lr)
train_step = optimizer.minimize(cross_entropy)
instead of these lines:
Ylogits = neural_network_model(x)
# measure the error use build in cross entropy function, the value that we want to minimize
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=Ylogits, labels=Y_C))
# To optimize our cost (cross_entropy), reduce error, default learning_rate is 0.001, but you can change it, this case we use default
# optimizer = tf.train.GradientDescentOptimizer(0.003)
optimizer = tf.train.AdamOptimizer(lr)
train_step = optimizer.minimize(cross_entropy)
More Details:
Since in neural network, we calculate the error of prediction with respect to the targets( the true labels ), in your case, you use the cross entropy error which finds the sum of targets multiple Log of predicted probabilities.
The optimizer of network backpropagates to minimize the error to achieve more accuracy.
Without weighted loss, the weight for each class are equals, so optimizer reduce the error for the classes which have more amount and overlook the other class.
So in order to prevent this phenomenon, we should force the optimizer to backpropogate larger error for class with small amount, to do this we should multiply the errors with a scalar.
I hope it was useful :)
I've wrote my first Tensorflow program ( using my own data) . It works well at least it doesn't crash! but I'm getting a wired accuracy values either 0 oder 1 ?
.................................
the previous part of the code, is only about handeling csv file an getting Data in correct format / shapes
......................................................
# Tensoflow
x = tf.placeholder(tf.float32,[None,len(Training_Data[0])],name='Train_data')# each input has a 457 lenght
y_ = tf.placeholder(tf.float32,[None, numberOFClasses],name='Labels')#
#w = tf.Variable(tf.zeros([len(Training_Data[0]),numberOFClasses]),name='Weights')
w = tf.Variable(tf.truncated_normal([len(Training_Data[0]),numberOFClasses],stddev=1./10),name='Weights')
b = tf.Variable(tf.zeros([numberOFClasses]),name='Biases')
model = tf.add(tf.matmul(x,w),b)
y = tf.nn.softmax(model)
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
#cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for j in range(len(train_data)):
if(np.shape(train_data) == (batchSize,numberOFClasses)):
sess.run(train_step,feed_dict={x:train_data[j],y_:np.reshape(train_labels[j],(batchSize,numberOFClasses)) })
correct_prediction = tf.equal(tf.arg_max(y,1),tf.arg_max(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,"float"))
accuracy_vector= []
current_class =[]
for i in range(len(Testing_Data)):
if( np.shape(Testing_Labels[i]) == (numberOFClasses,)):
accuracy_vector.append(sess.run(accuracy,feed_dict={x:np.reshape(Testing_Data[i],(1,457)),y_:np.reshape(Testing_Labels[i],(1,19))}))#,i)#,Test_Labels[i])
current_class.append(int(Test_Raw[i][-1]))
ploting theaccuracy_vector delivers the following :
[]
any idea what I'm missing here ?
thanks a lot for any hint !
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
tf.nn.softmax_cross_entropy_with_logits wants unscaled logits.
From the doc:
WARNING: This op expects unscaled logits, since it performs a softmax on logits internally for efficiency. Do not call this op with the output of softmax, as it will produce incorrect results.
this means that the line y = tf.nn.softmax(model) is wrong.
Instead, you want to pass unscaled logits to that function, thus:
y = model
Moreover, once you fix this problem, if the network doesn't work, try to lower the learning rate from 0.01 to something about 1e-3 or 1e-4. (I tell you this because 1e-2 usually is an "high" learning rate)
You're testing on batches of size 1, so either the prediction is good or it's false, so you can only get 0 or 1 accuracy:
accuracy_vector.append(sess.run(accuracy,feed_dict={x:np.reshape(Testing_Data[i],(1,457)),y_:np.reshape(Testing_Labels[i],(1,19))}))#,i)#,Test_Labels[i])
Just use a bigger batch size :
accuracy_vector.append(sess.run(accuracy,feed_dict={x:np.reshape(Testing_Data[i:i+batch_size],(batch_size,457)),y_:np.reshape(Testing_Labels[i:i+batch_size],(batch_size,19))}))