I am trying to solve the Titanic Problem on Kaggle and I am unsure of how to get the output for a given test data.
I successfully train the network and call the method make_prediction(x, test_x)
x = tf.placeholder('float', [None, ip_features])
...
def make_prediction(x, test_data):
with tf.Session() as sess :
sess.run(tf.global_variables_initializer())
prediction = sess.run(y, feed_dict={x: test_data})
return prediction
I am not sure how to pass a np.array in this case test_data to get back a np.array which contains the prediction 0/1
Link to Full Code
I combined your train_neural_network and make_prediction function into one single function. Applying tf.nn.softmax to the model function would make the value range into from 0~1 (interpreted as probability), then tf.argmax extracts the column number with the higher probability. Note that the placeholder for y in this case needs to be one-hot-encoded. (If you are not one-hot-encoding y here, then pred_y=tf.round(tf.nn.softmax(model)) would convert the output of softmax into 0 or 1)
def train_neural_network_and_make_prediction(train_X, test_X):
model = neural_network_model(x)
cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(model, y) )
optimizer = tf.train.AdamOptimizer().minimize(cost)
pred_y=tf.argmax(tf.nn.softmax(model),1)
ephocs = 10
with tf.Session() as sess :
tf.initialize_all_variables().run()
for epoch in range(ephocs):
epoch_cost = 0
i = 0
while i< len(titanic_train) :
start = i
end = i+batch_size
batch_x = np.array( train_x[start:end] )
batch_y = np.array( train_y[start:end] )
_, c = sess.run( [optimizer, cost], feed_dict={x: batch_x, y: batch_y} )
epoch_cost += c
i+=batch_size
print("Epoch",epoch+1,"completed with a cost of", epoch_cost)
# make predictions on test data
predictions = pred_y.eval(feed_dict={x : test_X})
return predictions
Related
I have a training data, with 1000 rows. I am using Tensorflow for training this data. Also trying to divide this into mini-batches of size 32. While Training the data, i am getting the error as mentioned below
InvalidArgumentError: Incompatible shapes: [1000] vs. [32]
[[{{node logistic_loss_1/mul}}]]
On the contrary, if i don't divide my training data into minibatches, or use a single minibatch of size 1000, the code works fine.
I have defined weights as tf.Variables and running the tensorflow session. See the code below
def sigmoid_cost(z,Y):
print("Entered Cost")
z = tf.squeeze(z)
Y = tf.cast(Y_train,tf.float64)
logits = tf.transpose(z)
labels = (Y)
print(logits.shape)
print(labels.shape)
return tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=labels,logits=logits))
def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001,
num_epochs = 1500, minibatch_size = 32, print_cost = True):
hidden_layer = 4
m,n = X_train.shape
n_y = Y_train.shape[0]
X = tf.placeholder(tf.float64,shape=(None,n), name="X")
Y = tf.placeholder(tf.float64,shape=(None),name="Y")
parameters = init_params(n)
z4, parameters = fwd_model(X,parameters)
cost = sigmoid_cost(z4,Y)
num_minibatch = m/minibatch_size
print("Getting Minibatches")
num_minibatch = tf.cast(num_minibatch,tf.int32)
optimizer = tf.train.GradientDescentOptimizer(learning_rate = learning_rate).minimize(cost)
print("Gradient Defination Done")
init = tf.global_variables_initializer()
init_op = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init)
sess.run(init_op)
for epoch in range(0,num_epochs):
minibatches = []
minibatches = minibatch(X_train,Y_train,minibatch_size)
minibatch_cost = 0
for i in range (0,len(minibatches)):
(X_m,Y_m) = minibatches[i]
Y_m = np.squeeze(Y_m)
print("Minibatch %d X shape Y Shape ",i, X_m.shape,Y_m.shape)
_ , minibatch_cost = sess.run([optimizer, cost], feed_dict={X: X_m, Y: Y_m})
print("Mini Batch Cost is ",minibatch_cost)
epoch_cost = minibatch_cost/num_minibatch
if print_cost == True and epoch % 100 == 0:
print ("Cost after epoch %i: %f" % (epoch, epoch_cost))
print(epoch_cost)
For some reason, while running the cost function the size of either X or Y batch is being taken as 32, 100 or vice-versa. Any help would be appreciated.
I think you are getting above error because of Y = tf.cast(Y_train, tf.float64) line inside sigmoid_cost function. Here, Y_train has 1000 rows, but loss function is expecting 32(which is your batch size).
It should be Y = tf.cast(Y, tf.float64). Infact, there is no need to cast data type here as Y is already of type tf.float64. Check below line:
Y = tf.placeholder(tf.float64,shape=(None),name="Y")
That's why, when you were using a single minibatch of size 1000(full Y_train data), your code was working fine.
I want to get predictions from my trained tensor flow model. The following is the code I have for training my model.
def train_model(self, train, test, learning_rate=0.0001, num_epochs=16, minibatch_size=32, print_cost=True, graph_filename='costs'):
# Ensure that model can be rerun without overwriting tf variables
ops.reset_default_graph()
# For reproducibility
tf.set_random_seed(42)
seed = 42
# Get input and output shapes
(n_x, m) = train.images.T.shape
n_y = train.labels.T.shape[0]
costs = []
# Create placeholders of shape (n_x, n_y)
X, Y = self.create_placeholders(n_x, n_y)
# Initialize parameters
parameters = self.initialize_parameters()
# Forward propagation
Z3 = self.forward_propagation(X, parameters)
# Cost function
cost = self.compute_cost(Z3, Y)
# Backpropagation (using Adam optimizer)
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)
# Initialize variables
init = tf.global_variables_initializer()
# Start session to compute Tensorflow graph
with tf.Session() as sess:
# Run initialization
sess.run(init)
# Training loop
for epoch in range(num_epochs):
epoch_cost = 0.
num_minibatches = int(m / minibatch_size)
seed = seed + 1
for i in range(num_minibatches):
# Get next batch of training data and labels
minibatch_X, minibatch_Y = train.next_batch(minibatch_size)
# Execute optimizer and cost function
_, minibatch_cost = sess.run([optimizer, cost], feed_dict={X: minibatch_X.T, Y: minibatch_Y.T})
# Update epoch cost
epoch_cost += minibatch_cost / num_minibatches
# Print the cost every epoch
if print_cost == True:
print("Cost after epoch {epoch_num}: {cost}".format(epoch_num=epoch, cost=epoch_cost))
costs.append(epoch_cost)
# Plot costs
plt.figure(figsize=(16,5))
plt.plot(np.squeeze(costs), color='#2A688B')
plt.xlim(0, num_epochs-1)
plt.ylabel("cost")
plt.xlabel("iterations")
plt.title("learning rate = {rate}".format(rate=learning_rate))
plt.savefig(graph_filename, dpi=300)
plt.show()
# Save parameters
parameters = sess.run(parameters)
print("Parameters have been trained!")
# Calculate correct predictions
correct_prediction = tf.equal(tf.argmax(Z3), tf.argmax(Y))
# Calculate accuracy on test set
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print ("Train Accuracy:", accuracy.eval({X: train.images.T, Y: train.labels.T}))
print ("Test Accuracy:", accuracy.eval({X: test.images.T, Y: test.labels.T}))
return parameters
After training the model, I want to extract the prediction from the model.
So I add
print(sess.run(accuracy, feed_dict={X: test.images.T}))
But I am seeing the below error after running the above code:
InvalidArgumentError: You must feed a value for placeholder tensor 'Y'
with dtype float and shape [10,?]
[[{{node Y}} = Placeholderdtype=DT_FLOAT, shape=[10,?], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Any help is welcome..
The tensor accuracy is a function of the tensor correct_prediction, which in turn is a function of (among the rest) Y.
So you're correctly being told that you should feed values for that placeholder too.
I'm assuming Y hold your labels, so it should also make intuitive sense that your feed_dict would also contain the correct Y values.
Hope that helps.
Good luck!
I am training a model to predict Time Series using an RNN model. This model is trained without any issue. Here's the original code:
tf.reset_default_graph()
num_inputs = 1
num_neurons = 100
num_outputs = 1
learning_rate = 0.0001
num_train_iterations = 2000
batch_size = 1
X = tf.placeholder(tf.float32, [None, time_steps-1, num_inputs])
y = tf.placeholder(tf.float32, [None, time_steps-1, num_outputs])
cell = tf.contrib.rnn.OutputProjectionWrapper(
tf.contrib.rnn.BasicRNNCell(num_units=num_neurons, activation=tf.nn.relu),
output_size=num_outputs)
outputs, states = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32)
loss = tf.reduce_mean(tf.square(outputs - y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
train = optimizer.minimize(loss)
init = tf.global_variables_initializer()
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.75)
with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) as sess:
sess.run(init)
for iteration in range(num_train_iterations):
elx,ely = next_batch(training_data, time_steps)
sess.run(train, feed_dict={X: elx, y: ely})
if iteration % 100 == 0:
mse = loss.eval(feed_dict={X: elx, y: ely})
print(iteration, "\tMSE:", mse)
The problem comes when I change tf.contrib.rnn.BasicRNNCell to tf.contrib.rnn.BasicLSTMCell, there's a huge slowdown in speed and the loss function (MSE variable becomes NAN). My best bet is that MSE is the incorrect loss function and that I should try cross entropy. I searched for similar code and found that tf.nn.softmax_cross_entropy_with_logits() could be the solution but still don't understand how to implement it in my problem.
Usually the "NAN" occurs when your gradients blow up.
Here is some code for tf.softmax. Have a try.
#Output Layer
logit = tf.add(tf.matmul(H1,w2),b2)
cross_entropy =
tf.nn.softmax_cross_entropy_with_logits(logits=logit,labels=Y)
#Cost
cost = (tf.reduce_mean(cross_entropy))
#Optimizer
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
#Prediction
y_pred = tf.nn.softmax(logit)
pred = tf.argmax(y_pred, axis=1 )
I am doing this project in which i applied minibatches in neural network and calculating epoch cost:-
def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001, num_epochs = 1500,
minibatch_size = 32, print_cost = True):
ops.reset_default_graph()
tf.set_random_seed(1)
seed = 3
costs = []
(n_x, m) = X_train.shape
n_y = Y_train.shape[0]
#create placeholder
X, Y = create_placeholder(n_x, n_y)
# init parameter
parameters = init_parameter()
# forward prop
Z3 = forward_prop(X, parameters)
# compute cost
cost = compute_cost(Z3, Y)
# optimizer
optimizer = tf.train.AdamOptimizer(learning_rate= learning_rate).minimize(cost)
# Initialize all variables
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(num_epochs):
epoch_cost = 0
num_minibatche = int(m/ minibatch_size)
seed = seed + 1
minibatches = random_mini_batches(X_train, Y_train,
minibatch_size, seed)
for minibatch in minibatches:
(minibatch_X, minibatch_Y) = minibatch
_, minibatch_cost = sess.run([optimizer, cost], feed_dict
= {X: minibatch_X, Y: minibatch_Y})
epoch_cost += minibatch_cost / num_minibatche
# Print the cost every epoch
if print_cost == True and epoch % 100 == 0:
print ("Cost after epoch " ,epoch, np.mean(epoch_cost))
if print_cost == True and epoch % 5 == 0:
costs.append(epoch_cost)
# plot the cost
plt.plot(np.squeeze(costs))
plt.ylabel('cost')
plt.xlabel('iterations (per tens)')
plt.title("Learning rate =" + str(learning_rate))
plt.show()
# save the parameters
parameters = sess.run(parameters)
print ("Parameters have been trained!")
# Calculate the correct predictions
correct_prediction = tf.equal(tf.argmax(Z3), tf.argmax(Y))
# Calculate accuracy on the test set
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print ("Train Accuracy:", accuracy.eval({X: X_train, Y: Y_train}))
print ("Test Accuracy:", accuracy.eval({X: X_test, Y: Y_test}))
return parameters
SO when i run this code i am getting this error on line:-
---->epoch_cost += minibatch_cost / num_minibatche
---->ValueError: operands could not be broadcast together with shapes (32,) (5,) (32,)
I took minibatche_size = 32 and number of training examples = 1381
But i am totally confused why i am getting this error.
The code you've posted is missing a lot of parts, like the whole model() function, so it's difficult to debug. But based on just what we have, some things here that are supposed to be scalars are in fact, arrays.
epoch_cost starts out as a scalar zero with epoch_cost = 0. Then you add some value to it, then try to print np.mean( epoch_cost ). Why do you take the mean of a scalar? Looks like the code was different earlier, and the migration to a scalar epoch_cost was not successful.
It is easy to imagine that minibatch_cost is returned as an array from TensorFlow - one cost value for each member of the batch. In that case you would need to apply np.mean() right there, like
epoch_cost += np.mean( minibatch_cost ) / num_minibatche
Maybe even num_minibatche somehow became a vector. It comes from
num_minibatche = int(m/ minibatch_size)
and minibatch_size is supposedly 32, so that's all right. But m comes from
(n_x, m) = X_train.shape
and we know nothing of X_train. Maybe m somehow became a vector, and in turn num_minibatche too. You will need to print the value for num_minibatche once calculated and make sure it's what it's supposed to be.
Hope this helps. If you post the whole code, I can help you more.
This is the code from sentdex tutorials:
How is the data from MNIST set being transfered into the placeholder x.
Please help me , considering i am just a beginner into tensorflow, if it has something to do with the placeholder then please explain.
Thanks in advance!
"""
os.environ removes the warning
"""
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
"""
tensorflow starts below
"""
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/",one_hot=True)
# 10 classes , 0-9
"""
nodes for the hidden layers
"""
n_nodes_hl1 = 500
n_nodes_hl2 = 500
n_nodes_hl3 = 500
n_classes = 10 # 0-9
batch_size = 100
"""
placeholders
"""
x = tf.placeholder('float',[None,784]) # 784 is 28*28 ,i.e., the size of mnist images
y = tf.placeholder('float')
# y is the label of data
def neural_network_model(data):
# biases are added so that the some neurons get fired even when input_data is 0
hidden_1_layer = {'weights':tf.Variable(tf.random_normal([784,n_nodes_hl1])),'biases':tf.Variable(tf.random_normal([n_nodes_hl1]))}
hidden_2_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl1,n_nodes_hl2])),'biases':tf.Variable(tf.random_normal([n_nodes_hl2]))}
hidden_3_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl2,n_nodes_hl3])),'biases':tf.Variable(tf.random_normal([n_nodes_hl3]))}
output_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl3,n_classes])),
'biases':tf.Variable(tf.random_normal([n_classes]))}
# (input_data * weights) + biases
l1 = tf.add(tf.matmul(data,hidden_1_layer['weights']) , hidden_1_layer['biases'])
l1 = tf.nn.relu(l1) # activation func
l2 = tf.add(tf.matmul(l1,hidden_2_layer['weights']) , hidden_2_layer['biases'])
l2 = tf.nn.relu(l2) # activation func
l3 = tf.add(tf.matmul(l2,hidden_3_layer['weights']) , hidden_3_layer['biases'])
l3 = tf.nn.relu(l3) # activation func
output = tf.matmul(l3,output_layer['weights']) + output_layer['biases']
return output
# we now have modeled a neural network
def train_neural_network(x):
prediction = neural_network_model(x)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction,labels=y))
# softmax_cross_entropy_with_logits ==> for changing weights
# we wanna minimize the difference
# AdamOptimizer optionally has a learning_reate : 0.0001
optimizer = tf.train.AdamOptimizer().minimize(cost)
hm_epochs = 5 # cycles of feed forward + back
with tf.Session() as sess:
sess.run(tf.global_variables_initializer()) # replace it with global_variable_initializer
for epoch in range(hm_epochs):
epoch_loss = 0
for _ in range(int(mnist.train.num_examples/batch_size)):
epoch_x,epoch_y = mnist.train.next_batch(batch_size)
_,c = sess.run([optimizer, cost], feed_dict = {x: epoch_x, y: epoch_y})
epoch_loss += c
print('Epoch',epoch,'completed out of',hm_epochs,' loss:',epoch_loss)
correct = tf.equal(tf.argmax(prediction,1),tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct,'float')) # cast changes the data type of a tensor
print('Accuracy: ',accuracy.eval({x:mnist.test.images,y:mnist.test.labels}))
if __name__ == "__main__":
train_neural_network(x)
To see where the MNIST data is transferred into the tf.placeholder() tensors x and y, focus on these lines:
for _ in range(int(mnist.train.num_examples/batch_size)):
epoch_x, epoch_y = mnist.train.next_batch(batch_size)
_, c = sess.run([optimizer, cost], feed_dict = {x: epoch_x, y: epoch_y})
The arrays epoch_x and epoch_y are a pair of (somewhat confusingly named) NumPy arrays that contain a batch of batch_size images and labels respectively from the MNIST training data set. They will contain a different batch in each iteration of the for loop.
The feed_dict argument to sess.run() tells TensorFlow to substitute the value of epoch_x for placeholder x, and the value of epoch_y for placeholder y. Thus TensorFlow will use those values to run a step of the optimization algorithm (Adam, in this case).
Note that the MNIST data is also used on this line:
print('Accuracy: ', accuracy.eval({x: mnist.test.images, y: mnist.test.labels}))
...except here the program is using the entire test data set to evaluate the accuracy of the model.