I am able to perform classification with this code. It outputs the probability for each output labels. But I need to convert this so that it can predict the values. That is, I want to add a regression layer at the end instead of softmax. How can I achieve this? Let's say for example I trained the model for label 1,2,3,4,5. But I want the model to predict the values beyond those 5 labels. Example, Given the input, the model may predict 1.3 or 2.5, etc. I want a continuous output rather than a discrete output.
Update
I am trying to achieve a suggested solution from this question
Here
Let's say I have a training data. I train the model for whole number temperatures like 1,2,3,4,5 degrees. Basically, Those output temperatures are the labels. How can I predict the values that lies between two temperatures like 2.5 degree. It is not possible to train for every values of temperature. How can I achieve this?
My model gives probability of each class predicted
Temp Probability
1 .01
2 .05
3 .56
4 .24
5 .14
I want my model to predict the temperature values like 1.2, 2.7, etc. instead of predicting the probability of each class.
input_height = 1 # 1-Dimensional convulotion
input_width = 90 #window
num_labels = 5 #output labels
num_channels = 8 #input columns
batch_size = 10
kernel_size = 60
depth = 60
num_hidden = 1000
learning_rate = 0.0001
training_epochs = 8
total_batches = train_x.shape[0] # batch_size
X = tf.placeholder(tf.float32, shape=[None,input_height,input_width,num_channels],name="input")
# X = tf.placeholder(tf.float32, shape=[None,input_width * num_channels], name="input")
# X_reshaped = tf.reshape(X,[-1,1,90,3])
Y = tf.placeholder(tf.float32, shape=[None,num_labels])
c = apply_depthwise_conv(X,kernel_size,num_channels,depth)
p = apply_max_pool(c,20,2)
c = apply_depthwise_conv(p,6,depth*num_channels,depth//10)
shape = c.get_shape().as_list()
c_flat = tf.reshape(c, [-1, shape[1] * shape[2] * shape[3]])
f_weights_l1 = weight_variable([shape[1] * shape[2] * depth * num_channels * (depth//10), num_hidden])
f_biases_l1 = bias_variable([num_hidden])
f = tf.nn.tanh(tf.add(tf.matmul(c_flat, f_weights_l1),f_biases_l1))
out_weights = weight_variable([num_hidden, num_labels])
out_biases = bias_variable([num_labels])
y_ = tf.nn.softmax(tf.matmul(f, out_weights) + out_biases,name="y_")
loss = -tf.reduce_sum(Y * tf.log(y_))
optimizer = tf.train.GradientDescentOptimizer(learning_rate = learning_rate).minimize(loss)
correct_prediction = tf.equal(tf.argmax(y_,1), tf.argmax(Y,1)) #difference between correct output and expected output
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
cost_history = np.empty(shape=[1], dtype=float)
with tf.Session() as session:
tf.global_variables_initializer().run()
for epoch in range(training_epochs):
for b in range(total_batches):
offset = (b * batch_size) % (train_y.shape[0] - batch_size)
batch_x = train_x[offset:(offset + batch_size), :, :, :]
batch_y = train_y[offset:(offset + batch_size), :]
_, c = session.run([optimizer, loss], feed_dict={X: batch_x, Y: batch_y})
cost_history = np.append(cost_history, c)
print "Epoch: ", epoch, " Training Loss: ", c, " Training Accuracy: ",session.run(accuracy, feed_dict={X: train_x, Y: train_y})
print "Testing Accuracy:", session.run(accuracy, feed_dict={X: test_x, Y: test_y})
If you want to predict which class is detected, just do an arg_max on the output. The one with the highest probability is the detected class.
predict = tf.argmax(y_)
Related
I am building several simple networks to predict the bike rentals at 500 stations in the upcoming hour, given rentals at all stations in the previous 24 hours. I am working with two architectures, one with a graph convolution (which amounts to updating each station with a learned linear combination of other stations, at each hour) and a FNN layer to prediction, and a second with a graph convolution -> LSTM -> FNN to prediction.
Before I describe more, I'm getting poorer performance for my model which includes an LSTM unit, which is confusing me.
See these two images for a description of each architecture, for each architecture I also add hourly meta-data (weather, time, etc) as variation, they are in the images in red, and not relevant to my question. Image links at the bottom of the post.
[Architecture 1: GCNN + FNN][1]
[Architecture 2: GCNN + LSTM + FNN][2]
Confusingly, the test RMSE for the first model is 3.46, for the second model its 3.57. Could someone please explain to me why the second wouldn't be lower, as it seems to be running the exact same processes, except with an additional LSTM unit.
Here are relevant snippets of my code for the GCNN+FNN model:
def gcnn_ddgf(hidden_layer, node_num, feature_in, horizon, learning_rate, beta, batch_size, early_stop_th, training_epochs, X_training, Y_training, X_val, Y_val, X_test, Y_test, scaler, display_step):
n_output_vec = node_num * horizon # length of output vector at the final layer
early_stop_k = 0 # early stop patience
best_val = 10000
traing_error = 0
test_error = 0
pred_Y = []
tf.reset_default_graph()
batch_size = batch_size
early_stop_th = early_stop_th
training_epochs = training_epochs
# tf Graph input and output
X = tf.placeholder(tf.float32, [None, node_num, feature_in]) # X is the input signal
Y = tf.placeholder(tf.float32, [None, n_output_vec]) # y is the regression output
# define dictionaries to store layers weight & bias
weights_hidden = {}
weights_A = {}
biases = {}
vec_length = feature_in
weights_hidden['h1'] = tf.Variable(tf.random_normal([vec_length, hidden_layer], stddev=0.5))
biases['b1'] = tf.Variable(tf.random_normal([1, hidden_layer], stddev=0.5))
weights_A['A1'] = tf.Variable(tf.random_normal([node_num, node_num], stddev=0.5))
weights_hidden['out'] = tf.Variable(tf.random_normal([hidden_layer, horizon], stddev=0.5))
biases['bout'] = tf.Variable(tf.random_normal([1, horizon], stddev=0.5))
# Construct model
pred= gcn(X, weights_hidden, weights_A, biases, node_num, horizon) #see below
pred = scaler.inverse_transform(pred)
Y_original = scaler.inverse_transform(Y)
cost = tf.sqrt(tf.reduce_mean(tf.pow(pred - Y_original, 2)))
#optimizer = tf.train.RMSPropOptimizer(learning_rate, decay).minimize(cost)
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate, beta1=beta).minimize(cost)
# Initializing the variables
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
avg_cost_sq = 0.
num_train = X_training.shape[0]
total_batch = int(num_train/batch_size)
for i in range(total_batch):
_, c = sess.run([optimizer, cost], feed_dict={X: X_training[i*batch_size:(i+1)*batch_size,],
Y: Y_training[i*batch_size:(i+1)*batch_size,]})
avg_cost_sq += np.square(c) * batch_size #/ total_batch
# rest part of training dataset
if total_batch * batch_size != num_train:
_, c = sess.run([optimizer, cost], feed_dict={X: X_training[total_batch*batch_size:num_train,],
Y: Y_training[total_batch*batch_size:num_train,]})
avg_cost_sq += np.square(c) * (num_train - total_batch*batch_size)
avg_cost = np.sqrt(avg_cost_sq / num_train)
# validation
c_val, = sess.run([cost], feed_dict={X: X_val, Y: Y_val})
if c_val < best_val:
# testing
c_tes, preds, Y_true = sess.run([cost, pred, Y_original], feed_dict={X: X_test,Y: Y_test})
best_val = c_val
test_error = c_tes
traing_error = avg_cost
pred_Y = preds
early_stop_k = 0 # reset to 0
# update early stopping patience
if c_val >= best_val:
early_stop_k += 1
# threshold
if early_stop_k == early_stop_th:
break
if epoch % display_step == 0:
print ("Epoch:", '%04d' % (epoch+1), "Training RMSE: ","{:.9f}".format(avg_cost))
print("Validation RMSE: ", c_val)
print("Lowest test RMSE: ", test_error)
print("epoch is ", epoch)
print("training RMSE is ", traing_error)
print("Optimization Finished! the lowest validation RMSE is ", best_val)
print("The test RMSE is ", test_error)
return best_val, pred_Y ,Y_true,test_error
# code that creates the model
def gcn(signal_in, weights_hidden, weights_A, biases, node_num, horizon):
signal_in = tf.transpose(signal_in, [1, 0, 2]) # node_num, batch, feature_in
feature_len = signal_in.shape[2] # feature vector length at the node of the input graph
signal_in = tf.reshape(signal_in, [node_num, -1]) # node_num, batch*feature_in
Adj = 0.5*(weights_A['A1'] + tf.transpose(weights_A['A1']))
Adj = normalize_adj(Adj)
Z = tf.matmul(Adj, signal_in) # node_num, batch*feature_in
Z = tf.reshape(Z, [-1, int(feature_len)]) # node_num * batch, feature_in
signal_output = tf.add(tf.matmul(Z, weights_hidden['h1']), biases['b1'])
signal_output = tf.nn.relu(signal_output) # node_num * batch, hidden_vec
final_output = tf.add(tf.matmul(signal_output, weights_hidden['out']), biases['bout']) # node_num * batch, horizon
# final_output = tf.nn.relu(final_output)
final_output = tf.reshape(final_output, [node_num, -1, horizon]) # node_num, batch, horizon
final_output = tf.transpose(final_output, [1, 0, 2]) # batch, node_num, horizon
final_output = tf.reshape(final_output, [-1, node_num*horizon]) # batch, node_num*horizon
return final_output
And the code for the GCNN+LSTM+FNN model:
def gcnn_ddgf_lstm(node_num, feature_in, learning_rate, beta, batch_size, early_stop_th, training_epochs, X_training,
Y_training, X_val, Y_val, X_test, Y_test, scaler, lstm_layer):
n_output_vec = node_num # length of output vector at the final layer
early_stop_k = 0 # early stop patience
display_step = 1 # frequency of printing results
best_val = 10000
traing_error = 0
test_error = 0
predic_res = []
tf.reset_default_graph()
batch_size = batch_size
early_stop_th = early_stop_th
training_epochs = training_epochs
# tf Graph input and output
X = tf.placeholder(tf.float32, [None, node_num, feature_in]) # X is the input signal
Y = tf.placeholder(tf.float32, [None, n_output_vec]) # y is the regression output
lstm_cell = tf.nn.rnn_cell.LSTMCell(lstm_layer, state_is_tuple=True)
# define dictionaries to store layers weight & bias
weights_hidden = {}
weights_A = {}
biases = {}
weights_A['A1'] = tf.Variable(tf.random_normal([node_num, node_num], stddev=0.5))
weights_hidden['h1'] = tf.Variable(tf.random_normal([lstm_layer, node_num], stddev=0.5))
biases['h1'] = tf.Variable(tf.random_normal([1, node_num], stddev=0.5))
weights_hidden['out'] = tf.Variable(tf.random_normal([node_num, node_num], stddev=0.5))
biases['bout'] = tf.Variable(tf.random_normal([1, node_num], stddev=0.5))
# Construct model
pred= gcn_lstm(X, weights_hidden, weights_A, biases, node_num, lstm_cell)
# pred = scaler.inverse_transform(pred)
# Y_original = scaler.inverse_transform(Y)
cost = tf.sqrt(tf.reduce_mean(tf.pow(pred - Y, 2)))
#optimizer = tf.train.RMSPropOptimizer(learning_rate, decay).minimize(cost)
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate, beta1=beta).minimize(cost)
# Initializing the variables
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
avg_cost_sq = 0.
num_train = X_training.shape[0]
total_batch = int(num_train/batch_size)
for i in range(total_batch):
_, c = sess.run([optimizer, cost], feed_dict={X: X_training[i*batch_size:(i+1)*batch_size,],
Y: Y_training[i*batch_size:(i+1)*batch_size,]})
avg_cost_sq += np.square(c) * batch_size #/ total_batch
# rest part of training dataset
if total_batch * batch_size != num_train:
_, c = sess.run([optimizer, cost], feed_dict={X: X_training[total_batch*batch_size:num_train,],
Y: Y_training[total_batch*batch_size:num_train,]})
avg_cost_sq += np.square(c) * (num_train - total_batch*batch_size)
avg_cost = np.sqrt(avg_cost_sq / num_train)
# validation
c_val, = sess.run([cost], feed_dict={X: X_val, Y: Y_val})
if c_val < best_val:
c_tes, preds = sess.run([cost, pred], feed_dict={X: X_test,Y: Y_test})
best_val = c_val
# save model
#saver.save(sess, './bikesharing_gcnn_ddgf')
test_error = c_tes
traing_error = avg_cost
early_stop_k = 0 # reset to 0
# update early stopping patience
if c_val >= best_val:
early_stop_k += 1
# threshold
if early_stop_k == early_stop_th:
pred_Y = scaler.inverse_transform(preds)
Y_true = scaler.inverse_transform(Y_test)
test_err = tf.sqrt(tf.reduce_mean(tf.pow(pred_Y - Y_true, 2)))
break
if epoch % display_step == 0:
print ("Epoch:", '%04d' % (epoch+1), "Training RMSE: ","{:.9f}".format(avg_cost))
print("Validation RMSE: ", c_val)
print("Lowest test RMSE: ", test_error)
print("epoch is ", epoch)
print("training RMSE is ", traing_error)
print("Optimization Finished! the lowest validation RMSE is ", best_val)
print("The scaled test RMSE is ", test_error)
return pred_Y, Y_true
def gcn_lstm(signal_in, weights_hidden, weights_A, biases, node_num, lstm_cell):
signal_in = tf.transpose(signal_in, [1, 0, 2]) # node_num, batch, feature_in
feature_len = signal_in.shape[2] # feature vector length at the node of the input graph
signal_in = tf.reshape(signal_in, [node_num, -1]) # node_num, batch*feature_in
Adj = 0.5*(weights_A['A1'] + tf.transpose(weights_A['A1']))
Adj = normalize_adj(Adj)
Z = tf.matmul(Adj, signal_in) # node_num, batch*feature_in
Z = tf.reshape(Z, [node_num, -1, int(feature_len)]) # node_num, batch, feature_in
Z = tf.transpose(Z,[1,2,0]) # batch, feature_in, node_num
# init_state = cell.zero_state(batch_size, tf.float32)
_, Z = tf.nn.dynamic_rnn(lstm_cell, Z, dtype = tf.float32) # init_state?
dense_output = tf.add(tf.matmul(Z[1], weights_hidden['h1']), biases['h1'])
dense_output = tf.nn.relu(dense_output)
final_output = tf.add(tf.matmul(dense_output, weights_hidden['out']), biases['bout']) # batch, node_num*horizon
return final_output
In particular, should I be weary that _, Z = tf.nn.dynamic_rnn(lstm_cell, Z, dtype = tf.float32) causes my variables defined elsewhere not to train?
Thanks a lot for any help :)
[1]: https://i.stack.imgur.com/MAO2t.png
[2]: https://i.stack.imgur.com/UDjHw.png
I resolved this.
I have three years of bike use data to make the prediction, and was using the ~last three months as my validation/test set. The last few months were winter with lower bike use. I got expected results (GCNN+LSTM outperforms GCNN, though not by much) when I shuffled my training data prior to allocating to sets (with sequences preserved for LSTM)
I have a training data, with 1000 rows. I am using Tensorflow for training this data. Also trying to divide this into mini-batches of size 32. While Training the data, i am getting the error as mentioned below
InvalidArgumentError: Incompatible shapes: [1000] vs. [32]
[[{{node logistic_loss_1/mul}}]]
On the contrary, if i don't divide my training data into minibatches, or use a single minibatch of size 1000, the code works fine.
I have defined weights as tf.Variables and running the tensorflow session. See the code below
def sigmoid_cost(z,Y):
print("Entered Cost")
z = tf.squeeze(z)
Y = tf.cast(Y_train,tf.float64)
logits = tf.transpose(z)
labels = (Y)
print(logits.shape)
print(labels.shape)
return tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=labels,logits=logits))
def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001,
num_epochs = 1500, minibatch_size = 32, print_cost = True):
hidden_layer = 4
m,n = X_train.shape
n_y = Y_train.shape[0]
X = tf.placeholder(tf.float64,shape=(None,n), name="X")
Y = tf.placeholder(tf.float64,shape=(None),name="Y")
parameters = init_params(n)
z4, parameters = fwd_model(X,parameters)
cost = sigmoid_cost(z4,Y)
num_minibatch = m/minibatch_size
print("Getting Minibatches")
num_minibatch = tf.cast(num_minibatch,tf.int32)
optimizer = tf.train.GradientDescentOptimizer(learning_rate = learning_rate).minimize(cost)
print("Gradient Defination Done")
init = tf.global_variables_initializer()
init_op = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init)
sess.run(init_op)
for epoch in range(0,num_epochs):
minibatches = []
minibatches = minibatch(X_train,Y_train,minibatch_size)
minibatch_cost = 0
for i in range (0,len(minibatches)):
(X_m,Y_m) = minibatches[i]
Y_m = np.squeeze(Y_m)
print("Minibatch %d X shape Y Shape ",i, X_m.shape,Y_m.shape)
_ , minibatch_cost = sess.run([optimizer, cost], feed_dict={X: X_m, Y: Y_m})
print("Mini Batch Cost is ",minibatch_cost)
epoch_cost = minibatch_cost/num_minibatch
if print_cost == True and epoch % 100 == 0:
print ("Cost after epoch %i: %f" % (epoch, epoch_cost))
print(epoch_cost)
For some reason, while running the cost function the size of either X or Y batch is being taken as 32, 100 or vice-versa. Any help would be appreciated.
I think you are getting above error because of Y = tf.cast(Y_train, tf.float64) line inside sigmoid_cost function. Here, Y_train has 1000 rows, but loss function is expecting 32(which is your batch size).
It should be Y = tf.cast(Y, tf.float64). Infact, there is no need to cast data type here as Y is already of type tf.float64. Check below line:
Y = tf.placeholder(tf.float64,shape=(None),name="Y")
That's why, when you were using a single minibatch of size 1000(full Y_train data), your code was working fine.
I'm trying to create a RNN to guess what notes are being played on a piano, given a sound file of piano notes (WAV format). I'm currently cutting the WAV clips into ten-second chunks (2D), padding shorter sections to 10 seconds with zeroes so the input is all regular. However, when I pass in the clips to the RNN, it gives an output of one less dimension (1D) (when taking the last state - should I be taking the state series?).
I've created a simpler RNN to analyze single notes files (2D) and produce one output (1D), which has been successful. However, when trying to apply this same technique to full clips with multiple notes and notes starting/stopping it seems to break down, as I can't seem to change the output shape.
def weight_variable(shape):
initer = tf.truncated_normal_initializer(stddev=0.01)
return tf.get_variable('W', dtype=tf.float32, shape=shape, initializer=initer)
def bias_variable(shape):
initial = tf.constant(0., shape=shape, dtype=tf.float32)
return tf.get_variable('b', dtype=tf.float32,initializer=initial)
def RNN(x, weights, biases, timesteps, num_hidden):
x = tf.unstack(x, timesteps, 1)
# Define a rnn cell with tensorflow
lstm_cell = rnn.LSTMCell(num_hidden)
states_series, current_state = rnn.static_rnn(lstm_cell, x, dtype=tf.float32)
return tf.matmul(current_state[1], weights) + biases
# return [tf.matmul(temp,weights) + biases for temp in states_series]
# does this even make sense
# x is for data, y is for targets, shapes are [index, time, frequency], [index, time, output note (s)] respectively
x_train, x_valid, y_train, y_valid = load_data() # removed test
print("Size of:")
print("- Training-set:\t\t{}".format(y_train.shape[0]))
print("- Validation-set:\t{}".format(y_valid.shape[0]))
# print("- Test-set\t{}".format(len(y_test)))
learning_rate = 0.001 # The optimization initial learning rate
epochs = 1000 # Total number of training epochs
batch_size = 100 # Training batch size
display_freq = 100 # Frequency of displaying the training results
threshold = 0.7 # Threshold for determining a "note"
num_hidden_units = 15 # Number of hidden units of the RNN
# Placeholders for inputs (x) and outputs(y)
x = tf.placeholder(tf.float32, shape=(None, stepCount, num_input))
y = tf.placeholder(tf.float32, shape=(None, stepCount, n_classes))
# create weight matrix initialized randomly from N~(0, 0.01)
W = weight_variable(shape=[num_hidden_units, n_classes])
# create bias vector initialized as zero
b = bias_variable(shape=[n_classes])
output_logits = RNN(x, W, b, stepCount, num_hidden_units)
y_pred = tf.nn.softmax(output_logits)
# Define the loss function, optimizer, and accuracy, etc.
# (code removed, irrelevant)
# Creating the op for initializing all variables
init = tf.global_variables_initializer()
sess = tf.InteractiveSession()
sess.run(init)
global_step = 0
# Number of training iterations in each epoch
num_tr_iter = int(y_train.shape[0] / batch_size)
for epoch in range(epochs):
print('Training epoch: {}'.format(epoch + 1))
x_train, y_train = randomize(x_train, y_train)
for iteration in range(num_tr_iter):
global_step += 1
start = iteration * batch_size
end = (iteration + 1) * batch_size
x_batch, y_batch = get_next_batch(x_train, y_train, start, end)
# Run optimization op (backprop)
feed_dict_batch = {x: x_batch, y: y_batch}
sess.run(optimizer, feed_dict=feed_dict_batch)
if iteration % display_freq == 0:
# Calculate and display the batch loss and accuracy
loss_batch, acc_batch = sess.run([loss, accuracy],
feed_dict=feed_dict_batch)
print("iter {0:3d}:\t Loss={1:.2f},\tTraining Accuracy={2:.01%}".
format(iteration, loss_batch, acc_batch))
testLoss.append(loss_batch)
testAcc.append(acc_batch)
# Run validation after every epoch
feed_dict_valid = {x: x_valid[:1000].reshape((-1, stepCount, num_input)), y: y_valid[:1000]}
loss_valid, acc_valid = sess.run([loss, accuracy], feed_dict=feed_dict_valid)
print('---------------------------------------------------------')
print("Epoch: {0}, validation loss: {1:.2f}, validation accuracy: {2:.01%}".
format(epoch + 1, loss_valid, acc_valid))
print('---------------------------------------------------------')
validLoss.append(loss_valid)
validAcc.append(acc_batch)
Currently, this is outputting a 1D array of predictions, which really does not make sense in my scenario, but I'm not sure how to change it (it should be outputting predictions for each timestep - i.e. predictions of what notes are playing at each moment in time).
I have a dataset with 5 columns, I am feeding in first 3 columns as my Inputs and the other 2 columns as my outputs. I have successfully executed the program but i am not sure how to test the model by giving my own values as input and getting a predicted output from the model.
Can anyone please help me, How can I actually test the model with my own value after training is done ? I am using Tensorflow in Python..I am able to display accuracy of testing,but How do I actually predict with value if I pass some random input(here,I need to pass 3 input values to get 2 output values)
Here,is my code:
# Implementation of a simple MLP network with one hidden layer. Tested on the iris data set.
# Requires: numpy, sklearn>=0.18.1, tensorflow>=1.0
# NOTE: In order to make the code simple, we rewrite x * W_1 + b_1 = x' * W_1'
# where x' = [x | 1] and W_1' is the matrix W_1 appended with a new row with elements b_1's.
# Similarly, for h * W_2 + b_2
import tensorflow as tf
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
import pandas as pd
RANDOM_SEED = 1000
tf.set_random_seed(RANDOM_SEED)
def init_weights(shape):
""" Weight initialization """
weights = tf.random_normal(shape, stddev=0.1)
return tf.Variable(weights)
def forwardprop(X, w_1, w_2):
"""
Forward-propagation.
IMPORTANT: yhat is not softmax since TensorFlow's softmax_cross_entropy_with_logits() does that internally.
"""
h = tf.nn.sigmoid(tf.matmul(X, w_1)) # The \sigma function
yhat = tf.matmul(h, w_2) # The \varphi function
return yhat
def get_iris_data():
""" Read the iris data set and split them into training and test sets """
df = pd.read_csv("H:\MiniThessis\Sample.csv")
train_X = np.array(df[df.columns[0:3]])
train_Y = np.array(df[df.columns[3:]])
print(train_X)
# Convert into one-hot vectors
#num_labels = len(np.unique(train_Y))
#all_Y = np.eye(num_labels)[train_Y] # One liner trick!
#print()
return train_test_split(train_X, train_Y, test_size=0.33, random_state=RANDOM_SEED)
def main():
train_X, test_X, train_y, test_y = get_iris_data()
# Layer's sizes
x_size = train_X.shape[1] # Number of input nodes: 4 features and 1 bias
h_size = 256 # Number of hidden nodes
y_size = train_y.shape[1] # Number of outcomes (3 iris flowers)
# Symbols
X = tf.placeholder("float", shape=[None, x_size])
y = tf.placeholder("float", shape=[None, y_size])
# Weight initializations
w_1 = init_weights((x_size, h_size))
w_2 = init_weights((h_size, y_size))
# Forward propagation
yhat = forwardprop(X, w_1, w_2)
predict = tf.argmax(yhat, axis=1)
# Backward propagation
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=yhat))
updates = tf.train.GradientDescentOptimizer(0.01).minimize(cost)
# Run SGD
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
for epoch in range(3):
# Train with each example
for i in range(len(train_X)):
sess.run(updates, feed_dict={X: train_X[i: i + 1], y: train_y[i: i + 1]})
train_accuracy = np.mean(np.argmax(train_y, axis=1) == sess.run(predict, feed_dict={X: train_X, y: train_y}))
test_accuracy = np.mean(np.argmax(test_y, axis=1) ==sess.run(predict, feed_dict={X: test_X, y: test_y}))
print("Epoch = %d, train accuracy = %.2f%%, test accuracy = %.2f%%"
% (epoch + 1, 100. * train_accuracy, 100. * test_accuracy))
correct_Prediction = tf.equal((tf.arg_max(predict,1)),(tf.arg_max(y,1)))
best = sess.run([predict], feed_dict={X: np.array([[20.14, 46.93, 1014.66]])})
#print(correct_Prediction)
print(best)
sess.close()
if __name__ == '__main__':
main()
I have a dataset with 5 columns, I am feeding in first 3 columns as my Inputs and the other 2 columns as my outputs.
I have successfully executed the program but i am not sure how to test the model by giving my own values as input and getting a predicted output from the model.
Can anyone please help me, How can I actually test the model with my own value after training is done ?
I am using Tensorflow in Python. I am able to display accuracy of testing,but How do I actually predict with value if I pass some random input (here,I need to pass 3 input values to get 2 output values)
Here is my code:
# Implementation of a simple MLP network with one hidden layer. Tested on the iris data set.
# Requires: numpy, sklearn>=0.18.1, tensorflow>=1.0
# NOTE: In order to make the code simple, we rewrite x * W_1 + b_1 = x' * W_1'
# where x' = [x | 1] and W_1' is the matrix W_1 appended with a new row with elements b_1's.
# Similarly, for h * W_2 + b_2
import tensorflow as tf
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
import pandas as pd
RANDOM_SEED = 1000
tf.set_random_seed(RANDOM_SEED)
def init_weights(shape):
""" Weight initialization """
weights = tf.random_normal(shape, stddev=0.1)
return tf.Variable(weights)
def forwardprop(X, w_1, w_2):
"""
Forward-propagation.
IMPORTANT: yhat is not softmax since TensorFlow's softmax_cross_entropy_with_logits() does that internally.
"""
h = tf.nn.sigmoid(tf.matmul(X, w_1)) # The \sigma function
yhat = tf.matmul(h, w_2) # The \varphi function
return yhat
def get_iris_data():
""" Read the iris data set and split them into training and test sets """
df = pd.read_csv("H:\MiniThessis\Sample.csv")
train_X = np.array(df[df.columns[0:3]])
train_Y = np.array(df[df.columns[3:]])
print(train_X)
# Convert into one-hot vectors
#num_labels = len(np.unique(train_Y))
#all_Y = np.eye(num_labels)[train_Y] # One liner trick!
#print()
return train_test_split(train_X, train_Y, test_size=0.33, random_state=RANDOM_SEED)
def main():
train_X, test_X, train_y, test_y = get_iris_data()
# Layer's sizes
x_size = train_X.shape[1] # Number of input nodes: 4 features and 1 bias
h_size = 256 # Number of hidden nodes
y_size = train_y.shape[1] # Number of outcomes (3 iris flowers)
# Symbols
X = tf.placeholder("float", shape=[None, x_size])
y = tf.placeholder("float", shape=[None, y_size])
# Weight initializations
w_1 = init_weights((x_size, h_size))
w_2 = init_weights((h_size, y_size))
# Forward propagation
yhat = forwardprop(X, w_1, w_2)
predict = tf.argmax(yhat, axis=1)
# Backward propagation
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=yhat))
updates = tf.train.GradientDescentOptimizer(0.01).minimize(cost)
# Run SGD
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
for epoch in range(3):
# Train with each example
for i in range(len(train_X)):
sess.run(updates, feed_dict={X: train_X[i: i + 1], y: train_y[i: i + 1]})
train_accuracy = np.mean(np.argmax(train_y, axis=1) == sess.run(predict, feed_dict={X: train_X, y: train_y}))
test_accuracy = np.mean(np.argmax(test_y, axis=1) ==sess.run(predict, feed_dict={X: test_X, y: test_y}))
print("Epoch = %d, train accuracy = %.2f%%, test accuracy = %.2f%%"
% (epoch + 1, 100. * train_accuracy, 100. * test_accuracy))
correct_Prediction = tf.equal((tf.arg_max(predict,1)),(tf.arg_max(y,1)))
best = sess.run([predict], feed_dict={X: np.array([[20.14, 46.93, 1014.66]])})
#print(correct_Prediction)
print(best)
sess.close()
if __name__ == '__main__':
main()
Note, this answer takes some assumptions on the input, which you did not provide in your original question.
At this point I think your homework will require some more work to be finished. The 4th and 5th column are probably the petal width (a real value) and the Iris type (encoded as a category). This means that as a result you will probably need a real valued output (petal width) and a categorical prediction. This means that your softmax_cross_entry_with_logits will probably not play nicely with the petal width. Also you are applying an argmax as part of the prediction and this willl return the index with the highest value (in this case the petal width or the catergorically encoded value) which also does not make sense. So as a debugging aid why not start with:
print(sess.run([yhat], feed_dict={X: np.array([[20.14, 46.93, 1014.66]])})
This will print out yhat, (based upon the inputs [20.14, 46.93, 1014.66] (that's a massive flower, completely unlike anything inside the dataset) and will contain 2 outputs.