Constant predicted value for MLPRegressor

Constant predicted value for MLPRegressor - python

I want to perform a neural network regression on a data set. For testing purposes I have sample it down to 10000 rows. The input is 3 columns, the output is 1 column. I use the code below (I've replaced variable names).
import pandas as pd
import numpy as np
import os
from sklearn.neural_network import MLPRegressor
"""
Prepare
"""
train = os.path.join(r'C:\Documents and Settings\', 'input.csv')
df = pd.read_csv(train)
df = df[['A', 'B', 'C','D']]
df = df.dropna().sample(n=10000)
y = df['D'].as_matrix().reshape(10000,1)
x = df[['A', 'B','C']].as_matrix().reshape(10000,3)
print x
print y
print "Length before regression, x: %s, y: %s" % (x.shape, y.shape)
"""
Regression
"""
mlp = MLPRegressor(hidden_layer_sizes=(5, ), activation='relu', verbose=True, learning_rate_init=1, learning_rate='adaptive', max_iter=500,)
mlp.fit(x,y)
mlp.score(x,y)
print mlp.coefs_
print mlp.n_layers_
print mlp.n_outputs_
print mlp.out_activation_
print "res: ",res
res = mlp.predict(x)
r = np.subtract(df['D'].as_matrix(), res)
Running this code gives the following output:
[[ 162. 9. 475.5 ]
[ 105. 6.39 232.5 ]
[ 141. 7.44 373.5 ]
...,
[ 120. 8.41 450.5 ]
[ 120. 8.77 464. ]
[ 160. 8.77 483. ]]
[[ 72. ]
[ 73. ]
[ 74.5]
...,
[ 53. ]
[ 52. ]
[ 73. ]]
Length before regression, x: (10000, 3), y: (10000, 1)
Iteration 1, loss = 43928.72815906
Iteration 2, loss = 3434.26257670
Iteration 3, loss = 2393.24701752
Iteration 4, loss = 1662.31634550
Iteration 5, loss = 1225.37443598
Iteration 6, loss = 997.21761203
Iteration 7, loss = 891.10992049
Iteration 8, loss = 847.20461842
Iteration 9, loss = 830.60945144
Iteration 10, loss = 825.10945455
Iteration 11, loss = 823.39941482
Iteration 12, loss = 822.96788084
Iteration 13, loss = 822.85930250
Iteration 14, loss = 822.83848702
Iteration 15, loss = 822.84245376
Iteration 16, loss = 822.84871312
Iteration 17, loss = 822.83965835
Training loss did not improve more than tol=0.000100 for two consecutive epochs. Stopping.
[array([[-5.33, -5.23, -5.15, -4.86, -5.68],
[-5.28, -5.86, -5.83, -5.98, -6.2 ],
[-5.32, -5.79, -5.02, -4.71, -5.87]]), array([[-5.69],
[-5.06],
[ 4.35],
[ 4.6 ],
[-5.66]])]
3
1
identity
res: [ 95.53 95.53 95.53 ..., 95.53 95.53 95.53]
The resulting res variable is then constant.
I've played a little with the prediction and found that input values below 0.01 give a little change in result. Also I find that the out_activation_ is always identity, though I've set the activation function to be relu.
I'm kind of lost to what might cause this behavior. Why does it seem x needs to be different (normalized?) for fit() than for predict()?
note: as was commented below there is no cross-validation in this example. I am aware of that.

Related

Tensorflow linear regression result does not match Numpy/SciKit-Learn

I am working through the Tensorflow examples from Aurelien Geron's "Hands-On Machine Learning" book. However, I am unable to replicate the simple linear regression example in this supporting notebook. Why doesn't Tensorflow match the Numpy/SciKit-Learn result?
As far as I can tell, there is no optimization (we are using the normal equation, so it's just matrix computations), and the answers seem too different to be precision errors.
import numpy as np
import tensorflow as tf
from sklearn.datasets import fetch_california_housing
housing = fetch_california_housing()
m, n = housing.data.shape
housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]
X = tf.constant(housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
XT = tf.transpose(X)
theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, X)), XT), y)
with tf.Session() as sess:
theta_value = theta.eval()
theta_value
Answer:
array([[ -3.74651413e+01],
[ 4.35734153e-01],
[ 9.33829229e-03],
[ -1.06622010e-01],
[ 6.44106984e-01],
[ -4.25131839e-06],
[ -3.77322501e-03],
[ -4.26648885e-01],
[ -4.40514028e-01]], dtype=float32)
###### Compare with pure NumPy
X = housing_data_plus_bias
y = housing.target.reshape(-1, 1)
theta_numpy = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)
print(theta_numpy)
Answer:
[[ -3.69419202e+01]
[ 4.36693293e-01]
[ 9.43577803e-03]
[ -1.07322041e-01]
[ 6.45065694e-01]
[ -3.97638942e-06]
[ -3.78654265e-03]
[ -4.21314378e-01]
[ -4.34513755e-01]]
###### Compare with Scikit-Learn
from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(housing.data, housing.target.reshape(-1, 1))
print(np.r_[lin_reg.intercept_.reshape(-1, 1), lin_reg.coef_.T])
Answer:
[[ -3.69419202e+01]
[ 4.36693293e-01]
[ 9.43577803e-03]
[ -1.07322041e-01]
[ 6.45065694e-01]
[ -3.97638942e-06]
[ -3.78654265e-03]
[ -4.21314378e-01]
[ -4.34513755e-01]]
Update: My question sounded similar to this one, but following the recommendation did not fix the problem.

I just compare the results by tensorflow and numpy. Since you used dtype=tf.float32 for X and y, I will use np.float32 for the numpy example as follows:
X_numpy = housing_data_plus_bias.astype(np.float32)
y_numpy = housing.target.reshape(-1, 1).astype(np.float32)
Now let's try to compare the results by tf.matmul(XT, X) (tensorflow) and X.T.dot(X) (numpy):
with tf.Session() as sess:
XTX_value = tf.matmul(XT, X).eval()
XTX_numpy = X_numpy.T.dot(X_numpy)
np.allclose(XTX_value, XTX_numpy, rtol=1e-06) # True
np.allclose(XTX_value, XTX_numpy, rtol=1e-07) # False
So this is the problem of float's precision. If you change the precision to tf.float64 and np.float64, you will have the same result for theta.

Custom metrics in keras to evaluate sign prediction

I am working on a regression problem. One of performance metrics for this problem is "sign accuracy", which means I want to see whether the predict value has the same sign of the true value. I know mse could somehow show the closeness between the predict value and the true value, but I would like to see the sign accuracy during the validation.
To be more specific, after training, I use the way below to check the accuracy. What I want to custom the metrics is to realize the way below during validation.
(np.multiply(predict_label,test_label)>0).sum()/float(predict_label.shape[0])

You can implement it in a similar way to accuracy:
def sign_accuracy(y_true, y_pred):
return K.mean(K.greater(y_true * y_pred, 0.), axis=-1)
To test it:
y_true = np.random.rand(5, 1) - 0.5
y_pred = np.random.rand(5, 1) - 0.5
acc = K.eval(sign_accuracy(K.variable(y_true), K.variable(y_pred)))
print(y_true)
[[ 0.20410185]
[ 0.12085985]
[ 0.39697642]
[-0.28178138]
[-0.37796012]]
print(y_pred)
[[-0.38281826]
[ 0.14268927]
[ 0.19218624]
[ 0.21394845]
[ 0.04044269]]
print(acc)
[ 0. 1. 1. 0. 0.]
The mean over axis 0 is taken automatically by Keras when you call fit() or evaluate(), so you don't need to sum acc and divide it by y_pred.shape[0].
This metric can also be applied to multidimensional variables:
y_true = np.random.rand(5, 3) - 0.5
y_pred = np.random.rand(5, 3) - 0.5
acc = K.eval(sign_accuracy(K.variable(y_true), K.variable(y_pred)))
print(y_true)
[[ 0.02745352 -0.27927986 -0.47882833]
[-0.40950793 -0.16218984 0.19184008]
[ 0.25002487 -0.08455175 -0.03606459]
[ 0.09315503 -0.19825522 0.19801222]
[-0.32129431 -0.02256616 0.47799333]]
print(y_pred)
[[-0.06733171 0.18156806 0.28396574]
[ 0.04054056 -0.45898607 -0.10661648]
[-0.05162396 -0.34005141 -0.25910923]
[-0.26283177 0.01532359 0.33764032]
[ 0.2754057 0.26896232 0.23089488]]
print(acc)
[ 0. 0.33333334 0.66666669 0.33333334 0.33333334]

Softmax function in Tensorflow not displaying correct answer

I was testing out the softmax function from Tensorflow but the answers I got don't appear to be correct.
So in the code below kh is a [5,4] matrix. softmaxkh should be the softmax matrix of kh. However even without doing the calculations, you can tell that the maximum numbers in a particular column or row of kh do not necessarily correspond to the the maximum numbers in softmaxkh.
For example '65' in the middle row of the last column is the highest number in both its column and row however in both its row and column in softmaxkh it does not represent the highest number.
import tensorflow as tf
kh = tf.random_uniform(
shape= [5,4],
maxval=67,
dtype=tf.int32,
seed=None,
name=None
)
sess = tf.InteractiveSession()
kh = tf.cast(kh, tf.float32)
softmaxkh = tf.nn.softmax(kh)
print(sess.run(kh))
Which returns
[[ 55. 49. 48. 30.]
[ 21. 39. 20. 11.]
[ 40. 33. 58. 65.]
[ 55. 19. 12. 24.]
[ 17. 8. 14. 0.]]
and
print(sess.run(softmaxkh))
returns
[[ 1.42468502e-21 9.99663830e-01 8.31249167e-07 3.35349847e-04]
[ 3.53262839e-24 1.56288218e-18 1.00000000e+00 3.13913289e-17]
[ 6.10305051e-06 6.69280719e-03 9.93300676e-01 3.03852971e-07]
[ 2.86251861e-20 2.31952296e-16 8.75651089e-27 1.00000000e+00]
[ 5.74948687e-19 2.61026280e-23 9.99993801e-01 6.14417422e-06]]

That is because a random generator such as random_uniform draws different numbers every time you call it.
You need to store the result in a Variable to reuse random generated values across different graph runs:
import tensorflow as tf
kh = tf.random_uniform(
shape= [5,4],
maxval=67,
dtype=tf.int32,
seed=None,
name=None
)
kh = tf.cast(kh, tf.float32)
kh = tf.Variable(kh)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
softmaxkh = tf.nn.softmax(kh)
# run graph
print(sess.run(kh))
# run graph again
print(sess.run(softmaxkh))
Atlernatively, if those values are used only once but at multiple locations, you could run the graph calling all the desired output at once.
import tensorflow as tf
kh = tf.random_uniform(
shape= [5,4],
maxval=67,
dtype=tf.int32,
seed=None,
name=None
)
kh = tf.cast(kh, tf.float32)
sess = tf.InteractiveSession()
softmaxkh = tf.nn.softmax(kh)
# produces consistent output values
print(sess.run([kh, softmaxkh))
# also produces consistent values, but different from above
print(sess.run([kh, softmaxkh))

how can I calculate the multi-label top k precisions with tensorflow?

My task is to predict the five most probable tags in a sentence. And now I've got unscaled logits from the output(dense connect) layer:
with tf.name_scope("output"):
scores = tf.nn.xw_plus_b(self.h_drop, W,b, name="scores")
predictions = tf.nn.top_k(self.scores, 5) # should be the k highest score
with tf.name_scope("accuracy"):
labels = input_y # its shape is (batch_size, num_classes)
# calculate the top k accuracy
now predictions are just like [3,1,2,50,12] (3,1... are indexes of the highest scores), while labels are in "multi-hot" form: [0,1,0,1,1,0...].
In python, i can simply write
correct_preds = [input_y[i]==1 for i in predictions]
weighted = np.dot(correct_preds, [5,4,3,2,1]) # weighted by rank
recall = sum(correct_preds) /sum(input_y)
precision =sum(correct_preds)/len(correct_preds)
but in tensorflow, what form shoud I use to complete this task?

Solution
I've coded up an example of how to do the calculations. All of the inputs in this example are coded as tf.constant but of course you can substitute your variables.
The main trick is the matrix multiplications. First is input_y reshaped to be 2d times a [1x5] ones matrix called to_top5. The second is correct_preds by the weighted_matrix.
Code
import tensorflow as tf
input_y = tf.constant( [5,2,9,1] , dtype=tf.int32 )
predictions = tf.constant( [[9,3,5,2,1],[8,9,0,6,5],[1,9,3,4,5],[1,2,3,4,5]])
to_top5 = tf.constant( [[1,1,1,1,1]] , dtype=tf.int32 )
input_y_for_top5 = tf.matmul( tf.reshape(input_y,[-1,1]) , to_top5 )
correct_preds = tf.cast( tf.equal( input_y_for_top5 , predictions ) , dtype=tf.float16 )
weighted_matrix = tf.constant( [[5.],[4.],[3.],[2.],[1.]] , dtype=tf.float16 )
weighted = tf.matmul(correct_preds,weighted_matrix)
recall = tf.reduce_sum(correct_preds) / tf.cast( tf.reduce_sum(input_y) , tf.float16)
precision = tf.reduce_sum(correct_preds) / tf.constant(5.0,dtype=tf.float16)
## training
# Run tensorflow and print the result
with tf.Session() as sess:
print "\n\n=============\n\n"
print "\ninput_y_for_top5"
print sess.run(input_y_for_top5)
print "\ncorrect_preds"
print sess.run(correct_preds)
print "\nweighted"
print sess.run(weighted)
print "\nrecall"
print sess.run(recall)
print "\nprecision"
print sess.run(precision)
print "\n\n=============\n\n"
Output
=============
input_y_for_top5
[[5 5 5 5 5]
[2 2 2 2 2]
[9 9 9 9 9]
[1 1 1 1 1]]
correct_preds
[[ 0. 0. 1. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 1. 0. 0. 0.]
[ 1. 0. 0. 0. 0.]]
weighted
[[ 3.]
[ 0.]
[ 4.]
[ 5.]]
recall
0.17651
precision
0.6001
=============
Summary
The above examples shows a batch size of 4.
The first batch has a y_label of 5, which means that the element with an index of 5 is the correct label for the first batch. Furthermore, the prediction for the first batch is [9,3,5,2,1] which means that the prediction function thinks that the 9th element is the most likely, then element 3 is the next most likely and so on.
Let's say we want an example of a batch size of 3, then use the following code
input_y = tf.constant( [5,2,9] , dtype=tf.int32 )
predictions = tf.constant( [[9,3,5,2,1],[8,9,0,6,5],[1,9,3,4,5]])
If we substitute in the above lines to the program we can see that indeed it calculates everything for a batch size of 3 correctly.

inspired by #wontonimo' answer above, I implemented a method using matrix ops and tf.reshape, tf.gather. The label tensor are "multi-hot", e.g. [[0,1,0,1],[1,0,0,1]]. prediction tensor are obtained by tf.nn.top_k, looks like [[3,1],[0,1]]. Here is the code:
top_k_pred = tf.nn.top_k(logits, 5)
tmp1 = tf.reshape(tf.range(batch_size)*num_classes, (-1,1))
idx_incre = top_k_pred[1] + tf.concat([tmp1]*5,1)
correct_preds = tf.gather(tf.reshape(y_label, (-1,), tf.reshape(idx_incre, (-1,)))
correct_preds = tf.reshape(correct_pred, (batch_size, 5))
weighted = correct_preds * [[5],[4],[3],[2],[1]]

What is going wrong with the training and predictions using TensorFlow?

Please see the code written below.
x = tf.placeholder("float", [None, 80])
W = tf.Variable(tf.zeros([80,2]))
b = tf.Variable(tf.zeros([2]))
y = tf.nn.softmax(tf.matmul(x,W) + b)
y_ = tf.placeholder("float", [None,2])
So here we see that there are 80 features in the data with only 2 possible outputs. I set the cross_entropy and the train_step like so.
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(tf.matmul(x, W) + b, y_)
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
Initialize all variables.
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
Then I use this code to "train" my Neural Network.
g = 0
for i in range(len(x_train)):
_, w_out, b_out = sess.run([train_step, W, b], feed_dict={x: [x_train[g]], y_: [y_train[g]]})
g += 1
print "...Trained..."
After training the network, it always produces the same accuracy rate regardless of how many times I train it. That accuracy rate is 0.856067 and I get to that accuracy with this code-
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print sess.run(accuracy, feed_dict={x: x_test, y_: y_test})
0.856067
So this is where the question comes in. Is it because I have too small of dimensions? Maybe I should break the features into a 10x8 matrix? Maybe a 4x20 matrix? etc.
Then I try to get the probabilities of the actual test data producing a 0 or a 1 like so-
test_data_actual = genfromtxt('clean-test-actual.csv',delimiter=',') # Actual Test data
x_test_actual = []
for i in test_data_actual:
x_test_actual.append(i)
x_test_actual = np.array(x_test_actual)
ans = sess.run(y, feed_dict={x: x_test_actual})
And print out the probabilities:
print ans[0:10]
[[ 1. 0.]
[ 1. 0.]
[ 1. 0.]
[ 1. 0.]
[ 1. 0.]
[ 1. 0.]
[ 1. 0.]
[ 1. 0.]
[ 1. 0.]
[ 1. 0.]]
(Note: it does produce [ 0. 1.] sometimes.)
I then tried to see if applying the expert methodology would produce better results. Please see the following code.
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 1, 1, 1],
strides=[1, 1, 1, 1], padding='SAME')
(Please note how I changed the strides in order to avoid errors).
W_conv1 = weight_variable([1, 80, 1, 1])
b_conv1 = bias_variable([1])
Here is where the question comes in again. I define the Tensor (vector/matrix if you will) as 80x1 (so 1 row with 80 features in it); I continue to do that throughout the rest of the code (please see below).
x_ = tf.reshape(x, [-1,1,80,1])
h_conv1 = tf.nn.relu(conv2d(x_, W_conv1) + b_conv1)
Second Convolutional Layer
h_pool1 = max_pool_2x2(h_conv1)
W_conv2 = weight_variable([1, 80, 1, 1])
b_conv2 = bias_variable([1])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)
Densely Connected Layer
W_fc1 = weight_variable([80, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 80])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
Dropout
keep_prob = tf.placeholder("float")
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
Readout
W_fc2 = weight_variable([1024, 2])
b_fc2 = bias_variable([2])
y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
In the above you'll see that I defined the output as 2 possible answers (also to avoid errors).
Then cross_entropy and the train_step.
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(tf.matmul(h_fc1_drop, W_fc2) + b_fc2, y_)
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
Start the session.
sess.run(tf.initialize_all_variables())
"Train" the neural network.
g = 0
for i in range(len(x_train)):
if i%100 == 0:
train_accuracy = accuracy.eval(session=sess, feed_dict={x: [x_train[g]], y_: [y_train[g]], keep_prob: 1.0})
train_step.run(session=sess, feed_dict={x: [x_train[g]], y_: [y_train[g]], keep_prob: 0.5})
g += 1
print "test accuracy %g"%accuracy.eval(session=sess, feed_dict={
x: x_test, y_: y_test, keep_prob: 1.0})
test accuracy 0.929267
And, once again, it always produces 0.929267 as the output.
The probabilities on the actual data producing a 0 or a 1 are as follows:
[[ 0.92820859 0.07179145]
[ 0.92820859 0.07179145]
[ 0.92820859 0.07179145]
[ 0.92820859 0.07179145]
[ 0.92820859 0.07179145]
[ 0.92820859 0.07179145]
[ 0.96712834 0.03287172]
[ 0.92820859 0.07179145]
[ 0.92820859 0.07179145]
[ 0.92820859 0.07179145]]
As you see, there is some variance in these probabilities, but typically just the same result.
I know that this isn't a Deep Learning problem. This is obviously a training problem. I know that there should always be some variance in the training accuracy every time you reinitialize the variables and retrain the network, but I just don't know why or where it's going wrong.

The answer is 2 fold.
One problem is with the dimensions/parameters. The other problem is that the features are being placed in the wrong spot.
W_conv1 = weight_variable([1, 2, 1, 80])
b_conv1 = bias_variable([80])
Notice the first two numbers in the weight_variable correspond to the dimensions of the input. The second two numbers correspond to the dimensions of the feature tensor. The bias_variable always takes the final number in the weight_variable.
Second Convolutional Layer
W_conv2 = weight_variable([1, 2, 80, 160])
b_conv2 = bias_variable([160])
Here the first two numbers still correspond to the dimensions of the input. The second two numbers correspond to the amount of features and the weighted network that results from the 80 previous features. In this case, we double the weighted network. 80x2=160. The bias_variable then takes the final number in the weight_variable. If you were to finish the code at this point, the last number in the weight_variable would be a 1 in order to prevent dimensional errors due to the shape of the input tensor and the output tensor. But, instead, for better predictions, let's add a third convolutional layer.
Third Convolutional Layer
W_conv3 = weight_variable([1, 2, 160, 1])
b_conv3 = bias_variable([1])
Once again, the first two numbers in the weight_variable take the shape of the input. The third number corresponds to the amount of the weighted variables we established in the Second Convolutional Layer. The last number in the weight_variable now becomes 1 so we don't run into any dimension errors on the output that we are predicting. In this case, the output has the dimensions of 1, 2.
W_fc2 = weight_variable([80, 1024])
b_fc2 = bias_variable([1024])
Here, the number of neurons is 1024 which is completely arbitrary, but the first number in the weight_variable needs to be something that the dimensions of our feature matrix needs to be divisible by. In this case it can be any number (such as 2, 4, 10, 20, 40, 80). Once again, the bias_variable takes the last number in the weight_variable.
At this point, make sure that the last number in h_pool3_flat = tf.reshape(h_pool3, [-1, 80]) corresponds to the first number in the W_fc2 weight_variable.
Now when you run your training program you will notice that the outcome varies and won't always guess all 1's or all 0's.
When you want to predict the probabilities, you have to feed x to the softmax variable-> y_conv=tf.nn.softmax(tf.matmul(h_fc2_drop, W_fc3) + b_fc3) like so-
ans = sess.run(y_conv, feed_dict={x: x_test_actual, keep_prob: 1.0})
You can alter the keep_prob variable, but keeping it at a 1.0 always produces the best results. Now, if you print out ans you'll have something that looks like this-
[[ 0.90855026 0.09144982]
[ 0.93020624 0.06979381]
[ 0.98385173 0.0161483 ]
[ 0.93948185 0.06051811]
[ 0.90705943 0.09294061]
[ 0.95702559 0.04297439]
[ 0.95543593 0.04456403]
[ 0.95944828 0.0405517 ]
[ 0.99154049 0.00845954]
[ 0.84375167 0.1562483 ]
[ 0.98449463 0.01550537]
[ 0.97772813 0.02227189]
[ 0.98341942 0.01658053]
[ 0.93026513 0.06973486]
[ 0.93376994 0.06623009]
[ 0.98026556 0.01973441]
[ 0.93210858 0.06789146]
Notice how the probabilities vary. Your training is now working properly.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Constant predicted value for MLPRegressor - python

Related

Tensorflow linear regression result does not match Numpy/SciKit-Learn

Custom metrics in keras to evaluate sign prediction

Softmax function in Tensorflow not displaying correct answer

how can I calculate the multi-label top k precisions with tensorflow?

What is going wrong with the training and predictions using TensorFlow?

Categories

Resources