Related
I cannot evaluate my model because i find this error when i try to print the accuracy of my model.
How can I evalueate my model? I use LSTM to generate new data from my dataset, i know different metrics like accuracy, precision and recall but every time i try to implement to my data generated i found this problem
#scaled is my dataset that i scaled and contain 6879 line with this value:
#array([[0. , 0. , 0. , 0. , 0. ],
# [0. , 0.25 , 0. , 0.07142857, 0. ],
# [0. , 0.875 , 0. , 0.07142857, 0. ],
# ...,
# [0.98828125, 0.375 , 0.92050207, 0.5 , 0. ],
from numpy import array
n_steps = 10
n_features = 5
def split_sequences(sequences, n_steps):
X, y = list(), list()
for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps
# check if we are beyond the dataset
if end_ix > len(sequences)-1:
break
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
X.append(seq_x)
y.append(seq_y)
return array(X), array(y)
X, y = split_sequences(sequences=scaled, n_steps=n_steps)
print(X.shape, y.shape)
xtrain, xtest, ytrain, ytest = train_test_split(X, y, test_size=0.25, random_state=42)
# define model
LSTM_model = Sequential()
LSTM_model.add(LSTM(100, return_sequences=False,activation='relu' ,input_shape=(n_steps, n_features)))
#model.add(LSTM(100, activation='relu'))
LSTM_model.add(Dense(n_features))
LSTM_model.compile(optimizer='adam', loss='mse')
# fit model
LSTM_model.fit(xtrain, ytrain, epochs=10, batch_size=100, verbose=1)
LSTM_model.summary()
print(accuracy_score(ytest, LSTM_model.predict(xtest)[:,0,:]))
this is the error
ValueError Traceback (most recent call last)
<ipython-input-203-0e337cd696dc> in <module>()
1 #yhat = Conv1D_model.predict(X, verbose=0)
----> 2 print(accuracy_score(ytest, Conv1D_model2.predict(xtest)[:,0,:]))
3
1 frames
/usr/local/lib/python3.7/dist-packages/sklearn/metrics/_classification.py in _check_targets(y_true, y_pred)
102 # No metrics support "multiclass-multioutput" format
103 if y_type not in ["binary", "multiclass", "multilabel-indicator"]:
--> 104 raise ValueError("{0} is not supported".format(y_type))
105
106 if y_type in ["binary", "multiclass"]:
ValueError: continuous-multioutput is not supported
Edit1 From the comments, this does not look like a classification problem, so accuracy_score() won't work since it requires unique, comparable labels and predictions.
Without seeing how your predictions look (compared to your true labels), the biggest problem I see is that you're passing predictions to accuracy_score() and the comparisons will almost never match. ValueError: continuous-multioutput is not supported most likely refers to the fact that predictions are confidence float values, meanwhile true labels are integers or int-like, such as 0. or 1.. Your predictions most likely include continuous values like 0.9876 which will never match 0. or 1.. You need to discretize them with either some function or by rounding...probably.
if y_type not in ["binary", "multiclass", "multilabel-indicator"]: should also be an indicator that it's looking for either [0, 1] (binary), or [0, 1, 2, ..., n-1] (multiclass), or [[0, 1], [0, 2], [1, 2], ..., [n, m]] (multilabel).
"the set of labels predicted for a sample must exactly match the corresponding set of labels in y_true"
Do something like this:
preds = LSTM_model.predict(xtest) # gets predictions as floats
preds = np.rint(preds) # rounds to nearest int, where >=.5 becomes 1
print(accuracy_score(ytest, preds)
Edit0: You are also missing an activation function in your final Dense layer. Otherwise you're getting a roughly linear representation of the previous layer's output through n_features neurons as values.
LSTM_model.add(Dense(n_features, activation="softmax"))
LSTM_model.compile(optimizer="adam", loss="mse")
Can someone please explain dimensionality logic for input X and class Y
for sparse_categorical_crossentropy loss function ?
I checked both Keras and tf2 doc and examples, and this post.
Cross Entropy vs Sparce but one point is not clear to me.
Does the Y vector need to be expanded to the same number column as
the number classes models outputs (if I use softmax output), or
Does Keras automatically expand Y?
In my case, I have input images 32x32, and Y is a number between 0 and 10.
So the input is (batch_size, h, w), Y (batch_size, 0....10 integer value)
X = (73257, 32, 32)
Y = (73257, 1)
model.fit(X, Y, epochs=30, validation_split=0.10, batch_size=1, verbose=True)
The model itself just a Sequential bunch of Dense layers and output Softmax.
model = Sequential()
model.add(Dense(32, activation='relu',
input_shape=input_shape,
kernel_initializer='he_uniform',
bias_initializer='ones'))
# bunch of Dense layer and output softmax
model.add(Dense(10, activation='softmax'))
The error is dimensionality.
ValueError: Shape mismatch: The shape of labels (received (1, 1)) should equal the shape of logits except for the last dimension (received (1, 32, 10)).
Thank you.
As mentioned in that post, both categorical cross-entropy (cce) and sparse categorical cross-entropy (scc) have the same loss function just except the format of the true label Y. Simply if Y is an integer, you would use scc whereas if Y is one-hot, you would use cce. So for scc, ground truth Y is mostly 1D whereas in cce, ground truth Y mostly is 2D. For ground truth
- (num_of_samples, n_class_one_hot_encode) <- for cce (2D)
- (num_of_samples, n_class_int) <- for scc (1D)
For example, if we use the cifar10 data set, we can do
import tensorflow as tf
(x_train, y_train), (_, _) = tf.keras.datasets.cifar10.load_data()
# train set / data
x_train = x_train.astype('float32') / 255
sparse = y_train
onehot = y_train
onehot = tf.keras.utils.to_categorical(onehot , num_classes=10)
print(sparse[:5]) # < --- (num_of_samples, n_class_int)
print(onehot[:5]) # < --- (num_of_samples, n_class_one_hot_encode)
[[6]
[9]
[9]
[4]
[1]]
[[0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
[0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]]
Now, let's define a simple model and train using the above both and see what happens.
def net():
input = tf.keras.Input(shape=(32, 32, 3))
x = tf.keras.layers.Conv2D(16, 3, activation="relu")(input)
x = tf.keras.layers.MaxPooling2D(3)(x)
x = tf.keras.layers.GlobalMaxPooling2D()(x)
x = tf.keras.layers.Dense(10, activation='softmax')(x)
model = tf.keras.Model(input, x)
return model
Using cce
model = net()
model.compile(
loss = tf.keras.losses.CategoricalCrossentropy(),
metrics = 'accuracy',
optimizer = 'adam')
his = model.train_on_batch(x_train, onehot, return_dict=True)
print(his)
{'loss': 2.376708984375, 'accuracy': 0.09651999920606613}
one_hot_pred = model.predict(x_train)
print(onehot[0])
print(one_hot_pred[0])
print(onehot[0].shape)
print(one_hot_pred[0].shape)
[0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]
[0.1516315 0.1151238 0.11732318 0.10644271 0.08946694 0.1398355
0.05046898 0.04249624 0.11813554 0.06907552]
(10,)
(10,)
Now, using scc
model = net()
model.compile(
loss = tf.keras.losses.SparseCategoricalCrossentropy(),
metrics = 'accuracy',
optimizer = 'adam')
his = model.train_on_batch(x_train, sparse, return_dict=True)
print(his)
{'loss': 2.331458806991577, 'accuracy': 0.10066000372171402}
sparse_pred = model.predict(x_train)
print(sparse[0])
print(sparse_pred[0])
print(sparse[0].shape)
print(sparse_pred[0].shape)
[6]
[0.07184976 0.08837385 0.06910037 0.12347631 0.09542189 0.09981853
0.11247937 0.06707954 0.14902702 0.12337337]
(1,)
(10,)
Observe that, gt and pred shape for scc are (1,) and (10,). In this case, the loss computes the logarithm only for output index which ground truth indicates to. For example, the gt here is 6, and from pred the loss will compute only the logarithm of pred[6]. Here are some little more details of it.
I'm trying to implement GAN in Keras, and I want to use One-sided label smoothing trick, i.e. put the label of True image to be 0.9 instead of 1. However, now the built-in metrics binary_crossentropy does not do the correct thing, it's always 0 for True image.
Then I tried to implement my own metrics in Keras. I want to convert all 0.9 label to be 1, but I'm new to Keras and I don't know how to do that. Here's what I intend:
# Just a pseudo code
def custom_metrics(y_true, y_pred):
if K.equal(y_true, [[0.9]]):
y_true = y_true+0.1
return metrics.binary_accuracy(y_true, y_pred)
How should I compare and change the y_true label? Thanks in advance!
EDIT:
The output of the following code is:
def custom_metrics(y_true, y_pred):
print(K.shape(y_true))
print(K.shape(y_pred))
y_true = K.switch(K.equal(y_true, 0.9), K.ones_like(y_true), K.zeros_like(y_true))
return metrics.binary_accuracy(y_true, y_pred)
Tensor("Shape:0", shape=(2,), dtype=int32)
Tensor("Shape_1:0", shape=(2,), dtype=int32)
ValueError: Shape must be rank 0 but is rank 2 for 'cond/Switch' (op: 'Switch') with input shapes: [?,?], [?,?].
You can use tf.where:
y_true = tf.where(K.equal(y_true, 0.9), tf.ones_like(y_true), tf.zeros_like(y_true))
Alternatively, You can use keras.backend.switch function for that.
keras.backend.switch(condition, then_expression, else_expression)
Your custom metrics function would look something like below:
def custom_metrics(y_true, y_pred):
y_true = K.switch(K.equal(y_true, 0.9),K.ones_like(y_true), K.zeros_like(y_true))
return metrics.binary_accuracy(y_true, y_pred)
Test code:
def test_function(y_true):
print(K.eval(y_true))
y_true = K.switch(K.equal(y_true, 0.9),K.ones_like(y_true), K.zeros_like(y_true))
print(K.eval(y_true))
y_true = K.variable(np.array([0, 0, 0, 0, 0, 0.9, 0.9, 0.9, 0.9, 0.9]))
test_function(y_true)
output:
[0. 0. 0. 0. 0. 0.9 0.9 0.9 0.9 0.9]
[0. 0. 0. 0. 0. 1. 1. 1. 1. 1.]
My task is to predict the five most probable tags in a sentence. And now I've got unscaled logits from the output(dense connect) layer:
with tf.name_scope("output"):
scores = tf.nn.xw_plus_b(self.h_drop, W,b, name="scores")
predictions = tf.nn.top_k(self.scores, 5) # should be the k highest score
with tf.name_scope("accuracy"):
labels = input_y # its shape is (batch_size, num_classes)
# calculate the top k accuracy
now predictions are just like [3,1,2,50,12] (3,1... are indexes of the highest scores), while labels are in "multi-hot" form: [0,1,0,1,1,0...].
In python, i can simply write
correct_preds = [input_y[i]==1 for i in predictions]
weighted = np.dot(correct_preds, [5,4,3,2,1]) # weighted by rank
recall = sum(correct_preds) /sum(input_y)
precision =sum(correct_preds)/len(correct_preds)
but in tensorflow, what form shoud I use to complete this task?
Solution
I've coded up an example of how to do the calculations. All of the inputs in this example are coded as tf.constant but of course you can substitute your variables.
The main trick is the matrix multiplications. First is input_y reshaped to be 2d times a [1x5] ones matrix called to_top5. The second is correct_preds by the weighted_matrix.
Code
import tensorflow as tf
input_y = tf.constant( [5,2,9,1] , dtype=tf.int32 )
predictions = tf.constant( [[9,3,5,2,1],[8,9,0,6,5],[1,9,3,4,5],[1,2,3,4,5]])
to_top5 = tf.constant( [[1,1,1,1,1]] , dtype=tf.int32 )
input_y_for_top5 = tf.matmul( tf.reshape(input_y,[-1,1]) , to_top5 )
correct_preds = tf.cast( tf.equal( input_y_for_top5 , predictions ) , dtype=tf.float16 )
weighted_matrix = tf.constant( [[5.],[4.],[3.],[2.],[1.]] , dtype=tf.float16 )
weighted = tf.matmul(correct_preds,weighted_matrix)
recall = tf.reduce_sum(correct_preds) / tf.cast( tf.reduce_sum(input_y) , tf.float16)
precision = tf.reduce_sum(correct_preds) / tf.constant(5.0,dtype=tf.float16)
## training
# Run tensorflow and print the result
with tf.Session() as sess:
print "\n\n=============\n\n"
print "\ninput_y_for_top5"
print sess.run(input_y_for_top5)
print "\ncorrect_preds"
print sess.run(correct_preds)
print "\nweighted"
print sess.run(weighted)
print "\nrecall"
print sess.run(recall)
print "\nprecision"
print sess.run(precision)
print "\n\n=============\n\n"
Output
=============
input_y_for_top5
[[5 5 5 5 5]
[2 2 2 2 2]
[9 9 9 9 9]
[1 1 1 1 1]]
correct_preds
[[ 0. 0. 1. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 1. 0. 0. 0.]
[ 1. 0. 0. 0. 0.]]
weighted
[[ 3.]
[ 0.]
[ 4.]
[ 5.]]
recall
0.17651
precision
0.6001
=============
Summary
The above examples shows a batch size of 4.
The first batch has a y_label of 5, which means that the element with an index of 5 is the correct label for the first batch. Furthermore, the prediction for the first batch is [9,3,5,2,1] which means that the prediction function thinks that the 9th element is the most likely, then element 3 is the next most likely and so on.
Let's say we want an example of a batch size of 3, then use the following code
input_y = tf.constant( [5,2,9] , dtype=tf.int32 )
predictions = tf.constant( [[9,3,5,2,1],[8,9,0,6,5],[1,9,3,4,5]])
If we substitute in the above lines to the program we can see that indeed it calculates everything for a batch size of 3 correctly.
inspired by #wontonimo' answer above, I implemented a method using matrix ops and tf.reshape, tf.gather. The label tensor are "multi-hot", e.g. [[0,1,0,1],[1,0,0,1]]. prediction tensor are obtained by tf.nn.top_k, looks like [[3,1],[0,1]]. Here is the code:
top_k_pred = tf.nn.top_k(logits, 5)
tmp1 = tf.reshape(tf.range(batch_size)*num_classes, (-1,1))
idx_incre = top_k_pred[1] + tf.concat([tmp1]*5,1)
correct_preds = tf.gather(tf.reshape(y_label, (-1,), tf.reshape(idx_incre, (-1,)))
correct_preds = tf.reshape(correct_pred, (batch_size, 5))
weighted = correct_preds * [[5],[4],[3],[2],[1]]
When I print out the predictions, the output includes 3 separate classes 0, 1, and 2 but I only give it 2 separate classes in the training set 0 and 1. I'm not sure why this is happening. I'm trying to elaborate on a tutorial from TensorFlow Machine Learning Cookbook. This is based on the last example of Chapter 2 if anyone has access to it. Note, there are some errors but that may be incompatibility between the older version from the text.
Anyways, I am trying to develop a very rigid structure when building my models so I can get it engrained in muscle memory. I am instantiating the tf.Graph before-hand for each tf.Session of a set of computations and also setting the number of threads to use. Note, I am using TensorFlow 1.0.1 with Python 3.6.1 so the f"formatstring{var}" won't work if you have an older version of Python.
Where I am getting confused is the last step in the prediction under # Accuracy Predictions section. Why am I getting 3 classes for my classification and why is my accuracy so poor for such a simple classification? I am fairly new at this type of model-based machine learning so I'm sure it's some syntax error or assumption I have made. Is there an error in my code?
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import multiprocessing
# Set number of CPU to use
tf_max_threads = tf.ConfigProto(intra_op_parallelism_threads=multiprocessing.cpu_count())
# Data
seed= 0
size = 50
x = np.concatenate((np.random.RandomState(seed).normal(-1,1,size),
np.random.RandomState(seed).normal(2,1,size)
)
)
y = np.concatenate((np.repeat(0, size),
np.repeat(1, size)
)
)
# Containers
loss_data = list()
A_data = list()
# Graph
G_6 = tf.Graph()
n = 25
# Containers
loss_data = list()
A_data = list()
# Iterations
n_iter = 5000
# Train / Test Set
tr_ratio = 0.8
tr_idx = np.random.RandomState(seed).choice(x.size, round(tr_ratio*x.size), replace=False)
te_idx = np.array(list(set(range(x.size)) - set(tr_idx)))
# Build Graph
with G_6.as_default():
# Placeholders
pH_x = tf.placeholder(tf.float32, shape=[None,1], name="pH_x")
pH_y_hat = tf.placeholder(tf.float32, shape=[None,1], name="pH_y_hat")
# Train Set
x_train = x[tr_idx].reshape(-1,1)
y_train = y[tr_idx].reshape(-1,1)
# Test Set
x_test= x[te_idx].reshape(-1,1)
y_test = y[te_idx].reshape(-1,1)
# Model
A = tf.Variable(tf.random_normal(mean=10, stddev=1, shape=[1], seed=seed), name="A")
model = tf.multiply(pH_x, A)
# Loss
loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=model, labels=pH_y_hat))
with tf.Session(graph=G_6, config=tf_max_threads) as sess:
sess.run(tf.global_variables_initializer())
# Optimizer
op = tf.train.GradientDescentOptimizer(0.03)
train_step = op.minimize(loss)
# Train linear model
for i in range(n_iter):
idx_random = np.random.RandomState(i).choice(x_train.size, size=n)
x_tr = x[idx_random].reshape(-1,1)
y_tr = y[idx_random].reshape(-1,1)
sess.run(train_step, feed_dict={pH_x:x_tr, pH_y_hat:y_tr})
# Iterations
A_iter = sess.run(A)[0]
loss_iter = sess.run(loss, feed_dict={pH_x:x_tr, pH_y_hat:y_tr}).mean()
# Append
loss_data.append(loss_iter)
A_data.append(A_iter)
# Log
if (i + 1) % 1000 == 0:
print(f"Step #{i + 1}:\tA = {A_iter}", f"Loss = {to_precision(loss_iter)}", sep="\t")
print()
# Accuracy Predictions
A_result = sess.run(A)
y_ = tf.squeeze(tf.round(tf.nn.sigmoid_cross_entropy_with_logits(logits=model, labels=pH_y_hat)))
correct_predictions = tf.equal(y_, pH_y_hat)
accuracy = tf.reduce_mean(tf.cast(correct_predictions, tf.float32))
print(sess.run(y_, feed_dict={pH_x:x_train, pH_y_hat:y_train}))
print("Training:",
f"Accuracy = {sess.run(accuracy, feed_dict={pH_x:x_train, pH_y_hat:y_train})}",
f"Shape = {x_train.shape}", sep="\t")
print("Testing:",
f"Accuracy = {sess.run(accuracy, feed_dict={pH_x:x_test, pH_y_hat:y_test})}",
f"Shape = {x_test.shape}", sep="\t")
# Plot path
with plt.style.context("seaborn-whitegrid"):
fig, ax = plt.subplots(nrows=3, figsize=(6,6))
pd.Series(loss_data,).plot(ax=ax[0], label="loss", legend=True)
pd.Series(A_data,).plot(ax=ax[1], color="red", label="A", legend=True)
ax[2].hist(x[:size], np.linspace(-5,5), label="class_0", color="red")
ax[2].hist(x[size:], np.linspace(-5,5), label="class_1", color="blue")
alphas = np.linspace(0,0.5, len(A_data))
for i in range(0, len(A_data), 100):
alpha = alphas[i]
a = A_data[i]
ax[2].axvline(a, alpha=alpha, linestyle="--", color="black")
ax[2].legend(loc="upper right")
fig.suptitle("training-process", fontsize=15, y=0.95)
Output Results:
Step #1000: A = 6.72 Loss = 1.13
Step #2000: A = 3.93 Loss = 0.58
Step #3000: A = 2.12 Loss = 0.319
Step #4000: A = 1.63 Loss = 0.331
Step #5000: A = 1.58 Loss = 0.222
[ 0. 0. 1. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 2.
0. 0. 2. 0. 2. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0. 1. 0.
1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.
0. 0. 0. 0. 0. 0. 0. 0.]
Training: Accuracy = 0.475 Shape = (80, 1)
Testing: Accuracy = 0.5 Shape = (20, 1)
Your model doesn't do classification
You have a linear regression model, i.e., your output variable (model = tf.multiply(pH_x, A)) outputs for each input a single scalar value with an arbitrary range. That's generally what you'd have for a prediction model, one that needs to predict some numeric value, not for a classifier.
Afterwards, you treat it like it would contain a typical n-ary classifier output (e.g. by passing it sigmoid_cross_entropy_with_logits) but it does not match the expectations of that function - in that case, the 'shape' of the model variable should be multiple values (e.g. 2 in your case) per input datapoint, each corresponding to some metric corresponding to the probability for each class; then often passed to a softmax function to normalize them.
Alternatively, you may want a binary classifier model that outputs a single value 0 or 1 depending on the class - however, in that case, you want something like the logistic function after the matrix multiplication; and that would need a different loss function, something like simple mean square difference, not sigmoid_cross_entropy_with_logits.
Currently the model as written seems like a mash of two different, incompatible tutorials.