I'm trying to plot the ROC curve from a modified version of the CIFAR-10 example provided by tensorflow. It's now for 2 classes instead of 10.
The output of the network are called logits and take the form:
[[-2.57313061 2.57966399] [ 0.04221377 -0.04033273] [-1.42880082
1.43337202] [-2.7692945 2.78173304] [-2.48195744 2.49331546] [ 2.0941515 -2.10268974] [-3.51670194 3.53267646] [-2.74760485 2.75617766] ...]
First of all, what do these logits actually represent? The final layer in the network is a "softmax linear" of form WX+b.
The model is able to calculate accuracy by calling
top_k_op = tf.nn.in_top_k(logits, labels, 1)
Then once the graph has been initialized:
predictions = sess.run([top_k_op])
predictions_int = np.array(predictions).astype(int)
true_count += np.sum(predictions)
...
precision = true_count / total_sample_count
This works fine.
But now how can I plot a ROC curve from this?
I've been trying the "sklearn.metrics.roc_curve()" function (http://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html#sklearn.metrics.roc_curve) but I don't know what to use as my "y_score" parameter.
Any help would be appreciated!
'y_score' here should be an array corresponding to the probability of each sample that will be classified as positive (if positive was labeled as 1 in your y_true array)
Actually, if your network use Softmax as the last layer, then the model should output the probability of each category for this instance. But the data you given here doesn't conform with this format. And I checked the example code : https://github.com/tensorflow/tensorflow/blob/r0.10/tensorflow/models/image/cifar10/cifar10.py
it seems use the layer called softmax_linear, I know little for this Example but I guess you should process the output with something like Logistic Function to turn it into the probability.
Then just feed it along with your true label 'y_true' to the scikit-learn function:
y_score = np.array(output)[:,1]
roc_curve(y_true, y_score)
import tensorflow as tf
tp = [] # the true positive rate list
fp = [] # the false positive rate list
total = len(fp)
writer = tf.train.SummaryWriter("/tmp/tensorboard_roc")
for idx in range(total):
summt = tf.Summary()
summt.value.add(tag="roc", simple_value = tp[idx])
writer.add_summary (summt, tp[idx] * 100) #act as global_step
writer.flush ()
then start a tensorboard:
tensorboard --logdir=/tmp/tensorboard_roc
tensorboard_roc
for details and code, you can visit my blog: http://blog.csdn.net/mao_feng/article/details/54731098
Related
I am a newbie to neural networks. I coded a perceptron in Python 3.10 without using libraries. But I am facing an issue. It always returns either True for all the input data or False for all the input data. I am not sure why this happens.
Details about the project:
The learning rate of the perceptron is set to 0.1.
It is trained on a bunch of randomly trained points(100 randomly generated points).
It's purpose is to figure out whether the x - coordinate of the point is greater than the y - coordinate if the point.
It uses the "sign" activation function.
# activation function, return 1 if positive else -1
def sig(self, n):
return 1 if n == abs(n) else -1
The training process looks like this,
# training the perceptron with the training data
for point in train_data:
neuron.train(point.inputs, point.label)
The training method is defined as,
# training the perceptron with known data
def train(self, train_data, target):
prediction = self.predict(train_data)
error = prediction - target
# nudging the weights by calculating delta weight
for i in range(len(self.weights)):
delta_weight = error * train_data[i] * self.learning_rate
self.weights[i] += delta_weight
Link to github repo: https://github.com/cipherDOT/perceptron
Please help me solve this issue.
I'm using a CNN with an Autoencoder to cluster different types of RNA. The clusters are calculated from the compressed representations of the different RNAs. Every RNA has a label corresponding to the type of RNA. In my case 7 different classes. After I get the result of the clustering I would like to visualize the results and see which RNA clusters where but right now the y_pred value does not correspond to the to the RNA-class but to the cluster that was initialized by kmeans.
kmeans = KMeans(n_clusters=self.n_clusters, n_init=20)
self.y_pred = kmeans.fit_predict(self.encoder.predict(x))
y_pred_last = np.copy(self.y_pred)
self.model.get_layer(name='clustering').set_weights([kmeans.cluster_centers_])
print(kmeans.labels_)
self.y_pred = q.argmax(1)
if y is not None:
acc = np.round(metrics.acc(y, self.y_pred), 5)
nmi = np.round(metrics.nmi(y, self.y_pred), 5)
ari = np.round(metrics.ari(y, self.y_pred), 5)
loss = np.round(loss, 5)
logdict = dict(iter=ite, acc=acc, nmi=nmi, ari=ari, L=loss[0], Lc=loss[1], Lr=loss[2])
optimizer = 'adam'
dcec.compile(loss=['kld', 'mse'], loss_weights=[args.gamma, 1], optimizer=optimizer)
dcec.fit(x, y=y, tol=args.tol, maxiter=args.maxiter,
update_interval=args.update_interval,
save_dir=args.save_dir,
cae_weights=args.cae_weights)
y_pred = dcec.y_pred
result = list(itertools.chain(y))
with open('datapoints.csv', mode='w', newline='') as data_points:
data_writer = csv.writer(data_points)
data_writer.writerow(['id', 'ytrue', 'ypred'])
truth= y
prediction = dcec.y_pred
for i in range(len(result)):
data_writer.writerow([i, truth[i], prediction[i]])
My problem right now is this part: prediction = dcec.y_pred
The output shows me the correct true label but not the "correct" predicted label. It returns a value but this does not correspond to the RNA-types
I don't know if this is the right path. Mainly I just want to visualize the clusters and see which RNA type was rightly and wrongly classified.
You might not be using the correct function call to get the prediction from the Keras model. I believe you should be doing something like:
prediction = dcec.predict(x)
Additional details are here: https://keras.io/models/model/
I hope this helps.
I am trying to build a Neural Network in tensorflow where the cost of a Type I error (false-positive) is more costly than a Type II error (false-negative). Is there a way to impose this during the training process (i.e. inputting a cost matrix)? This is possible with simple models like Logistic Regression in scikit learn by specifying the class_weight parameter.
cw = {0: 3,1:1}
clf = LogisticRegression(class_weight = cw )
In this case, incorrectly predicting a 0 is 3x more costly than incorrectly predicting a 1. However, this cannot be performed with a Neural Network, so I want to see if it is possible in tensorflow.
Thanks
You could use tf.nn.weighted_cross_entropy_with_logits and it's pos_weight argument.
This argument weights positive class, as described by documentation (in TF2.0 at least):
A value pos_weights > 1 decreases the false negative count, hence increasing the recall.
Conversely setting pos_weights < 1 decreases the false positive count and increases the precision.
In your case, you could create custom loss function like this:
import tensorflow as tf
# Output logits from your network, not the values after sigmoid activation
class WeightedBinaryCrossEntropy:
def __init__(self, positive_weight: float):
self.positive_weight = positive_weight
def __call__(self, targets, logits, sample_weight=None):
return tf.nn.weighted_cross_entropy_with_logits(
targets, logits, pos_weight=self.positive_weight
)
And create a custom neural network with it, for example using tf.keras (samples are weighted as they were in your question:
import numpy as np
model = tf.keras.models.Sequential(
[
tf.keras.layers.Dense(32, input_shape=(10,)),
tf.keras.layers.Activation("relu"),
tf.keras.layers.Dense(10),
tf.keras.layers.Activation("relu"),
# Output one logit for binary classification
tf.keras.layers.Dense(1),
]
)
# Example random data
data = np.random.random((32, 10))
targets = np.random.randint(2, size=32)
# 3 times as costly to make type I error
model.compile(optimizer="rmsprop", loss=WeightedBinaryCrossEntropy(positive_weight=3))
model.fit(data, targets, batch_size=32)
You can use a logarithmic scale. For a 0 incorrectly predicted as 1, y - ŷ = -1, log goes to 1.71. For a 1 predicted as 0, y - ŷ = 1 log equals 0.63. For y == ŷ log equals 0. Almost the three times more costly, for a 0 incorrectly predicted as 1.
import numpy as np
from math import exp
loss=abs(1-exp(-np.log(exp(y-ŷ))))
#abs(1-exp(-np.log(exp(0))))
#Out[53]: 0.0
#abs(1-exp(-np.log(exp(-1))))
#Out[54]: 1.718281828459045
#abs(1-exp(-np.log(exp(1))))
#Out[55]: 0.6321205588285577
Then you will have a convex optimization. Implementing:
import keras.backend as K
def custom_loss(y_true,y_pred):
return K.mean(abs(1-exp(-np.log(exp(y_true-y_pred)))))
Then:
model.compile(loss=custom_loss, optimizer=sgd,metrics = ['accuracy'])
For ~20,000 text datasets, the true and false samples are ~5,000 against ~1,5000. Two-channel textCNN built with Keras and Theano is used to do the classification. F1 score is the evaluation metric. The F1 score is not bad while the confusion matrix shows that the accuracy of the true samples is relatively low(~40%). But actually it is very important to predict the true samples accurately. Therefore, want to design a custom binary cross entropy loss function to increase the weight of mis-classified true samples and make the model focus more on predicting accurately on the true samples.
tried class_weight with sklearn in model.fit method and it did not work very well since the weight applied to all samples instead of the mis-classified ones.
tried and adjusted the method mentioned here: https://github.com/keras-team/keras/issues/2115, but the loss function was categorical cross entropy and it did not work well for the binary classification problem. Tried to modified the loss function to a binary one but encounter some issues concerning the input dimension.
The sample code of the cost sensitive loss function focusing on the mis-classified samples is:
def w_categorical_crossentropy(y_true, y_pred, weights):
nb_cl = len(weights)
final_mask = K.zeros_like(y_pred[:, 0])
y_pred_max = K.max(y_pred, axis=1)
y_pred_max = K.reshape(y_pred_max, (K.shape(y_pred)[0], 1))
y_pred_max_mat = K.equal(y_pred, y_pred_max)
for c_p, c_t in product(range(nb_cl), range(nb_cl)):
final_mask += (weights[c_t, c_p] * y_pred_max_mat[:, c_p] * y_true[:, c_t])
return K.categorical_crossentropy(y_pred, y_true) * final_mask
Actually, a custom loss function for binary classification implemented with Keras and Theano that focuses on the mis-classified samples is of great importance to the imbalanced dataset. Please help troubleshoot this. Thanks!
Well when I have to deal with imbalanced datasets in keras, what I do is to first compute the weights for each class and pass them to the model instance during training. This will look something like this:
from sklearn.utils import compute_class_weight
w = compute_class_weight('balanced', np.unique(targets), targets)
# here I am adding only two categories with their corresponding weights
# you can spin a loop or continue by hand until you include all of your categories
weights = {
np.unique(targets)[0] : w[0], # class 0 with weight 0
np.unique(targets)[1] : w[1] # class 1 with weight 1
}
# then during training you do like this
model.fit(x=features, y=targets, {..}, class_weight=weights)
I believe this will solve your problem.
I am relatively new in machine learning especially when it comes to implementing algorithms. I am using python and tensorflow library to implement a neural network to train on a dataset which has about 20 classes. I am able to train and get predictions successfully but I have a question,
Is it possible to get top k classes along with their probabilities using tensorflow instead of just a single prediction?
If it is possible how can this be done? Thanks for your guidance.
Update 01:
I am adding code of what I am doing. So I build a neural network with 3 layers having tanh, sigmoid, & sigmoid respectively as activation functions for the hidden layers and softmax for output layer. The code for training and prediction is as follows:
y_pred = None
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
# running the training_epoch numbered epoch
_,cost = sess.run([optimizer,cost_function],feed_dict={X:tr_features,Y:tr_labels})
cost_history = np.append(cost_history,cost)
# predict results based on the trained model
y_pred = sess.run(tf.argmax(y_,1),feed_dict={X: ts_features})
Right now y_pred is a list of class labels for each test example of ts_features. But instead of getting 1 single class label for each test example I am hoping to get top-k predictions for each example each of the k-predictions accompanied by some kind of probability.
Using tf.nn.top_k():
top_k_values, top_k_indices = tf.nn.top_k(predictions, k=k)
If predictions is a vector of probabilities per class (i.e. predictions[i] = prediction probability for class i), then top_k_values will contain the k highest probabilities in predictions, and top_k_indices will contain the indices of these probabilities, i.e. the corresponding classes.
Supposing that in your code, y_ is the vector of predicted probabilities per class:
k = 3 # replace with your value
# Instead of `y_pred`:
y_k_probs, y_k_pred = sess.run(
tf.nn.top_k(y_, k=k), feed_dict={X: ts_features})