So I have a fine-tuned model that returns among other features, the predicted images, I want to get the predicted classes from those images but I'm not able to get with it, this is intended to compute a Confusion matrix either manually or using scikit learn, I'm able to get with each class of my original dataset but I'm struggling to get with the predicted classes of each image, so far this is a patch of my code:
predlist = torch.zeros(0, dtype=torch.long, device='cpu')
lbllist = torch.zeros(0, dtype=torch.long, device='cpu')
with torch.no_grad():
for i, (inputs, classes) in enumerate(val_loader):
inputs = inputs.to(device) # [1, 13, 224, 224]
classes = classes.to(device)
outputs = model_ft(inputs)[1] # 0 is LOSS, 1 is [1, 196, 3328] is PRED, 2 is [1, 196] is MASK,
# 3 is [1, 13, 224, 224] is TARGET
# !! return loss, pred, mask, target_2d
#outputs = outputs[1]
model = models_mae_mod.__dict__['mae_vit_small_patch16'](in_chans=13, feature='raw')
#loss, pred, mask, target = outputs
#print(loss, pred.shape, mask.shape)
#outputs = torch.Tensor(np.stack((loss, pred, mask, target), -1))
#model = models_mae_mod.__dict__['mae_vit_small_patch16'](in_chans=13)
outputs = model.unpatchify(outputs) #[1, 13, 224, 224]
#lab = torch.argmax(outputs, 1)
_, preds = torch.max(outputs, 1)
#pred_c = torch.argmax(preds)
predlist = torch.cat([predlist, preds.view(-1).cpu()])
lbllist = torch.cat([lbllist, classes.view(-1).cpu()])
if i > 10:
break
After debugging I got with this values:
a = torch.max(outputs, 1)
a
torch.return_types.max(
values=tensor([[[0.5419, 0.3766, 1.0952, ..., 0.9223, 0.7693, 1.0980],
[1.9111, 1.4176, 0.9902, ..., 1.3873, 0.9266, 0.6857],
[0.8174, 0.5505, 0.8097, ..., 0.8501, 0.1761, 1.0284],
...,
[0.5996, 0.4945, 0.8258, ..., 0.8206, 1.3554, 1.1564],
[0.3814, 0.7084, 0.8026, ..., 0.6130, 1.1291, 1.3241],
[1.4426, 1.3198, 0.9262, ..., 0.9011, 0.7266, 0.8977]]],
device='cuda:0'),
indices=tensor([[[ 2, 3, 4, ..., 0, 1, 10],
[ 7, 9, 3, ..., 6, 9, 4],
[ 2, 4, 3, ..., 9, 7, 1],
...,
[10, 0, 1, ..., 11, 9, 3],
[ 6, 7, 10, ..., 7, 11, 8],
[ 6, 7, 8, ..., 7, 4, 4]]], device='cuda:0'))
_.shape
torch.Size([1, 224, 224])
preds.shape
torch.Size([1, 224, 224])
This is just for one image, I know that the indices follows each value of the values tensor, however I cannot understand how to get with the probability or the prediction for this image and not the entire tensor representing the image. Do you have any idea how to get with that information? How could I get with the predicted classes for each image?
PD: The indices tensor looks like the prediction but I'm not sure, since it has values from 0 to 12 when the dataset just have 10 classes thus, seems to be other thing.
Related
I am receiving the following error when I run a convolution operation inside a torch.no_grad() context:
RuntimeError: Only Tensors of floating-point and complex dtype can require gradients.
import torch.nn as nn
import torch
with torch.no_grad():
ker_t = torch.tensor([[1, -1] ,[-1, 1]])
in_t = torch.tensor([ [14, 7, 6, 2,] , [4 ,8 ,11 ,1], [3, 5, 9 ,10], [12, 15, 16, 13] ])
print(in_t.shape)
in_t = torch.unsqueeze(in_t,0)
in_t = torch.unsqueeze(in_t,0)
print(in_t.shape)
conv = nn.Conv2d(1, 1, kernel_size=2,stride=2,dtype=torch.long)
conv.weight[:] = ker_t
conv(in_t)
Now I am sure if I turn my input into floats this message will go away, but I want to work in integers.
But I was under the impression that if I am in a with torch.no_grad() context it should turn off the need for gradients.
The need for gradients comes from nn.Conv2d when it registers the weights of the convolution layer.
However, if you are only after the forward pass, you do not need to use a convolution layer: you can use the underlying convolution function:
import torch.nn.functional as nnf
ker_t = torch.tensor([[1, -1] ,[-1, 1]])[None, None, ...]
in_t = torch.tensor([ [14, 7, 6, 2,] , [4 ,8 ,11 ,1], [3, 5, 9 ,10], [12, 15, 16, 13] ])[None, None, ...]
out = nnf.conv2d(in_t, ker_t, stride=2)
Will give you this output:
tensor([[[[11, -6],
[ 1, -4]]]])
This question already has answers here:
calculate precision and recall in a confusion matrix
(6 answers)
Closed 2 years ago.
I'm using Python and have some confusion matrixes. I'd like to calculate precisions and recalls and f-measure by confusion matrixes in multiclass classification. My result logs don't contain y_true and y_pred, just contain confusion matrix.
Could you tell me how to get these scores from confusion matrix in multiclass classification?
Let's consider the case of MNIST data classification (10 classes), where for a test set of 10,000 samples we get the following confusion matrix cm (Numpy array):
array([[ 963, 0, 0, 1, 0, 2, 11, 1, 2, 0],
[ 0, 1119, 3, 2, 1, 0, 4, 1, 4, 1],
[ 12, 3, 972, 9, 6, 0, 6, 9, 13, 2],
[ 0, 0, 8, 975, 0, 2, 2, 10, 10, 3],
[ 0, 2, 3, 0, 953, 0, 11, 2, 3, 8],
[ 8, 1, 0, 21, 2, 818, 17, 2, 15, 8],
[ 9, 3, 1, 1, 4, 2, 938, 0, 0, 0],
[ 2, 7, 19, 2, 2, 0, 0, 975, 2, 19],
[ 8, 5, 4, 8, 6, 4, 14, 11, 906, 8],
[ 11, 7, 1, 12, 16, 1, 1, 6, 5, 949]])
In order to get the precision & recall (per class), we need to compute the TP, FP, and FN per class. We don't need TN, but we will compute it, too, as it will help us for our sanity check.
The True Positives are simply the diagonal elements:
# numpy should have already been imported as np
TP = np.diag(cm)
TP
# array([ 963, 1119, 972, 975, 953, 818, 938, 975, 906, 949])
The False Positives are the sum of the respective column, minus the diagonal element (i.e. the TP element):
FP = np.sum(cm, axis=0) - TP
FP
# array([50, 28, 39, 56, 37, 11, 66, 42, 54, 49])
Similarly, the False Negatives are the sum of the respective row, minus the diagonal (i.e. TP) element:
FN = np.sum(cm, axis=1) - TP
FN
# array([17, 16, 60, 35, 29, 74, 20, 53, 68, 60])
Now, the True Negatives are a little trickier; let's first think what exactly a True Negative means, with respect to, say class 0: it means all the samples that have been correctly identified as not being 0. So, essentially what we should do is remove the corresponding row & column from the confusion matrix, and then sum up all the remaining elements:
num_classes = 10
TN = []
for i in range(num_classes):
temp = np.delete(cm, i, 0) # delete ith row
temp = np.delete(temp, i, 1) # delete ith column
TN.append(sum(sum(temp)))
TN
# [8970, 8837, 8929, 8934, 8981, 9097, 8976, 8930, 8972, 8942]
Let's make a sanity check: for each class, the sum of TP, FP, FN, and TN must be equal to the size of our test set (here 10,000): let's confirm that this is indeed the case:
l = 10000
for i in range(num_classes):
print(TP[i] + FP[i] + FN[i] + TN[i] == l)
The result is
True
True
True
True
True
True
True
True
True
True
Having calculated these quantities, it is now straightforward to get the precision & recall per class:
precision = TP/(TP+FP)
recall = TP/(TP+FN)
which for this example are
precision
# array([ 0.95064166, 0.97558849, 0.96142433, 0.9456838 , 0.96262626,
# 0.986731 , 0.93426295, 0.95870206, 0.94375 , 0.9509018])
recall
# array([ 0.98265306, 0.98590308, 0.94186047, 0.96534653, 0.97046843,
# 0.91704036, 0.97912317, 0.94844358, 0.9301848 , 0.94053518])
Similarly we can compute related quantities, like specificity (recall that sensitivity is the same thing with recall):
specificity = TN/(TN+FP)
Results for our example:
specificity
# array([0.99445676, 0.99684151, 0.9956512 , 0.99377086, 0.99589709,
# 0.99879227, 0.99270073, 0.99531877, 0.99401728, 0.99455011])
You should now be able to compute these quantities virtually for any size of your confusion matrix.
If you have confusion matrix in the form of:
cmat = [[ 5, 7],
[25, 37]]
Following simple function can be made:
def myscores(smat):
tp = smat[0][0]
fp = smat[0][1]
fn = smat[1][0]
tn = smat[1][1]
return tp/(tp+fp), tp/(tp+fn)
Testing:
print("precision and recall:", myscores(cmat))
Output:
precision and recall: (0.4166666666666667, 0.16666666666666666)
Above function can also be extended to produce other scores, the formulae for which are mentioned on https://en.wikipedia.org/wiki/Confusion_matrix
There is a package called 'disarray'.
So, if I have four classes :
import numpy as np
a = np.random.randint(0,4,[100])
b = np.random.randint(0,4,[100])
I can use disarray to calculate 13 matrices :
import disarray
# Instantiate the confusion matrix DataFrame with index and columns
cm = confusion_matrix(a,b)
df = pd.DataFrame(cm, index= ['a','b','c','d'], columns=['a','b','c','d'])
df.da.export_metrics()
which gives :
I have one input tensor and an index mapping. I want to create output tensor based on that index mapping.
For example:
input = torch.tensor([12, 56, 45, 37], dtype=torch.float)
index_map = torch.tensor([-1, 1, 2, 1, 3, -1])
(-1 in index_map represents it is not mapped to anything)
grad = torch.tensor([4, 2, 5, 3, 6, 7], dtype=torch.float)
I want to copy values from grad to input based on index_map. The values which are mapped to same index should be averaged. For e.g. grad[1] and grad[3] are both mapped to index 1, so final value at index 1 should be 1.5
Output tensor should be like :
tensor([12., 1.5, 5., 6.])
My code is:
input = torch.tensor([12, 56, 45, 37], dtype=torch.float)
input = torch.cat([input, torch.tensor([-1], dtype=torch.float)])
index_map = torch.tensor([-1, 1, 2, 1, 3, -1])
grad_out = torch.tensor([4, 2, 5, 3, 6, 7], dtype=torch.float)
input[index_map] = grad_out
input = input[:4]
print(input)
Above code copies the latest value but does not average it out.
This question already has answers here:
calculate precision and recall in a confusion matrix
(6 answers)
Closed 2 years ago.
I'm using Python and have some confusion matrixes. I'd like to calculate precisions and recalls and f-measure by confusion matrixes in multiclass classification. My result logs don't contain y_true and y_pred, just contain confusion matrix.
Could you tell me how to get these scores from confusion matrix in multiclass classification?
Let's consider the case of MNIST data classification (10 classes), where for a test set of 10,000 samples we get the following confusion matrix cm (Numpy array):
array([[ 963, 0, 0, 1, 0, 2, 11, 1, 2, 0],
[ 0, 1119, 3, 2, 1, 0, 4, 1, 4, 1],
[ 12, 3, 972, 9, 6, 0, 6, 9, 13, 2],
[ 0, 0, 8, 975, 0, 2, 2, 10, 10, 3],
[ 0, 2, 3, 0, 953, 0, 11, 2, 3, 8],
[ 8, 1, 0, 21, 2, 818, 17, 2, 15, 8],
[ 9, 3, 1, 1, 4, 2, 938, 0, 0, 0],
[ 2, 7, 19, 2, 2, 0, 0, 975, 2, 19],
[ 8, 5, 4, 8, 6, 4, 14, 11, 906, 8],
[ 11, 7, 1, 12, 16, 1, 1, 6, 5, 949]])
In order to get the precision & recall (per class), we need to compute the TP, FP, and FN per class. We don't need TN, but we will compute it, too, as it will help us for our sanity check.
The True Positives are simply the diagonal elements:
# numpy should have already been imported as np
TP = np.diag(cm)
TP
# array([ 963, 1119, 972, 975, 953, 818, 938, 975, 906, 949])
The False Positives are the sum of the respective column, minus the diagonal element (i.e. the TP element):
FP = np.sum(cm, axis=0) - TP
FP
# array([50, 28, 39, 56, 37, 11, 66, 42, 54, 49])
Similarly, the False Negatives are the sum of the respective row, minus the diagonal (i.e. TP) element:
FN = np.sum(cm, axis=1) - TP
FN
# array([17, 16, 60, 35, 29, 74, 20, 53, 68, 60])
Now, the True Negatives are a little trickier; let's first think what exactly a True Negative means, with respect to, say class 0: it means all the samples that have been correctly identified as not being 0. So, essentially what we should do is remove the corresponding row & column from the confusion matrix, and then sum up all the remaining elements:
num_classes = 10
TN = []
for i in range(num_classes):
temp = np.delete(cm, i, 0) # delete ith row
temp = np.delete(temp, i, 1) # delete ith column
TN.append(sum(sum(temp)))
TN
# [8970, 8837, 8929, 8934, 8981, 9097, 8976, 8930, 8972, 8942]
Let's make a sanity check: for each class, the sum of TP, FP, FN, and TN must be equal to the size of our test set (here 10,000): let's confirm that this is indeed the case:
l = 10000
for i in range(num_classes):
print(TP[i] + FP[i] + FN[i] + TN[i] == l)
The result is
True
True
True
True
True
True
True
True
True
True
Having calculated these quantities, it is now straightforward to get the precision & recall per class:
precision = TP/(TP+FP)
recall = TP/(TP+FN)
which for this example are
precision
# array([ 0.95064166, 0.97558849, 0.96142433, 0.9456838 , 0.96262626,
# 0.986731 , 0.93426295, 0.95870206, 0.94375 , 0.9509018])
recall
# array([ 0.98265306, 0.98590308, 0.94186047, 0.96534653, 0.97046843,
# 0.91704036, 0.97912317, 0.94844358, 0.9301848 , 0.94053518])
Similarly we can compute related quantities, like specificity (recall that sensitivity is the same thing with recall):
specificity = TN/(TN+FP)
Results for our example:
specificity
# array([0.99445676, 0.99684151, 0.9956512 , 0.99377086, 0.99589709,
# 0.99879227, 0.99270073, 0.99531877, 0.99401728, 0.99455011])
You should now be able to compute these quantities virtually for any size of your confusion matrix.
If you have confusion matrix in the form of:
cmat = [[ 5, 7],
[25, 37]]
Following simple function can be made:
def myscores(smat):
tp = smat[0][0]
fp = smat[0][1]
fn = smat[1][0]
tn = smat[1][1]
return tp/(tp+fp), tp/(tp+fn)
Testing:
print("precision and recall:", myscores(cmat))
Output:
precision and recall: (0.4166666666666667, 0.16666666666666666)
Above function can also be extended to produce other scores, the formulae for which are mentioned on https://en.wikipedia.org/wiki/Confusion_matrix
There is a package called 'disarray'.
So, if I have four classes :
import numpy as np
a = np.random.randint(0,4,[100])
b = np.random.randint(0,4,[100])
I can use disarray to calculate 13 matrices :
import disarray
# Instantiate the confusion matrix DataFrame with index and columns
cm = confusion_matrix(a,b)
df = pd.DataFrame(cm, index= ['a','b','c','d'], columns=['a','b','c','d'])
df.da.export_metrics()
which gives :
I have a question about Keras function Dropout with the argument of noise_shape.
Question 1:
What's the meaning of if your inputs have shape (batch_size, timesteps, features) and you want the dropout mask to be the same for all timesteps, you can use noise_shape=(batch_size, 1, features)?, and what's the benefit of adding this argument?
Does it mean the number of neurons that will be dropped out is same along time step? which means at every timestep t, there would be n neurons dropped?
Question 2:
Do I have to include 'batch_size' in noise_shape when creating models? --> see the following example.
Suppose I have a multivariate time series data in the shape of (10000, 1, 100, 2) --> (number of data, channel, timestep, number of features).
Then I create batches with batch size of 64 --> (64, 1, 100, 2)
If I want to create a CNN model with drop out, I use Keras functional API:
inp = Input([1, 100, 2])
conv1 = Conv2D(64, kernel_size=(11,2), strides(1,1),data_format='channels_first')(inp)
max1 = MaxPooling2D((2,1))(conv1)
max1_shape = max1._keras_shape
drop1 = Dropout((0.1, noise_shape=[**?**, max1._keras_shape[1], 1, 1]))
Because the output shape of layer max1 should be (None, 64, 50, 1), and I cannot assign None to the question mark (which corresponds to batch_size)
I wonder how should I cope with this? Should I just use (64, 1, 1) as noise_shape? or should I define a variable called 'batch_size', then pass it to this argument like (batch_size, 64, 1, 1)?
Question 1:
It's kind of like a numpy broadcast I think.
Imagine you have 2 batches witch 3 timesteps and 4 features (It's a small example to make it easier to show it):
(2, 3, 4)
If you use a noise shape of (2, 1, 4), each batch will have its own
dropout mask that will be applied to all timesteps.
So let's say these are the weights of shape (2, 3, 4):
array([[[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 10, 11, 12, 13]],
[[ 14, 15, 16, 17],
[ 18, 19, 20, 21],
[ 22, 23, 24, 25]]])
And this would be the random noise_shape (2, 1, 4)
(1 is like keep and 0 is like turn it off):
array([[[ 1, 1, 1, 0]],
[[ 1, 0, 0, 1]]])
So you have these two noise shapes (For every batch one).
Then it will be kinda broadcast along the timestep axis.
array([[[ 1, 1, 1, 0],
[ 1, 1, 1, 0],
[ 1, 1, 1, 0]],
[[ 1, 0, 0, 1],
[ 1, 0, 0, 1],
[ 1, 0, 0, 1]]])
and applied to the weights:
array([[[ 1, 2, 3, 0],
[ 5, 6, 7, 0],
[ 10, 11, 12, 0]],
[[ 14, 0, 0, 17],
[ 18, 0, 0, 21],
[ 22, 0, 0, 25]]])
Question 2:
I'm not sure about your second question to be honest.
Edit:
What you can do is take the first dimension of the shape of the input,
which should be the batch_size, as proposed in this github issue:
import tensorflow as tf
...
batch_size = tf.shape(inp)[0]
drop1 = Dropout((0.1, noise_shape=[batch_size, max1._keras_shape[1], 1, 1]))
As you can see I'm on tensorflow backend. Dunno if theano also
has these problems and if it does you might just be able to solve it with
the theano shape equivalent.
Below is sample code to see what exactly is happening.
The output log is self explanatory.
While if you are bothered about dynamic batch_size just make first element of noise_shape to None as follows i.e.
dl1 = tk.layers.Dropout(0.2, noise_shape=[_batch_size, 1, _num_features])
to
dl1 = tk.layers.Dropout(0.2, noise_shape=[None, 1, _num_features])
import tensorflow as tf
import tensorflow.keras as tk
import numpy as np
_batch_size = 5
_time_steps = 2
_num_features = 3
input = np.random.random((_batch_size, _time_steps, _num_features))
dl = tk.layers.Dropout(0.2)
dl1 = tk.layers.Dropout(0.2, noise_shape=[_batch_size, 1, _num_features])
out = dl(input, training=True).numpy()
out1 = dl1(input, training=True).numpy()
for i in range(_batch_size):
print(">>>>>>>>>>>>>>>>>>>>>>>>>>>>", i)
print("input")
print(input[i])
print("out")
print(out[i])
print("out1")
print(out1[i])
The output is:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 0
input
[[0.53853024 0.80089701 0.64374258]
[0.06481775 0.31187039 0.5029061 ]]
out
[[0.6731628 1.0011213 0. ]
[0.08102219 0.38983798 0.6286326 ]]
out1
[[0.6731628 0. 0.8046782 ]
[0.08102219 0. 0.6286326 ]]
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1
input
[[0.70746014 0.08990712 0.58195288]
[0.75798534 0.50140453 0.04914242]]
out
[[0.8843252 0.11238389 0. ]
[0.9474817 0.62675565 0. ]]
out1
[[0. 0.11238389 0. ]
[0. 0.62675565 0. ]]
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2
input
[[0.85253707 0.55813084 0.70741476]
[0.98812977 0.21565134 0.67909392]]
out
[[1.0656713 0.69766355 0.8842684 ]
[0. 0.26956415 0. ]]
out1
[[1.0656713 0.69766355 0.8842684 ]
[1.2351623 0.26956415 0.84886736]]
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3
input
[[0.9837272 0.3504008 0.37425778]
[0.67648931 0.74456052 0.6229444 ]]
out
[[1.2296591 0.438001 0. ]
[0.84561163 0.93070066 0.7786805 ]]
out1
[[0. 0.438001 0.46782222]
[0. 0.93070066 0.7786805 ]]
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 4
input
[[0.45599217 0.80992091 0.04458478]
[0.12214568 0.09821599 0.51525869]]
out
[[0.5699902 1.0124011 0. ]
[0.1526821 0. 0.64407337]]
out1
[[0.5699902 1.0124011 0.05573097]
[0.1526821 0.12276999 0.64407337]]