Compute precision and accuracy using numpy - python

Suppose two lists true_values = [1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0] and predictions = [1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0]. How can I compute the accuracy and the precision using numpy?

import numpy as np
true_values = np.array([[1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0]])
predictions = np.array([[1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0]])
N = true_values.shape[1]
accuracy = (true_values == predictions).sum() / N
TP = ((predictions == 1) & (true_values == 1)).sum()
FP = ((predictions == 1) & (true_values == 0)).sum()
precision = TP / (TP+FP)
This is the most concise way I came up with (assuming no sklearn), there might be even shorter though!

This is what sklearn, which uses numpy behind the curtain, is for:
from sklearn.metrics import precision_score, accuracy_score
accuracy_score(true_values, predictions), precision_score(true_values, predictions)
Output:
(0.3333333333333333, 0.375)

I'm only going to answer for precision, because I posted a duplicate for accuracy and there should be one question per thread:
sum(map(lambda x, y: x == y == 1, true_values, predictions))/sum(true_values)
0.5
Use np.sum if you absolutely want to use Numpy
Here is for the mean:
np.equal(true_values, predictions).mean()

If you really want to calculate it yourself instead of using a library like in Quang Hoang's answer, just count the number of (true|false) (positives|negatives) and plug the values into your formula:
tp = 0
tn = 0
fp = 0
fn = 0
for t,p in zip(true_values, predictions):
if t == p:
if p == 1:
tp += 1
else:
tn += 1
else:
if p == 1:
fn += 1
else:
fp += 1
accuracy = (tp + tn) / (tp + tn + fp + fn)
precision = tp / (tp + fp)
This wikipedia article has a nice info box with formulas that use exactly these values.

Related

Optimal combination of linked-buckets

Let's say I have the following (always binary) options:
import numpy as np
a=np.array([1, 1, 0, 0, 1, 1, 1])
b=np.array([1, 1, 0, 0, 1, 0, 1])
c=np.array([1, 0, 0, 1, 0, 0, 0])
d=np.array([1, 0, 1, 1, 0, 0, 0])
And I want to find the optimal combination of the above that get's me to at least, with minimal above:
req = np.array([50,50,20,20,100,40,10])
For example:
final = X1*a + X2*b + X3*c + X4*d
Does this map to a known operational research problem? Or does it fall under mathematical programming?
Is this NP-hard, or exactly solveable in a reasonable amount of time (I've assumed it's combinatorally impossible to solve exactly)
Are there know solutions to this?
Note: The actual length of arrays are longer - think ~50, and the number of options are ~20
My current research has led me to some combination of the assignment problem and knapsack, but not too sure.
It's a covering problem, easily solvable using an integer program solver (I used OR-Tools below). If the X variables can be fractional, substitute NumVar for IntVar. If the X variables are 0--1, substitute BoolVar.
import numpy as np
a = np.array([1, 1, 0, 0, 1, 1, 1])
b = np.array([1, 1, 0, 0, 1, 0, 1])
c = np.array([1, 0, 0, 1, 0, 0, 0])
d = np.array([1, 0, 1, 1, 0, 0, 0])
opt = [a, b, c, d]
req = np.array([50, 50, 20, 20, 100, 40, 10])
from ortools.linear_solver import pywraplp
solver = pywraplp.Solver.CreateSolver("SCIP")
x = [solver.IntVar(0, solver.infinity(), "x{}".format(i)) for i in range(len(opt))]
extra = [solver.NumVar(0, solver.infinity(), "y{}".format(j)) for j in range(len(req))]
for j, (req_j, extra_j) in enumerate(zip(req, extra)):
solver.Add(extra_j == sum(opt_i[j] * x_i for (opt_i, x_i) in zip(opt, x)) - req_j)
solver.Minimize(sum(extra))
status = solver.Solve()
if status == pywraplp.Solver.OPTIMAL:
print("Solution:")
print("Objective value =", solver.Objective().Value())
for i, x_i in enumerate(x):
print("x{} = {}".format(i, x[i].solution_value()))
else:
print("The problem does not have an optimal solution.")
Output:
Solution:
Objective value = 210.0
x0 = 40.0
x1 = 60.0
x2 = -0.0
x3 = 20.0

How can I count the number of matching zero elements between two numpy arrays?

I have two numpy arrays of equal size. They contain the values 1, 0, and -1. I can count the number of matching ones and negative ones, but I'm not sure how to count the matching elements that have the same index and value of zero.
I'm a little confused on how to proceed here.
Here is some code:
print(actual_direction.shape)
print(predicted_direction.shape)
act = actual_direction
pre = predicted_direction
part1 = act[pre == 1]
part2 = part1[part1 == 1]
result1 = part2.sum()
part3 = act[pre == -1]
part4 = part3[part3 == -1]
result2 = part4.sum() * -1
non_zeros = result1 + result2
zeros = len(act) - non_zeros
print(f'zeros : {zeros}\n')
print(f'non_zeros : {non_zeros}\n')
final_result = non_zeros + zeros
print(f'result1 : {result1}\n')
print(f'result2 : {result2}\n')
print(f'final_result : {final_result}\n')
Here is the printout:
(11279,)
(11279,)
zeros : 5745.0
non_zeros : 5534.0
result1 : 2217.0
result2 : 3317.0
final_result : 11279.0
So what I've done here is simply subtract the summation of the ones and negative ones from the total length of the array. I can't assume that the difference (zeros: 5745) contains ALL matching elements that contain zeros can I?
You could try this:
import numpy as np
a=np.array([1,0,0,1,-1,-1,0,0])
b=np.array([1,0,0,1,-1,-1,0,1])
summ = np.sum((a==0) & (b==0))
print(summ)
Output:
3
You can use numpy.ravel() to flatten out the array, then use zip() to compare each element side by side:
import numpy as np
ar1 = np.array([[1, 0, 0],
[0, 1, 1],
[0, 1, 0]])
ar2 = np.array([[0, 0, 0],
[1, 0, 1],
[0, 1, 0]])
count = 0
for e1, e2 in zip(ar1.ravel(), ar2.ravel()):
if e1 == e2:
count += 1
print(count)
Output:
6
You can also do this to list all the matches found, as well as print out the amount:
dup = [e1 for e1, e2 in zip(ar1.ravel(), ar2.ravel()) if e1 == e2]
print(dup)
print(len(dup))
Output:
[0, 0, 1, 0, 1, 0]
6
You have two arrays and want to count the positions where both of these are 0, right?
You can check where the array meets your required condition (a == 0), and then use the 'and' operator & to check where both arrays meet your requirement:
import numpy as np
a = np.array([1, 0, -1, 0, -1, 1, 1, 1, 1])
b = np.array([1, 0, -1, 1, 0, -1, 1, 0, 1])
both_zero = (a == 0) & (b == 0) # [False, True, False, False, False, False]
both_zero.sum() # 1
In your updated question you appear to be interested in the similarities and differences between actual values and predictions. For this, a confusion matrix is ideally suited.
from sklearn.metrics import confusion_matrix
confusion_matrix(a, b, labels=[-1, 0, 1])
will give you a confusion matrix as output telling you how many -1s were predicted as -1, 0 and 1, and the same for 0 and +1:
[[1 1 0] # -1s predicted as -1, 0 and 1
[0 1 1] # 0s predicted as -1, 0 and 1
[1 1 3]] # 1s predicted as -1, 0 and 1

Discrepancies between R optim vs Scipy optimize: Nelder-Mead

I wrote a script that I believe should produce the same results in Python and R, but they are producing very different answers. Each attempts to fit a model to simulated data by minimizing deviance using Nelder-Mead. Overall, optim in R is performing much better. Am I doing something wrong? Are the algorithms implemented in R and SciPy different?
Python result:
>>> res = minimize(choiceProbDev, sparams, (stim, dflt, dat, N), method='Nelder-Mead')
final_simplex: (array([[-0.21483287, -1. , -0.4645897 , -4.65108495],
[-0.21483909, -1. , -0.4645915 , -4.65114839],
[-0.21485426, -1. , -0.46457789, -4.65107337],
[-0.21483727, -1. , -0.46459331, -4.65115965],
[-0.21484398, -1. , -0.46457725, -4.65099805]]), array([107.46037865, 107.46037868, 107.4603787 , 107.46037875,
107.46037875]))
fun: 107.4603786452194
message: 'Optimization terminated successfully.'
nfev: 349
nit: 197
status: 0
success: True
x: array([-0.21483287, -1. , -0.4645897 , -4.65108495])
R result:
> res <- optim(sparams, choiceProbDev, stim=stim, dflt=dflt, dat=dat, N=N,
method="Nelder-Mead")
$par
[1] 0.2641022 1.0000000 0.2086496 3.6688737
$value
[1] 110.4249
$counts
function gradient
329 NA
$convergence
[1] 0
$message
NULL
I've checked over my code and as far as I can tell this appears to be due to some difference between optim and minimize because the function I'm trying to minimize (i.e., choiceProbDev) operates the same in each (besides the output, I've also checked the equivalence of each step within the function). See for example:
Python choiceProbDev:
>>> choiceProbDev(np.array([0.5, 0.5, 0.5, 3]), stim, dflt, dat, N)
143.31438613033876
R choiceProbDev:
> choiceProbDev(c(0.5, 0.5, 0.5, 3), stim, dflt, dat, N)
[1] 143.3144
I've also tried to play around with the tolerance levels for each optimization function, but I'm not entirely sure how the tolerance arguments match up between the two. Either way, my fiddling so far hasn't brought the two into agreement. Here is the entire code for each.
Python:
# load modules
import math
import numpy as np
from scipy.optimize import minimize
from scipy.stats import binom
# initialize values
dflt = 0.5
N = 1
# set the known parameter values for generating data
b = 0.1
w1 = 0.75
w2 = 0.25
t = 7
theta = [b, w1, w2, t]
# generate stimuli
stim = np.array(np.meshgrid(np.arange(0, 1.1, 0.1),
np.arange(0, 1.1, 0.1))).T.reshape(-1,2)
# starting values
sparams = [-0.5, -0.5, -0.5, 4]
# generate probability of accepting proposal
def choiceProb(stim, dflt, theta):
utilProp = theta[0] + theta[1]*stim[:,0] + theta[2]*stim[:,1] # proposal utility
utilDflt = theta[1]*dflt + theta[2]*dflt # default utility
choiceProb = 1/(1 + np.exp(-1*theta[3]*(utilProp - utilDflt))) # probability of choosing proposal
return choiceProb
# calculate deviance
def choiceProbDev(theta, stim, dflt, dat, N):
# restrict b, w1, w2 weights to between -1 and 1
if any([x > 1 or x < -1 for x in theta[:-1]]):
return 10000
# initialize
nDat = dat.shape[0]
dev = np.array([np.nan]*nDat)
# for each trial, calculate deviance
p = choiceProb(stim, dflt, theta)
lk = binom.pmf(dat, N, p)
for i in range(nDat):
if math.isclose(lk[i], 0):
dev[i] = 10000
else:
dev[i] = -2*np.log(lk[i])
return np.sum(dev)
# simulate data
probs = choiceProb(stim, dflt, theta)
# randomly generated data based on the calculated probabilities
# dat = np.random.binomial(1, probs, probs.shape[0])
dat = np.array([0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1,
0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1,
0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1,
0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1,
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
# fit model
res = minimize(choiceProbDev, sparams, (stim, dflt, dat, N), method='Nelder-Mead')
R:
library(tidyverse)
# initialize values
dflt <- 0.5
N <- 1
# set the known parameter values for generating data
b <- 0.1
w1 <- 0.75
w2 <- 0.25
t <- 7
theta <- c(b, w1, w2, t)
# generate stimuli
stim <- expand.grid(seq(0, 1, 0.1),
seq(0, 1, 0.1)) %>%
dplyr::arrange(Var1, Var2)
# starting values
sparams <- c(-0.5, -0.5, -0.5, 4)
# generate probability of accepting proposal
choiceProb <- function(stim, dflt, theta){
utilProp <- theta[1] + theta[2]*stim[,1] + theta[3]*stim[,2] # proposal utility
utilDflt <- theta[2]*dflt + theta[3]*dflt # default utility
choiceProb <- 1/(1 + exp(-1*theta[4]*(utilProp - utilDflt))) # probability of choosing proposal
return(choiceProb)
}
# calculate deviance
choiceProbDev <- function(theta, stim, dflt, dat, N){
# restrict b, w1, w2 weights to between -1 and 1
if (any(theta[1:3] > 1 | theta[1:3] < -1)){
return(10000)
}
# initialize
nDat <- length(dat)
dev <- rep(NA, nDat)
# for each trial, calculate deviance
p <- choiceProb(stim, dflt, theta)
lk <- dbinom(dat, N, p)
for (i in 1:nDat){
if (dplyr::near(lk[i], 0)){
dev[i] <- 10000
} else {
dev[i] <- -2*log(lk[i])
}
}
return(sum(dev))
}
# simulate data
probs <- choiceProb(stim, dflt, theta)
# same data as in python script
dat <- c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1,
0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1,
0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1,
0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1,
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
# fit model
res <- optim(sparams, choiceProbDev, stim=stim, dflt=dflt, dat=dat, N=N,
method="Nelder-Mead")
UPDATE:
After printing the estimates at each iteration, it now appears to me that the discrepancy might stem from differences in 'step sizes' that each algorithm takes. Scipy appears to take smaller steps than optim (and in a different initial direction). I haven't figured out how to adjust this.
Python:
>>> res = minimize(choiceProbDev, sparams, (stim, dflt, dat, N), method='Nelder-Mead')
[-0.5 -0.5 -0.5 4. ]
[-0.525 -0.5 -0.5 4. ]
[-0.5 -0.525 -0.5 4. ]
[-0.5 -0.5 -0.525 4. ]
[-0.5 -0.5 -0.5 4.2]
[-0.5125 -0.5125 -0.5125 3.8 ]
...
R:
> res <- optim(sparams, choiceProbDev, stim=stim, dflt=dflt, dat=dat, N=N, method="Nelder-Mead")
[1] -0.5 -0.5 -0.5 4.0
[1] -0.1 -0.5 -0.5 4.0
[1] -0.5 -0.1 -0.5 4.0
[1] -0.5 -0.5 -0.1 4.0
[1] -0.5 -0.5 -0.5 4.4
[1] -0.3 -0.3 -0.3 3.6
...
This isn't exactly an answer of "what are the optimizer differences", but I want to contribute some exploration of the optimization problem here. A few take-home points:
the surface is smooth, so derivative-based optimizers might work better (even without an explicitly coded gradient function, i.e. falling back on finite difference approximation - they'd be even better with a gradient function)
this surface is symmetric, so it has multiple optima (apparently two), but it's not highly multimodal or rough, so I don't think a stochastic global optimizer would be worth the trouble
for optimization problems that aren't too high-dimensional or expensive to compute, it's feasible to visualize the global surface to understand what's going on.
for optimization with bounds, it's generally better either to use an optimizer that explicitly handles bounds, or to change the scale of parameters to an unconstrained scale
Here's a picture of the whole surface:
The red contours are the contours of log-likelihood equal to (110, 115, 120) (the best fit I could get was LL=105.7). The best points are in the second column, third row (achieved by L-BFGS-B) and fifth column, fourth row (true parameter values). (I haven't inspected the objective function to see where the symmetries come from, but I think it would probably be clear.) Python's Nelder-Mead and R's Nelder-Mead do approximately equally badly.
parameters and problem setup
## initialize values
dflt <- 0.5; N <- 1
# set the known parameter values for generating data
b <- 0.1; w1 <- 0.75; w2 <- 0.25; t <- 7
theta <- c(b, w1, w2, t)
# generate stimuli
stim <- expand.grid(seq(0, 1, 0.1), seq(0, 1, 0.1))
# starting values
sparams <- c(-0.5, -0.5, -0.5, 4)
# same data as in python script
dat <- c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1,
0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1,
0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1,
0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1,
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
objective functions
Note use of built-in functions (plogis(), dbinom(...,log=TRUE) where possible.
# generate probability of accepting proposal
choiceProb <- function(stim, dflt, theta){
utilProp <- theta[1] + theta[2]*stim[,1] + theta[3]*stim[,2] # proposal utility
utilDflt <- theta[2]*dflt + theta[3]*dflt # default utility
choiceProb <- plogis(theta[4]*(utilProp - utilDflt)) # probability of choosing proposal
return(choiceProb)
}
# calculate deviance
choiceProbDev <- function(theta, stim, dflt, dat, N){
# restrict b, w1, w2 weights to between -1 and 1
if (any(theta[1:3] > 1 | theta[1:3] < -1)){
return(10000)
}
## for each trial, calculate deviance
p <- choiceProb(stim, dflt, theta)
lk <- dbinom(dat, N, p, log=TRUE)
return(sum(-2*lk))
}
# simulate data
probs <- choiceProb(stim, dflt, theta)
model fitting
# fit model
res <- optim(sparams, choiceProbDev, stim=stim, dflt=dflt, dat=dat, N=N,
method="Nelder-Mead")
## try derivative-based, box-constrained optimizer
res3 <- optim(sparams, choiceProbDev, stim=stim, dflt=dflt, dat=dat, N=N,
lower=c(-1,-1,-1,-Inf), upper=c(1,1,1,Inf),
method="L-BFGS-B")
py_coefs <- c(-0.21483287, -0.4645897 , -1, -4.65108495) ## transposed?
true_coefs <- c(0.1, 0.25, 0.75, 7) ## transposed?
## start from python coeffs
res2 <- optim(py_coefs, choiceProbDev, stim=stim, dflt=dflt, dat=dat, N=N,
method="Nelder-Mead")
explore log-likelihood surface
cc <- expand.grid(seq(-1,1,length.out=51),
seq(-1,1,length.out=6),
seq(-1,1,length.out=6),
seq(-8,8,length.out=51))
## utility function for combining parameter values
bfun <- function(x,grid_vars=c("Var2","Var3"),grid_rng=seq(-1,1,length.out=6),
type=NULL) {
if (is.list(x)) {
v <- c(x$par,x$value)
} else if (length(x)==4) {
v <- c(x,NA)
}
res <- as.data.frame(rbind(setNames(v,c(paste0("Var",1:4),"z"))))
for (v in grid_vars)
res[,v] <- grid_rng[which.min(abs(grid_rng-res[,v]))]
if (!is.null(type)) res$type <- type
res
}
resdat <- rbind(bfun(res3,type="R_LBFGSB"),
bfun(res,type="R_NM"),
bfun(py_coefs,type="Py_NM"),
bfun(true_coefs,type="true"))
cc$z <- apply(cc,1,function(x) choiceProbDev(unlist(x), dat=dat, stim=stim, dflt=dflt, N=N))
library(ggplot2)
library(viridisLite)
ggplot(cc,aes(Var1,Var4,fill=z))+
geom_tile()+
facet_grid(Var2~Var3,labeller=label_both)+
scale_fill_viridis_c()+
scale_x_continuous(expand=c(0,0))+
scale_y_continuous(expand=c(0,0))+
theme(panel.spacing=grid::unit(0,"lines"))+
geom_contour(aes(z=z),colour="red",breaks=seq(105,120,by=5),alpha=0.5)+
geom_point(data=resdat,aes(colour=type,shape=type))+
scale_colour_brewer(palette="Set1")
ggsave("liksurf.png",width=8,height=8)
'Nelder-Mead' has always been a problematic optimization method, and its coding in optim is not up-to-date. We will try three other implementations available in R packages.
To avoild the other parameters, let's define function fn as
fn <- function(theta)
choiceProbDev(theta, stim=stim, dflt=dflt, dat=dat, N=N)
Then the solvers dfoptim::nmk(), adagio::neldermead(), and pracma::anms() will all return the same minimum value xmin = 105.7843, but at different positions, for instance
dfoptim::nmk(sparams, fn)
## $par
## [1] 0.1274937 0.6671353 0.1919542 8.1731618
## $value
## [1] 105.7843
These are real local minima while, for example, the Python solution 107.46038 at c(-0.21483287,-1.0,-0.4645897,-4.65108495) is not. Your problem data are obviously not sufficient for fitting the model.
You might try a global optimizer to possibly find a global optimum within certain bounds. To me it looks like all local minima have the same minimum value.

True Positive Rate and False Positive Rate (TPR, FPR) for Multi-Class Data in python [duplicate]

This question already has answers here:
How to get precision, recall and f-measure from confusion matrix in Python [duplicate]
(3 answers)
calculate precision and recall in a confusion matrix
(6 answers)
Closed 2 years ago.
How do you compute the true- and false- positive rates of a multi-class classification problem? Say,
y_true = [1, -1, 0, 0, 1, -1, 1, 0, -1, 0, 1, -1, 1, 0, 0, -1, 0]
y_prediction = [-1, -1, 1, 0, 0, 0, 0, -1, 1, -1, 1, 1, 0, 0, 1, 1, -1]
The confusion matrix is computed by metrics.confusion_matrix(y_true, y_prediction), but that just shifts the problem.
EDIT after #seralouk's answer. Here, the class -1 is to be considered as the negatives, while 0 and 1 are variations of positives.
Using your data, you can get all the metrics for all the classes at once:
import numpy as np
from sklearn.metrics import confusion_matrix
y_true = [1, -1, 0, 0, 1, -1, 1, 0, -1, 0, 1, -1, 1, 0, 0, -1, 0]
y_prediction = [-1, -1, 1, 0, 0, 0, 0, -1, 1, -1, 1, 1, 0, 0, 1, 1, -1]
cnf_matrix = confusion_matrix(y_true, y_prediction)
print(cnf_matrix)
#[[1 1 3]
# [3 2 2]
# [1 3 1]]
FP = cnf_matrix.sum(axis=0) - np.diag(cnf_matrix)
FN = cnf_matrix.sum(axis=1) - np.diag(cnf_matrix)
TP = np.diag(cnf_matrix)
TN = cnf_matrix.sum() - (FP + FN + TP)
FP = FP.astype(float)
FN = FN.astype(float)
TP = TP.astype(float)
TN = TN.astype(float)
# Sensitivity, hit rate, recall, or true positive rate
TPR = TP/(TP+FN)
# Specificity or true negative rate
TNR = TN/(TN+FP)
# Precision or positive predictive value
PPV = TP/(TP+FP)
# Negative predictive value
NPV = TN/(TN+FN)
# Fall out or false positive rate
FPR = FP/(FP+TN)
# False negative rate
FNR = FN/(TP+FN)
# False discovery rate
FDR = FP/(TP+FP)
# Overall accuracy
ACC = (TP+TN)/(TP+FP+FN+TN)
For a general case where we have a lot of classes, these metrics are represented graphically in the following image:
Another simple way is PyCM (by me), that supports multi-class confusion matrix analysis.
Applied to your Problem :
>>> from pycm import ConfusionMatrix
>>> y_true = [1, -1, 0, 0, 1, -1, 1, 0, -1, 0, 1, -1, 1, 0, 0, -1, 0]
>>> y_prediction = [-1, -1, 1, 0, 0, 0, 0, -1, 1, -1, 1, 1, 0, 0, 1, 1, -1]
>>> cm = ConfusionMatrix(actual_vector=y_true,predict_vector=y_prediction)
>>> print(cm)
Predict -1 0 1
Actual
-1 1 1 3
0 3 2 2
1 1 3 1
Overall Statistics :
95% CI (0.03365,0.43694)
Bennett_S -0.14706
Chi-Squared None
Chi-Squared DF 4
Conditional Entropy None
Cramer_V None
Cross Entropy 1.57986
Gwet_AC1 -0.1436
Joint Entropy None
KL Divergence 0.01421
Kappa -0.15104
Kappa 95% CI (-0.45456,0.15247)
Kappa No Prevalence -0.52941
Kappa Standard Error 0.15485
Kappa Unbiased -0.15405
Lambda A 0.2
Lambda B 0.27273
Mutual Information None
Overall_ACC 0.23529
Overall_RACC 0.33564
Overall_RACCU 0.33737
PPV_Macro 0.23333
PPV_Micro 0.23529
Phi-Squared None
Reference Entropy 1.56565
Response Entropy 1.57986
Scott_PI -0.15405
Standard Error 0.10288
Strength_Of_Agreement(Altman) Poor
Strength_Of_Agreement(Cicchetti) Poor
Strength_Of_Agreement(Fleiss) Poor
Strength_Of_Agreement(Landis and Koch) Poor
TPR_Macro 0.22857
TPR_Micro 0.23529
Class Statistics :
Classes -1 0 1
ACC(Accuracy) 0.52941 0.47059 0.47059
BM(Informedness or bookmaker informedness) -0.13333 -0.11429 -0.21667
DOR(Diagnostic odds ratio) 0.5 0.6 0.35
ERR(Error rate) 0.47059 0.52941 0.52941
F0.5(F0.5 score) 0.2 0.32258 0.17241
F1(F1 score - harmonic mean of precision and sensitivity) 0.2 0.30769 0.18182
F2(F2 score) 0.2 0.29412 0.19231
FDR(False discovery rate) 0.8 0.66667 0.83333
FN(False negative/miss/type 2 error) 4 5 4
FNR(Miss rate or false negative rate) 0.8 0.71429 0.8
FOR(False omission rate) 0.33333 0.45455 0.36364
FP(False positive/type 1 error/false alarm) 4 4 5
FPR(Fall-out or false positive rate) 0.33333 0.4 0.41667
G(G-measure geometric mean of precision and sensitivity) 0.2 0.30861 0.18257
LR+(Positive likelihood ratio) 0.6 0.71429 0.48
LR-(Negative likelihood ratio) 1.2 1.19048 1.37143
MCC(Matthews correlation coefficient) -0.13333 -0.1177 -0.20658
MK(Markedness) -0.13333 -0.12121 -0.19697
N(Condition negative) 12 10 12
NPV(Negative predictive value) 0.66667 0.54545 0.63636
P(Condition positive) 5 7 5
POP(Population) 17 17 17
PPV(Precision or positive predictive value) 0.2 0.33333 0.16667
PRE(Prevalence) 0.29412 0.41176 0.29412
RACC(Random accuracy) 0.08651 0.14533 0.10381
RACCU(Random accuracy unbiased) 0.08651 0.14619 0.10467
TN(True negative/correct rejection) 8 6 7
TNR(Specificity or true negative rate) 0.66667 0.6 0.58333
TON(Test outcome negative) 12 11 11
TOP(Test outcome positive) 5 6 6
TP(True positive/hit) 1 2 1
TPR(Sensitivity, recall, hit rate, or true positive rate) 0.2 0.28571 0.2
Since there are several ways to solve this, and none is really generic (see https://stats.stackexchange.com/questions/202336/true-positive-false-negative-true-negative-false-positive-definitions-for-mul?noredirect=1&lq=1 and
https://stats.stackexchange.com/questions/51296/how-do-you-calculate-precision-and-recall-for-multiclass-classification-using-co#51301), here is the solution that seems to be used in the paper which I was unclear about:
to count confusion between two foreground pages as false positive
So the solution is to import numpy as np, use y_true and y_prediction as np.array, then:
FP = np.logical_and(y_true != y_prediction, y_prediction != -1).sum() # 9
FN = np.logical_and(y_true != y_prediction, y_prediction == -1).sum() # 4
TP = np.logical_and(y_true == y_prediction, y_true != -1).sum() # 3
TN = np.logical_and(y_true == y_prediction, y_true == -1).sum() # 1
TPR = 1. * TP / (TP + FN) # 0.42857142857142855
FPR = 1. * FP / (FP + TN) # 0.9

Solving linear regression equation for a sparse matrix

I'd like to solve a multivariate linear regression equation for vector X with m elements while I have n observations, Y. If I assume the measurements have Gaussian random errors. How can I solve this problem using python? My problem looks like this:
A simple example of W when m=5, is given as follows:
P.S. I would like to consider the effect of errors and precisely I want to measure the standard deviation of errors.
You can do it like this
def myreg(W, Y):
from numpy.linalg import pinv
m, n = Y.shape
k = W.shape[1]
X = pinv(W.T.dot(W)).dot(W.T).dot(Y)
Y_hat = W.dot(X)
Residuals = Y_hat - Y
MSE = np.square(Residuals).sum(axis=0) / (m - 2)
X_var = (MSE[:, None] * pinv(W.T.dot(W)).flatten()).reshape(n, k, k)
Tstat = X / np.sqrt(X_var.diagonal(axis1=1, axis2=2)).T
return X, Tstat
demo
W = np.array([
[ 1, -1, 0, 0, 1],
[ 0, -1, 1, 0, 0],
[ 1, 0, 0, 1, 0],
[ 0, 1, -1, 0, 1],
[ 1, -1, 0, 0, -1],
])
Y = np.array([2, 4, 1, 5, 3])[:, None]
X, V = myreg(W, Y)

Categories

Resources