Different Evaluation Metrics

import ipywidgets as widgets
import matplotlib.pyplot as plt
import numpy as np
from ipywidgets import interact

%matplotlib widget

Medical diagnosis: From patient’s perspective¶

A test for COVID-19 has an accuracy of $90\%$ , i.e.,
$\Pr(\hat{\R{Y}} = \R{Y}) = 0.9$
(1)
- $\R{Y}$ : Indicator of infection.
- $\hat{\R{Y}}$ : Diagnosis of infection.
Suppose a person is diagnosed to have the virus, i.e., $\hat{\R{Y}} = 1$ .
- Is it likely ( $>50\%$ chance) that the person has the virus? Y/N
- Is the likelihood $90\%$ ? Y/N

Confusion matrix for binary classification¶

TP (True +ve): number of +ve tuples classified as +ve.
TN (True -ve): number of -ve tuples classified as -ve.
FP (False +ve): number of -ve tuples classified as +ve.
(F_______ a________ / Type I error)
FN (False -ve): number of +ve tuples classified as -ve.
(M______ d________ / Type II error)

Accuracy vs Precision¶

Accuracy is $\frac{\op{TP} + \op{TN}}{n}$ where $n = \op{TP} + \op{TN} + \op{FP} + \op{FN}$ .
Precision is $\frac{\op{TP}}{\hat{P}}$ where $\hat{P} = \op{TP} + \op{FP}$ .
P_______________ p _______________ v _______________ (PPV)
Is it possible that accuracy is high but precision is low?

Example¶

Accuracy is ____________%.
Precision is ____________%.
When is accuracy > precision in general?

Negative predictive value (NPV)¶

NPV is $\frac{\op{TN}}{\hat{N}}$ where $\hat{N} = \op{TN} + \op{FN} = n - \hat{P}$ .
Accuracy is $\frac{\op{TP} + \op{TN}}{n} = \frac{\hat{P} \cdot \op{PPV} + \hat{N} \cdot \op{NPV}}{n} = \frac{\hat{P}}{n} \op{PPV} + \frac{\hat{N}}{n} \op{NPV}$ .
Accuracy > precision iff NPV $\ge$ PPV.
Accuracy = precision iff _________________________________________________________

Example¶

Accuracy is _______________%.
Precision is _______________%.
NPV is ________________%.

Medical diagnosis: From Government’s perspective¶

Suppose the government wants to eradicate COVID-19 as it is highly contagious.
If a test is $90\%$ accurate, can the government identify $>50\%$ of infected people? Y/N

Recall¶

Recall is $\frac{\op{TP}}{\op{P}}$ where $\op{P} = \op{TP} + \op{FN}$ .
S__________ or True positive rate (TPR)

Example¶

Accuracy is ____________%.
Precision is ____________%.
NPV is __________________%.
Recall is ___________________________%.
When is accuracy $>$ recall?

Specificity¶

Specificity is $\frac{\op{TN}}{N}$ where $N = \op{TN} + \op{FP}$ .
True negative rate (TNR)
Accuracy is
$\frac{\op{TP} + \op{TN}}{n} = \frac{P \cdot \op{TPR} + N \cdot \op{TNR}}{n} = \frac{P}{n} \op{TPR} + \frac{N}{n} \op{TNR}$
(2)
Accuracy > recall iff TNR ≥ TPR.
Accuracy = recall iff ______________________________________________________

Example¶

Accuracy is ____________%.
Precision is ____________%.
NPV is __________________%.
Recall is ___________________________%.
Specificity is _________________________%.

Class imbalance problem¶

Happens when $P \ll N$ (or $N \ll P$ ).
If $P \ll N$ , accuracy can be dominated by ___________ over __________________.
$\op{Accuracy} = \frac{{\color{grey}{\op{TP}}} + \op{TN}}{n} = {\color{grey}{\frac{P}{n} \cdot \op{TPR}}} + \frac{N}{n} \cdot \op{TNR} = {\color{grey}{\frac{P}{n} \cdot \op{PPV}}} + \frac{N}{n} \cdot \op{NPV}$
(3)
How to evaluate the prediction of positive class?

Cost/benefit analysis
- Different per unit cost/benefit assigned to FP, FN, TP, and TN.
- Minimize total cost or maximize total benefit.
  $\op{Cost} = \op{FP} \cdot \op{Cost}_{\op{FP}} + \op{FN} \cdot \op{Cost}_{\op{FN}} + \op{TP} \cdot \op{Cost}_{\op{TP}} + \op{TN} \cdot \op{Cost}_{\op{TN}}$
  (4)

F score¶

F_1 := \left( \frac{\left( \op{PPV}^{-1} + \op{TPR}^{-1} \right)}{2} \right)^{-1} = \frac{2 \cdot \op{PPV} \cdot \op{TPR}}{\op{PPV} + \op{TPR}} \quad (\op{F-score/measure})

(5)

Why Harmonic means instead of arithmetic mean?

Arithmetic mean $=0.7$ implies $\op{PPV,TPR}\geq$ _____

# Code for the above plot
# Create a meshgrid for x and y in the range [0, 1]
x = np.linspace(0, 1, 100)
y = np.linspace(0, 1, 100)
X, Y = np.meshgrid(x, y)
Z = (X + Y) / 2  # Arithmetic mean

# Set up the figure with two subplots: one for 3D and one for contour
fig = plt.figure(figsize=(12, 6), num=1, clear=True)

# 3D subplot
ax1 = fig.add_subplot(121, projection='3d')
ax1.plot_surface(X, Y, Z, cmap='viridis')
ax1.set_title(r'3D Plot of Arithmetic Mean $z=\frac{x+y}{2}$')
ax1.set_xlabel(r'$x$')
ax1.set_ylabel(r'$y$')
ax1.set_zlabel(r'$z$')
ax1.zaxis.set_label_position('lower')

# Contour subplot
ax2 = fig.add_subplot(122)
contour_levels = np.linspace(0, 1, 11)  # Levels: 0, 0.1, ..., 1.0
contour = ax2.contour(X, Y, Z, levels=contour_levels, cmap='viridis')
ax2.set_title('Contour Plot')
ax2.set_xlabel(r'$x$')
ax2.set_ylabel(r'$y$')
fig.colorbar(contour, ax=ax2, shrink=0.5, aspect=5)

# To write to file
fig.savefig('images/arithmetic_mean.svg')

plt.show()

Harmonic mean $=0.7$ implies $\op{PPV,TPR}\geq$ _____

# Create a meshgrid for x and y in the range [0, 1]
x = np.linspace(0.01, 1, 100)
y = np.linspace(0.01, 1, 100)
X, Y = np.meshgrid(x, y)
Z = ((X**(-1) + Y**(-1)) / 2)**(-1)  # Harmonic mean

# Set up the figure with two subplots: one for 3D and one for contour
fig = plt.figure(figsize=(12, 6), num=2, clear=True)

# 3D subplot
ax1 = fig.add_subplot(121, projection='3d')
ax1.plot_surface(X, Y, Z, cmap='viridis')
ax1.set_title(r'3D Plot of Harmonic Mean $z=\left(\frac{x^{-1}+y^{-1}}{2}\right)^{-1}$')
ax1.set_xlabel(r'$x$')
ax1.set_ylabel(r'$y$')
ax1.set_zlabel(r'$z$')
ax1.zaxis.set_label_position('lower')

# Contour subplot
ax2 = fig.add_subplot(122)
contour_levels = np.linspace(0, 1, 11)  # Levels: 0, 0.1, ..., 1.0
contour = ax2.contour(X, Y, Z, levels=contour_levels, cmap='viridis')
ax2.set_title('Contour Plot')
ax2.set_xlabel(r'$x$')
ax2.set_ylabel(r'$y$')
fig.colorbar(contour, ax=ax2, shrink=0.5, aspect=5)

# To write to file
fig.savefig('images/harmonic_mean.svg')

plt.show()

F-beta score¶

F_{\beta} := \left( \frac{\op{PPV}^{-1} + \beta^2 \op{TPR}^{-1} }{\beta^2 + 1} \right)^{-1} = \frac{(\beta^2 + 1) \cdot \op{PPV} \cdot \op{TPR}}{\beta^2 \cdot \op{PPV} + \op{TPR}} \quad \op{for} \ \beta > 0

(6)

As $\beta \to \infty$ , $F_{\beta} \to$ ____
As $\beta \to 0$ , $F_{\beta} \to$ ____

def f_beta_score(precision, recall, beta):
    return (1 + beta**2) * (precision * recall) / (beta**2 * precision + recall)

# Create an interactive widget to change beta on a logarithmic scale
beta_slider = widgets.FloatLogSlider(value=2, base=10, min=-2, max=2, step=0.1, 
                                     description=r'$\beta$:', continuous_update=False)

@interact
def plot_f_beta(beta=beta_slider):
    x = np.linspace(0.01, 1, 100)
    y = np.linspace(0.01, 1, 100)
    X, Y = np.meshgrid(x, y)
    Z = f_beta_score(X, Y, beta)
    
    fig = plt.figure(figsize=(12, 6), num=3, clear=True)
    
    ax1 = fig.add_subplot(121, projection='3d')
    surf = ax1.plot_surface(X, Y, Z, cmap='viridis')
    ax1.set_title(r'3D Plot of $F_{\beta}:=\left( \frac{\text{PPV}^{-1} + \beta^2 \text{TPR}^{-1}}{\beta^2 + 1} \right)^{-1}$')
    ax1.set_xlabel('PPV')
    ax1.set_ylabel('TPR')
    ax1.set_zlabel(r'$F_{\beta}$ Score')
    ax1.zaxis.set_label_position('lower')

    ax2 = fig.add_subplot(122)
    contour_levels = np.linspace(0, 1, 11)
    contour = ax2.contour(X, Y, Z, levels=contour_levels, cmap='viridis')
    ax2.set_title('Contour Plot')
    ax2.set_xlabel('PPV')
    ax2.set_ylabel('TPR')
    fig.colorbar(contour, ax=ax2, shrink=0.5, aspect=5)
        
    plt.show()

Threshold-moving¶

Apply a threshold γ to the output of a probabilistic classifier.

Area under curve (AUC)¶

Obtain the trade-offs of different performance metrics by varying the threshold.
Receiver operation characteristics curve (ROC):
- Plot of TPR against FPR (False positive rate=1-TNR)
- AUC: ROC area
Precision recall curve (PRC):
- Plot of precision against recall
- AUC: PRC area
Which is better, ROC or PRC?

References¶

8.5.1 Metrics for Evaluating Classifier Performance
8.5.6 Comparing Classifiers based on Cost-Benefits and ROC Curves