Evaluation of binary classifiers Binary classification
the left, , right, halves respectively contain instances in fact have, , not have, condition. oval contains instances classified (predicted) positive (having condition). green , red respectively contain instances correctly (true), , wrongly (false), classified.
tp=true positive; tn=true negative; fp=false positive (type error); fn=false negative (type ii error); tpr=true positive rate; fpr=false positive rate; ppv=positive predictive value; npv=negative predictive value.
there many metrics can used measure performance of classifier or predictor; different fields have different preferences specific metrics due different goals. example, in medicine sensitivity , specificity used, while in information retrieval precision , recall preferred. important distinction between metrics independent on prevalence (how each category occurs in population), , metrics depend on prevalence – both types useful, have different properties.
given classification of specific data set, there 4 basic data: number of true positives (tp), true negatives (tn), false positives (fp), , false negatives (fn). these can arranged 2×2 contingency table, columns corresponding actual value – condition positive (cp) or condition negative (cn) – , rows corresponding classification value – test outcome positive or test outcome negative. there 8 basic ratios 1 can compute table, come in 4 complementary pairs (each pair summing 1). these obtained dividing each of 4 numbers sum of row or column, yielding 8 numbers, can referred generically in form true positive row ratio or false negative column ratio , though there conventional terms. there 2 pairs of column ratios , 2 pairs of row ratios, , 1 can summarize these 4 numbers choosing 1 ratio each pair – other 4 numbers complements.
the column ratios true positive rate (tpr, aka sensitivity or recall), complement false negative rate (fnr); , true negative rate (tnr, aka specificity, spc), complement false positive rate (fpr). these proportion of population condition (resp., without condition) test correct (or, complementarily, test incorrect); these independent of prevalence.
the row ratios positive predictive value (ppv, aka precision), complement false discovery rate (fdr); , negative predictive value (npv), complement false omission rate (for). these proportion of population given test result test correct (or, complementarily, test incorrect); these depend on prevalence.
in diagnostic testing, main ratios used true column ratios – true positive rate , true negative rate – known sensitivity , specificity. in informational retrieval, main ratios true positive ratios (row , column) – positive predictive value , true positive rate – known precision , recall.
one can take ratios of complementary pair of ratios, yielding 4 likelihood ratios (two column ratio of ratios, 2 row ratio of ratios). done column (condition) ratios, yielding likelihood ratios in diagnostic testing. taking ratio of 1 of these groups of ratios yields final ratio, diagnostic odds ratio (dor). can defined directly (tp×tn)/(fp×fn) = (tp/fn)/(fp/tn); has useful interpretation – odds ratio – , prevalence-independent.
there number of other metrics, accuracy or fraction correct (fc), measures fraction of instances correctly categorized; complement fraction incorrect (fic). f-score combines precision , recall 1 number via choice of weighing, equal weighing, balanced f-score (f1 score). metrics come regression coefficients: markedness , informedness, , geometric mean, matthews correlation coefficient. other metrics include youden s j statistic, uncertainty coefficient, phi coefficient, , cohen s kappa.
Comments
Post a Comment