For another commentary, the importance of shape of an ROC curve, rather than area under the curve (AUC), was highlighted. That topic is very rarely if ever discussed. In another essay on biomarker discovery, disease prevalence as a driver of analysis was examined. Combining prevalence with ROC analysis also highlights the limitations of the latter.
For those who know ROCs well, this discussion might likely be obvious and well known. However, for many practical users of ROCs, this could be surprising.
In addition to AUCs and sensitivities and specificities, positive and negative predictive values (PPVs and NPVs) are mentioned more and more, as they should be, given their clinical importance. Some (often?) times when PPV and NPV are discussed, the formula assumes that the prevalence is equal between the diseased and the normal populations. (1)
This does not take into account disease prevalence, which most often has dramatic results.
From this example, one can see the classic issue with screening assays. Even with decent sensitivity and specificity, the PPV is untenable. For most disease requiring screening, the prevalence is even lower, <0.1%, and he situation worse (95% sensitivity and specificity lead to only a 2% PPV).
Additionally, at low disease prevalence the NPV is 99+% even for very, very poor assays. (“You very probably don’t have the disease.”)
As can be seen, at higher disease prevalence, it is hard to rule out disease (NPV).
Disease prevalence is critical (and can be manipulated by inclusion and exclusion criteria of a study).
The implication of all this is that the rules of thumb regarding AUCs don’t have value except in light of disease prevalence – very often not mentioned in study results. Even though, more and more NPV and PPV are mentioned, and calculated using disease prevalence, authors and readers gravitate towards the comfortable AUC metric. This is doubly compounded by the choice of cut-offs (and as we know from the shape discussion, perhaps two cut-offs should most often be chosen – one best for PPV and on for NPV).
Companies harp on AUCs, often mediocre even themselves.
So, we’re back to that simple and misleading metric again. Shape is often more important. When considering ROCs, it is crucial to understand the curve’s shape, the epidemiology (e.g., prevalence), and of course the intended clinical use of the test.
(1) This leads to PPV equaling true positives divided by (true positives plus false positives). For NPV, the value would be true negatives divided by (true negatives plus false negatives).