“Humans, it turns out, are not very good at Bayesian inference, at least when verbal reasoning is involved. The problem is that we tend to neglect the cause’s prior probability. ”
-Pedro Domingos, in The Master Algorithm
We remain in the most challenging global health crisis of our generation, with the real-time (or not so real-time) reverse transcriptase polymerase chain reaction (rRT-PCR) screening tests still not widely available. Even when the tests are available, the results are not in a timely manner so it renders these screening tests irrelevant. Last time, we delineated why a high sensitivity as well as high negative predictive value are preferred given that a false negative test has much higher level of significance than a false positive. Again, the general observation is that we humans tend to over-rely on the screening test itself (especially if it is in print, like the electronic health record that is often full of erroneous information) and concomitantly underestimate the importance of a pre-test assessment (patient symptoms, local prevalence of disease, exposure risk, etc). In one common scenario of a patient with suspicion for COVID-19, how would we interpret a negative screening test?
We should now discuss the essential pre-test assessment-to-screening test coupling in the context of the English theological and mathematician Thomas Bayes and his Theorem (and its concept of prior probabilities) and how important this is to rule in or out disease (likelihood of disease). Clinicians will often forego their strong clinical suspicion for or against disease (very often counter to their judgment) and act instead on the screening test result (positive or negative). Using Bayes’ theorem and the concept of a pre-test probability, anyone who is tested for COVID-19 should have a pre-test estimation of the probability of infection as this impacts heavily on the likelihood of a negative test being interpreted correctly as negative. In other words, in a person with a higher probability of having COVID-19, the post-test probability of a false negative is much higher than someone who has a lower probability of disease. In short, a high prior probability of disease in a person renders the likelihood of a false negative test much higher, and the latter depends on the sensitivity of the test (the lower the sensitivity, the higher this likelihood of a false negative). This pre-test and post-test probability leading to a likelihood ratio can be estimated with the use of a Fagan nomogram (with the appropriate conversions of probabilities and odds ratios). Naive Bayes, the widely used supervised machine learning probabilistic classifier, assumes that the features are independent of each other; the “naive” connotation is given because this assumption (called class conditional independence) is usually not true.
A special thank you to my friend Dr. Alfonso Limon, who continues to teach me data science and artificial intelligence (including a few nuances about Bayes’ Theorem).
One may be thinking that a negative predictive value would be helpful here, but this calculation depends on the prevalence of disease (not easily known due to inadequate testing for COVID-19). One possible solution to this entire conundrum would be to perform frequent testing or better yet, assume anyone with symptoms at least to be positive with the appropriate quarantine measures (although many are asymptomatic). Of note, given the much higher specificity of the screening test for COVID-19, both of the low and high pre-test probability patients are very likely to have COVID-19 infection if they tested positive on the screening test.