Using Bayes’ Theorem and Conditional Probability In Medical Field
Over the years, statistics (a branch of mathematics) has been used to calculate the probability of events. These statistical tests are based on data that has been collected to fulfill a specific purpose. In this paper, we will look at the probability of undertaking medical tests and decisions using Bayes theorem. Practically, not all medical tests are correct; this implies that medical tests are never ‘100%’ accurate.
For instance, assuming the result of a diagnostic procedure carried out to test for Chronic Obstructive Pulmonary Disease (COPD) among smokers and non-smokers yields a ‘99%’ accuracy; this means that this result will only be ‘correct’ 99% of the time. That is, there is always a 1% chance that this result is incorrect. This ability to correctly diagnose COPD in a person who has the condition defines the sensitivity while the ability to correctly diagnose lack of COPD in a person who doesn’t have the condition defines the specificity. Now, let us closely look at what this means using Bayes theorem;
P (A|B) = P (A) P (B|A)
P (B)
P (A|B) shows us the probability that ‘A happens gives the B happens’ while P (B|A) is how often we know ‘B happens given that A happens’. P (A) and P (B) are probabilities of A and B happening respectively on their own. Looking back at our COPD tests with 99% accuracy, we can assess say the results of spirometry screening test carried out among district residents (both smokers and non-smokers) during an anti-smoking campaign. The probability of the results is depicted in Table 1 with green representing correct diagnosis and red representing incorrect diagnosis result.
Table 1: COPD Spirometry Screening Results
COPD Spirometry Screening | COPD (1%) | No COPD (99%) |
Tested Positive | 90% | 7.6% |
Tested Negative | 10% | 92.4% |
Table 1 indicates that 1% of the district residents have COPD whereas 99% of the district residents do not. Residents who are already affected by COPD are shown in the first column with spirometry screening tests indicating that there is 90% chance of testing positive and a 10% chance of testing negative. Those residents who don’t have COPD are in the second column with spirometry screening test indicating that there is 7.6% chance of testing positive and a 92.4% chance of testing negative. Indeed, we can also say that 90% of those with COPD who tested positive actually have COPD and those with No COPD and actually tested negative are 92.4%. We, however, need to calculate this figures to identify the chances of a true positive or false positive and so on. Therefore;
- To calculate the chance of a true positive; we have
= Chance you have COPD * Chance the spirometry result captured it
= 1% * 90%
= 0.009
- To calculate the chance of a false positive; we have
= Chance you don’t have COPD * Chance the spiromety result captured it anyway
= 99% * 7.6%
= 0.07524
Our table will now look like this;
COPD Spirometry Screening |
COPD (1%) |
No COPD (99%) |
Tested Positive | True Positive
0.009 |
False Positive
0.07524 |
Tested Negative | False Negative
0.001 |
True Negative
0.91476 |
Thus the actual chance of having COPD when you have a positive result is
= 0.009
(0.009+0.07524)
= 0.10684
The possibility of having COPD going by a positive test is therefore 0.10684 or about 10.684% chance as compared to the 90% shown by the spirometry screening in Table 1. In conclusion, drawing from both tables, if we took 100 district residents, only 1 of them will have COPD (1%) and there’s a 90% chance they are likely to test positive. Among the remaining 99 district residents, 7.6% will test positive, approximately 8 residents. That is, for all positive results we will have only 1 in 9 residents will test correct positive for each screening. The real number, however, will be 10.684% or approximately 1 in every 10 residents.
In our second example, we take a look at lung cancer. CDC (2015) links 80% to 90% of lung cancer cases to cigarette smoking; further people who smoke are 30 times more likely to get lung cancer compared their counterparts who are non-smokers. CDC also notes that 15 out of 100 adults (those who are 18 years and above) smoke cigarettes. National Cancer Institute (2015) lung cancer Computed Tomography (CT) scans of up to three times annually found that 39.1% had at least recorded false positives reporting an accuracy of 94.5% on the tests. The US population is approximately 243, 333,333 adults, which implies that about 36,500,000 people smoke cigarettes. Here is how the figures look like in a table format;
Table 2: Sample CT Scan test results
CT Scan |
Smokers (15%) |
Non-smokers |
Totals |
Tested Positive | True Positive
34,492,500 (94.5%) |
False Positive
80,871,834 (39.1%) |
115,364,334 |
Tested Negative | False Negative
2,007,500 (5.5%) |
True Negative
125,961,499 (60.9%) |
127,968,999 |
Totals | 36,500,000 | 206,833,333 | 243,333,333 |
Using Bayes theorem formula indicated in previous section in this writeup, this means that event ‘an’ or smokers in the US are 15%; event ‘B’ or positive result is average 85% (mid-point of 80% to 90%) and ‘probability of event B given an’ is about 333% (or 30 times more likely). Thus;
P (A|B) = 3.33 * 0.15 = 0.58764
0.85
The probability (computed) of a smoker having lung cancer is 0.58764 (58.76%). Compared to initial projections of 80% to 90%, this therefore raises question on the accuracy of CT scan test results.
In real life, it is important to remember that CT scan tests however popular will not be carried out on all existing smokers or even the entire population. Thus it is unlikely that the reported false positive among non-smokers will be recorded. However, in practice both false negative and false positives persons will continue to be monitored by their doctors for any symptoms of the disease, and if displayed these persons will be tested again. Doctors will also choose to carry out appropriate follow-up tests on especially positive screening results thereby preventing the risk of prescribing lung cancer drugs to persons who are otherwise healthy. Personally, it is this practice of carrying out extensive tests/investigations and re-tests – what others might consider as invasive and burdensome – that supports the rational of complementing medical tests, the test results (data), calculations, probability theorems (Bayes theorem), assumptions and other external factors thereby providing those in the medicine profession with comprehensive information to make accurate and timely decisions.