Evidence-based medicine and clinical diagnosis


I just started reading The evidence base of clinical diagnosis, edited by J. André Knottnerus and Frank Buntinx (Wiley, 2009, 2nd ed.).

Evidence-based medicine seeks to inform clinical decision makers with reliable and scientifically established facts in therapeutic, prognostic and diagnostic research. There is also a dedicated journal from the BMJ group.

Diagnostic testing can be seen as

the collection of additional information intended to (further) clarify the character and prognosis of the patient's condition and can include patients' characteristics, symptoms and signs, history and physical examination items, or additional tests using laboratory or other technical facilities.

The authors emphasize the need to clearly articulated the main objectives of a diagnostic study--detecting or excluding disorders, contributing to the decision-making process, assessing prognosis, monitoring the clinical course of a disorder, or measuring physical fitness--and in particular to clearly formulated relevant questions: (Phase I) Do patients with the target disorder have different test results from normal individuals? (Phase II) Are patients with certain test results more likely to have the target disorder than patients with other test results? (Phase III) Among patients in whom it is clinically sensible to suspect the target disorder, does the level of the test result distinguish those with and without the target disorder? (Phase IV) Do patients who undergo this diagnostic test fare better (in their ultimate health outcomes) than similar patients who do not? (Phase V) Does use of the diagnostic test lead to better health outcomes at an acceptable cost?

Methodological challenges are multiple: there are complex many-to-many relations between diagnostic and nosological outcome; a gold standard to which a diagnostic test can be compared rarely exists; spectrum and selection bias can seriously affect the validity of the findings; "soft" measures (pain, feeling unwell, etc.) should be part of the assessment of the clinical outcome; observer variability and bias can threaten the reliability of the assessment; discrimination does not mean usefulness in the sense that a diagnostic test may add too little to what is already known from a clinical perspective; indication area defined by moderate prior probability of a particular disorder.

Another important aspect of diagnostic study: What do you mean by "normal" and "the normal range"? There's always this unfamous "Gaussian" definition which assumes that the diagnostic test for a target population will fit the Gaussian distribution, meaning that its mean ± 2 standard deviations encloses 95% of its contents: beside the obvious fact that diagnostic readings do not obey a normal law, a serious issue with this approach is the diagnosis of nondisease since it amounts to say that diseases represented by the 2.5% of test results located at both end of the Gaussian distribution have exactly the same estimated frequency. Using percentile range is no more satisfactory even if the shape of the distribution is ignored: when the lower 95% of a test result are called "normal" it still suggests that the underlying prevalence of all diseases is exactly the same, 5%, notwithstanding the fact that several independent readings will decrease the likelihood of any patient being called "normal". Both approaches fall under the trap of the upper-limit syndrome of nondisease.

As with Phase I study, Phase II investigation, which doesn't meet methodological standard either and shall be considered as explanatory, should tell us whether a promising diagnostic test is worth further, costlier evaluation. Common threats to Phase III studies include:

  • Lack of an independent, blind comparison with a gold standard of diagnosis
  • Loss of the reference standard.
  • Uninterpretable or indeterminate result from the reference standard.
  • Diagnostic test result lost, never performed or indeterminate.
  • Interpretation of reference standard not blind to outcome from the diagnostic test (and vice versa).
  • Selection of the "upper limit or normal" or cut-point for the diagnostic test is under the control of the investigator.

Of particular concern is the fact that lack of results on the gold standard or the diagnostic test add a third row/column to the classica 2-by-2 cross-classification table, which most of the times will lead to biased estimates of sensitivity and specificity. Phase IV questions generally arise with diagnostic test for early detection of asymptomatic disease and they require follow-up of randomized patients. Phase V has typically to do with cost-effectiveness study: we are generally seeking a more effective but less costly testing strategy.

Other books of interest:

  1. Pepe, MS. The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford University Press, 2003. But see The Diagnostic and Biomarkers Statistical (DABS) Center.
  2. Broemeling, LD. Bayesian Biostatistics and Diagnostic Medicine. Chapman & Hall, 2007.

Articles with the same tag(s):

Academic teaching
Data cleaning techniques
Data Science from Scratch
Writing a book
Stata for health researchers
R Graphs Cookbook
Bad Data
Data science at the command-line
Reproducible research with R
Twenty canonical questions in machine learning