< a quantity that can be divided into another a whole number of time />

Mendelian randomization

June 15, 2012

Some time ago, I proposed a paper on Mendelian randomization for the Journal Club on Cross Validated. Apparently, it fell to the water, but here are the main ideas from that paper.

The paper in question is freely available on PLoS Medicine: Sheehan NA, Didelez V, Burton PR, Tobin MD (2008). Mendelian Randomisation and Causal Inference in Observational Epidemiology. PLoS Med 5(8): e177.

An older paper can also be found in the International Journal of Epidemiology, ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease?. Mendelian randomization in family data (BMC Proc. 2009; 3(Suppl 7): S45) is another paper on Mendelian randomization with aggregation at the family level.

In few words, the idea behind Mendelian randomization — which, from a genetic perspective, is all about the random assortment of genes from parents to offspring that occurs during gamete formation and conception1 — is to use a known genetic variant as a proxy to assess potential confounding between an intermediate phenotype and the disease of interest, something akin to the use of instrumental variable in econometrics (PDF). Most importantly, this genetic variant is unrelated to the confounding factor(s), but it is predictive of the exposure factor. The effect of the genetic variant is not direct, and conditional on exposure and confounders the genetic variant is independent of the outcome. Testing the association between this genetic variant and the outcome amounts to test for the causal effect exposure → outcome.

Several limitations of Mendelian randomization are discussed, including the presence of linkage disequilibrium, genetic heterogeneity (when a phenotype is influenced by several alleles, generally at different loci), pleiotropy (when a genetic variant has more than one phenotypic effect), or population stratification (when the relation between allele frequencies and disease or exposure vary across subgroups), to name a few. Figures 2-5 provide nice depictions of what happens in those cases.


  1. Chen, L, Smith, GD, Harbord, RM, and Lewis, SJ (2008). Alcohol intake and blood pressure: A systematic review implementing a Mendelian randomization approach. PLoS Med 5: e52.
  2. Hernán MA, Robins JM (2006). Instruments for causal inference. An epidemiologist’s dream. Epidemiology 17: 360–372.
  3. Smith, GD, Ebrahim, S, Lewis S, Hansell AL, Palmer LJ, and Burton, PR (2005). Genetic epidemiology and public health: Hope, hype, and future prospects. Lancet 366: 1484–1498.
  4. Smith, GD and Ebrahim, S (2003). ‘Mendelian randomization’: Can genetic epidemiology contribute to understanding environmental determinants of disease. International Journal of Epidemiology 32(1): 1–22.
  5. Katan, MB (1986). Apolipoprotein E isoforms, serum cholesterol, and cancer. Lancet i: 507–508. (See also this IJE paper)
  6. Cambien, F (2003). On Mendelian Randomisation. GeneCanvas.
  7. Didelez, V and Sheehan, NA (2007). Mendelian randomisation as an instrumental variable approach to causal inference. Statistical Methods in Medical Research 16: 309–330.
  8. Lawlor, DA, Harbord, RM, Sterne, JAC, Timpson, N, and Smith, GD (2008). Mendelian randomization: Using genes as instruments for making causal inferences in epidemiology. Statistics in Medicine 27: 1133-1163.

  1. In the context of genetic association study, Mendelian laws ensure that the comparison of groups of individuals defined by genotype is equivalent to a randomized comparison, such that genetic or non-genetic traits are expected to be distributed randomly across genotypes, except those that are affected by the polymorphism under study. Whence the bias-free comparison of phenotypes across genotypes and the idea that such results might bring insight into causal pathways. ↩︎


See Also

» Workflow for statistical data analysis » Latest reading list on medical statistics » Fitting genetic models for twin studies » Kiefer's Introduction to statistical inference » Ensemble Methods in Data Mining