Processing math: 100%

ANOVA and the linear model

Christophe Lalanne

November 25, 2014

Synopsis

These slides are intended to provide some complementary material to multiple comparisons in ANOVA modeling, and the analysis of correlated using mixed-effect models.




orthogonal contrasts • post-hoc comparisons • unbalanced data paired samples • intraclass correlation • variance components • random effects

ANOVA and multiple testing

The analysis of variance is not a mathematical theorem, but rather a convenient method of arranging the arithmetic. Ronald Fisher (1890-1962)

Back to the ANOVA model

One-way ANOVA: yij=μ+αi+εij,εijN(0,σ2).

Two-way ANOVA yijk=μ+αi+βj+γij+εijk,εijkN(0,σ2).

When a factor has more than 2 levels, or when the interaction between the two factors is of interest, this involves post-hoc multiple comparisons of pairs of means.

Orthogonal contrasts

With k samples (treatments), we can define k1 orthogonal contrasts ϕ=ki=1ci¯xi,ici=0etϕtuϕtv=0 and use as a test statistic ϕsϕ, where s2ϕ=s2ic2ini, which is distributed as a Student's t.

  • contr.treatment: treatment contrast (dummy coding), k1 last levels compared to baseline category
  • contr.sum: deviation contrast, k1 first levels compared to the mean of the group means
  • contr.poly: polynomial contrast (factor with ordered levels only), to test for linear, quadratic, cublic trend.

contr.treatment(levels(grp <- gl(3, 2, 36, c("A","B","C"))))
##   B C
## A 0 0
## B 1 0
## C 0 1
y <- 0.5 * as.numeric(grp) + rnorm(36);
coef(lm(y ~ grp))
## (Intercept)        grpB        grpC 
##   0.3611311   0.6080719   1.1492023
coef(lm(y ~ grp, contrasts = list(grp = "contr.sum")))
## (Intercept)        grp1        grp2 
##  0.94688920 -0.58575808  0.02231383

Application

Consider a factor A with 4 levels and 8 observations per group.

set.seed(101)
n <- 8
A <- gl(4, n, 4*n, labels = paste("a", 1:4, sep = ""))
y <- 0.5 * as.numeric(A) + rnorm(4*n)
d <- data.frame(y, A)
print(mm <- tapply(y, A, mean))
##        a1        a2        a3        a4 
## 0.7195790 0.9945359 0.8941996 2.2276571

One-way ANOVA model:
H0:μ1=μ2=μ3=μ4, SS=11.38, F(3,28)=6.036 (p=0.00264).
Let us assume that we are interested in the following comparisons: H(1)0:(μ1+μ2)/2=(μ3+μ4)/2, H(2)0:μ1=μ2 and H(3)0:μ3=μ4.

cm <- cbind(c(-1,-1,1,1), c(-1,1,0,0), c(0,0,-1,1))
contrasts(d$A) <- cm
summary(lm(y ~ A, data = d))
## 
## Call:
## lm(formula = y ~ A, data = d)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.63805 -0.33598  0.07134  0.46073  1.43322 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   1.2090     0.1401   8.628 2.25e-09 ***
## A1            0.3519     0.1401   2.512  0.01806 *  
## A2            0.1375     0.1982   0.694  0.49355    
## A3            0.6667     0.1982   3.365  0.00224 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7927 on 28 degrees of freedom
## Multiple R-squared:  0.3927, Adjusted R-squared:  0.3277 
## F-statistic: 6.036 on 3 and 28 DF,  p-value: 0.002642

as.numeric((t(cm) %*% mm) / c(4, 2, 2))
## [1] 0.3519355 0.1374784 0.6667288
library(multcomp)
summary(glht(aov(y ~ A, d), linfct = mcp(A = c("a1 - a2 = 0"))))
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: User-defined Contrasts
## 
## 
## Fit: aov(formula = y ~ A, data = d)
## 
## Linear Hypotheses:
##              Estimate Std. Error t value Pr(>|t|)
## a1 - a2 == 0  -0.2750     0.3963  -0.694    0.494
## (Adjusted p values reported -- single-step method)

Multiples comparisons

With k samples, there are k(k1)/2 pairs of means to compare. With k=4 and α=0.05, the family wise error rate (FWER) becomes 1(10.05)6=0.265, assuming tests are independent.

There are two general (and conservative) strategies to control the inflation of Type I errors: (Christensen, 2002)

  • consider a different test statistic (e.g., HSD Tukey)
  • consider a different nominal Type I error (e.g., Bonferroni correction)
p.adjust.methods
## [1] "holm"       "hochberg"   "hommel"     "bonferroni" "BH"        
## [6] "BY"         "fdr"        "none"

Application

Usually, Bonferroni correction is applied to unplanned comparisons, and it can be restricted to a subset of all possible tests.

pairwise.t.test(y, A, p.adjust.method = "bonf")
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  y and A 
## 
##    a1     a2     a3    
## a2 1.0000 -      -     
## a3 1.0000 1.0000 -     
## a4 0.0042 0.0255 0.0134
## 
## P value adjustment method: bonferroni

Tukey HSD tests can be used following a significant ANOVA, and they are applied to all pairs of means.

TukeyHSD(aov(y ~ A, d))
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = y ~ A, data = d)
## 
## $A
##             diff        lwr       upr     p adj
## a2-a1  0.2749569 -0.8071461 1.3570599 0.8986093
## a3-a1  0.1746206 -0.9074824 1.2567236 0.9708799
## a4-a1  1.5080781  0.4259751 2.5901811 0.0037260
## a3-a2 -0.1003363 -1.1824393 0.9817667 0.9941741
## a4-a2  1.2331212  0.1510182 2.3152242 0.0208968
## a4-a3  1.3334575  0.2513545 2.4155605 0.0113289

Unbalanced data

When the number of observations differ in each treatment, we are more concerned with how to compute sum of squares (Herr, 1986).

For two factors, A and B:

  • Type I (default): SS(A), SS(B|A), then SS(AB|B, A)
  • Type II: SS(A|B), then SS(B|A) (no interaction)
  • Type III: SS(A|B, AB), SS(B|A, AB) (interpret each main effect after having accounted for other main effects and the interaction)

See also, Venables, W.N. (2000). Exegeses on Linear Models. Paper presented to the S-PLUS User’s Conference Washington, DC, 8-9th October, 1998. (§ 5.1)

Analysis of correlated data

Evelyn Hall: I would like to know how (if) I can extract some of the information from the summary of my nlme.
Simon Blomberg: This is R. There is no if. Only how.
—Evelyn Hall and Simon ‘Yoda’ Blomberg
R-help (April 2005)

Mixed-effects models

Compared to standard linear models, mixed-effect models further include random-effect terms that allow to reflect correlation between statistical units. Such clustered data may arise as the result of grouping of individuals (e.g., students nested in schools), within-unit correlation (e.g., longitudinal data), or a mix thereof (e.g., performance over time for different groups of subjects) (McCulloch and Searle, 2001; Lindsey, 1999; Gelman and Hill, 2007).

This approach belongs to conditional models, as opposed to marginal models, where we are interested in modeling population-averaged effects by assuming a working (within-unit) correlation matrix. ME models are not restricted to a single level of clustering, and they give predicted values for each cluster or level in the hierarchy.

Fixed vs. random effects

There seems to be little agreement about what fixed and random effects really means (Gelman, 2005).

As a general decision workflow, we can ask whether we are interested in just estimating parameters for the random-effect terms, or get predictions at the individual level.

Analysis of paired data

The famous 'sleep' study is a good illustration of the importance of using pairing information when available.

a <- t.test(extra ~ group, data = sleep, var.equal = TRUE)
## [1] "t(18)=-1.86, p=0.07919"
b <- t.test(extra ~ group, data = sleep, paired = TRUE)
## [1] "t(9)=-4.06, p=0.00283"

Generally, ignoring within-unit correlation results in a less powerful test: Considering that Cov(X1,X2)=0 amounts to over-estimate variance of the differences, since Cov(X1,X2) will generally be positive.

A positive covariance in this case means that subjects having higher values on the first level will also have higher values on the second level.

Analysis of repeated measures

Lack of digestive enzymes in the intestine can cause bowel absorption problems, with excess fat in the feces. Pancreatic enzyme supplements can be given to ameliorate the problem (Vittinghoff, Glidden, Shiboski, and McCulloch, 2005).

ID None Tablet Capsule Coated
1 1.00 44.50 7.30 3.40 12.40
2 2.00 33.00 21.00 23.10 25.40
3 3.00 19.10 5.00 11.80 22.00
4 4.00 9.40 4.60 4.60 5.80
5 5.00 71.30 23.30 25.60 68.20
6 6.00 51.20 38.00 36.00 52.60

R's wide and long format

Wide format: Each level in a separate column.

d <- read.table("../data/pilltype.dat", header = TRUE)
head(d)
##   ID None Tablet Capsule Coated
## 1  1 44.5    7.3     3.4   12.4
## 2  2 33.0   21.0    23.1   25.4
## 3  3 19.1    5.0    11.8   22.0
## 4  4  9.4    4.6     4.6    5.8
## 5  5 71.3   23.3    25.6   68.2
## 6  6 51.2   38.0    36.0   52.6

Long format: All levels in the same column.

Note that we often need to keep an 'ID' variable to identify subjects and repeated measurements.

library(reshape2)
head(fat <- melt(d, id.vars = "ID", 
                 variable.name = "pilltype", 
                 value.name = "fecfat"), n = 7)
##   ID pilltype fecfat
## 1  1     None   44.5
## 2  2     None   33.0
## 3  3     None   19.1
## 4  4     None    9.4
## 5  5     None   71.3
## 6  6     None   51.2
## 7  1   Tablet    7.3

Variance components

There is only one predictor, Pill type, which is attached to subject and period of time (subsumed under the repeated administration of the different treatment).

Here are different ways of decomposing the total variance:

  • One-way ANOVA:
    aov(fecfat ~ pilltype, data = fat)
  • Two-way ANOVA:
    aov(fecfat ~ pilltype + subject, data = fat)
  • RM ANOVA:
    aov(fecfat ~ pilltype + Error(subject), data = fat)

Source DF SS MS M1 M2*/M3
pilltype 3 2009 669.5 669.5/359.7 (p=0.17 ) 669.5/107.0 (p=0.01)
subject 5 5588 1117.7 1117.7/107.0 (p=0.00*)
residuals 15 1605 107.0

  • The first model, which assumes independent observations, does not remove variability between subjects (about 77.8% of residual variance).

  • The last two models incorporate subject-specific effects:

yij=μ+subjecti+pilltypej+εij,εijN(0,σ2ε)

  • In the third model, we further assume subjectiN(0,σ2s), independent of εij.

The inclusion of a random effect specific to subjects allows to model several types of within-unit correlation at the level of the outcome. What is the correlation between measurements taken from the same individual? We know that Cor(yij,yik)=Cov(yij,yik)Var(yij)Var(yik). Because μ and pilltype are fixed, and εijsubjecti, we have Cov(yij,yik)=Cov(subjecti,subjecti)=Var(subjecti)=σ2s, and variance components follow from Var(yij)=Var(subjecti)+Var(εij)=σ2s+σ2ε, which is assumed to hold for all observations.

So that, we have Cor(yij,yik)=σ2sσ2s+σ2ε which is the proportion of the total variance that is due to subjects. It is also known as the intraclass correlation, ρ, and it measures the closeness of observations on different subjects (or within-cluster similarity).

  • Subject-to-subject variability simultaneously raises or lowers all the observations on a subject.
  • The variance-covariance structure in the above model is called compound symmetry.

Estimating ρ

Observations on the same subject are modeled as correlated through their shared random subject effect. Using the random intercept model defined above, we can estimate ρ as follows: (default method is known as REML)

First, we fit a random intercept model by specifying which variable is used to identify subjects (ID):

library(nlme)
lme.fit <- lme(fecfat ~ pilltype, data = fat, random = ~ 1 | ID)
anova(lme.fit)
##             numDF denDF   F-value p-value
## (Intercept)     1    15 14.265686  0.0018
## pilltype        3    15  6.257391  0.0057

intervals(lme.fit)
## Approximate 95% confidence intervals
## 
##  Fixed effects:
##                     lower       est.     upper
## (Intercept)      21.58081  38.083333 54.585860
## pilltypeTablet  -34.27929 -21.550000 -8.820713
## pilltypeCapsule -33.39595 -20.666667 -7.937380
## pilltypeCoated  -19.74595  -7.016667  5.712620
## attr(,"label")
## [1] "Fixed effects:"
## 
##  Random Effects:
##   Level: ID 
##                    lower     est.    upper
## sd((Intercept)) 8.001119 15.89557 31.57924
## 
##  Within-group standard error:
##     lower      est.     upper 
##  7.232394 10.344027 14.794394

We want to compute Cor(yij,yik)=σ2sσ2s+σ2ε.

From the preceding output, we can extract everything we need using VarCorr:

sigma.s <- as.numeric(VarCorr(lme.fit)[1,2])
sigma.eps <- as.numeric(VarCorr(lme.fit)[2,2])
sigma.s^2 / (sigma.s^2+sigma.eps^2)
## [1] 0.7025064

From the ANOVA table, we can also compute ˆρ:

ms <- anova(lm(fecfat ~ pilltype + ID, data = fat))[[3]]
vs <- (ms[2]-ms[3]) / nlevels(fat$pilltype)
vr <- ms[3]                                
vs / (vs+vr)
## [1] 0.6389674

We could also use Generalized Least Squares, imposing compound symmetry:

gls.fit <- gls(fecfat ~ pilltype, data = fat, 
               corr = corCompSymm(form= ~ 1 | ID))
anova(gls.fit)
## Denom. DF: 20 
##             numDF   F-value p-value
## (Intercept)     1 14.265652  0.0012
## pilltype        3  6.257396  0.0036
## intervals(gls.fit) 

We would get ˆρ=0.7025074.

The final picture

The imposed variance-covariance structure is clearly reflected in the predicted values.

Model diagnostic

Inspection of the distribution of the residuals and residuals vs. fitted values plots are useful diagnostic tools. It is also interesting to examine the distribution of random effects (intercepts and/or slopes).

Some remarks

  • For a balanced design, the residual variance for the within-subject ANOVA and random-intercept model will be identical (the REML estimator is equivalent to ANOVA MSs). Likewise, Pill type effects and overall mean are identical.
  • Testing the significance of fixed effects can be done using ANOVA (F-value) or by model comparison. In the latter case, we need to fit model by ML method (and not REML) because models will include differents fixed effects.
## anova(lme.fit)
lme.fit <- update(lme.fit, method="ML")
lme.fit0 <- update(lme.fit, fixed= . ~ - pilltype)
anova(lme.fit, lme.fit0)
##          Model df      AIC      BIC     logLik   Test  L.Ratio p-value
## lme.fit      1  6 201.9581 209.0264  -94.97905                        
## lme.fit0     2  3 210.5667 214.1008 -102.28334 1 vs 2 14.60857  0.0022

Variance-covariance structure

Other VC matrices can be choosen, depending on study design (Pinheiro and Bates, 2000), e.g., Unstructured, First-order auto-regressive, Band-diagonal, AR(1) with heterogeneous variance.

The random-intercept model ensures that the VC structure will be constrained as desired. With repeated measures ANOVA, a common strategy is to use Greenhouse-Geisser or Huynh-Feldt correction to correct for sphericity violations, or to rely on MANOVA which is less powerful, e.g. (Abdi, 2010; Zar, 1998).



Mixed-effect models are more flexible as they allow to make inference on the correlation structure, and to perform model comparisons.

References

References

[1] H. Abdi. "The Greenhouse-Geisser correction". In: Encyclopedia of Research Design. Ed. by N. Salkind. Thousand Oaks, CA: Sage, 2010.

[2] R. Christensen. Plane Answers to Complex Questions: The Theory of Linear Models. 3rd. Springer, 2002.

[3] A. Gelman. "Analysis of variance-why it is more important than ever". In: Annals of Statistics 33.1 (2005), pp. 1-53.

[4] A. Gelman and J. Hill. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press, 2007.

[5] D. Herr. "On the History of ANOVA in Unbalanced, Factorial Designs: The First 30 Years". In: The American Statistician 40.4 (1986), pp. 265-270.

[6] J. Lindsey. Models for Repeated Measurements. 2nd. Oxford University Press, 1999.

[7] C. McCulloch and S. Searle. Generalized, Linear, and Mixed Models. Wiley, 2001.

[8] J. Pinheiro and D. Bates. Mixed-Effects Models in S and S-PLUS. Springer, 2000.

[9] E. Vittinghoff, D. Glidden, S. Shiboski, et al. Regression Methods in Biostatistics. Linear, Logistic, Survival, and Repeated Measures Models. Springer, 2005.

[10] J. Zar. Biostatistical Analysis. 4th. Pearson, Prentice Hall, 1998.