# aliquote.org

A few years ago I ran a workshop where I showed how to carry out a simple (bilateral) permutation test in R. Assuming you have a matrix d with two columns of values, and no missing values, it reads:

s0 <- mean(d[,1]) - mean(d[,2])               ## observed statistic
x <- c(d[,1], d[,2])
idx <- combn(seq(along = 1:20), 10)           ## {20 \choose 10}
f <- function(k) mean(x[k]) - mean(x[-k])
s <- apply(idx, 2, f)                         ## test statistic s
pobs <- sum(abs(s) >= abs(s0)) / length(s)


Using the coin package the same test is done as follows:

library(coin)
oneway_test(value ~ variable, data = reshape2::melt(d[1:2]),
alternative = "two.sided", distribution = "exact")


I also showed how to devise a bootstrap test of hypothesis in R. This time, we need to take care of correctly centering the group means, though:

x1 <- d[,1] - mean(d[,1]) + mean(x)           ## x is c(d[,1], d[,2])
x2 <- d[,2] - mean(d[,2]) + mean(x)
B <- 999                                      ## no. bootstrap samples
s <- numeric(B)                               ## vector of test statistics
for (i in 1:B) {
x1s <- sample(x1, replace=TRUE)
x2s <- sample(x2, replace=TRUE)
s[i] <- mean(x1s) - mean(x2s)
}
pobs <-  (1 + sum(abs(s) > abs(s0))) / (B+1)  ## s0 is mean(d[,1]) - mean(d[,2])


Note that the for loop can easily be replaced with a function call in a replicate statement, of course. Here is some results with the ToothGrowth dataset we discussed earlier, which can be prepared as follows:

> data(ToothGrowth)
> d <- as.data.frame(with(ToothGrowth, split(len, supp)))
> x <- ToothGrowth\$len
> s0 <- mean(d[,1]) - mean(d[,2])
--8<-- copy/paste code above ---
> pobs
 0.043


Of course, we could compute confidence normal, percentile or BCA confidence intervals, with little effort. Stata does it smoothly, as usual:

. use toothgrowth, clear
. bootstrap dobs = (r(mu_1) - r(mu_2)), reps(1000) seed(101): ttest len, by(supp)
. estat bootstrap, all

Bootstrap results                               Number of obs      =        60
Replications       =      1000

command:  ttest len, by(supp)
diff:  r(mu_1) - r(mu_2)

------------------------------------------------------------------------------
|    Observed               Bootstrap
|       Coef.       Bias    Std. Err.  [95% Conf. Interval]
-------------+----------------------------------------------------------------
diff |         3.7   .0643175   1.8568749    .0605921   7.339408   (N)
|                                      -.0248661   7.282081   (P)
|                                       -.178198   7.126667  (BC)
------------------------------------------------------------------------------
(N)    normal confidence interval
(P)    percentile confidence interval
(BC)   bias-corrected confidence interval