1. |
The next output shows the structure of a data frame d
and the results of applying a one-way ANOVA model to the data.
> summary(d)
resp grp
Min. : 6.847 A:10
1st Qu.: 8.905 B:10
Median :10.063 C:10
Mean :10.095
3rd Qu.:11.000
Max. :13.033
> summary(aov(resp ~ grp, data=d))
Df Sum Sq Mean Sq F value Pr(>F)
grp ? 61.42 30.708 ???? 1.33e-07
Residuals 27 27.52 1.019
What are the values for the missing degrees of freedom (DF=) and F-statistic
(F=)?
|
|
DF=3 and F=30.13 |
|
DF=2 and F=30.13 |
|
DF=2 and F=2.23 |
|
Don't know. |
2. |
Below are results from a two-way ANOVA with factors x1
and x2 , and responses collected on 100 subjects.
Df Sum Sq Mean Sq F value Pr(>F)
x1 1 1077 1077 4.893 0.029385 *
x2 1 3255 3255 14.788 0.000219 ***
x1:x2 1 1338 1338 6.081 0.015480 *
Residuals 94 20688 220
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
How many missing observations are present in the data frame?
|
|
None. |
|
2. |
|
1. |
|
Don't know. |
3. |
From the preceding ANOVA table, how would you compute the partial
effect-size measure for x2 ? |
|
3255/20688. |
|
3255/(3255+20688). |
|
3255/(3255+1338+20688). |
|
Don't know. |
4. |
Using the data set described in Question 1, but restricted to levels A and B for
factor grp , we fit a regression line to the data (20
observations). Some results are provided below:
> with(d, tapply(resp, grp, mean))
A B C
10.247899 11.765453 8.270759
> lm(resp ~ grp, data=d, subset= grp != "C")
Call:
lm(formula = resp ~ grp, data = d, subset = grp != "C")
Coefficients:
(Intercept) grpB
10.248 ????? What is the value of the estimate for
the slope parameter? |
|
11.765. |
|
1.518. |
|
0.759. |
|
Don't know. |
5. |
Would you expect to observe the same results (value of the test
statistic, and its corresponding p-value) when using a two-tailed
Student t-test vs. a simple linear regression to assess differences
between the two groups in the preceding case? |
|
Yes. |
|
No. |
|
Don't know. |
6. |
Here are some data from an experiment in plant physiology, which
record the length in coded units of pea sections grown in tissue culture
with auxin present. [RR Sokal et FJ Rohlf. Biometry. 3e ed. WH Freeman
et Company, 1995] The purpose of the experiment was to test the effects
of various sugars on growth as measured by length (pea diameter measured
in ocular units, x 0.114 = mm). Four experimental
group, representing three different sugars (X2G , 2%
glucose; X2F , 2% fructose; X2S , 2% sucrose)
and one mixture of sugars (X1G1F , 1% glucose + 1%
fructose), were used, plus one control (C ) without
sugar. The null hypothesis is that there is no added component due to
treatment effects among the five groups.
Data altered for the purpose of the exercise.
C X2G X2F X1G1F X2S
1 75 57 58 58 62
2 67 58 61 59 66
3 70 60 NA 58 65
4 75 59 58 61 63
5 65 62 57 57 64
6 71 60 56 NA 62
7 67 60 61 58 NA
8 67 57 60 57 NA
9 76 NA 57 57 62
10 68 61 58 59 67
Assuming the data frame peas has been converted to the long format where the
explanatory variable is now tx and the response variable
is value , the ANOVA table is shown below:
Df Sum Sq Mean Sq F value Pr(>F)
tx 4 989.6 247.41 42.37 7.13e-14 ***
Residuals 40 233.6 5.84
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
What command can be used to find the degrees of freedom for the residual sum
of squares? |
|
nrow(peas)-nlevels(peas$tx) |
|
length(peas$value)-nlevels(peas$tx) |
|
sum(!is.na(peas$value))-nlevels(peas$tx) |
|
Don't know. |
7. |
What command could we use to compute a 95% confidence interval for
Pearson correlation coefficient estimated from the following series of obervations?
x1 11 12 14 11 13 15 14 15 10 13 14 11 13 8 9
x2 12 13 14 11 13 16 15 16 11 14 15 12 14 8 10 |
|
confint(cor(x1, x2)) |
|
cor(x1, x2, conf.level=0.95) |
|
cor.test(x1, x2, conf.level=0.95) |
|
Don't know. |
8. |
In a study on cognitive performance of twenty children from four
different age groups (5, 6, 7 and 8 years), we observed the following
results with a linear regression model, considering age group as a
numerical variable:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.7521 0.7398 1.017 0.323572
age 0.5053 0.1117 4.525 0.000299
Residual standard error: 0.5554 on 17 degrees of freedom
(1 observation deleted due to missingness)
Multiple R-squared: 0.5464, Adjusted R-squared: 0.5197
F-statistic: 20.48 on 1 and 17 DF, p-value: 0.0002991
Without taking into account the missing observation, the estimated variances
are Var(x)=1.374 and Var(y)=0.642. What is the value of Pearson
coefficient of correlation between x and y ? |
|
0.739 |
|
0.345 |
|
0.592 |
|
Don't know. |
9. |
If we were to use an ANOVA model, treating age group as a factor,
would we get the same p-value for the F-test assessing the whole model? |
|
Yes. |
|
No. |
|
Don't know. |
10. |
Is the hypothesis of normality required to compute the slope of a
regression line by ordianry least squares? |
|
Yes, but only that of the x-variable. |
|
Yes, but only that of the y-variable. |
|
Yes, both the x- and y-variable
should be normally distributed. |
|
No. |
|
Don't know. |