%>%
palmerpenguins ggplot(aes(x = species, y = flipper_length_mm)) +
geom_violin() +
geom_boxplot(width = 0.1) +
geom_jitter(size = 1, shape = 1, width = 0.15) +
xlab('Species') +
ylab('Flipper length (mm)')
6 Analysis of Variance
An ANOVA test with two groups is equivalent to a two-sample t-test. Unlike two-sample t-tests, ANOVA can simultaneously test many groups without inflating the Type I error rate beyond \(\alpha\).
Here is an example of using ANOVA to test whether the mean flipper length is different between three different species of penguins.
6.1 Visualizing data
Since we have large sample sizes, it would be helpful to visualize the data with a violin plot (or a boxplot would work too).
6.2 Hypotheses
The null hypothesis for an ANOVA test is that the population means, \(\mu_i\), are the same for all treatments. The alternate hypothesis is that at least one population mean is different from the rest. ANOVA cannot give us information about which group’s population mean is different from the rest.
- \(H_0\): Mean flipper length is equal among all three species (\(\mu_1 = \mu_2 = \mu_3\)).
- \(H_A\): At least one species’ mean flipper length is different from the rest.
We’ll be conducting a fixed-effects ANOVA test, which uses the F-test statistic.
6.3 Checking assumptions
ANOVA makes the same assumptions as a two-sample t-test:
- Both samples represent random samples obtained from their populations.
- The variable in question is normally distributed in each population.
- The variance of the variable is the same in each population.
We can check the normality assumption using a normal quantile plot:
%>%
palmerpenguins ggplot(aes(sample = flipper_length_mm)) +
stat_qq(shape = 1) +
stat_qq_line() +
facet_grid(~ species) +
xlab('Normal quantile') +
ylab('Flipper length (mm)')
We can check the homoscedasticity (equal variances) assumption using the Bartlett’s test:
bartlett.test(flipper_length_mm ~ species, data = palmerpenguins)
Bartlett test of homogeneity of variances
data: flipper_length_mm by species
Bartlett's K-squared = 0.80702, df = 2, p-value = 0.668
Since the P-value is greater than 0.05, we fail to reject the null hypothesis of equal variances.
6.4 Performing an ANOVA test
There are two steps to performing an ANOVA test in R. First, we need to create a linear model using the lm
function. Then, we can pass that to the anova
method to generate the appropriate ANOVA results.
<- lm(flipper_length_mm ~ species, data = palmerpenguins)
palmerpenguins.lm anova(palmerpenguins.lm)
Analysis of Variance Table
Response: flipper_length_mm
Df Sum Sq Mean Sq F value Pr(>F)
species 2 50526 25262.9 567.41 < 2.2e-16 ***
Residuals 330 14693 44.5
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
\(df\) | Sum of Squares | Mean of Squares | \(F\)-ratio | \(P\) | |
---|---|---|---|---|---|
Species | 2 | 50526 | 25262.9 | 567.41 | 2.2e-16 |
Residuals | 330 | 14693 | 44.5 |
Due to the low P-value, we can reject the null hypothesis that the mean flipper length is equal. But now the question is, which group mean(s) differ?
6.5 Post-hoc Tukey
The Tukey-Kramer post-hoc test (yes, a mouthful) compares all group means to all other group means. This is preferable to doing multiple two-sample t-tests because the latter inflates the Type I error rate.
To conduct this test in R, we first need to re-do the ANOVA test using the aov
method and pass that onto TukeyHSD()
. An example is shown below:
aov(flipper_length_mm ~ species, data = palmerpenguins) %>%
TukeyHSD(conf.level = 0.95)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = flipper_length_mm ~ species, data = palmerpenguins)
$species
diff lwr upr p adj
Chinstrap-Adelie 5.72079 3.414364 8.027215 0
Gentoo-Adelie 27.13255 25.192399 29.072709 0
Gentoo-Chinstrap 21.41176 19.023644 23.799885 0