In this short tutorial, we perform a hypothesis test on the “cuckoo” dataset.
Cuckoo dataset
The common cuckoo does not build its own nest: it prefers to lay its eggs in another birds’ nest. It is known, since 1892, that the type of cuckoo bird eggs are different between different locations. In a study from 1940, it was shown that cuckoos return to the same nesting area each year, and that they always pick the same bird species to be a “foster parent” for their eggs.
Over the years, this has lead to the development of geographically determined subspecies of cuckoos. These subspecies have evolved in such a way that their eggs look as similar as possible as those of their foster parents.
The cuckoo dataset contains information on 120 Cuckoo eggs, obtained from randomly selected “foster” nests. For these eggs, researchers have measured the length
(in mm) and established the type
(species) of foster parent. The type column is coded as follows:
type=1
: Meadow pipit
type=2
: Tree pipit
type=3
: Dunnock
type=4
: European robin
type=5
: White wagtail
type=6
: Eurasian wren
Goal
The researchers want to test if the type of foster parent has an effect on the average length of the cuckoo eggs.
In theory, they want to study this for all six species. However, a t-test can only be used to study mean differences between two groups. If we want to analyze multiple groups, there are two options.
We perform t-tests on all pairwise combinations of types. This mean we need to perform n*(n-1)/2 = 15 t-tests.
We perform an ANOVA analysis.
The second strategy is much more efficient and has a higher statistical power. We will learn about ANOVA in chapter 7.
In the current tutorial, we will assess a single pairwise comparison, between the European robin and the European wren. In later tutorials, we revisit this dataset and analyse all data.
Load the required libraries
Import the data
Data Exploration
Data tidying
For this exercise, we only care about the European robin and the Eurasian wren. Therefore, we can remove the observations of the other types. In addition, it seems that the tpye
column rather than a factor. Let’s fix this:
Cuckoo <- Cuckoo %>%
filter(type %in% c("4","6")) %>%
mutate(type = as.factor(type))
Data exploration
How many birds do we have for each type?
Cuckoo %>%
count(...)
# in base R, you could use the table function
Visualize the data
What do you observe?
How will you model the data?
Translate the research question in a null and alternative hypothesis
Which test will you use to assess the research hypothesis?
Formulate the assumptions of the test and assess the assumptions using diagnostic plots.
If all assumptions to perform the test, complete the analysis and formulate a proper conclusion.
LS0tCnRpdGxlOiAnVHV0b3JpYWwgNS40OiBIeXBvdGhlc2lzIHRlc3Rpbmcgb24gdGhlIGN1Y2tvbyBkYXRhc2V0JwpvdXRwdXQ6CiAgaHRtbF9kb2N1bWVudDoKICAgIGNvZGVfZG93bmxvYWQ6IHllcwogICAgaGlnaGxpZ2h0OiB0YW5nbwogICAgbnVtYmVyX3NlY3Rpb25zOiB5ZXMKICAgIHRoZW1lOiBjb3NtbwogICAgdG9jOiB5ZXMKICAgIHRvY19mbG9hdDogeWVzCiAgcGRmX2RvY3VtZW50OgogICAgdG9jOiB5ZXMKLS0tCgpJbiB0aGlzIHNob3J0IHR1dG9yaWFsLCB3ZSBwZXJmb3JtIGEgaHlwb3RoZXNpcyB0ZXN0IG9uIHRoZQoiY3Vja29vIiBkYXRhc2V0LiAKCiMgQ3Vja29vIGRhdGFzZXQKClRoZSBjb21tb24gY3Vja29vIGRvZXMgbm90IGJ1aWxkIGl0cyBvd24gbmVzdDogaXQgcHJlZmVycwp0byBsYXkgaXRzIGVnZ3MgaW4gYW5vdGhlciBiaXJkcycgbmVzdC4gSXQgaXMga25vd24sIHNpbmNlIDE4OTIsCnRoYXQgdGhlIHR5cGUgb2YgY3Vja29vIGJpcmQgZWdncyBhcmUgZGlmZmVyZW50IGJldHdlZW4gZGlmZmVyZW50CmxvY2F0aW9ucy4gSW4gYSBzdHVkeSBmcm9tIDE5NDAsIGl0IHdhcyBzaG93biB0aGF0IGN1Y2tvb3MgcmV0dXJuCnRvIHRoZSBzYW1lIG5lc3RpbmcgYXJlYSBlYWNoIHllYXIsIGFuZCB0aGF0IHRoZXkgYWx3YXlzIHBpY2sKdGhlIHNhbWUgYmlyZCBzcGVjaWVzIHRvIGJlIGEgImZvc3RlciBwYXJlbnQiIGZvciB0aGVpciBlZ2dzLgoKT3ZlciB0aGUgeWVhcnMsIHRoaXMgaGFzIGxlYWQgdG8gdGhlIGRldmVsb3BtZW50IG9mIGdlb2dyYXBoaWNhbGx5CmRldGVybWluZWQgc3Vic3BlY2llcyBvZiBjdWNrb29zLiBUaGVzZSBzdWJzcGVjaWVzIGhhdmUgZXZvbHZlZCBpbgpzdWNoIGEgd2F5IHRoYXQgdGhlaXIgZWdncyBsb29rIGFzIHNpbWlsYXIgYXMgcG9zc2libGUgYXMgdGhvc2UKb2YgdGhlaXIgZm9zdGVyIHBhcmVudHMuCgpUaGUgY3Vja29vIGRhdGFzZXQgY29udGFpbnMgaW5mb3JtYXRpb24gb24gMTIwIEN1Y2tvbyBlZ2dzLApvYnRhaW5lZCBmcm9tIHJhbmRvbWx5IHNlbGVjdGVkICJmb3N0ZXIiIG5lc3RzLgpGb3IgdGhlc2UgZWdncywgcmVzZWFyY2hlcnMgaGF2ZSBtZWFzdXJlZCB0aGUgYGxlbmd0aGAgKGluIG1tKQphbmQgZXN0YWJsaXNoZWQgdGhlIGB0eXBlYCAoc3BlY2llcykgb2YgZm9zdGVyIHBhcmVudC4KVGhlIHR5cGUgY29sdW1uIGlzIGNvZGVkIGFzIGZvbGxvd3M6CgotIGB0eXBlPTFgOiBNZWFkb3cgcGlwaXQKLSBgdHlwZT0yYDogVHJlZSBwaXBpdAotIGB0eXBlPTNgOiBEdW5ub2NrCi0gYHR5cGU9NGA6IEV1cm9wZWFuIHJvYmluCi0gYHR5cGU9NWA6IFdoaXRlIHdhZ3RhaWwKLSBgdHlwZT02YDogRXVyYXNpYW4gd3JlbgoKIyBHb2FsCgpUaGUgcmVzZWFyY2hlcnMgd2FudCB0byB0ZXN0IGlmIHRoZSB0eXBlIG9mIGZvc3RlciBwYXJlbnQKaGFzIGFuIGVmZmVjdCBvbiB0aGUgYXZlcmFnZSBsZW5ndGggb2YgdGhlIGN1Y2tvbyBlZ2dzLgoKSW4gdGhlb3J5LCB0aGV5IHdhbnQgdG8gc3R1ZHkgdGhpcyBmb3IgYWxsIHNpeCBzcGVjaWVzLgpIb3dldmVyLCBhIHQtdGVzdCBjYW4gb25seSBiZSB1c2VkIHRvIHN0dWR5IG1lYW4gZGlmZmVyZW5jZXMKYmV0d2VlbiB0d28gZ3JvdXBzLiBJZiB3ZSB3YW50IHRvIGFuYWx5emUgbXVsdGlwbGUgZ3JvdXBzLCB0aGVyZQphcmUgdHdvIG9wdGlvbnMuCgoxLiBXZSBwZXJmb3JtIHQtdGVzdHMgb24gYWxsIHBhaXJ3aXNlIGNvbWJpbmF0aW9ucyBvZiB0eXBlcy4KVGhpcyBtZWFuIHdlIG5lZWQgdG8gcGVyZm9ybSBuKihuLTEpLzIgPSAxNSB0LXRlc3RzLgoKMi4gV2UgcGVyZm9ybSBhbiBBTk9WQSBhbmFseXNpcy4KClRoZSBzZWNvbmQgc3RyYXRlZ3kgaXMgbXVjaCBtb3JlIGVmZmljaWVudCBhbmQgaGFzIGEgaGlnaGVyCnN0YXRpc3RpY2FsIHBvd2VyLiBXZSB3aWxsIGxlYXJuIGFib3V0IEFOT1ZBIGluIGNoYXB0ZXIgNy4KCkluIHRoZSBjdXJyZW50IHR1dG9yaWFsLCB3ZSB3aWxsIGFzc2VzcyBhIHNpbmdsZSBwYWlyd2lzZSBjb21wYXJpc29uLApiZXR3ZWVuIHRoZSBFdXJvcGVhbiByb2JpbiBhbmQgdGhlIEV1cm9wZWFuIHdyZW4uIEluIGxhdGVyCnR1dG9yaWFscywgd2UgcmV2aXNpdCB0aGlzIGRhdGFzZXQgYW5kIGFuYWx5c2UgYWxsIGRhdGEuCgpMb2FkIHRoZSByZXF1aXJlZCBsaWJyYXJpZXMKCmBgYHtyLCBtZXNzYWdlPUZBTFNFfQoKYGBgCgojIEltcG9ydCB0aGUgZGF0YQoKYGBge3IsIG1lc3NhZ2U9RkFMU0V9CgpgYGAKCiMgRGF0YSBFeHBsb3JhdGlvbgoKYGBge3J9CgpgYGAKCgojIERhdGEgdGlkeWluZwoKRm9yIHRoaXMgZXhlcmNpc2UsIHdlIG9ubHkgY2FyZSBhYm91dCB0aGUgRXVyb3BlYW4gcm9iaW4KYW5kIHRoZSBFdXJhc2lhbiB3cmVuLiBUaGVyZWZvcmUsIHdlIGNhbiByZW1vdmUgdGhlIG9ic2VydmF0aW9ucwpvZiB0aGUgb3RoZXIgdHlwZXMuIEluIGFkZGl0aW9uLCBpdCBzZWVtcyB0aGF0IHRoZSBgdHB5ZWAgCmNvbHVtbiByYXRoZXIgdGhhbiBhIGZhY3Rvci4gTGV0J3MgZml4IHRoaXM6CgpgYGB7ciwgZXZhbD1GQUxTRX0KQ3Vja29vIDwtIEN1Y2tvbyAlPiUKICBmaWx0ZXIodHlwZSAlaW4lIGMoIjQiLCI2IikpICU+JQogIG11dGF0ZSh0eXBlID0gYXMuZmFjdG9yKHR5cGUpKQpgYGAKCgojIERhdGEgZXhwbG9yYXRpb24KCkhvdyBtYW55IGJpcmRzIGRvIHdlIGhhdmUgZm9yIGVhY2ggdHlwZT8KCmBgYHtyLCBldmFsPUZBTFNFfQpDdWNrb28gJT4lCiAgY291bnQoLi4uKQoKIyBpbiBiYXNlIFIsIHlvdSBjb3VsZCB1c2UgdGhlIHRhYmxlIGZ1bmN0aW9uCmBgYAoKVmlzdWFsaXplIHRoZSBkYXRhCgpgYGB7cn0KCmBgYAoKMS4gV2hhdCBkbyB5b3Ugb2JzZXJ2ZT8KCjIuIEhvdyB3aWxsIHlvdSBtb2RlbCB0aGUgZGF0YT8gCgozLiBUcmFuc2xhdGUgdGhlIHJlc2VhcmNoIHF1ZXN0aW9uIGluIGEgbnVsbCBhbmQgYWx0ZXJuYXRpdmUgaHlwb3RoZXNpcwoKNC4gV2hpY2ggdGVzdCB3aWxsIHlvdSB1c2UgdG8gYXNzZXNzIHRoZSByZXNlYXJjaCBoeXBvdGhlc2lzPyAKCjUuIEZvcm11bGF0ZSB0aGUgYXNzdW1wdGlvbnMgb2YgdGhlIHRlc3QgYW5kIGFzc2VzcyB0aGUgYXNzdW1wdGlvbnMgdXNpbmcgZGlhZ25vc3RpYyBwbG90cy4gCgo2LiBJZiBhbGwgYXNzdW1wdGlvbnMgdG8gcGVyZm9ybSB0aGUgdGVzdCwgY29tcGxldGUgdGhlIGFuYWx5c2lzIGFuZApmb3JtdWxhdGUgYSBwcm9wZXIgY29uY2x1c2lvbi4KCgoKCgoKCgo=