In this tutorial, we perform a hypothesis test on the “smelly armpit” dataset.

1 Smelly armpit dataset

Smelly armpits are not caused by sweat, itself. The smell is caused by specific micro-organisms belonging to the group of Corynebacterium spp. that metabolise sweat. Another group of abundant bacteria are the Staphylococcus spp., these bacteria do not metabolise sweat in smelly compounds.

The CMET-groep at Ghent University does research to on transplanting the armpit microbiome to save people with smelly armpits.

  • Proposed Therapy:
    1. Remove armpit-microbiome with antibiotics
    2. Influence armpit microbiome with microbial transplant (https://youtu.be/9RIFyqLXdVw)
  • Experiment:

    • 20 students with smelly armpits are attributed to one of two treatment groups
    • placebo (only antibiotics)
    • transplant (antibiotica followed by microbial transplant).
    • The microbiome is sampled 6 weeks upon the treatment
    • The relative abundance of Staphylococcus spp. on Corynebacterium spp. + Staphylococcus spp. in the microbiome is measured via DGGE (Denaturing Gradient Gel Electrophoresis).

2 Goal

The overarching goal of this research was to assess if the relative abundance Staphylococcus spp. in the microbiome of the armpit is affected by transplanting the microbiome. To this end the researchers randomized patients to two treatment: A treatment with antibiotics only and a treatment with antibiotics and a microbial transplant.

In the tutorial on hypotheses testing we will use a formal statistical test to generalize the results from the sample to that of the population.

Load the libraries

library(tidyverse)

3 Import the dataset

ap <- read_csv("https://raw.githubusercontent.com/GTPB/PSLS20/master/data/armpit.csv")
## Parsed with column specification:
## cols(
##   trt = col_character(),
##   rel = col_double()
## )
glimpse(ap)
## Observations: 20
## Variables: 2
## $ trt <chr> "placebo", "placebo", "placebo", "placebo", "placebo", "placebo",…
## $ rel <dbl> 54.99208, 31.84466, 41.09948, 59.52064, 63.57341, 41.48649, 30.44…

4 Data visualization

It is always a good idea to first have a quick look at the raw data;

ap %>% ggplot(aes(x=trt,y=rel,fill=trt)) + 
  geom_boxplot(outlier.shape=NA) + 
  geom_point(position="jitter") +
  ylab("relative abundance (%)") +
  xlab("treatment group") + 
  stat_summary(fun.y=mean, geom="point", shape=5, size=3, color="black", fill="black")

We clearly see that, on average, the subjects who had a microbial transplant have a higher relative abundance of Staphylococcus spp. But is this difference significant?

We can test this with an unpaired, two-sample t-test. But before we can start the analysis, we must check if all assumptions to perform a t-test are met.

5 Check the assumptions

  1. The observations are independent of each other (in both groups)
  2. The data (rel) must be normally distributed (in both groups)

Additionally, we must check if the variances are similar for both groups. If so, we can use a t-test with a pooled variance (see theory). If not, we must rely on the Welch t-test, which can deal with unequal variances.

The first assumption is met, as we may assume that there are no specific patterns of correlation in our group of 20 randomly select subjects. Note, however, that as we sample only in the population of ‘students’, we will only be able to extrapolate our findings to this group.

To check the normality assumption, we will use QQ plots.

ap %>%
  ggplot(aes(sample=rel)) +
  geom_qq() +
  geom_qq_line() +
  facet_grid(cols = vars(trt))

We can see that all of the data lies nicely around the quantile-quantile line (black line). As such, we may conclude that our data is normally distributed.

For the third assumption, we must compare the within-group variability of both groups. We can do this visually:

ap %>%  ggplot(aes(x=trt,y=rel)) + 
  geom_boxplot(outlier.shape=NA) + 
  geom_point(position="jitter") +
  ylab("relative abundance (%)") +
  xlab("treatment group") + 
  stat_summary(fun.y=mean, geom="point", shape=5, size=3, color="black", fill="black")

Here we can see that this interval, as well as the length of the whiskers, is approximately equal for groups.

As all three assumptions are met we may continue with performing the unpaired two-sample t-test.

6 Two-sample t-test (unpaired)

placebo_rel <- ap %>%
  filter(trt=="placebo") %>%
  pull(rel)

transplant_rel <- ap %>%
  filter(trt=="transplant") %>%
  pull(rel)

output <- t.test(placebo_rel,transplant_rel,conf.level = 0.95,var.equal = TRUE)
output
## 
##  Two Sample t-test
## 
## data:  placebo_rel and transplant_rel
## t = -5.0334, df = 18, p-value = 8.638e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -31.53191 -12.96072
## sample estimates:
## mean of x mean of y 
##  44.15496  66.40127

7 Conclusion

We may conclude that, on the 5% significance level, the mean relative abundance in subjects that had a placebo treatment is significantly (p = 910^{-5}) lower than the mean relative abundance in subjects that had a microbial transplant. The relative abundances are on average 22.25 percent (95% CI: [ -31.53, -12.96 ]) lower with placebo treatment than with the transplant.

LS0tCnRpdGxlOiAnVHV0b3JpYWwgMi4yOiBoeXBvdGhlc2lzIHRlc3Rpbmcgb24gdGhlIGFybXBpdCBkYXRhc2V0JwpvdXRwdXQ6CiAgaHRtbF9kb2N1bWVudDoKICAgIGNvZGVfZG93bmxvYWQ6IHllcwogICAgaGlnaGxpZ2h0OiB0YW5nbwogICAgbnVtYmVyX3NlY3Rpb25zOiB5ZXMKICAgIHRoZW1lOiBjb3NtbwogICAgdG9jOiB5ZXMKICAgIHRvY19mbG9hdDogeWVzCiAgcGRmX2RvY3VtZW50OgogICAgdG9jOiB5ZXMKLS0tCgpJbiB0aGlzIHR1dG9yaWFsLCB3ZSBwZXJmb3JtIGEgaHlwb3RoZXNpcyB0ZXN0IG9uIHRoZQoic21lbGx5IGFybXBpdCIgZGF0YXNldC4gCgojIFNtZWxseSBhcm1waXQgZGF0YXNldAoKU21lbGx5IGFybXBpdHMgYXJlIG5vdCBjYXVzZWQgYnkgc3dlYXQsIGl0c2VsZi4gVGhlIHNtZWxsIGlzIGNhdXNlZCBieSBzcGVjaWZpYyBtaWNyby1vcmdhbmlzbXMgYmVsb25naW5nIHRvIHRoZSBncm91cCBvZiAqQ29yeW5lYmFjdGVyaXVtIHNwcC4qIHRoYXQgbWV0YWJvbGlzZSBzd2VhdC4KQW5vdGhlciBncm91cCBvZiBhYnVuZGFudCBiYWN0ZXJpYSBhcmUgdGhlICpTdGFwaHlsb2NvY2N1cyBzcHAuKiwgdGhlc2UgYmFjdGVyaWEgZG8gbm90IG1ldGFib2xpc2Ugc3dlYXQgaW4gc21lbGx5IGNvbXBvdW5kcy4KClRoZSBDTUVULWdyb2VwIGF0IEdoZW50IFVuaXZlcnNpdHkgZG9lcyByZXNlYXJjaCB0byBvbiB0cmFuc3BsYW50aW5nIHRoZSBhcm1waXQgbWljcm9iaW9tZSB0byBzYXZlIHBlb3BsZSB3aXRoIHNtZWxseSBhcm1waXRzLgoKLSBQcm9wb3NlZCBUaGVyYXB5OgogIAkxLiBSZW1vdmUgYXJtcGl0LW1pY3JvYmlvbWUgd2l0aCBhbnRpYmlvdGljcwogICAgMi4gSW5mbHVlbmNlIGFybXBpdCBtaWNyb2Jpb21lIHdpdGggbWljcm9iaWFsIHRyYW5zcGxhbnQgKGh0dHBzOi8veW91dHUuYmUvOVJJRnlxTFhkVncpCgotIEV4cGVyaW1lbnQ6CgogICAgLSAyMCBzdHVkZW50cyB3aXRoIHNtZWxseSBhcm1waXRzIGFyZSBhdHRyaWJ1dGVkIHRvIG9uZSBvZiB0d28gdHJlYXRtZW50IGdyb3VwcwogICAgLSBwbGFjZWJvIChvbmx5IGFudGliaW90aWNzKQogICAgLSB0cmFuc3BsYW50IChhbnRpYmlvdGljYSBmb2xsb3dlZCBieSBtaWNyb2JpYWwgdHJhbnNwbGFudCkuCiAgICAtIFRoZSBtaWNyb2Jpb21lIGlzIHNhbXBsZWQgNiB3ZWVrcyB1cG9uIHRoZSB0cmVhdG1lbnQKICAgIC0gVGhlIHJlbGF0aXZlIGFidW5kYW5jZSBvZiAqU3RhcGh5bG9jb2NjdXMgc3BwLiogb24gKkNvcnluZWJhY3Rlcml1bSBzcHAuKiArCiAgICAgICpTdGFwaHlsb2NvY2N1cyBzcHAuKiBpbiB0aGUgbWljcm9iaW9tZSBpcyBtZWFzdXJlZCB2aWEgREdHRSAoKkRlbmF0dXJpbmcgR3JhZGllbnQKICAgICAgR2VsIEVsZWN0cm9waG9yZXNpcyopLgogICAgCiMgR29hbAoKVGhlIG92ZXJhcmNoaW5nIGdvYWwgb2YgdGhpcyByZXNlYXJjaCB3YXMgdG8gYXNzZXNzIGlmIHRoZSByZWxhdGl2ZSBhYnVuZGFuY2UgCipTdGFwaHlsb2NvY2N1cyBzcHAuKgppbiB0aGUgbWljcm9iaW9tZSBvZiB0aGUgYXJtcGl0IGlzIGFmZmVjdGVkIGJ5IHRyYW5zcGxhbnRpbmcgdGhlIG1pY3JvYmlvbWUuIApUbyB0aGlzIGVuZCB0aGUgcmVzZWFyY2hlcnMgcmFuZG9taXplZCBwYXRpZW50cyB0byB0d28gdHJlYXRtZW50OgpBIHRyZWF0bWVudCB3aXRoIGFudGliaW90aWNzIG9ubHkgYW5kIGEgdHJlYXRtZW50IHdpdGgKYW50aWJpb3RpY3MgYW5kIGEgbWljcm9iaWFsIHRyYW5zcGxhbnQuCgpJbiB0aGUgdHV0b3JpYWwgb24gaHlwb3RoZXNlcyB0ZXN0aW5nIHdlIHdpbGwgdXNlIGEgZm9ybWFsIHN0YXRpc3RpY2FsIHRlc3QgdG8gZ2VuZXJhbGl6ZSB0aGUgcmVzdWx0cyBmcm9tIHRoZSBzYW1wbGUgdG8gdGhhdCBvZiB0aGUgcG9wdWxhdGlvbi4KCkxvYWQgdGhlIGxpYnJhcmllcwoKYGBge3IsIG1lc3NhZ2U9RkFMU0V9CmxpYnJhcnkodGlkeXZlcnNlKQpgYGAKCiMgSW1wb3J0IHRoZSBkYXRhc2V0CgpgYGB7cn0KYXAgPC0gcmVhZF9jc3YoImh0dHBzOi8vcmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbS9HVFBCL1BTTFMyMC9tYXN0ZXIvZGF0YS9hcm1waXQuY3N2IikKYGBgCgpgYGB7cn0KZ2xpbXBzZShhcCkKYGBgCgojIERhdGEgdmlzdWFsaXphdGlvbgoKSXQgaXMgYWx3YXlzIGEgZ29vZCBpZGVhIHRvIGZpcnN0IGhhdmUgYSBxdWljayBsb29rIGF0IHRoZSByYXcgZGF0YTsKCmBgYHtyfQphcCAlPiUgZ2dwbG90KGFlcyh4PXRydCx5PXJlbCxmaWxsPXRydCkpICsgCiAgZ2VvbV9ib3hwbG90KG91dGxpZXIuc2hhcGU9TkEpICsgCiAgZ2VvbV9wb2ludChwb3NpdGlvbj0iaml0dGVyIikgKwogIHlsYWIoInJlbGF0aXZlIGFidW5kYW5jZSAoJSkiKSArCiAgeGxhYigidHJlYXRtZW50IGdyb3VwIikgKyAKICBzdGF0X3N1bW1hcnkoZnVuLnk9bWVhbiwgZ2VvbT0icG9pbnQiLCBzaGFwZT01LCBzaXplPTMsIGNvbG9yPSJibGFjayIsIGZpbGw9ImJsYWNrIikKYGBgCgpXZSBjbGVhcmx5IHNlZSB0aGF0LCBvbiBhdmVyYWdlLCB0aGUgc3ViamVjdHMgd2hvIGhhZCBhCm1pY3JvYmlhbCB0cmFuc3BsYW50IGhhdmUgYSBoaWdoZXIgcmVsYXRpdmUgYWJ1bmRhbmNlIG9mClN0YXBoeWxvY29jY3VzIHNwcC4gQnV0IGlzIHRoaXMgZGlmZmVyZW5jZSAqKnNpZ25pZmljYW50Kio/CgpXZSBjYW4gdGVzdCB0aGlzIHdpdGggYW4gdW5wYWlyZWQsIHR3by1zYW1wbGUgdC10ZXN0LiAKQnV0IGJlZm9yZSB3ZSBjYW4gc3RhcnQgdGhlIGFuYWx5c2lzLCB3ZSBtdXN0IGNoZWNrIGlmCmFsbCBhc3N1bXB0aW9ucyB0byBwZXJmb3JtIGEgdC10ZXN0IGFyZSBtZXQuCgojIENoZWNrIHRoZSBhc3N1bXB0aW9ucwoKMS4gVGhlIG9ic2VydmF0aW9ucyBhcmUgaW5kZXBlbmRlbnQgb2YgZWFjaCBvdGhlciAoaW4gYm90aCBncm91cHMpCjIuIFRoZSBkYXRhIChyZWwpIG11c3QgYmUgbm9ybWFsbHkgZGlzdHJpYnV0ZWQgKGluIGJvdGggZ3JvdXBzKQoKQWRkaXRpb25hbGx5LCB3ZSBtdXN0IGNoZWNrIGlmIHRoZSB2YXJpYW5jZXMgYXJlIHNpbWlsYXIgZm9yIGJvdGggZ3JvdXBzLgpJZiBzbywgd2UgY2FuIHVzZSBhIHQtdGVzdCB3aXRoIGEgcG9vbGVkIHZhcmlhbmNlIChzZWUgdGhlb3J5KS4KSWYgbm90LCB3ZSBtdXN0IHJlbHkgb24gdGhlIFdlbGNoIHQtdGVzdCwgd2hpY2ggY2FuIGRlYWwgd2l0aAp1bmVxdWFsIHZhcmlhbmNlcy4KClRoZSBmaXJzdCBhc3N1bXB0aW9uIGlzIG1ldCwgYXMgd2UgbWF5IGFzc3VtZSB0aGF0IHRoZXJlIGFyZSBubwpzcGVjaWZpYyBwYXR0ZXJucyBvZiBjb3JyZWxhdGlvbiBpbiBvdXIgZ3JvdXAgb2YgMjAgcmFuZG9tbHkKc2VsZWN0IHN1YmplY3RzLiBOb3RlLCBob3dldmVyLCB0aGF0IGFzIHdlIHNhbXBsZSBvbmx5IGluIHRoZQpwb3B1bGF0aW9uIG9mICdzdHVkZW50cycsIHdlIHdpbGwgb25seSBiZSBhYmxlIHRvIGV4dHJhcG9sYXRlCm91ciBmaW5kaW5ncyB0byB0aGlzIGdyb3VwLgoKVG8gY2hlY2sgdGhlIG5vcm1hbGl0eSBhc3N1bXB0aW9uLCB3ZSB3aWxsIHVzZSBRUSBwbG90cy4KCmBgYHtyfQphcCAlPiUKICBnZ3Bsb3QoYWVzKHNhbXBsZT1yZWwpKSArCiAgZ2VvbV9xcSgpICsKICBnZW9tX3FxX2xpbmUoKSArCiAgZmFjZXRfZ3JpZChjb2xzID0gdmFycyh0cnQpKQpgYGAKCldlIGNhbiBzZWUgdGhhdCBhbGwgb2YgdGhlIGRhdGEgbGllcyBuaWNlbHkgYXJvdW5kIHRoZSBxdWFudGlsZS1xdWFudGlsZQpsaW5lIChibGFjayBsaW5lKS4gQXMgc3VjaCwgd2UgbWF5IGNvbmNsdWRlIHRoYXQgb3VyIGRhdGEgaXMgbm9ybWFsbHkgZGlzdHJpYnV0ZWQuCgpGb3IgdGhlIHRoaXJkIGFzc3VtcHRpb24sIHdlIG11c3QgY29tcGFyZSB0aGUgd2l0aGluLWdyb3VwCnZhcmlhYmlsaXR5IG9mIGJvdGggZ3JvdXBzLiBXZSBjYW4gZG8gdGhpcyB2aXN1YWxseToKCmBgYHtyfQphcCAlPiUgIGdncGxvdChhZXMoeD10cnQseT1yZWwpKSArIAogIGdlb21fYm94cGxvdChvdXRsaWVyLnNoYXBlPU5BKSArIAogIGdlb21fcG9pbnQocG9zaXRpb249ImppdHRlciIpICsKICB5bGFiKCJyZWxhdGl2ZSBhYnVuZGFuY2UgKCUpIikgKwogIHhsYWIoInRyZWF0bWVudCBncm91cCIpICsgCiAgc3RhdF9zdW1tYXJ5KGZ1bi55PW1lYW4sIGdlb209InBvaW50Iiwgc2hhcGU9NSwgc2l6ZT0zLCBjb2xvcj0iYmxhY2siLCBmaWxsPSJibGFjayIpCmBgYAoKSGVyZSB3ZSBjYW4gc2VlIHRoYXQgdGhpcyBpbnRlcnZhbCwgYXMgd2VsbCBhcyB0aGUgbGVuZ3RoIG9mIHRoZSB3aGlza2VycywgCmlzIGFwcHJveGltYXRlbHkgZXF1YWwgZm9yIGdyb3Vwcy4KCkFzIGFsbCB0aHJlZSBhc3N1bXB0aW9ucyBhcmUgbWV0IHdlIG1heSBjb250aW51ZSB3aXRoCnBlcmZvcm1pbmcgdGhlIHVucGFpcmVkIHR3by1zYW1wbGUgdC10ZXN0LgoKIyBUd28tc2FtcGxlIHQtdGVzdCAodW5wYWlyZWQpCgpgYGB7cn0KcGxhY2Vib19yZWwgPC0gYXAgJT4lCiAgZmlsdGVyKHRydD09InBsYWNlYm8iKSAlPiUKICBwdWxsKHJlbCkKCnRyYW5zcGxhbnRfcmVsIDwtIGFwICU+JQogIGZpbHRlcih0cnQ9PSJ0cmFuc3BsYW50IikgJT4lCiAgcHVsbChyZWwpCgpvdXRwdXQgPC0gdC50ZXN0KHBsYWNlYm9fcmVsLHRyYW5zcGxhbnRfcmVsLGNvbmYubGV2ZWwgPSAwLjk1LHZhci5lcXVhbCA9IFRSVUUpCm91dHB1dApgYGAKCiMgQ29uY2x1c2lvbgoKV2UgbWF5IGNvbmNsdWRlIHRoYXQsIG9uIHRoZSA1JSBzaWduaWZpY2FuY2UgbGV2ZWwsIHRoZSBtZWFuIApyZWxhdGl2ZSBhYnVuZGFuY2UgaW4gc3ViamVjdHMgdGhhdCBoYWQgYSBwbGFjZWJvIHRyZWF0bWVudCBpcyAKc2lnbmlmaWNhbnRseSAocCA9IGByIHJvdW5kKG91dHB1dCRwLnZhbHVlLDUpYCkgbG93ZXIgdGhhbiB0aGUgbWVhbiByZWxhdGl2ZSAKYWJ1bmRhbmNlIGluIHN1YmplY3RzIHRoYXQgaGFkIGEgbWljcm9iaWFsIHRyYW5zcGxhbnQuIFRoZSAKcmVsYXRpdmUgYWJ1bmRhbmNlcyBhcmUgb24gYXZlcmFnZSAKYHIgcm91bmQoIHVubmFtZShvdXRwdXQkZXN0aW1hdGVbMl0pIC0gdW5uYW1lKG91dHB1dCRlc3RpbWF0ZVsxXSksMilgIHBlcmNlbnQKKDk1JSBDSTogWyBgciByb3VuZChvdXRwdXQkY29uZi5pbnRbYygxLDIpXSwyKWAgXSkgbG93ZXIgd2l0aCBwbGFjZWJvIAp0cmVhdG1lbnQgdGhhbiB3aXRoIHRoZSB0cmFuc3BsYW50LgoKCgoKCg==