In this tutorial, we will explore the “smelly armpit” dataset.

1 Smelly armpit dataset

Smelly armpits are not caused by sweat, itself. The smell is caused by specific micro-organisms belonging to the group of Corynebacterium spp. that metabolise sweat. Another group of abundant bacteria are the Staphylococcus spp., these bacteria do not metabolise sweat in smelly compounds.

The CMET-groep at Ghent University does research to on transplanting the armpit microbiome to save people with smelly armpits.

  • Proposed Therapy:
    1. Remove armpit-microbiome with antibiotics
    2. Influence armpit microbiome with microbial transplant (https://youtu.be/9RIFyqLXdVw)
  • Experiment:

    • 20 students with smelly armpits are attributed to one of two treatment groups
    • placebo (only antibiotics)
    • transplant (antibiotica followed by microbial transplant).
    • The microbiome is sampled 6 weeks upon the treatment
    • The relative abundance of Staphylococcus spp. on Corynebacterium spp. + Staphylococcus spp. in the microbiome is measured via DGGE (Denaturing Gradient Gel Electrophoresis).

Load the libraries

library(tidyverse)

2 Import the dataset

ap <- 
glimpse(ap)

3 Goal

The overarching goal of this research was to assess if the relative abundance Staphylococcus spp. in the microbiome of the armpit is affected by transplanting the microbiome. To this end the researchers randomized patients to two treatment: A treatment with antibiotics only and a treatment with antibiotics and a microbial transplant.

In the tutorial on hypotheses testing we will use a formal statistical test to generalize the results from the sample to that of the population. For this test to be valid we have to assess following assumptions:

  1. The data in each treatment group is normally distributed.
  2. The data from the two treatment groups has the same variance.

A statistical analysis always starts with a data analysis to get insight in the nature and distribution of the data, and to assess the assumptions of the downstream data analysis. Mastering this data exploration step is the purpose of the current tutorial.

4 Data visualization

A crucial first step in a data analysis is to visualize and to explore the raw data.

First, try to make a histogram of the data. Fill in the missing parts in the chunk of code below to get a good-looking visualization:

ap %>%
  ggplot(aes(x=...,fill=...)) + ## fill in the correct values for x and fill 
  geom_histogram() +
  facet_grid(rows = vars(...)) + ## fill in to put the histograms for both treatment conditions in a separate row
  theme_bw() +
  xlab("relative abundance (%)")

Based on this plot, it seems that the relative abundace is higher for subjects who had the transplants. However, given the small sample size the histograms are not optimally informative. A better option for these data would be to show the data in a boxplot:

ap %>%  ggplot(aes(x=...,y=...,fill=...)) +

What do you observe?

4.1 Descriptive statistics

Here, we will generate some informative descriptive statistics for the dataset.

We first summarize the data and calculate the mean, standard deviation, number of observations and standard error and store the result in an object apRelSum via ’apRelSum<-`

  1. We pipe the ap dataframe to the group_by function to group the data by treatment trt group_by(trt)
  2. We pipe the result to the summarize_at function to summarize the “rel” variable and calculate the mean, standard deviation and the number of observations
  3. We pipe the result to the mutate function to make a new variable in the data frame that is named se for which we calculate the standard error
## Use the instructions from above to generate the summary statistics
...

This concludes the data exploration. Tomorrow, we will learn how to formally test if the observed difference is statistically significant.

LS0tCnRpdGxlOiAiVHV0b3JpYWwgNC4yOiBFeHBsb3JpbmcgdGhlIGFybXBpdCBkYXRhc2V0IiAgIApvdXRwdXQ6CiAgICBodG1sX2RvY3VtZW50OgogICAgICBjb2RlX2Rvd25sb2FkOiB0cnVlICAgIAogICAgICB0aGVtZTogY29zbW8KICAgICAgdG9jOiB0cnVlCiAgICAgIHRvY19mbG9hdDogdHJ1ZQogICAgICBoaWdobGlnaHQ6IHRhbmdvCiAgICAgIG51bWJlcl9zZWN0aW9uczogdHJ1ZQotLS0KCkluIHRoaXMgdHV0b3JpYWwsIHdlIHdpbGwgZXhwbG9yZSB0aGUgInNtZWxseSBhcm1waXQiIGRhdGFzZXQuIAoKIyBTbWVsbHkgYXJtcGl0IGRhdGFzZXQKClNtZWxseSBhcm1waXRzIGFyZSBub3QgY2F1c2VkIGJ5IHN3ZWF0LCBpdHNlbGYuIFRoZSBzbWVsbCBpcyBjYXVzZWQKYnkgc3BlY2lmaWMgbWljcm8tb3JnYW5pc21zIGJlbG9uZ2luZyB0byB0aGUgZ3JvdXAgb2YKKkNvcnluZWJhY3Rlcml1bSBzcHAuKiB0aGF0IG1ldGFib2xpc2Ugc3dlYXQuCkFub3RoZXIgZ3JvdXAgb2YgYWJ1bmRhbnQgYmFjdGVyaWEgYXJlIHRoZSAqU3RhcGh5bG9jb2NjdXMgc3BwLiosCnRoZXNlIGJhY3RlcmlhIGRvIG5vdCBtZXRhYm9saXNlIHN3ZWF0IGluIHNtZWxseSBjb21wb3VuZHMuCgpUaGUgQ01FVC1ncm9lcCBhdCBHaGVudCBVbml2ZXJzaXR5IGRvZXMgcmVzZWFyY2ggdG8gb24gdHJhbnNwbGFudGluZyB0aGUgYXJtcGl0IG1pY3JvYmlvbWUgdG8gc2F2ZSBwZW9wbGUgd2l0aCBzbWVsbHkgYXJtcGl0cy4KCi0gUHJvcG9zZWQgVGhlcmFweToKICAJMS4gUmVtb3ZlIGFybXBpdC1taWNyb2Jpb21lIHdpdGggYW50aWJpb3RpY3MKICAgIDIuIEluZmx1ZW5jZSBhcm1waXQgbWljcm9iaW9tZSB3aXRoIG1pY3JvYmlhbCB0cmFuc3BsYW50CiAgICAgICAoaHR0cHM6Ly95b3V0dS5iZS85UklGeXFMWGRWdykKCi0gRXhwZXJpbWVudDoKCiAgICAtIDIwIHN0dWRlbnRzIHdpdGggc21lbGx5IGFybXBpdHMgYXJlIGF0dHJpYnV0ZWQgdG8gb25lIG9mIAogICAgICB0d28gdHJlYXRtZW50IGdyb3VwcwogICAgLSBwbGFjZWJvIChvbmx5IGFudGliaW90aWNzKQogICAgLSB0cmFuc3BsYW50IChhbnRpYmlvdGljYSBmb2xsb3dlZCBieSBtaWNyb2JpYWwgdHJhbnNwbGFudCkuCiAgICAtIFRoZSBtaWNyb2Jpb21lIGlzIHNhbXBsZWQgNiB3ZWVrcyB1cG9uIHRoZSB0cmVhdG1lbnQKICAgIC0gVGhlIHJlbGF0aXZlIGFidW5kYW5jZSBvZiAqU3RhcGh5bG9jb2NjdXMgc3BwLiogb24KICAgICAgKkNvcnluZWJhY3Rlcml1bSBzcHAuKiArICpTdGFwaHlsb2NvY2N1cyBzcHAuKiBpbiB0aGUKICAgICAgbWljcm9iaW9tZSBpcyBtZWFzdXJlZCB2aWEgREdHRSAoKkRlbmF0dXJpbmcgR3JhZGllbnQgR2VsCiAgICAgIEVsZWN0cm9waG9yZXNpcyopLgoKTG9hZCB0aGUgbGlicmFyaWVzCgpgYGB7ciwgbWVzc2FnZT1GQUxTRX0KbGlicmFyeSh0aWR5dmVyc2UpCmBgYAoKIyBJbXBvcnQgdGhlIGRhdGFzZXQKCmBgYHtyLCBldmFsPUZBTFNFfQphcCA8LSAKYGBgCgpgYGB7ciwgZXZhbD1GQUxTRX0KZ2xpbXBzZShhcCkKYGBgCgojIEdvYWwKClRoZSBvdmVyYXJjaGluZyBnb2FsIG9mIHRoaXMgcmVzZWFyY2ggd2FzIHRvIGFzc2VzcyBpZiB0aGUgcmVsYXRpdmUgYWJ1bmRhbmNlIAoqU3RhcGh5bG9jb2NjdXMgc3BwLioKaW4gdGhlIG1pY3JvYmlvbWUgb2YgdGhlIGFybXBpdCBpcyBhZmZlY3RlZCBieSB0cmFuc3BsYW50aW5nIHRoZSBtaWNyb2Jpb21lLiAKVG8gdGhpcyBlbmQgdGhlIHJlc2VhcmNoZXJzIHJhbmRvbWl6ZWQgcGF0aWVudHMgdG8gdHdvIHRyZWF0bWVudDoKQSB0cmVhdG1lbnQgd2l0aCBhbnRpYmlvdGljcyBvbmx5IGFuZCBhIHRyZWF0bWVudCB3aXRoCmFudGliaW90aWNzIGFuZCBhIG1pY3JvYmlhbCB0cmFuc3BsYW50LgoKSW4gdGhlIHR1dG9yaWFsIG9uIGh5cG90aGVzZXMgdGVzdGluZyB3ZSB3aWxsIHVzZSBhIGZvcm1hbCBzdGF0aXN0aWNhbCB0ZXN0IHRvIGdlbmVyYWxpemUgdGhlIHJlc3VsdHMgZnJvbSB0aGUgc2FtcGxlIHRvIHRoYXQgb2YgdGhlIHBvcHVsYXRpb24uCkZvciB0aGlzIHRlc3QgdG8gYmUgdmFsaWQgd2UgaGF2ZSB0byBhc3Nlc3MgZm9sbG93aW5nIGFzc3VtcHRpb25zOgoKMS4gVGhlIGRhdGEgaW4gZWFjaCB0cmVhdG1lbnQgZ3JvdXAgaXMgbm9ybWFsbHkgZGlzdHJpYnV0ZWQuCjIuIFRoZSBkYXRhIGZyb20gdGhlIHR3byB0cmVhdG1lbnQgZ3JvdXBzIGhhcyB0aGUgc2FtZSB2YXJpYW5jZS4KCkEgc3RhdGlzdGljYWwgYW5hbHlzaXMgYWx3YXlzIHN0YXJ0cyB3aXRoIGEgZGF0YSBhbmFseXNpcyB0byBnZXQgaW5zaWdodCBpbiB0aGUgbmF0dXJlIGFuZCBkaXN0cmlidXRpb24gb2YgdGhlIGRhdGEsIGFuZCB0byBhc3Nlc3MgdGhlIGFzc3VtcHRpb25zIG9mIHRoZSBkb3duc3RyZWFtIGRhdGEgYW5hbHlzaXMuCk1hc3RlcmluZyB0aGlzIGRhdGEgZXhwbG9yYXRpb24gc3RlcCBpcyB0aGUgcHVycG9zZSBvZiB0aGUgY3VycmVudCB0dXRvcmlhbC4KCiMgRGF0YSB2aXN1YWxpemF0aW9uCgpBIGNydWNpYWwgZmlyc3Qgc3RlcCBpbiBhIGRhdGEgYW5hbHlzaXMgaXMgdG8gdmlzdWFsaXplIGFuZCB0byBleHBsb3JlIHRoZSByYXcKZGF0YS4KCkZpcnN0LCB0cnkgdG8gbWFrZSBhIGhpc3RvZ3JhbSBvZiB0aGUgZGF0YS4gRmlsbCBpbiB0aGUKbWlzc2luZyBwYXJ0cyBpbiB0aGUgY2h1bmsgb2YgY29kZSBiZWxvdyB0byBnZXQgYSBnb29kLWxvb2tpbmcKdmlzdWFsaXphdGlvbjoKCmBgYHtyLCBldmFsPUZBTFNFfQphcCAlPiUKICBnZ3Bsb3QoYWVzKHg9Li4uLGZpbGw9Li4uKSkgKyAjIyBmaWxsIGluIHRoZSBjb3JyZWN0IHZhbHVlcyBmb3IgeCBhbmQgZmlsbCAKICBnZW9tX2hpc3RvZ3JhbSgpICsKICBmYWNldF9ncmlkKHJvd3MgPSB2YXJzKC4uLikpICsgIyMgZmlsbCBpbiB0byBwdXQgdGhlIGhpc3RvZ3JhbXMgZm9yIGJvdGggdHJlYXRtZW50IGNvbmRpdGlvbnMgaW4gYSBzZXBhcmF0ZSByb3cKICB0aGVtZV9idygpICsKICB4bGFiKCJyZWxhdGl2ZSBhYnVuZGFuY2UgKCUpIikKYGBgCgpCYXNlZCBvbiB0aGlzIHBsb3QsIGl0IHNlZW1zIHRoYXQgdGhlIHJlbGF0aXZlIGFidW5kYWNlCmlzIGhpZ2hlciBmb3Igc3ViamVjdHMgd2hvIGhhZCB0aGUgdHJhbnNwbGFudHMuCkhvd2V2ZXIsIGdpdmVuIHRoZSBzbWFsbCBzYW1wbGUgc2l6ZSB0aGUgaGlzdG9ncmFtcwphcmUgbm90IG9wdGltYWxseSBpbmZvcm1hdGl2ZS4gQSBiZXR0ZXIgb3B0aW9uIGZvciB0aGVzZSBkYXRhIHdvdWxkIGJlIHRvCnNob3cgdGhlIGRhdGEgaW4gYSBib3hwbG90OgoKYGBge3IsIGV2YWw9RkFMU0V9CmFwICU+JSAgZ2dwbG90KGFlcyh4PS4uLix5PS4uLixmaWxsPS4uLikpICsKYGBgCgpXaGF0IGRvIHlvdSBvYnNlcnZlPwoKIyMgRGVzY3JpcHRpdmUgc3RhdGlzdGljcwoKSGVyZSwgd2Ugd2lsbCBnZW5lcmF0ZSBzb21lIGluZm9ybWF0aXZlIGRlc2NyaXB0aXZlIHN0YXRpc3RpY3MKZm9yIHRoZSBkYXRhc2V0LgoKV2UgZmlyc3Qgc3VtbWFyaXplIHRoZSBkYXRhIGFuZCBjYWxjdWxhdGUgdGhlIG1lYW4sIHN0YW5kYXJkCmRldmlhdGlvbiwgbnVtYmVyIG9mIG9ic2VydmF0aW9ucyBhbmQgc3RhbmRhcmQgZXJyb3IgYW5kIHN0b3JlIHRoZQpyZXN1bHQgaW4gYW4gb2JqZWN0IGFwUmVsU3VtIHZpYSAnYXBSZWxTdW08LWAKCjEuIFdlIHBpcGUgdGhlIGBhcGAgZGF0YWZyYW1lIHRvIHRoZSBncm91cF9ieSBmdW5jdGlvbiB0byBncm91cAp0aGUgZGF0YSBieSB0cmVhdG1lbnQgdHJ0IGBncm91cF9ieSh0cnQpYAoyLiBXZSBwaXBlIHRoZSByZXN1bHQgdG8gdGhlIGBzdW1tYXJpemVfYXRgIGZ1bmN0aW9uIHRvIHN1bW1hcml6ZQp0aGUgInJlbCIgdmFyaWFibGUgYW5kIGNhbGN1bGF0ZSB0aGUgbWVhbiwgc3RhbmRhcmQgZGV2aWF0aW9uIGFuZAp0aGUgbnVtYmVyIG9mIG9ic2VydmF0aW9ucyAKMy4gV2UgcGlwZSB0aGUgcmVzdWx0IHRvIHRoZSBgbXV0YXRlYCBmdW5jdGlvbiB0byBtYWtlIGEgbmV3CnZhcmlhYmxlIGluIHRoZSBkYXRhIGZyYW1lIHRoYXQgaXMgbmFtZWQgYHNlYCBmb3Igd2hpY2ggd2UgY2FsY3VsYXRlIHRoZQpzdGFuZGFyZCBlcnJvciAKCmBgYHtyLCBldmFsPUZBTFNFfQojIyBVc2UgdGhlIGluc3RydWN0aW9ucyBmcm9tIGFib3ZlIHRvIGdlbmVyYXRlIHRoZSBzdW1tYXJ5IHN0YXRpc3RpY3MKLi4uCmBgYAoKVGhpcyBjb25jbHVkZXMgdGhlIGRhdGEgZXhwbG9yYXRpb24uIFRvbW9ycm93LCB3ZSB3aWxsIApsZWFybiBob3cgdG8gZm9ybWFsbHkgdGVzdCBpZiB0aGUgb2JzZXJ2ZWQgCmRpZmZlcmVuY2UgaXMgKipzdGF0aXN0aWNhbGx5IHNpZ25pZmljYW50KiouCgoKCgoK