Statistical inference tutorial

On the second day of the “Practical Statistics for the Life Sciences (2020)” course, we will have three tutorials on data exploration, based on different datasets:

1) The captopril dataset:

The captopril dataset holds information on a small experiment with 15 patients that have increased blood pressure values. More specifically, for each patient we will have four values; one value for systolic blood pressure and one for diasystolyic, both before and after treating the patient with a drug named captopril.

In this tutorial on data exploration, you will learn how to deal with data from paired experimental designs.

  • Exercise: Exercise_1
  • Data: “https://raw.githubusercontent.com/GTPB/PSLS20/master/data/captopril.txt”
  • Solution: Solution_1

2) The armpit dataset:

Smelly armpits are not caused by sweat, itself. The smell is caused by specific micro-organisms belonging to the group of Corynebacterium spp. that metabolise sweat. Another group of abundant bacteria are the Staphylococcus spp., which do not metabolise sweat in smelly compounds.

The CMET-groep at Ghent University does research on transplanting the armpit microbiome to help people with smelly armpits. To test the effect of transplanting the microbiome, they conducted an experiment on two groups of volunteers: one group was treated with a placebo, while the other had a microbiome transplant.

In this tutorial, you will use your acquired skills on hypothesis testing to test whether or not the microbial transplant has helped in decreasing the (average) relative abundance Staphylococcus spp. in the microbiome of the armpit.

  • Exercise: Exercise_2
  • Data: “https://raw.githubusercontent.com/GTPB/PSLS20/master/data/armpit.csv”
  • Solution: Solution_2

3) The Shrimps dataset:

PCBs are often present in coolants, and are know to accumulate easily in the adipose tissue of shrimps. In this experiment, two groups of 18 samples (each 100 grams) of shrimps each were cultivated in different conditions, one control condition and one condition where the medium was polluted with PCBs.

The research question is; is there an effect of the growth condition on the PCB concentration in the adipose tissue of shrimps?

  • Exercise: Exercise_3
  • Data: “https://raw.githubusercontent.com/GTPB/PSLS20/master/data/shrimps.txt”
  • Solution: Solution_3

4) The Cuckoo dataset:

The common cuckoo does not build its own nest: it prefers to lay its eggs in another birds’ nest. It is known, since 1892, that the type of cuckoo bird eggs are different between different locations. In a study from 1940, it was shown that cuckoos return to the same nesting area each year, and that they always pick the same bird species to be a “foster parent” for their eggs.

Over the years, this has lead to the development of geographically determined subspecies of cuckoos. These subspecies have evolved in such a way that their eggs look as similar as possible as those of their foster parents.

The cuckoo dataset contains information on 120 Cuckoo eggs, obtained from randomly selected “foster” nests. The researchers want to test if the type of foster parent has an effect on the average length of the cuckoo eggs.

  • Exercise: Exercise_4
  • Data: “https://raw.githubusercontent.com/GTPB/PSLS20/master/data/Cuckoo.txt”
  • solution_4