On the third day of the “Practical Statistics for the Life Sciences” course, we will have exercises on linear regression, based on different datasets:


For your reference you can find the code for the brca case study from the lecture for chapter 6:


1 The puromycin dataset

Data on the velocity of an enzymatic reaction were obtained by Treloar (1974). The number of counts per minute of radioactive product from the reaction was measured as a function of substrate concentration in parts per million (ppm) and from these counts the initial rate (or velocity) of the reaction was calculated (counts/min/min). The experiment was conducted once with the enzyme treated with Puromycin, and once with the enzyme untreated.

The goal is to assess if there is an association between the substrate concentration and rate for the treated enzyme.

The aims of this exercise are to

  1. Independently perform a linear regression analysis on a new case study
  2. Critically assess the assumptions for linear regression
  3. Grasp how an appropriate transformation can remediate invalid assumptions
  4. Get more proficient in formulating the conclusion of a statistical analysis in terms of the research question

2 The poison dataset

In this experiment, 96 fish (dojofish, goldfish and zebrafish) were placed separately in a tank with two litres of water and a certain dose (in mg) of a certain poison EI-43,064. The resistance of the fish a against the poison was measured as the amount of minutes the fish survived upon adding the poison (Surv_time, in minutes). Additionally, the weight of each fish was measured.

In this tutorial, we will study the association between dose and survival time by using a linear regression model.

The aims of this exercise are to sharpen your skills in

  1. Data exploration for datasets with multiple variables

  2. Assessing the assumptions of the linear model and remediating the fit using transformations

  3. Interpreting the linear model parameters in the context of the research question

  • Exercise: Exercise2

  • Data path:

    https://raw.githubusercontent.com/statOmics/PSLSData/main/poison.csv

  • Solution: Solution2


3 Armpit example

Smelly armpits are not caused by sweat, itself. The smell is caused by specific micro-organisms belonging to the group of Corynebacterium spp. that metabolise sweat. Another group of abundant bacteria are the Staphylococcus spp., which do not metabolise sweat in smelly compounds.

The CMET-groep at Ghent University does research on transplanting the armpit microbiome to help people with smelly armpits. To test the effect of transplanting the microbiome, they conducted an experiment on two groups of volunteers: one group was treated with a placebo, while the other had a microbiome transplant. The goal was to assess whether a microbial transplant can affect the relative abundance Staphylococcus spp. in the microbiome of the armpit.

In the previous tutorial, we analysed the armpit data using a two sample t-test. Add a section to the script file for the armpit example where you model the data using a linear model. Compare the output of the linear model to the results you optained with the two sample t-test and formulate your conclusion based on the output of the linear model.

The aims of this exercise are to

  1. Learn how the linear regression model can also be used to compare two group means
  2. Grasp how a two sample t-test can be recasted in a test on the slope term for a dummy variable
  • Exercise: Exercise3

  • Data path:

    https://raw.githubusercontent.com/statOmics/PSLSData/main/armpit.csv

  • Solution: Solution3

LS0tCnRpdGxlOiAiRXhlcmNpc2VzIG9uIGNoYXB0ZXIgNjogU2ltcGxlIGxpbmVhciByZWdyZXNzaW9uIgphdXRob3I6ICJMaWV2ZW4gQ2xlbWVudCBhbmQgSmVyb2VuIEdpbGlzIgpkYXRlOiAic3RhdE9taWNzLCBHaGVudCBVbml2ZXJzaXR5IChodHRwczovL3N0YXRvbWljcy5naXRodWIuaW8pIgotLS0KCk9uIHRoZSB0aGlyZCBkYXkgb2YgdGhlICJQcmFjdGljYWwgU3RhdGlzdGljcyBmb3IgdGhlIExpZmUgU2NpZW5jZXMiCmNvdXJzZSwgd2Ugd2lsbCBoYXZlIGV4ZXJjaXNlcyBvbiBsaW5lYXIgcmVncmVzc2lvbiwgYmFzZWQgb24gZGlmZmVyZW50CmRhdGFzZXRzOgoKLSBbVGhlIHB1cm9teWNpbiBkYXRhc2V0XQotIFtUaGUgcG9pc29uIGRhdGFzZXRdCi0gW0FybXBpdCBleGFtcGxlXQoKKioqCgpGb3IgeW91ciByZWZlcmVuY2UgeW91IGNhbiBmaW5kIHRoZSBjb2RlIGZvciB0aGUgYnJjYSBjYXNlIHN0dWR5IGZyb20gdGhlIGxlY3R1cmUgZm9yIGNoYXB0ZXIgNjoKCi0gW0JyZWFzdCBjYW5jZXIgZXhhbXBsZV0oLi9icmVhc3RjYW5jZXJFeGFtcGxlLmh0bWwpCgoqKioKCiMgVGhlIHB1cm9teWNpbiBkYXRhc2V0CgpEYXRhIG9uIHRoZSB2ZWxvY2l0eSBvZiBhbiBlbnp5bWF0aWMgcmVhY3Rpb24gd2VyZSBvYnRhaW5lZCBieSBUcmVsb2FyICgxOTc0KS4KVGhlIG51bWJlciBvZiBjb3VudHMgcGVyIG1pbnV0ZSBvZiByYWRpb2FjdGl2ZSBwcm9kdWN0IGZyb20gdGhlIHJlYWN0aW9uIHdhcwptZWFzdXJlZCBhcyBhIGZ1bmN0aW9uIG9mIHN1YnN0cmF0ZSBjb25jZW50cmF0aW9uIGluIHBhcnRzIHBlciBtaWxsaW9uIChwcG0pCmFuZCBmcm9tIHRoZXNlIGNvdW50cyB0aGUgaW5pdGlhbCByYXRlIChvciB2ZWxvY2l0eSkgb2YgdGhlIHJlYWN0aW9uIHdhcwpjYWxjdWxhdGVkIChjb3VudHMvbWluL21pbikuIFRoZSBleHBlcmltZW50IHdhcyBjb25kdWN0ZWQgb25jZSB3aXRoIHRoZSBlbnp5bWUKdHJlYXRlZCB3aXRoIFB1cm9teWNpbiwgYW5kIG9uY2Ugd2l0aCB0aGUgZW56eW1lIHVudHJlYXRlZC4KClRoZSBnb2FsIGlzIHRvIGFzc2VzcyBpZiB0aGVyZSBpcyBhbiBhc3NvY2lhdGlvbiBiZXR3ZWVuIHRoZSBzdWJzdHJhdGUKY29uY2VudHJhdGlvbiBhbmQgcmF0ZSBmb3IgdGhlIHRyZWF0ZWQgZW56eW1lLgoKVGhlIGFpbXMgb2YgdGhpcyBleGVyY2lzZSBhcmUgdG8KCjEuIEluZGVwZW5kZW50bHkgcGVyZm9ybSBhIGxpbmVhciByZWdyZXNzaW9uIGFuYWx5c2lzIG9uIGEgbmV3IGNhc2Ugc3R1ZHkKMi4gQ3JpdGljYWxseSBhc3Nlc3MgdGhlIGFzc3VtcHRpb25zIGZvciBsaW5lYXIgcmVncmVzc2lvbgozLiBHcmFzcCBob3cgYW4gYXBwcm9wcmlhdGUgdHJhbnNmb3JtYXRpb24gY2FuIHJlbWVkaWF0ZSBpbnZhbGlkIGFzc3VtcHRpb25zCjQuIEdldCBtb3JlIHByb2ZpY2llbnQgaW4gZm9ybXVsYXRpbmcgdGhlIGNvbmNsdXNpb24gb2YgYSBzdGF0aXN0aWNhbCBhbmFseXNpcyBpbiB0ZXJtcyBvZiB0aGUgcmVzZWFyY2ggcXVlc3Rpb24KCi0gRXhlcmNpc2U6IFtFeGVyY2lzZTFdKC4vMDZfMV9wdXJvbXljaW4uaHRtbCkKLSBEYXRhIHBhdGg6IE5vdCByZXF1aXJlZAotIFNvbHV0aW9uOiBbU29sdXRpb24xXSguLzA2XzFfcHVyb215Y2luX3NvbC5odG1sKQoKCiMgVGhlIHBvaXNvbiBkYXRhc2V0CgpJbiB0aGlzIGV4cGVyaW1lbnQsIDk2IGZpc2ggKGRvam9maXNoLCBnb2xkZmlzaCBhbmQgemVicmFmaXNoKQp3ZXJlIHBsYWNlZCBzZXBhcmF0ZWx5IGluIGEgdGFuayB3aXRoIHR3byBsaXRyZXMgb2Ygd2F0ZXIgYW5kCmEgY2VydGFpbiBkb3NlIChpbiBtZykgb2YgYSBjZXJ0YWluIHBvaXNvbiBFSS00MywwNjQuIFRoZSByZXNpc3RhbmNlCm9mIHRoZSBmaXNoIGEgYWdhaW5zdCB0aGUgcG9pc29uIHdhcyBtZWFzdXJlZCBhcyB0aGUgYW1vdW50IG9mCm1pbnV0ZXMgdGhlIGZpc2ggc3Vydml2ZWQgdXBvbiBhZGRpbmcgdGhlIHBvaXNvbiAoU3Vydl90aW1lLCBpbgptaW51dGVzKS4gQWRkaXRpb25hbGx5LCB0aGUgd2VpZ2h0IG9mIGVhY2ggZmlzaCB3YXMgbWVhc3VyZWQuCgpJbiB0aGlzIHR1dG9yaWFsLCB3ZSB3aWxsIHN0dWR5IHRoZSBhc3NvY2lhdGlvbiBiZXR3ZWVuIGRvc2UgYW5kIHN1cnZpdmFsIHRpbWUKYnkgdXNpbmcgYSBsaW5lYXIgcmVncmVzc2lvbiBtb2RlbC4KClRoZSBhaW1zIG9mIHRoaXMgZXhlcmNpc2UgYXJlIHRvIHNoYXJwZW4geW91ciBza2lsbHMgaW4KCjEuIERhdGEgZXhwbG9yYXRpb24gZm9yIGRhdGFzZXRzIHdpdGggbXVsdGlwbGUgdmFyaWFibGVzCgoyLiBBc3Nlc3NpbmcgdGhlIGFzc3VtcHRpb25zIG9mIHRoZSBsaW5lYXIgbW9kZWwgYW5kIHJlbWVkaWF0aW5nIHRoZSBmaXQgdXNpbmcgdHJhbnNmb3JtYXRpb25zCgozLiBJbnRlcnByZXRpbmcgdGhlIGxpbmVhciBtb2RlbCBwYXJhbWV0ZXJzIGluIHRoZSBjb250ZXh0IG9mIHRoZSByZXNlYXJjaCBxdWVzdGlvbgoKLSBFeGVyY2lzZTogW0V4ZXJjaXNlMl0oLi8wNl8yX3BvaXNvbi5odG1sKQotIERhdGEgcGF0aDoKCiAgYGh0dHBzOi8vcmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbS9zdGF0T21pY3MvUFNMU0RhdGEvbWFpbi9wb2lzb24uY3N2YAoKLSBTb2x1dGlvbjogW1NvbHV0aW9uMl0oLi8wNl8yX3BvaXNvbl9zb2wuaHRtbCkKCi0tLQoKIyBBcm1waXQgZXhhbXBsZQoKU21lbGx5IGFybXBpdHMgYXJlIG5vdCBjYXVzZWQgYnkgc3dlYXQsIGl0c2VsZi4gVGhlIHNtZWxsIGlzIGNhdXNlZCBieSBzcGVjaWZpYwptaWNyby1vcmdhbmlzbXMgYmVsb25naW5nIHRvIHRoZSBncm91cCBvZiBDb3J5bmViYWN0ZXJpdW0gc3BwLiB0aGF0IG1ldGFib2xpc2UKc3dlYXQuIEFub3RoZXIgZ3JvdXAgb2YgYWJ1bmRhbnQgYmFjdGVyaWEgYXJlIHRoZSBTdGFwaHlsb2NvY2N1cyBzcHAuLCB3aGljaCBkbwpub3QgbWV0YWJvbGlzZSBzd2VhdCBpbiBzbWVsbHkgY29tcG91bmRzLgoKVGhlIENNRVQtZ3JvZXAgYXQgR2hlbnQgVW5pdmVyc2l0eSBkb2VzIHJlc2VhcmNoIG9uIHRyYW5zcGxhbnRpbmcgdGhlIGFybXBpdAptaWNyb2Jpb21lIHRvIGhlbHAgcGVvcGxlIHdpdGggc21lbGx5IGFybXBpdHMuIFRvIHRlc3QgdGhlIGVmZmVjdCBvZgp0cmFuc3BsYW50aW5nIHRoZSBtaWNyb2Jpb21lLCB0aGV5IGNvbmR1Y3RlZCBhbiBleHBlcmltZW50IG9uIHR3byBncm91cHMKb2Ygdm9sdW50ZWVyczogb25lIGdyb3VwIHdhcyB0cmVhdGVkIHdpdGggYSBwbGFjZWJvLCB3aGlsZSB0aGUgb3RoZXIgaGFkIGEKbWljcm9iaW9tZSB0cmFuc3BsYW50LiBUaGUgZ29hbCB3YXMgdG8gYXNzZXNzIHdoZXRoZXIgYSBtaWNyb2JpYWwgdHJhbnNwbGFudCBjYW4gYWZmZWN0IHRoZSByZWxhdGl2ZSBhYnVuZGFuY2UgU3RhcGh5bG9jb2NjdXMgc3BwLiBpbiB0aGUgbWljcm9iaW9tZSBvZiB0aGUgYXJtcGl0LgoKSW4gdGhlIHByZXZpb3VzIHR1dG9yaWFsLCB3ZSBhbmFseXNlZCB0aGUgYXJtcGl0IGRhdGEgdXNpbmcgYSB0d28gc2FtcGxlIHQtdGVzdC4KQWRkIGEgc2VjdGlvbiB0byB0aGUgc2NyaXB0IGZpbGUgZm9yIHRoZSBhcm1waXQgZXhhbXBsZSB3aGVyZSB5b3UgbW9kZWwgdGhlIGRhdGEgdXNpbmcgYSBsaW5lYXIgbW9kZWwuIENvbXBhcmUgdGhlIG91dHB1dCBvZiB0aGUgbGluZWFyIG1vZGVsIHRvIHRoZSByZXN1bHRzIHlvdSBvcHRhaW5lZCB3aXRoIHRoZSB0d28gc2FtcGxlIHQtdGVzdCBhbmQgZm9ybXVsYXRlIHlvdXIgY29uY2x1c2lvbiBiYXNlZCBvbiB0aGUgb3V0cHV0IG9mIHRoZSBsaW5lYXIgbW9kZWwuCgpUaGUgYWltcyBvZiB0aGlzIGV4ZXJjaXNlIGFyZSB0bwoKMS4gTGVhcm4gaG93IHRoZSBsaW5lYXIgcmVncmVzc2lvbiBtb2RlbCBjYW4gYWxzbyBiZSB1c2VkIHRvIGNvbXBhcmUgdHdvIGdyb3VwIG1lYW5zCjIuIEdyYXNwIGhvdyBhIHR3byBzYW1wbGUgdC10ZXN0IGNhbiBiZSByZWNhc3RlZCBpbiBhIHRlc3Qgb24gdGhlIHNsb3BlIHRlcm0gZm9yIGEgZHVtbXkgdmFyaWFibGUKCi0gRXhlcmNpc2U6IFtFeGVyY2lzZTNdKC4vMDZfM19hcm1waXQuaHRtbCkKLSBEYXRhIHBhdGg6CgogIGBodHRwczovL3Jhdy5naXRodWJ1c2VyY29udGVudC5jb20vc3RhdE9taWNzL1BTTFNEYXRhL21haW4vYXJtcGl0LmNzdmAKCi0gU29sdXRpb246IFtTb2x1dGlvbjNdKC4vMDZfM19hcm1waXRfc29sLmh0bWwpCg==