Sub-continental origin of European haplotypes derived from admixed genomes


Course Description

Genetic and genomic data are increasingly used by ecologists and evolutionary biologists in general. It has thus become important for many biologists with different levels of experience to produce and analyse genetic (and genomic) data. In this course we will take a practical approach to the analysis of genetic and genomic data, but we will also provide some of the theoretical background required to understand the outputs of the software used. This course will be organised so as to mix lectures where important notions are introduced with practicals where freely available software will be used. While this will not be the focus of the course, we will also introduce and discuss genealogical (coalescent-based) simulation methods and those based on forward-in-time simulations. Altogether this will allow to discuss the potentialities and limitations of the tools available to the community.

In this five-day course we will introduce the main concepts that underlie many of the models that are frequently used in population genetics. We will focus on the importance of demographic history (e.g. effective sizes and migration patterns) in shaping genetic data. We will go through the basic notions that are central to population genetics, insisting particularly on the statistics used to measure genetic diversity and population differentiation. The course will also cover a short introduction to coalescent theory, Bayesian inference in population genetics and data simulation. We will also introduce methods that have been recently developed to analyse genomic data such as the PSMC method of Li and Durbin that reconstructs the demographic history of a species or population with the genome of a single individual.

Most theory will be put into practice in practical sessions, analyzing real and/or simulated datasets. In these sessions, we will look at measures of genetic diversity and differentiation and use methods to infer demographic history. We will learn how to perform coalescent simulations of genetic/genomic data (using mainly Richard Hudson’s ms program). We will also show how to simulate data for PSMC analyses. This will allow users to compare the PSMC obtained with real data to those obtained for the models they used. We will also look at how habitat fragmentation can be simulated using an in-house program. Some exercises will make use of R scripts (R being a freely available statistical program). Basic R knowledge is a pre-requisite but we will provide a short introduction to R. The R statistical package is a very powerful tool to analyse data outputs from many population genetics software, and can also be used to simulate genetic data under simple demographic scenarios.


Detailed Program

Note - All the datasets and documentation used for this training course is available in the following button. You need to unzip this file and follow the instructions throughout the documentation.

Download PGDH18 Datasets and Docs File Size: 660,0MB


Instructors


The source for this course webpage is on github.


Creative Commons License
PGDH18 by GTPB is licensed under a Creative Commons Attribution 4.0 International License