Course Description
Reference genomes are central to most bioinformatics approaches. However, the use of a single reference genome to guide an analysis can result in reference bias: other genomes appear more similar to the reference than they actually are. We might miss, or misunderstand information about genome variability and relationship that cannot be expressed relative to a chosen reference genome. Moreover, new assembly methods are making it easier than ever to generate high-quality complete genome assemblies. To obtain a total understanding of variation between multiple whole genomes, we need to use a model that expresses many genomes and their mutual alignment. A general solution to these problems is to use a pangenome graph wherein genomes are described as labeled walks through an underlying s equence graph. In this course, we will work with methods to build such graphs from whole genome assemblies, and to use the built graphs in an array of downstream applications in comparative genomics, evolution, variation analysis, sequence alignment, and phenotype association.
Target Audience
This course is oriented towards biologists and bioinformaticians. The course will be of particular interest to researchers investigating organisms without a reference genome or populations featuring high levels of genetic diversity, to researchers in comparative genomics, and to researchers who are assembling pangenomes of any species.
Detailed Program
Introduction
PGGB
Pangenome visualization
Homo sapiens pangenomes
ODGI
Saccharomyces cerevisiae pangenomes
Read mapping and variant calling
Other notes
Learning objectives
Participants will develop an understanding of pangenome concepts, and refine this through practical experience with methods to build and interrogate pangenome graphs. We will apply these methods to difficult study questions wherein we need to understand the relationship between many genomes, or account for variability when we analyze new genomes. Participants will leave with a deep understanding of pangenome methods based on whole genome assemblies.
Instructors
Erik Garrison
Affiliation: University of Tennessee Health Science Center, Memphis, TN, US
Andrea Guarracino
Affiliation: Human Technopole, Milan, IT
Helpers
Flavia Villani
Affiliation: University of Tennessee Health Science Center, Memphis, TN, US
Njagi Mwaniki
Affiliation: University of Pisa, Pisa, Italy
The source for this course webpage is in github.
Web_course_template by GTPB is licensed under a Creative Commons Attribution 4.0 International License.