Course Description

Python is an object-oriented programming language that is ideal for biological data analysis. The course will start from zero knowledge, and will introduce the participants to all the basic concepts of Python such as calculating, organizing data, reading and writing files, program logic and writing larger programs. All the examples and practical sessions will focus on solving biological problems. In particular the sessions will cover:

  • working with DNA and protein sequences
  • data retrieval from files and their manipulation
  • running applications, such as BLAST, locally and from a script
  • finding motifs in sequence
  • parsing common file formats (Uniprot, GenBank, PDB, BLAST) with Biopython
  • ways to find and correct program errors

The course will be highly interactive and the students will continuously put theory into practice while learning.

Target Audience

End-users of bioinformatics databases and tools that need to manage large files and/or a large number of files and aim at developing hands-on capabilities for their analysis by writing their own or adapting somebody else’s scripts in an autonomous way.

Course Documentation

All the datasets used for this training course is available throughout the documentation.

Day 1

Intro | Python Shell | Python Programs | Structures and Modules

Day 2

Flipped Lesson: Homework Loops | In class assessment

Repeating Things | File Formats | Parsing I | Parsing II

Day 3

Flipped Lesson: Homework Functions | In class assessment
Functions | Best Practices |Error Handling | Data Columns | Tabular Data

Day 4

Data Searching | BLAST Pipeline | Python libraries

Learning objectives and Course Pre-requisites


The source for this course webpage is on github.

Creative Commons License
PPB18 by GTPB is licensed under a Creative Commons Attribution 4.0 International License.