In order to help familiarise yourself with R and RStudio, we'll make a tree.
We will use several libraries in this course, most of which have already been installed for you. However, it's good to know how to install them yourself.
install.packages
command.Most libraries are not loaded by default, so you'll need to load them yourself. Try the following command
library(ape)
Most of the time, we will use FASTA formatted sequence data.
>SequenceName
CTCTTGGCTCTGTTGTCCTGTTTGACCATCCCAGCTTCCGCTTATGAAG
CATGTCACAAACGACTGCTCCAACTCGAGCATTGTCTATGAGGCAGCGG
TGCGTGCCCTGTGTTCGGGAGGGCAACACCTCCCGCTGCTGGGTATCGC
AACTCCAGTGTCCCCACCACGACGATACGGCGCCATGTCGATTTGCTCG
GCCATGTACGTGGGGGATCTCTGCGGATCTGTTTTCCTCGTCTCCCAAC
TATCAGACGGTACAGGACTGCAATTGTTCAATCTATCCTGGCCACGTAA
myseqs <- read.dna("ray2000.fas",format="fasta",as.matrix=TRUE)
<-
to assign to namesread.dna
?read.dna
??dna
use.matrix=TRUE
Nothing happened! This is because nothing went wrong. If we want to see the data, we can just type in the name we gave it.
myseqs
## 71 DNA sequences in binary format stored in a matrix.
##
## All sequences of same length: 411
##
## Labels: AF271819 AF271820 AF271821 AF271823 AF271824 AF271822 ...
##
## Base composition:
## a c g t
## 0.177 0.305 0.256 0.262
image(myseqs)
image(myseqs[1:10,100:200])
myseqs.dist <- dist.dna(myseqs,model="TN93")
nj
command generates a neighbour joining tree from a distance matrix.myseqs.njtree <- nj(myseqs.dist)
myseqs.njtree
##
## Phylogenetic tree with 71 tips and 69 internal nodes.
##
## Tip labels:
## AF271819, AF271820, AF271821, AF271823, AF271824, AF271822, ...
##
## Unrooted; includes branch lengths.
plot(myseqs.njtree,type="unrooted",cex=0.5)