Practical Exercise - EM Algorithm
The main locus for the blood type of mice is called Ag-B (B). Several alleles are associated to this locus but for some crossovers Mendel’s laws do not seem to hold. A mating AaBb x AaBb, originated a progeny, yielding
Genotype | Frequency | Probability |
---|---|---|
AABB | 11 | |
AABb | 14 | |
AAbb | 1 | |
AaBB | 10 | |
AaBb | 27 | |
Aabb | 12 | |
aaBB | 3 | |
aaBb | 13 | |
aabb | 11 |
Estimate the recombination fraction, , from these data by the EM algorithm.
Step 1
- Read the data and state ao many recombinant gametes are there for each genotype.
Click Here to see the answer
nAABB<-11 # 0 recombinant gametes
nAABb<-14 # 1 recombinant gamete
nAAbb<-1 # 2 recombinant gametes
nAaBB<-10 # 1 recombinant gamete
nAaBb<-27 # 0 or 2 recombinant gametes
nAabb<-12 # 1 recombinant gamete
naaBB<-3 # 2 recombinant gametes
naaBb<-13 # 1 recombinant gamete
naabb<-11 # 0 recombinant gametes
- Calculate , the number of individuals from 1 recombinant gametes ().
Click Here to see the answer
n1 <- nAABb + nAaBB + nAabb + naaBb
n1
- Calculate , the number of individuals from 2 recombinant gametes ().
Note that .
Click Here to see the answer
n2.star <- NULL
n2 <- nAAbb + naaBB + n2.star
- Calculate n, the total number of individuals ().
Click Here to see the answer
n <- n1 + nAAbb + naaBB + nAABB + nAaBb + naabb
n
Step 2
- Initialize .
Click Here to see the answer
r <- 0.3
Step 3 - E (Expectation)
- Create function in order to calculate the expected value for ,
: random variable representing the number of individuals from 2 recombinant gametes, among individuals.
with
then, .
Click Here to see the answer
expected <- function(r)
{
n2.star <- nAaBb*r^2/(r^2+(1-r)^2)
n2.star
}
Step 4 - M (Maximization)
- Create function \texttt{update.theta} in order to update according to:
meaning that the proportion of recombinant gametes is calculated as the total number of recombinant gametes (0, 1 or 2 for each individual) over the total number of gametes for n individuals.
Click Here to see the answer
update.theta <- function(n2.star)
{
r <- (n1+2*(nAAbb+naaBB+n2.star))/(2*n)
r
}
Step 5 - Iterative procedure
- Compute the cycle.
Click Here to see the answer
i<-0
er<-1
error<-10^(-5)
while(er>=error)
{
# Step E
n2.star<-expected(r)
# Step M
r.updated<-update.theta(n2.star)
# Stop criteria
er<-abs(r-r.updated)
i<-i+1
r<-r.updated
cat(i,r,"\n")
}
Step 6 - Print the results
Click Here to see the answer
cat("\nThe final solution, after",i,"iterations, is r* =",r,"\n")
</p></details>
### Back Back to [main page](/ABSTAT18/).