Practical Exercise - Gibbs Sampling
At the end of this challenge, you should be able to:
(1) understand the Gibbs Sampler mechanism,
(2) how to run it for the Multinomial-Dirichlet distributions with known x,
(3) implement the main functions in .
Step 1. Simulate the data - Multinomial Distribution
- Learn about Multinomial ditribution. For the three-dimensional case we have:
where and
- Search for the function which generates multinomially distributed random number vectors and computes multinomial probabilities.
Click Here to see the answer
?rmultinom
- Simulate 1 random vector, , following a Multinomial distribution with parameters n=1000 and . Store the simulated data in an object named .
Click Here to see the answer
theta<-c(0.2,0.3,0.5)
data<-rmultinom(1,1000,theta)
data
- Calculate the probability of observing the vector (220,350,430), that is, .
Click Here to see the answer
dmultinom(c(220,350,430),1000,theta)
Step 2. Dirichlet Distribution - Prior / Posterior Distribution
- Learn about Dirichlet ditribution. For the three-dimensional case we have:
where and ,
- Search for the function which generates Dirichlet distributed random number vectors and computes Dirichlet probabilities.
Click Here to see the answer
?rdirichlet
library(gtools)
- Simulate a random vector , from a Dirichlet distribution with hyperparameters . Store the simulated vector in and vector in .
Click Here to see the answer
a.0<-c(1,1,1)
theta.0<-rdirichlet(1,a.0)
theta.0
- Calculate the probability density for the vector (0.15,0.25,0.6), that is, .
Click Here to see the answer
ddirichlet(c(0.15,0.25,0.6),a.0) # Note: prob density function > 0
- The Dirichlet distribution is the conjugate prior of the Multinomial distribution, by achiving the following result: .
Simulate knowing x and , directly from de posterior distribution (Dirichlet).
Click Here to see the answer
data<-t(data) # need a row vector
updated.theta<-rdirichlet(1,a.0+data) # Note: prob density function > 0
updated.theta
Step 3. The Gibbs Sampler.
-
Develop a function in for the Gibbs Sampler. Consider 5000 iterations.
-
Store the updated parameters for each iteration in a matrix of order .
Click Here to see the answer
nr.iter<-5000
theta.all.iter<-matrix(0,5000,3)
a<-a.0
for(i in 1:nr.iter)
{
theta.all.iter[i,]<-rdirichlet(1,a)
a<-a+data
# cat(i,"\n")
}
tail(theta.all.iter) # last 6 rows
Step 4. Trace / Parameters Estimation
- Represent graphically the trace for each parameter along the 5000 iterations.
Click Here to see the answer
par(mfrow=c(1,3))
plot(1:nr.iter,theta.all.iter[,1],ylim=c(0,0.5),main=expression(paste("Trace for ",theta[1])),ylab=expression(theta),xlab="iteration")
plot(1:nr.iter,theta.all.iter[,2],ylim=c(0,0.5),main=expression(paste("Trace for ",theta[2])),ylab=expression(theta),xlab="iteration")
plot(1:nr.iter,theta.all.iter[,3],ylim=c(0,0.5),main=expression(paste("Trace for ",theta[3])),ylab=expression(theta),xlab="iteration")
-
Evaluate the need of setting a period of burn-in.
-
Find estimates of the parameters according to the decisions made in previous bullet.
Click Here to see the answer
theta.final<-apply(theta.all.iter[1001:5000,],2,mean) # 2: 'by column'
round(theta.final,2)
Back
Back to main page.