At the end of this challenge, you should be able to:

(1) understand the Gibbs Sampler mechanism,

(2) how to run it for the Multinomial-Dirichlet distributions with known x,

(3) implement the main functions in .

Step 1. Simulate the data - Multinomial Distribution

eq1

where and

  • Search for the function which generates multinomially distributed random number vectors and computes multinomial probabilities.
Click Here to see the answer

?rmultinom


  • Simulate 1 random vector, , following a Multinomial distribution with parameters n=1000 and . Store the simulated data in an object named .
Click Here to see the answer

theta<-c(0.2,0.3,0.5)
data<-rmultinom(1,1000,theta)
data


  • Calculate the probability of observing the vector (220,350,430), that is, .
Click Here to see the answer

dmultinom(c(220,350,430),1000,theta)



Step 2. Dirichlet Distribution - Prior / Posterior Distribution

where and ,

  • Search for the function which generates Dirichlet distributed random number vectors and computes Dirichlet probabilities.
Click Here to see the answer

?rdirichlet
library(gtools)


  • Simulate a random vector , from a Dirichlet distribution with hyperparameters . Store the simulated vector in and vector in .
Click Here to see the answer

a.0<-c(1,1,1)
theta.0<-rdirichlet(1,a.0)
theta.0


  • Calculate the probability density for the vector (0.15,0.25,0.6), that is, .
Click Here to see the answer

ddirichlet(c(0.15,0.25,0.6),a.0) # Note: prob density function > 0


  • The Dirichlet distribution is the conjugate prior of the Multinomial distribution, by achiving the following result: .

Simulate knowing x and , directly from de posterior distribution (Dirichlet).

Click Here to see the answer

data<-t(data) # need a row vector
updated.theta<-rdirichlet(1,a.0+data) # Note: prob density function > 0
updated.theta



Step 3. The Gibbs Sampler.

  • Develop a function in for the Gibbs Sampler. Consider 5000 iterations.

  • Store the updated parameters for each iteration in a matrix of order .

Click Here to see the answer

nr.iter<-5000
theta.all.iter<-matrix(0,5000,3)

a<-a.0
for(i in 1:nr.iter)
{
  theta.all.iter[i,]<-rdirichlet(1,a)
  a<-a+data
  # cat(i,"\n")
}
tail(theta.all.iter)  # last 6 rows



Step 4. Trace / Parameters Estimation

  • Represent graphically the trace for each parameter along the 5000 iterations.
Click Here to see the answer

par(mfrow=c(1,3))

plot(1:nr.iter,theta.all.iter[,1],ylim=c(0,0.5),main=expression(paste("Trace for ",theta[1])),ylab=expression(theta),xlab="iteration")
plot(1:nr.iter,theta.all.iter[,2],ylim=c(0,0.5),main=expression(paste("Trace for ",theta[2])),ylab=expression(theta),xlab="iteration")
plot(1:nr.iter,theta.all.iter[,3],ylim=c(0,0.5),main=expression(paste("Trace for ",theta[3])),ylab=expression(theta),xlab="iteration")


  • Evaluate the need of setting a period of burn-in.

  • Find estimates of the parameters according to the decisions made in previous bullet.

Click Here to see the answer

theta.final<-apply(theta.all.iter[1001:5000,],2,mean) # 2: 'by column'
round(theta.final,2)


Back

Back to main page.