- Markov chain Monte Carlo, again
- Gibbs sampling
- Output analysis for MCMC
- Convergence diagnostics
- Examples: Capture-recapture and toy example
Count | Parameter | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|---|---|
Captured | \(c_i\) | 30 | 22 | 29 | 26 | 31 | 32 | 35 |
Newly Caught | \(m_i\) | 30 | 8 | 17 | 7 | 9 | 8 | 5 |
Let \(N\) be the population size, \(I\) be the number of census attempts where \(c_i\) were captured (\(I=7\) in our case), and \(r\) be the total number captured (\(r = \sum_{i=1}^I m_i = 84\))
Then \[ L(N,\alpha | c,r) \propto \frac{N!}{(N-r)!} \prod_{i=1}^{I} \alpha_i ^{c_i} (1-\alpha_i) ^ {N-c_i} \]
Then we can consider the chain \[ \left(N , \alpha\right) \rightarrow \left(N' , \alpha\right) \rightarrow \left(N' , \alpha'\right) \] or \[ \left(N , \alpha\right) \rightarrow \left(N , \alpha'\right) \rightarrow \left(N' , \alpha'\right) , \] where both involve a ``block'' update of \(\alpha\)
First, we can write the data into R
captured <- c(30, 22, 29, 26, 31, 32, 35) new.captures <- c(30, 8, 17, 7, 9, 8, 5) total.r <- sum(new.captures)
The following R code implements the Gibbs sampler
gibbs.chain <- function(n, N.start = 94, alpha.start = rep(.5,7)) { output <- matrix(0, nrow=n, ncol=8) for(i in 1:n){ neg.binom.prob <- 1 - prod(1-alpha.start) N.new <- rnbinom(1, 85, neg.binom.prob) + total.r beta1 <- captured + .5 beta2 <- N.new - captured + .5 alpha.new <- rbeta(7, beta1, beta2) output[i,] <- c(N.new, alpha.new) N.start <- N.new alpha.start <- alpha.new } return(output) }
How can we tell if the chain is mixing well?
Then we consider some preliminary simulations to ensure the chain is mixing well
trial <- gibbs.chain(1000) plot.ts(trial[,1], main = "Trace Plot for N") for(i in 1:7){ plot.ts(trial[,(i+1)], main = paste("Alpha", i)) } acf(trial[,1], main = "Lag Plot for N") for(i in 1:7){ acf(trial[,(i+1)], main = paste("Lag Alpha", i)) }
Now for a more complete simulation to estimate posterior means and a 90% Bayesian credible region
sim <- gibbs.chain(10000) N <- sim[,1] alpha1 <- sim[,2]
par(mfrow=c(1,2)) hist(N, freq=F, main="Estimated Marginal Posterior for N") hist(alpha1, freq=F, main ="Estimating Marginal Posterior for Alpha 1")
par(mfrow=c(1,1))
library(mcmcse)
## mcmcse: Monte Carlo Standard Errors for MCMC ## Version 1.2-1 created on 2016-03-24. ## copyright (c) 2012, James M. Flegal, University of California,Riverside ## John Hughes, University of Minnesota ## Dootika Vats, University of Minnesota ## For citation information, type citation("mcmcse"). ## Type help("mcmcse-package") to get started.
ess(N)
## se ## 8421.642
ess(alpha1)
## se ## 8519.459
par(mfrow=c(1,2)) estvssamp(N) estvssamp(alpha1)
par(mfrow=c(1,1))
mcse(N)
## $est ## [1] 89.5495 ## ## $se ## [1] 0.0307835
mcse.q(N, .05)
## $est ## [1] 86 ## ## $se ## [1] 0.02930293
mcse.q(N, .95)
## $est ## [1] 95 ## ## $se ## [1] 0.06565672
mcse(alpha1)
## $est ## [1] 0.3374951 ## ## $se ## [1] 0.0005427247
mcse.q(alpha1, .05)
## $est ## [1] 0.2556669 ## ## $se ## [1] 0.0009743541
mcse.q(alpha1, .95)
## $est ## [1] 0.4221397 ## ## $se ## [1] 0.001015246
current <- sim[10000,] # start from here is you need more simulations sim <- rbind(sim, gibbs.chain(10000, N.start = current[1], alpha.start = current[2:8])) N.big <- sim[,1]
hist(N.big, freq=F, main="Estimated Marginal Posterior for N")
ess(N)
## se ## 8421.642
ess(N.big)
## se ## 14446.78
estvssamp(N.big)
mcse(N)
## $est ## [1] 89.5495 ## ## $se ## [1] 0.0307835
mcse(N.big)
## $est ## [1] 89.50495 ## ## $se ## [1] 0.02318482
mcse.q(N, .05)
## $est ## [1] 86 ## ## $se ## [1] 0.02930293
mcse.q(N.big, .05)
## $est ## [1] 86 ## ## $se ## [1] 0.01798403
mcse.q(N, .95)
## $est ## [1] 95 ## ## $se ## [1] 0.06565672
mcse.q(N.big, .95)
## $est ## [1] 95 ## ## $se ## [1] 0.04391722
coda
packageGelman and Rubin Diagnostic is also used a stopping criteria
Stopped the simulation when \[ \begin{matrix} BM: & t_{.975, (a - 1)} \displaystyle \frac{ \hat{\sigma}_{BM} }{ \sqrt{n} } + I(n < 400) < 0.04 \\ & \\ GRD: & \hat{R}_{0.975} + I(n < 400) < 1.005 \end{matrix} \] - 1000 independent replications + Starting from \(\bar{y}\) for BM + Starting from draws from \(\pi\) for GRD - Used 4 chains for GRD
Plots of \(\bar{\mu}_{n}\) vs. \(n\) for both stopping methods
MSE | BM | GRD |
---|---|---|
MSE for \(E(\mu | y)\) | 3.73e-05 (1.8e-06) | 0.000134 (9.2e-06) |
MSE for \(E(\lambda | y)\) | 0.000393 (1.8e-05) | 0.00165 (0.00012) |
Histograms of \(\bar{\mu}_{n}\) for both stopping methods.