## Agenda

• Markov chain Monte Carlo, again
• Gibbs sampling
• Output analysis for MCMC
• Convergence diagnostics
• Examples: Capture-recapture and toy example

## Gibbs Sampling

1. Select starting values $$x_0$$ and set $$t=0$$
2. Generate in turn (deterministic scan Gibbs sampler)
• $$x^{(1)}_{t+1} \sim f( x^{(1)} | x^{(-1)}_t)$$
• $$x^{(2)}_{t+1} \sim f( x^{(2)} | x^{(1)}_{t+1}, x^{(3)}_t, \dots, x^{(p)}_t)$$
• $$x^{(3)}_{t+1} \sim f( x^{(3)} | x^{(1)}_{t+1}, x^{(2)}_{t+1}, x^{(4)}_t, \dots, x^{(p)}_t)$$
• $$x^{(p)}_{t+1} \sim f( x^{(p)} | x^{(-p)}_{t+1})$$
3. Increment $$t$$ and go to Step 2

## Gibbs Sampling

• Common to have one or more components not available in closed form
• Then one can just use a MH sampler for those components known as a Metropolis within Gibbs or Hybrid Gibbs sampling
• Common to block'' groups of random variables

## Example: Capture-recapture

• Data from a fur seal pup capture-recapture study for $$i=7$$ census attempts
• Goal is to estimate the number of pups in a fur seal colony using a capture-recapture study
Count Parameter 1 2 3 4 5 6 7
Captured $$c_i$$ 30 22 29 26 31 32 35
Newly Caught $$m_i$$ 30 8 17 7 9 8 5

## Example: Capture-recapture

• Let $$N$$ be the population size, $$I$$ be the number of census attempts where $$c_i$$ were captured ($$I=7$$ in our case), and $$r$$ be the total number captured ($$r = \sum_{i=1}^I m_i = 84$$)

• We consider a separate unknown capture probability for each census $$(\alpha_1, \dots, \alpha_I)$$ where the animals are equally catchable''
• Then $L(N,\alpha | c,r) \propto \frac{N!}{(N-r)!} \prod_{i=1}^{I} \alpha_i ^{c_i} (1-\alpha_i) ^ {N-c_i}$

## Example: Capture-recapture

• Assume $$N$$ and $$\alpha$$ are apriori independent with $f(N) \propto 1 \mbox{ and } f(\alpha_i | \theta_1, \theta_2) \stackrel{i.i.d.}{\sim} \mbox{Beta} (\theta_1, \theta_2)$
• We use $$\theta_1 = \theta_2 = 1/2$$, which is the Jeffrey's Prior
• The resulting posterior is proper when $$I>2$$ and recommended when $$I>5$$

## Example: Capture-recapture

• Then it is easy to show the posterior is $f(N,\alpha | c,r) \propto \frac{N!}{(N-r)!} \prod_{i=1}^{I} \alpha_i ^{c_i} (1-\alpha_i) ^ {N-c_i} \prod_{i=1}^{I} \alpha_i^{-1/2} (1-\alpha_i)^{-1/2} .$
• Further, one can show \begin{aligned} N - 84 | \alpha & \sim \mbox{NB} \left( 85, 1 - \prod_{i=1}^{I} (1-\alpha_i) \right) \mbox{ and }\\ \alpha_i | N & \sim \mbox{Beta} \left( c_i + 1/2 , N -c_i + 1/2 \right) \mbox{ for all }i \end{aligned}

## Example: Capture-recapture

Then we can consider the chain $\left(N , \alpha\right) \rightarrow \left(N' , \alpha\right) \rightarrow \left(N' , \alpha'\right)$ or $\left(N , \alpha\right) \rightarrow \left(N , \alpha'\right) \rightarrow \left(N' , \alpha'\right) ,$ where both involve a block'' update of $$\alpha$$

## Example: Capture-recapture

First, we can write the data into R

captured <- c(30, 22, 29, 26, 31, 32, 35)
new.captures <- c(30, 8, 17, 7, 9, 8, 5)
total.r <- sum(new.captures)

## Example: Capture-recapture

The following R code implements the Gibbs sampler

gibbs.chain <- function(n, N.start = 94, alpha.start = rep(.5,7)) {
output <- matrix(0, nrow=n, ncol=8)
for(i in 1:n){
neg.binom.prob <- 1 - prod(1-alpha.start)
N.new <- rnbinom(1, 85, neg.binom.prob) + total.r
beta1 <- captured + .5
beta2 <- N.new - captured + .5
alpha.new <- rbeta(7, beta1, beta2)
output[i,] <- c(N.new, alpha.new)
N.start <- N.new
alpha.start <- alpha.new
}
return(output)
}

## MCMC output analysis

How can we tell if the chain is mixing well?

• Trace plots or time-series plots
• Autocorrelation plots
• Plot of estimate versus Markov chain sample size
• Effective sample size (ESS) $\text{ESS} (n) = \frac{n}{1+2 \sum_{k=1}^{\infty} \rho_{k}(g)},$ where $$\rho_{k}(g)$$ is the autocorrelation of lag $$k$$ for $$g$$
• Alternative, ESS can be written as $\text{ESS} (n) = \frac{n}{\sigma ^2 / \text{Var} g}$ where $$\sigma ^2$$ is the asymptotic variance from a Markov chain CLT

## Example: Capture-recapture

Then we consider some preliminary simulations to ensure the chain is mixing well

trial <- gibbs.chain(1000)
plot.ts(trial[,1], main = "Trace Plot for N")
for(i in 1:7){
plot.ts(trial[,(i+1)], main = paste("Alpha", i))
}

acf(trial[,1], main = "Lag Plot for N")
for(i in 1:7){
acf(trial[,(i+1)], main = paste("Lag Alpha", i))
}

## Example: Capture-recapture

Now for a more complete simulation to estimate posterior means and a 90% Bayesian credible region

sim <- gibbs.chain(10000)
N <- sim[,1]
alpha1 <- sim[,2]

## Example: Capture-recapture

par(mfrow=c(1,2))
hist(N, freq=F, main="Estimated Marginal Posterior for N")
hist(alpha1, freq=F, main ="Estimating Marginal Posterior for Alpha 1")

par(mfrow=c(1,1))

## Example: Capture-recapture

library(mcmcse)
## mcmcse: Monte Carlo Standard Errors for MCMC
## Version 1.2-1 created on 2016-03-24.
## copyright (c) 2012, James M. Flegal, University of California,Riverside
##                     John Hughes, University of Minnesota
##                     Dootika Vats, University of Minnesota
##  For citation information, type citation("mcmcse").
##  Type help("mcmcse-package") to get started.
ess(N)
##       se
## 8421.642
ess(alpha1)
##       se
## 8519.459

## Example: Capture-recapture

par(mfrow=c(1,2))
estvssamp(N)
estvssamp(alpha1)

par(mfrow=c(1,1))

## Example: Capture-recapture

mcse(N)
## $est ## [1] 89.5495 ## ##$se
## [1] 0.0307835
mcse.q(N, .05)
## $est ## [1] 86 ## ##$se
## [1] 0.02930293
mcse.q(N, .95)
## $est ## [1] 95 ## ##$se
## [1] 0.06565672

## Example: Capture-recapture

mcse(alpha1)
## $est ## [1] 0.3374951 ## ##$se
## [1] 0.0005427247
mcse.q(alpha1, .05)
## $est ## [1] 0.2556669 ## ##$se
## [1] 0.0009743541
mcse.q(alpha1, .95)
## $est ## [1] 0.4221397 ## ##$se
## [1] 0.001015246

## Example: Capture-recapture

current <- sim[10000,] # start from here is you need more simulations
sim <- rbind(sim, gibbs.chain(10000, N.start = current[1], alpha.start = current[2:8]))
N.big <- sim[,1]

## Example: Capture-recapture

hist(N.big, freq=F, main="Estimated Marginal Posterior for N")

## Example: Capture-recapture

ess(N)
##       se
## 8421.642
ess(N.big)
##       se
## 14446.78

## Example: Capture-recapture

estvssamp(N.big)