Agenda

  • Diasorin Example Continued
  • Inference for Rates
  • Two-Sample Poisson Exercise

Diasorin Example

  • Renal osteodystrophy is a bone disease that occurs when the kidneys fail to maintain proper levels of calcium and phosphorus in the blood. Monitoring patients with loss of kidney function for lower than normal bone turnover aids in managing the disease. A commercially available diagnostic assay, Diasorin, claims to be able to tell patients apart who have low versus normal bone turnover.
  • Cross-section of 34 kidney patients from the bone registry at the University of Kentucky were identified as low or normal turnover by other means and then given the commercial assay to determine whether it could correctly identify them. From boxplots a normal sampling model appears untenable due to marked skewness, but boxplots and quantile plots of the log transformed data seem reasonably normal.

Diasorin Example

set.seed(1)
n1 <- 19
n2 <- 15
low <- c(91, 46,  95,  60, 33, 410, 105, 43, 189,1097, 54,178, 114, 137, 233, 101, 25,70,357)
normal <- c(370, 267, 99,157, 75,1281, 48, 298, 268, 62,804,430,171,694,404)
diasorin <- c(low,normal)
group <- c(rep(1,n1),rep(2,n2))
group1 <- factor(group, 1:2, c('Low','Normal')) 

Priors

  • We elicit independence priors for each population.
  • For the low bone turnover group \(\mu_L = \mu_1 \sim N(4.87,0.00288)\).
  • For the normal bone turnover group \(\mu_N = \mu_2 \sim N(5.39,0.00280)\).
  • We consider gamma priors with modes \(\tau_{0,1} = 37.60\) and \(\tau_{0,2} = 46.53\).
  • Consider \(d = 0.001\) so \[ \tau_1 \sim \text{Gamma}(1.0376,0.001) \text{ and } \tau_2 \sim \text{Gamma}(1.04653,0.001). \]

Posterior Sampling

  • Use the Gibbs to obtain correlated samples from the posterior.
  • First notice the posterior independence \[ p(\mu_1, \tau_1, \mu_2, \tau_2 | y_1, y_2) = p(\mu_1, \tau_1|y_1) p(\mu_2, \tau_2|y_2). \]
  • Set \[ \hat{\mu} = \hat{\mu} (\tau) = \frac{n\tau}{n\tau + b} \bar{y} + \frac{b}{n\tau + b} a. \]

Posterior Sampling

  • Recall the one sample full conditional distributions \[ \mu | \tau, y \sim N\left( \hat{\mu}, \frac{1}{n\tau + b} \right) \] and \[ \tau | \mu, y \sim \text{Gamma} \left( c+n/2, d + \left\{ (n-1)s^2 + n (\bar{y} - \mu)^2 \right\} /2 \right), \] or \[ \tau | \mu, y \sim \text{Gamma} \left( c+n/2, d + \sum_{i=1}^n (y_i - \mu)^2 /2 \right). \]

Posterior Sampling

  • The Gibbs sampler for one sample is as follows.
gibbs <- function(reps, mu, tau, y.bar, s2, n, a, b, c, d){
  output <- matrix(0, nrow=reps, ncol=2)
  a.gamma <- c+n/2
  for(i in 1:reps){
    mu.hat <- n * tau / (n * tau + b) * y.bar + b / (n * tau + b) * a
    sd.norm <- 1/sqrt(n*tau + b)
    mu <- rnorm(1, mu.hat, sd.norm)
    b.gamma <- d + ((n-1)*s2 + (y.bar - mu)^2)/2
    tau <- rgamma(1, a.gamma, b.gamma)
    output[i,] <- c(mu, tau)
  }
return(output)
}

Posterior Sampling

  • Simulate for 50,000 iterations.
y.bar.l <- mean(log(low))
s2.l <- var(log(low))
y.bar.n <- mean(log(normal))
s2.n <- var(log(normal))

sample.l <- gibbs(50000, mu=5, tau=1, y.bar = y.bar.l, s2 = s2.l, n = n1, 
                  a=4.87, b=0.00288, c=1.0376, d=0.001)
sample.n <- gibbs(50000, mu=5, tau=1, y.bar = y.bar.n, s2 = s2.n, n = n2, 
                  a=5.39, b=0.00280, c=1.04653, d=0.001)

Posterior Sampling

mu1 <- sample.l[,1]
tau1 <- sample.l[,2]
mu2 <- sample.n[,1]
tau2 <- sample.n[,2]
mu.diff <- mu2-mu1
tau.ratio <- tau2/tau1
post.sum <- function(vec){
  return(c(mean(vec), sd(vec), quantile(vec, probs=c(0.25, .5, .975))))
}
sum.table <- rbind(post.sum(mu1), post.sum(mu2), post.sum(mu.diff), post.sum(tau1), 
                   post.sum(tau2), post.sum(tau.ratio))
rownames(sum.table) <- c("mu1", "mu2", "mu2-mu1", "tau1", "tau2", "tau2/tau1")
colnames(sum.table) <- c("mean", "sd", "2.5%", "50%", "97.5%")

Posterior Sampling

  • Common summary of results for two normal samples computed with our informative prior.
  • Mean and median of the normal group is larger than that of the low group.
round(sum.table, 3)
##            mean    sd  2.5%   50% 97.5%
## mu1       4.705 0.210 4.566 4.705 5.117
## mu2       5.489 0.241 5.332 5.489 5.968
## mu2-mu1   0.784 0.319 0.572 0.784 1.415
## tau1      1.316 0.406 1.024 1.274 2.227
## tau2      1.295 0.443 0.976 1.246 2.299
## tau2/tau1 1.088 0.543 0.711 0.975 2.439

Posterior Density

plot(density(mu1,n=1500), type='l', xlab="", main="", ylab="", 
     xlim=c(4, 6.2), ylim=c(0,2), lwd=3)
lines(density(mu2,n=1500),lty=2,lwd=3)
text(4.2,1,"Low",lwd=2,cex=1.2)
text(6,1,"Normal",lwd=2,cex=1.2)

Posterior Density

Predictive Probabilities

lowf <- rnorm(50000, mu1, 1/sqrt(tau1))
normalf <- rnorm(50000, mu2, 1/sqrt(tau2))
plot(density(lowf,n=1500), type='l', xlab="", main="", ylab="", 
     xlim=c(2,8), ylim=c(0,0.5), lwd=3)
lines(density(normalf,n=1500),lty=2,lwd=3)
mtext("log (Diasorin Score)",line=3,side=1,cex=1.5)
mtext("Predictive Density",line=2.5,side=2,cex=1.5)
text(3.2,0.3,"Low",lwd=2,cex=1.2)
text(7,.3,"Normal",lwd=2,cex=1.2)

Predictive Probabilities

  • Diasorin data: Predictive densities on log scale.

Predictive Probabilities

  • Mean and median of the normal group is larger than that of the low group, i.e. \(P(\mu_2 > \mu_1) \approx 0.99\)
mean(as.numeric(mu2>mu1))
## [1] 0.9914
  • Predictive probability that a new outcome from the normal group exceeds an outcome for the low group is only 0.72.
mean(as.numeric(normalf>lowf))
## [1] 0.72214

Predictive Probabilities

  • To address allocation issues more carefully, we used 5.1 as a cutoff to decide who should be assigned to the low group and who should be assigned to the normal group based on their log Diasorin scores.
mean(as.numeric(normalf>5.1))
## [1] 0.66176
  • Predictive probability of someone from the normal group getting above this value is about 66%.
mean(as.numeric(lowf<5.1))
## [1] 0.66744
  • Predictive probability of someone from the low group scoring below 5.1 is 66%. So, with this cutoff value, whether you are normal or low, there is a 34% chance that you will get misclassified.

Original Scale

  • Commonly need to transform data before a normal analysis, but now draw conclusions on the original scale.
low.act <- exp(lowf)
normal.act <- exp(normalf)
plot(density(low.act,n=2500,from=0), type='l', xlab="", main="", ylab="", 
     xlim=c(0,1000), ylim=c(0,0.006), lwd=3)
lines(density(normal.act,n=2500,from=0),lty=2,lwd=3)
lines(c(0,0),c(0,0.000593),lwd=3,lty=1)
mtext("Diasorin Score",line=3,side=1,cex=1.5)
mtext("Predictive Density",line=2.5,side=2,cex=1.5)
text(150,0.0055,"Low",lwd=2,cex=1.2)
text(380,0.0018,"Normal",lwd=2,cex=1.2)

Original Scale

  • Diasorin data: Predictive densities on original scale.

Original Scale

  • Predictive distributions are so skewed that means and standard deviations provide little useful information, so we do not report them.
  • Focus on medians and probability intervals.
rbind(quantile(exp(mu1), probs=c(0.05, .5, .95)), quantile(exp(mu2), probs=c(0.05, .5, .95)))
##             5%      50%      95%
## [1,]  78.32559 110.5186 155.4947
## [2,] 162.87723 242.0655 359.1196
  • Posterior median of Diasorin value among low bone turnover patients was 110 with 90% PI (78, 155).
  • Posterior median of Diasorin value among normal bone turnover patients was 242 with 90% PI (163, 359).

Sensitivity Analysis

  • Could (or should) perform a sensitivity analysis.
  • Set \(d = 0.01\) in the priors for the \(\tau\)s, but there is little effect on the estimates in the informative analysis.
  • Could also consider priors with very large variances for the \(\mu\)s, but the results do not change dramatically.
  • Activity: Modify my code to do your own sensitivity analysis. What kind of prior lead to noticeable differences?

Inference for Rates

  • Poisson distribution for modeling one and two samples.
  • Used to model counts, e.g. (i) the number of phone calls in a day, (ii) the number of raisins in a cookie, (iii) the number of dry oil wells in Texas, and (iv) the number of blips on a Geiger counter within 10 minutes.
  • Say a random variable \(y\) has a Poisson(\(\theta\)) distribution provided \[ f(y|\theta) = \frac{\theta^y e^{-\theta}}{y!}, \; y = 0, 1, \dots \]
  • Recall the mean and variance are both \(\theta\).
  • May also need to perform inference for count data over different time intervals, areas, pages, and so on. Need to reparameterize the distribution to adjust for 'size', say \(M_i\). Then we may consider \[ y_i | \theta \sim \text{Poisson}(\theta M_i). \]

Armadillo Hunting: Poisson Data

  • Reading from Chapter 1 introduced the Ache tribe of Paraguay. Recall armadillos comprised the vast majority of food calories consumed by the Ache and it is of interest to quantify the typical number of armadillos killed in a day.
  • Data involve observations on \(n = 38\) Ache men. Let \(y_i\) be the number of armadillos killed by the \(i\)th man on a given day.
  • Poisson distribution provides a natural model for the number of events occurring haphazardly over a given amount of time, so we assume \[ y_1, \dots, y_n | \theta \stackrel{iid}{\sim} \text{Poisson}(\theta). \]
  • Refer to \(\theta\) as the kill rate.
  • Gamma prior distribution conveniently provides a model for prior information on the mean daily number of kills.

One-Sample Poisson Data

  • Consider a sample of counts \(y = (y_1, \dots, y_n)′\) modeled as \[ y_1, \dots, y_n | \theta \stackrel{iid}{\sim} \text{Poisson}(\theta). \]
  • The likelihood is then \[ L(\theta | y) \propto \prod_{i=1}^n \theta^{y_i} e^{-\theta} = \theta^{n \bar{y}} e^{-n \theta}. \]
  • Conjugate prior for \(\theta\) is Gamma(\(a\),\(b\)) with density \[ p(\theta) \propto \theta^{a-1} e^{-b\theta}. \]

One-Sample Poisson Data

  • Posterior density becomes \[ \begin{aligned} p(\theta | y) & \propto L(\theta | y) p(\theta) \\ & = \theta^{n \bar{y}} e^{-n \theta} \theta^{a-1} e^{-b\theta} \\ & = \theta^{n \bar{y} + a -1} e^{-(n+b) \theta}, \end{aligned} \] so \[ \theta | y \sim \text{Gamma} (n \bar{y} + a, n + b ). \]
  • The gamma prior is a DAP having \(b\) prior observations with a total of \(a\) counts.
  • As with many DAPs, write the posterior mean as a weighted average of the sample and prior means.

Armadillo Hunting: Poisson Data

  • Let \(y_i\) be the number of armadillos killed by the \(i\)th man on a given day. Assume \[ y_1, \dots, y_n | \theta \stackrel{iid}{\sim} \text{Poisson}(\theta). \]
  • Eliciting the prior \[ \theta \sim \text{Gamma} (1.11, 1.61). \]
  • The data give \(n=38\) and \(\sum_{i=1}^{38} y_i = 10\) gives posterior \[ \theta | y \sim \text{Gamma} (11.11, 39.61). \]
  • Know the exact posterior distribution, but need computer routines to find probability intervals. Or we could simulate the posterior to get probability intervals.

Armadillo Hunting: Poisson Data

  • Prior (dashed) and posterior (solid) distributions for kill rate \(\theta\).

Informative Priors

  • Typically construct an informative prior by specifying a best guess and a percentile.
  • For example, if we were modeling the number of phone calls received each day by a call center, \(\theta\) is the mean number received.
  • On any given day, there is variability about this mean. My best guess is that the mean is 100. I have a lot of uncertainty about the best guess of 100 but I am 95% sure that the average number of phone calls per day is fewer than 300. We want to specify a distribution that satisfies these constraints.
  • We can equate these numbers a mean or median and tail probability statement.
  • If we assume the prior is a Gamma distribution, then we can can find the corresponding prior parameter values.
  • Expert on Ache hunting practices, believes that Ache men typically kill an armadillo every other day and thus provides a best guess for \(\theta\) of 0.5 armadillos, took to be the median of the prior distribution. Also, expert is 95% sure that the kill rate is no greater than 2 armadillos. Led to the example prior \[ \theta \sim \text{Gamma} (1.11, 1.61). \]

Reference Priors

  • Jeffreys' prior for this problem is \[ p(\theta) \propto 1 / \sqrt{\theta}, \] which is equivalent to an improper Gamma(\(0.5, 0\)).
  • Posterior becomes \[ \theta | y \sim \text{Gamma} (n \bar{y} + 0.5, n). \]
  • Posterior mean is \(\bar{y} + 1/2n\) with posterior standard deviation \(\sqrt{\bar{y} + 1/2n} / \sqrt{n}\).
  • If the sample size is large, the large sample normal approximation kicks in and inferences will be nearly identical to those based on large sample MLE theory.

Two-Sample Poisson Data

EXERCISE 5.38. Suppose we have completely independent count data \[ y_{11}, \dots, y_{1 n_1} | \theta_1 \stackrel{iid}{\sim} \text{Poisson}(\theta_1) \] and \[ y_{21}, \dots, y_{2 n_1} | \theta_2 \stackrel{iid}{\sim} \text{Poisson}(\theta_2). \] Assume priors \[ \theta_1 \sim \text{Gamma} (a_1, b_1) \perp \theta_2 \sim \text{Gamma} (a_2, b_2). \] Derive the analytical form of the posterior distribution of \((\theta_1, \theta_2)\) and characterize it as a well-known distribution. Also identify the joint posterior when independent Jeffreys' priors are used for \(\theta_1\) and \(\theta_2\). Finally, write R or WinBUGS code to handle the model, including making inferences about the risk difference and risk ratio.