--- title: 'Monte Carlo Similations' author: "James M. Flegal" output: ioslides_presentation: smaller: yes --- ## Agenda - Ordinary Monte Carlo - Examples - Monte Carlo integration - Bootstrap - Toy collector exercise ## Ordinary Monte Carlo The “Monte Carlo method” refers to the theory and practice of learning about probability distributions by simulation rather than calculus. In ordinary Monte Carlo (OMC) we use _IID_ simulations from the distribution of interest. Suppose $X_1, X_2, \dots$ are _IID_ simulations from some distribution, and suppose we want to know an expectation $\theta = E\left[ Y_1 \right] = E\left[ g(X_1) \right].$ The law of large numbers (LLN) then says $\bar{y}_n = \frac{1}{n} \sum_{i=1}^n Y_i = \frac{1}{n} \sum_{i=1}^n g(X_i)$ converges in probability to $\theta$. ## Ordinary Monte Carlo The central limit theorem (CLT) says $\frac{\sqrt{n} (\bar{y}_n - \theta) }{\sigma} \stackrel{d}{\rightarrow} \mbox{N}(0,1) .$ That is, for sufficiently large $n$, $\bar{y}_n \sim \mbox{N}( \theta , \sigma^2 / n).$ Further, we can estimate the standard error $\sigma / \sqrt{n}$ with $s_n / \sqrt{n}$ where $s_n$ is the sample standard deviation. ## Ordinary Monte Carlo We can also use the CLT form a confidence interval with $Pr( \bar{y}_n - 1.96 s_n / \sqrt{n} < \mbox{E} Y_1 < \bar{y}_n + 1.96 s_n / \sqrt{n} ) \approx 0.95 .$ Or we could simulate until a half-width (or width) of this confidence interval is sufficiently small, say less than $\epsilon>0$. That is, simulate until $1.96 s_n / \sqrt{n} < \epsilon .$ ## Ordinary Monte Carlo The theory of OMC is just the theory of frequentist statistical inference. The only differences are that - the “data” $X_1, \dots, X_n$ are computer simulations rather than measurements on objects in the real world - the “sample size” $n$ is the number of computer simulations rather than the size of some real world data - the unknown parameter $\theta$ is in principle completely known, given by some integral, which we are unable to do. ## Ordinary Monte Carlo Everything works just the same when the data $X_1, X_2, \dots$, which are computer simulations are vectors. But the functions of interest $g(X_1), g(X_2), \dots$ are scalars. OMC works great, but it can be very difficult to simulate IID simulations of random variables or random vectors whose distribution is not brand name distributions ## Approximating the Binomial Suppose we flip a coin 10 times and we want to know the probability of getting more than 3 heads. This is trivial for the Binomial distribution, which we'll ignore. {r} runs <- 10000 one.trial <- function(){ sum(sample(c(0,1),10,replace=T)) > 3 } mc.binom <- sum(replicate(runs,one.trial()))/runs mc.binom pbinom(3,10,0.5,lower.tail=FALSE)  Exercise: Program this example and estimate the Monte Carlo standard error ## Aproximating $\pi$ - Area of a circle is $\pi r^2$ - If we draw a square containing that circle its area will be $4r^2$ - So the ratio of the area of the circle to the area of the square is $\frac{\pi r^2}{4r^2} = \frac{\pi}{4}$ - Given this fact, we can empirically determine the ratio of the area of the circle to the area of the square we can simply multiply this number by 4 and we'll get our approximation of $\pi$. - How? ## Aproximating $\pi$ - Randomly sample $x$ and $y$ values on the unit square centered at 0 - If $x^2 + y^2 \le .5^2$ then the point is in the circle - The ratio of points in the circle multiplied by 4 is our estimate of $\pi$ {r} runs <- 100000 xs <- runif(runs,min=-0.5,max=0.5) ys <- runif(runs,min=-0.5,max=0.5) in.circle <- xs^2 + ys^2 <= 0.5^2 mc.pi <- (sum(in.circle)/runs)*4  ## Aproximating $\pi$ {r} plot(xs,ys,pch='.',col=ifelse(in.circle,"blue","grey") ,xlab='',ylab='',asp=1, main=paste("MC Approximation of Pi =",mc.pi))  ## Example: Integration Let $X \sim \Gamma (3/2, 1)$, i.e.\ $f(x) = \frac{2}{\sqrt{\pi}} \sqrt{x} e^{-x} I(x>0) .$ Suppose we want to find \begin{aligned} \theta & = \mbox{E} \left[ \frac{1}{(X+1)\log (X+3)} \right]\\ & = \int_{0}^{\infty} \frac{1}{(x+1)\log (x+3)} \frac{2}{\sqrt{\pi}} \sqrt{x} e^{-x} dx . \end{aligned} - The expectation (or integral) $\theta$ is intractable, we don't know how to compute it analytically - Further, suppose we want to estimate this quantity such that a 95% CI length is less than 0.002 ## Example: Integration {r} n <- 1000 x <- rgamma(n, 3/2, scale=1) mean(x) y <- 1/((x+1)*log(x+3)) est <- mean(y) est mcse <- sd(y) / sqrt(length(y)) interval <- est + c(-1,1)*1.96*mcse interval  ## Example: Sequential stopping rule {r} eps <- 0.002 len <- diff(interval) plotting.var <- c(est, interval) while(len > eps){ new.x <- rgamma(n, 3/2, scale=1) new.y <- 1/((new.x+1)*log(new.x+3)) y <- cbind(y, new.y) est <- mean(y) mcse <- sd(y) / sqrt(length(y)) interval <- est + c(-1,1)*1.96*mcse len <- diff(interval) plotting.var <- rbind(plotting.var, c(est, interval)) } list(interval, length(y)) temp <- seq(1000, length(y), 1000)  ## Example: Sequential stopping rule {r} plot(temp, plotting.var[,1], type="l", ylim=c(min(plotting.var), max(plotting.var)), main="Estimates of the Mean", xlab="Iterations", ylab="Estimate") points(temp, plotting.var[,2], type="l", col="red") points(temp, plotting.var[,3], type="l", col="red") legend("topright", legend=c("CI", "Estimate"), lty=c(1,1), col=c(2,1))  ## High-dimensional examples - [FiveThirtyEight's Election Forecast](https://projects.fivethirtyeight.com/2018-midterm-election-forecast/house/?ex_cid=rrpromo) - [FiveThirtyEight's NBA Predictions](https://projects.fivethirtyeight.com/2020-nba-predictions/?ex_cid=rrpromo) - [Vanguard's Retirement Nest Egg Calculator](https://retirementplans.vanguard.com/VGApp/pe/pubeducation/calculators/RetirementNestEggCalc.jsf) - [Fisher's Exact Test in R](https://stat.ethz.ch/R-manual/R-devel/library/stats/html/fisher.test.html) ## Permutations with sample() - sample() is powerful -- it works on any object that has a defined length(). - Permutations {r} sample(5) sample(1:6) replicate(3,sample(c("Curly","Larry","Moe","Shemp")))  ## Resampling with sample() Resampling from any existing distribution gives **bootstrap** estimators {r} bootstrap.resample <- function (object) sample (object, length(object), replace=TRUE) replicate(5, bootstrap.resample (6:10))  Recall: the *jackknife* removed one point from the sample and recalculated the statistic of interest. Here we resample the same length with replacement. ## Bootstrap test The 2-sample t-test checks for differences in means according to a known null distribution. Let's resample and generate the sampling distribution under the bootstrap assumption: {r} library(MASS) diff.in.means <- function(df) { mean(df[df$Sex=="M","Hwt"]) - mean(df[df$Sex=="F","Hwt"]) } resample.diffs <- replicate(1000, diff.in.means(cats[bootstrap.resample(1:nrow(cats)),]))  ## Bootstrap test {r} hist(resample.diffs); abline(v=diff.in.means(cats), col=2, lwd=3)  ## Summary - Ordinary Monte Carlo - Repeated random sampling to obtain numerical results - Using randomness to solve problems - Most useful when it is difficult or impossible to use other approaches - Can you solve [The Riddler](http://fivethirtyeight.com/tag/the-riddler/)? ## Exercise: Toy Collector Children (and some adults) are frequently enticed to buy breakfast cereal in an effort to collect all the action figures. Assume there are 15 action figures and each cereal box contains exactly one with each figure being equally likely. - Find the expected number of boxes needed to collect all 15 action figures. - Find the standard deviation of the number of boxes needed to collect all 15 action figures. - Now suppose we no longer have equal probabilities, instead let Figure | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O --- | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - Probability | .2 | .1| .1| .1| .1| .1| .05| .05| .05| .05| .02| .02| .02| .02| .02 - Estimate the expected number of boxes needed to collect all 15 action figures. - What is the uncertainty of your estimate? - What is the probability you bought more than 50 boxes? 100 boxes? 200 boxes?