---
title: 'Distributions'
author: "James M. Flegal"
output:
ioslides_presentation:
smaller: yes
---
## Agenda
- Random number generation
- Distributions in R
- Distributional Models
- Moments, generalized moments, likelihood
- Visual comparisons and basic testing
## Random number generation
How does R get “random” numbers?
- It doesn’t, instead it uses pseudorandom numbers that we hope are indistinguishable from random numbers
- Pseudorandom generators produce a deterministic sequence that is indistinguishable from a true random sequence if you don’t know how it started
- Why? Truly random numbers are expensive
## Random number generation
- Pseudorandom generators produce a sequence of Uniform$(0,1)$ random variates
- [Linear congruential generator](https://en.wikipedia.org/wiki/Linear_congruential_generator) is one of the oldest and best-known pseudorandom number generator algorithms
- Other distributions usually based such a sequence
- Example: If $U \sim \text{Unif}(0,1)$ then $-\beta \ln (1-U) \sim \text{Exp}(\beta)$
- More on this later ...
## Example: `runif()`
```{r}
runif(1:10)
set.seed(10)
runif(1:10)
set.seed(10)
runif(1:10)
```
## Linear congruential generator
- Easily implemented and fast
- Modulo arithmetic via storage-bit truncation
- Defined as
\[
X_{n+1}=\left(aX_{n}+c\right)~~{\bmod {~}}~m
\]
where $X$ is the sequence of pseudorandom values
## Linear congruential generator
```{r}
seed <- 10
new.random <- function (a=5, c=12, m=16) {
out <- (a*seed + c) %% m
seed <<- out
return(out)
}
out.length <- 20
variates <- rep (NA, out.length)
for (kk in 1:out.length) variates[kk] <- new.random()
variates
```
Unfortunately, this sequence has period 8
## Linear congruential generator
```{r}
variates <- rep (NA, out.length)
for (kk in 1:out.length) variates[kk] <- new.random(a=131, c=7, m=16)
variates
```
```{r}
variates <- rep (NA, out.length)
for (kk in 1:out.length) variates[kk] <- new.random(a=129, c=7, m=16)
variates
```
## Linear congruential generator
```{r}
variates <- rep (NA, out.length)
for (kk in 1:out.length) variates[kk] <- new.random(a=1664545, c=1013904223, m=2^32)
variates
```
## How to distinguish non-randomness?
- Look at period
- Missing some values
- Proper distribution in the limit
- Autocorrelation
- Gets harder for higher dimensions
## Distributions
- Beta, Binomial, Cauchy, Chi-Square, Exponential, F, Gamma, Geometric, Hypergeometric, Logistic, Log Normal, Negative Binomial, Normal, Poisson, Student t, Studentized Range, Uniform, Weibull, Wilcoxon Rank Sum Statistic, Wilcoxon Signed Rank Statistic
- Parameters of these distributions may not agree with textbooks
- Lots of distributions, but they all work the same way
## Distributions
Every distribution that R handles has four functions. There is a root name, for example, the root name for the normal distribution is `norm`. This root is prefixed by one of the letters
- `p` for "probability", the cumulative distribution function (c. d. f.)
- `q` for "quantile", the inverse c. d. f.
- `d` for "density", the density function (p. f. or p. d. f.)
- `r` for "random", a random variable having the specified distribution
## Distributions
- Beta, Binomial, Cauchy, Chi-Square, Exponential, F, Gamma, Geometric, Hypergeometric, Logistic, Log Normal, Negative Binomial, Normal, Poisson, Student t, Studentized Range, Uniform, Weibull, Wilcoxon Rank Sum Statistic, Wilcoxon Signed Rank Statistic
- `beta`, `binom`, `cauchy`, `chisq`, `exp`, `f`, `gamma`, `geom`, `hyper`, `logis`, `lnorm`, `nbinom`, `norm`, `pois`, `t`, `tukey`, `unif`, `weibull`, `wilcox`, `signrank`
- Exponential distribution has `pexp`, `qexp`, `dexp`, and `rexp`
## Exponential distribution
```{r}
this.range <- seq(0, 8, .05)
plot (this.range, dexp(this.range), ty="l", main="Exponential Distributions",
xlab="x", ylab="f(x)")
lines (this.range, dexp(this.range, rate=0.5), col="red")
lines (this.range, dexp(this.range, rate=0.2), col="blue")
```
## Exponential distribution
```{r}
this.range <- seq(0, 8, .05)
plot (this.range, pexp(this.range), ty="l", main="Exponential Distributions",
xlab="x", ylab="P(X