class: center, middle, inverse, title-slide # Rejection sampling, part I ### Niels Richard Hansen ### September 15, 2020 --- ## von Mises distribution The density on `\((-\pi, \pi]\)` is `$$f(x) = \frac{1}{2 \pi I_0(\kappa)} \ e^{\kappa \cos(x - \mu)}$$` for `\(\kappa > 0\)` and `\(\mu \in (-\pi, \pi]\)` parameters and `\(I_0\)` is a modified Bessel function. ```r library("movMF") xy <- rmovMF(500, 0.5 * c(cos(-1.5), sin(-1.5))) ## rmovMF represents samples as elements on the unit circle x <- acos(xy[, 1]) * sign(xy[, 2]) ``` --- ## von Mises distribution ```r hist(x, breaks = seq(-pi, pi, length.out = 15), prob = TRUE) rug(x) density(x, bw = "SJ", cut = 0) %>% lines(col = "red", lwd = 2) curve(exp(0.5 * cos(x + 1.5)) / (2 * pi * besselI(0.5, 0)), -pi, pi, col = "blue", lwd = 2, add = TRUE) ``` <img src="Reject_part_I_files/figure-html/vMhist-1.png" width="600" style="display: block; margin: auto;" /> --- ## Pseudo random numbers Simulation of random variables require a pseudo random number generator, and this is a research field in itself, see `?RNG` in R for the algorithms available and [Section 4.1 in CSwR](https://cswr.nrhstat.org/4-1-pseudorandom-number-generators.html) for more details. The R default (v. 4.0.2) is the 32-bit *Mersenne Twister*, which generates integers in the range `$$\{0, 1, \ldots, 2^{32} -1\}.$$` It has a long period and all combinations of consecutive integers up to dimension 623 occur equally often in a period. It has good statistical properties. Pseudo random numbers in `\((0, 1)\)` are returned by `runif` by division with `\(2^{32}\)` and a fix to prevent the algorithm from returning 0. --- ## Rejection sampling Let `\(Y_1, Y_2, \ldots\)` be i.i.d. with density `\(g\)` on `\(\mathbb{R}\)` and `\(U_1, U_2, \ldots\)` be i.i.d. uniform and independent of the `\(Y_i\)`-s. Define `$$\sigma = \inf\{n \geq 1 \mid U_n \leq \alpha f(Y_n) / g(Y_n)\},$$` for `\(\alpha \in (0, 1]\)` and `\(f\)` a density. **Theorem:** If `\(\alpha f(y) \leq g(y)\)` for all `\(y \in \mathbb{R}\)` then the distribution of `\(Y_{\sigma}\)` has density `\(f\)`. --- ## Normalizing constants If `\(f(y) = c q(y)\)` and `\(g(y) = d p(y)\)` for (unknown) normalizing constants `\(c, d > 0\)` and `\(\alpha' q \leq p\)` then <br> <br> `$$\underbrace{\left(\frac{\alpha' d}{c}\right)}_{= \alpha} \ f \leq g.$$` Moreover, `$$u > \frac{\alpha f(y)}{g(y)} \Leftrightarrow u > \frac{\alpha' q(y)}{p(y)},$$` and rejection sampling can be implemented without computing `\(c\)` or `\(d\)`. --- ## von Mises rejection sampling Rejection sampling using the uniform proposal, `\(g(y) \propto 1\)` Since `$$e^{\kappa(\cos(y) - 1)} = \alpha' e^{\kappa \cos(y)} \leq 1,$$` where `\(\alpha' = \exp(-\kappa)\)` we reject if `$$U > e^{\kappa(\cos(Y) - 1)}.$$` --- ## von Mises rejection sampling ```r vMsim_slow <- function(n, kappa) { y <- numeric(n) for(i in 1:n) { reject <- TRUE while(reject) { y0 <- runif(1, - pi, pi) u <- runif(1) reject <- u > exp(kappa * (cos(y0) - 1)) } y[i] <- y0 } y } ``` --- ## von Mises rejection sampling ```r f <- function(x, k) exp(k * cos(x)) / (2 * pi * besselI(k, 0)) x <- vMsim_slow(100000, 0.5) hist(x, breaks = seq(-pi, pi, length.out = 20), prob = TRUE) curve(f(x, 0.5), -pi, pi, col = "blue", lwd = 2, add = TRUE) x <- vMsim_slow(100000, 2) hist(x, breaks = seq(-pi, pi, length.out = 20), prob = TRUE) curve(f(x, 2), -pi, pi, col = "blue", lwd = 2, add = TRUE) ``` <div class="figure" style="text-align: center"> <img src="Reject_part_I_files/figure-html/vMsim2-1.png" alt=" " width="50%" /><img src="Reject_part_I_files/figure-html/vMsim2-2.png" alt=" " width="50%" /> <p class="caption"> </p> </div> --- ## Profiling ```r profvis(vMsim_slow(10000, 5)) ``` <img src="vMsim_slow.png" width="70%" style="display: block; margin: auto;" /> --- ## Summary Calling random number generators in R sequentially is relatively slow. It is faster to generate all random numbers once and store them in a vector. How to do that for rejection sampling with an unknown number of rejections?