## The Alias Method: Efficient Sampling with Many Discrete Outcomes

[latexpage] When implementing algorithms for inference and learning with probabilistic models, it commonly comes up that one needs to sample from a discrete distribution. That is, from a multinomial distribution with parameter $\boldsymbol{\pi}\in\mathbb{R}^K$, such that $\pi_k\geq 0$ and $\sum_{k}\pi_k=1$. A somewhat more common occurrence is that we have a $\boldsymbol{\phi}\in\mathbb{R}^K$ where $\phi_k\geq 0$, but we don’t know the normalization constant. That is, our $\boldsymbol{\phi}$ is only proportional to the multinomial parameter $\boldsymbol{\pi}$. We want to rapidly generate a variate according to $\boldsymbol{\pi}$, given $\boldsymbol{\pi}$, something easily done with (Matlab) code such as this (paraphrased from Tom Minka‘s Lightspeed Toolbox): cdf = cumsum(phi); samp_k = sum(cdf < rand()*cdf(end)) + 1; This is nice and simple, but you'll notice that it has ...
[latexpage]Much of what we do when we analyze data and invent algorithms is think about estimators for unknown quantities, even when we don’t directly phrase things this way.  One type of estimator that we commonly encounter is the Monte Carlo estimator, which approximates expectations via the sample mean.  That is, many problems in which we are interested involve a distribution $\pi$ on a space $\mathcal{X}$, where we wish to calculate the expectation of a function $f(x)$: \begin{align*} \hat{f}_{\pi} &= \int_{\mathcal{X}} \pi(x)\,f(x)\,\mathrm{d}x\\ &\approx \frac{1}{N}\sum_{n=1}^N f(x_n) \text{\qquad where } x_n \sim \pi. \end{align*} This is very nice because it gives you an unbiased estimator of $\hat{f}_\pi$.  That is, the expectation of this estimator is the desired quantity.  However, one issue that comes …