I recently uploaded the paper “Parallel MCMC with Generalized Elliptical Slice Sampling” to the arXiv. I’d like to highlight one trick that we used, but first I’ll give some background. Markov chain Monte Carlo (MCMC) is a class of algorithms for generating samples from a specified probability distribution (in the continuous setting, the distribution is generally specified by its density function). Elliptical slice sampling is an MCMC algorithm that can be used to sample distributions of the form
where is a multivariate Gaussian prior with mean and covariance matrix , and is a likelihood function. Suppose we want to generalize this algorithm to sample from arbitrary continuous probability distributions. We could simply factor the distribution as
for any Gaussian . In this setting, won’t be a prior, and won’t be a likelihood function, but this is enough to apply elliptical slice sampling.
Will this work well? Probably not. Elliptical slice sampling works because it makes use of the structure induced by the Gaussian prior. For an arbitrarily chosen Gaussian, this won’t be the case. On the other hand, if we choose a Gaussian that closely approximates , then elliptical slice sampling will probably work well.
This reasoning motivates us to build a Gaussian approximation to . The better our approximation, the better we can expect the sampling algorithm to work. Gaussians all have light tails (the density function diminishes exponentially as ). So, in order to give ourselves more flexibility, we broaden our class of approximations to the class of multivariate distributions. A multivariate distribution with parameters , , and (referred to as the degrees of freedom, mean, and covariance parameters respectively) can be written (somewhat suggestively) as
where is the density function of an inverse-gamma distribution. Now, using a multivariate approximation, we can write our original density function as
Packaging up as a likelihood function , we are left with the task of generating samples from the probability distribution given by
How can this be done? Here’s the trick. Define a joint distribution
so we have
This tells us that the marginal distribution of in is our target distribution . Therefore, to generate samples from , it suffices to generate samples from and then to disregard the coordinate. In order to generate samples from , we can alternately sample the conditional distributions
The first conditional can be sampled with elliptical slice sampling. The second conditional is still an inverse gamma distribution (because the inverse gamma turns out to be a conjugate prior), and so it can be sampled exactly. This approach allows us to generalize elliptical slice sampling to arbitrary continuous distributions.