ICML Highlight: Fast Dropout Training

In this post, I’ll summarize one of my favorite papers from ICML 2013: Fast Dropout Training, by Sida Wang and Christopher Manning. This paper derives an analytic approximation to dropout, a randomized regularization method recently proposed for training deep nets that has allowed big improvements in predictive accuracy.   Their approximation gives a roughly 10-times speedup under certain conditions.  Much more interestingly, the authors also show strong connections to existing regularization methods, shedding light on why dropout works so well. Continue reading “ICML Highlight: Fast Dropout Training”

An Auxiliary Variable Trick for MCMC

[latexpage]I recently uploaded the paper “Parallel MCMC with Generalized Elliptical Slice Sampling” to the arXiv. I’d like to highlight one trick that we used, but first I’ll give some background. Markov chain Monte Carlo (MCMC) is a class of algorithms for generating samples from a specified probability distribution $\pi({\bf x})$ (in the continuous setting, the distribution is generally specified by its density function). Elliptical slice sampling is an MCMC algorithm that can be used to sample distributions of the form

\begin{equation}

\pi({\bf x}) \propto \mathcal N({\bf x};\boldsymbol\mu,\boldsymbol\Sigma) L({\bf x}),

\end{equation}

where $\mathcal N({\bf x};\boldsymbol\mu,\boldsymbol\Sigma)$ is a multivariate Gaussian prior with mean $\boldsymbol\mu$ and covariance matrix $\boldsymbol\Sigma$, and $L({\bf x})$ is a likelihood function. Suppose we want to generalize this algorithm to sample from arbitrary continuous probability distributions. We could simply factor the distribution $\pi({\bf x})$ as

\begin{equation}

\pi({\bf x}) = \mathcal N({\bf x};\boldsymbol\mu,\boldsymbol\Sigma) \cdot \frac{\pi({\bf x})}{\mathcal N({\bf x};\boldsymbol\mu,\boldsymbol\Sigma)},

\end{equation} Continue reading “An Auxiliary Variable Trick for MCMC”

High-Dimensional Probability Estimation with Deep Density Models

[latexpage]

Ryan Adams and I just uploaded to the arXiv our paper “High-Dimensional Probability Estimation with Deep Density Models”. In this work, we introduce the deep density model (DDM), a new approach for density estimation. Continue reading “High-Dimensional Probability Estimation with Deep Density Models”