Bayesian nonparametrics in the real world


Bayesian nonparametrics allow the contruction of statistical models whose complexity is determined by the observed data. This is accomplished by specifying priors over infinite dimensional distributions. The most widely used Bayesian nonparametric priors in machine learning are the Dirichlet process, the beta process and their corresponding marginal processes the Chinese restaurant process and the Indian buffet process respectively. The Dirichlet process provides a prior for the mixing measure of an infinite mixture model and the beta process can be used as a prior for feature popularity in a latent feature model. The hierarchical Dirichlet process (HDP) also appears frequently in machine learning underlying topic models with an infinite number of topics. A main selling point of Bayesian nonparametrics has been that the complexity of the model can grow as more data is observed. While true in theory, current inference algorithms for Bayesian nonparametric models fail to scale to the massive data sets available making the model impractical. In this post I review traditional inference algorithms for Bayesian nonparametric models and discuss a new approach for handling massive data sets using stochastic variational inference. Continue reading “Bayesian nonparametrics in the real world”

Turning Theory into Algorithms

[latexpage] Some of the common complaints I hear about (learning) theoretical work run along the lines of “those bounds are meaningless in practice,” “that result doesn’t apply to any algorithm someone would actually use,” and “you lost me as soon as martingales/Banach spaces/measure-theoretic niceties/… got involved.” I don’t have a good answer for the latter concern, but a very nice paper by Sasha Rakhlin, Ohad Shamir, and Karthik Sridharan at NIPS this year goes some ways toward address the first two criticisms. Their paper, “Relax and Randomize: From Value to Algorithms,” (extended version here) is concerned with transforming non-constructive online regret bounds into useful algorithms. Continue reading “Turning Theory into Algorithms”