I would like to briefly introduce disconnectivity graphs — striking visualizations of multidimensional energy landscapes that I had never seen before. While it’s not immediately obvious how useful they are, it should be straightforward to adapt them for visualizing probability distributions. A quick Google search for ‘disconnectivity graph’ will turn up lots of examples. These things look like chandeliers and are meant to summarize the potential energy surface of a molecule, potentially with many degrees of freedom and many local optima

## Markov chain centenary

I just attended a fun event, Celebrating 100 Years of Markov Chains, at the Institute for Applied Computational Science. There were three talks and they were taped, so hopefully you will be able to find the videos through the IACS website in the near future. Below, I will review some highlights of the first two talks by Brian Hayes and Ryan Adams; I’m skipping the last one because it was more of a review of concepts building up to and surrounding Markov chain Monte Carlo (MCMC). The first talk was intriguingly called “First Links in the Markov Chain: Poetry and Probability”

## Hashing, streaming and sketching

One of the questions in the air at NIPS 2012 was, how do we make machine learning algorithms scale to large datasets? There are two main approaches: (1) developing parallelizable ML algorithms and integrating them with large parallel systems and (2) developing more efficient algorithms. More often than not, the latter approach requires some sort of relaxation of an underlying task. Hashing, streaming algorithms and sketching are increasingly employed to achieve efficient approximate algorithms that arise in ML tasks. Below, I highlight a few examples, mostly from NIPS 2012, with several coming from the Big Learning workshop. Nearest neighbor search (or similarity search) appears in many “meta” ML tasks such as information retrieval and near-duplicate detection. Many approximate approches are …