Scalable and Interpretable Learning with Probabilistic Models for Knowledge Discovery

Liu, S. (2023). Scalable and Interpretable Learning with Probabilistic Models for Knowledge Discovery [PhD thesis]. Princeton University.
Novel machine learning methods are at the heart of a transformation underway in science and engineering. Probabilistic models have served as the foundation learning models for knowledge discovery. As surrogate models, they enable efficient black-box optimization or active learning of the system behavior under limited budget to evaluate/query the complex system. Another important use case is to use probabilistic models as generative models to generate de novo designs with desired properties or samples from the equilibrium distribution of physical systems. However, to fully unleash the potential of probabilistic models for knowledge discovery, it is imperative to develop models that are scalable to growing data size and complexity while remaining interpretable to domain experts. In this dissertation, I start with the development of a novel approach that addresses the sparse solution identification problem in Bayesian optimization with probabilistic surrogate models. The discovery of sparse solutions not only enhances the interpretability of the solutions to humans for understanding the system behavior but also facilitates easier streamlined deployment and maintenance with a reduced number of parameters. Next, I present a novel approach utilizing deep learning to enhance the scalability of Gaussian process inference. Gaussian processes are extensively used as probabilistic surrogate models in knowledge discovery, but their practical usage is hindered by the high cost associated with the identification of kernel hyperparameters in GP regression, which involves costly marginal likelihoods. I show how to side-step the need for expensive marginal likelihoods by employing amortized” inference over hyperparameters. This is achieved through training a single neural network, which consumes a set of data and produces an estimate of the kernel function, useful across different tasks. Finally, I introduce marginalization models, a new family of generative models for high-dimensional discrete data which is ubiquitous in science discovery. Marginalization models offer scalable and flexible generative modeling with tractable likelihoods through explicit modeling of all induced marginal distributions with neural networks. Direct modeling of the marginals enables efficient marginal inference and scalable training of any-order generative models for sampling from a given (unnormalized) probability function, which overcomes major limitations of previous methods with exact likelihoods.
  @phdthesis{liu2023thesis,
  year = {2023},
  author = {Liu, Sulin},
  title = {Scalable and Interpretable Learning with Probabilistic Models for Knowledge Discovery},
  month = oct,
  school = {Princeton University},
  address = {Princeton, NJ},
  keywords = {Bayesian optimization, generative models, knowledge discovery, probabilistic machine learning}
}