Thursday February 24, 2022
1:00PM Eastern Time
Michele Peruzzi, PhD
Postdoctoral Associate, Department of Statistical Science, Duke University
Abstract: In this talk, I will consider the problem of fitting Bayesian models with spatial random effects to large scale multivariate multi-type data from satellite imaging, land-based weather and air quality sensors, and citizen science, with diverse applications in the environmental sciences, ecology, and public health.
In these contexts, the goal of quantifying spatial associations via random effects in Bayesian hierarchical models can be achieved by letting a Gaussian process (GP) characterize dependence in space, time, and across outcomes. GPs have desirable properties and lead to extremely flexible models, which are able to accurately quantify uncertainties. Unfortunately, GPs are notoriously poor performers in large data settings. While the literature on scalable GPs has primarily focused on their well-known bottlenecks in models of univariate continuous outcomes, I consider the more challenging hurdles to efficient computations facing latent models of multivariate non-Gaussian outcomes.
I introduce spatial meshing and manifold preconditioning as tools for efficient computations of multivariate Bayesian models of spatially referenced non-Gaussian data. First, I outline spatial meshing as a tool for building scalable processes using patterned directed acyclic graphs on partitioned spatial domains. Then, I present manifold preconditioning as a novel Langevin method for superior sampling performance with non-Gaussian multivariate data that are common in studying species’ communities.
In addition to these main topics, I discuss additional strategies for improving Markov-chain Monte Carlo performance, concluding with applications showcasing the flexibility of the proposed methodologies. All presented methods are implemented in the high performance R package ‘meshed’, available on CRAN.