SQUIDS-Statistics Joint Seminar: Automatic Tuning for Gradient-based Bayesian Inference
| Dates: | 11 February 2026 |
| Times: | 13:00 - 14:00 |
| What is it: | Seminar |
| Organiser: | Department of Mathematics |
| Who is it for: | University staff, Current University students |
| Speaker: | Christopher Nemeth |
|
|
Speaker: Professor Christopher Nemeth (Lancaster University)
Abstract: In Bayesian inference, the central computational task is to approximate a posterior distribution—often by designing dynamics whose stationary law is the posterior, or by directly minimising a variational objective such as a KL divergence. A unifying way to view many of these approaches is as optimisation over probability measures, where one seeks to minimise a functional F(\mu) on a Wasserstein space (most notably F(\mu)=\mathrm{KL}(\mu\|\pi) for a target posterior \pi), with close connections to Langevin-type samplers and particle-based variational methods. A persistent practical obstacle is that time-discretized Wasserstein gradient flows typically require careful step-size tuning: too small yields prohibitively slow mixing and convergence, while too large can destabilize the iterates and undermine theoretical guarantees. Worse still, the “optimal” fixed step sizes suggested by non-asymptotic analyses usually depend on unknown problem quantities (properties of the posterior, the minimiser, and the evolving iterate law), making principled tuning difficult and often forcing practitioners into expensive trial-and-error approaches.
This talk presents FUSE (Functional Upper Bound Step-Size Estimator): a principled, adaptive, tuning-free family of step-size schedules tailored to two canonical discretisations of Wasserstein gradient flows—the forward-flow and forward Euler schemes. The resulting methodology yields tuning-free variants of widely used gradient-based samplers and particle optimisers, including the unadjusted Langevin algorithm (ULA), stochastic gradient Langevin dynamics (SGLD), mean-field Langevin dynamics, stein variational gradient descent (SVGD), and variational gradient descent (VGD), and more broadly applies to stochastic optimisation problems on the space of measures. Under mild conditions (notably geodesic convexityand locally bounded stochastic gradients), the theory recovers the performance of optimally tuned methods up to logarithmic factors, in both nonsmooth and smooth regimes. Empirically, across representative sampling and learning benchmarks, the proposed algorithms achieve performance comparable to the best hand-tuned baselines—without any step-size tuning.
Speaker
Christopher Nemeth
Organisation: Lancaster University
Travel and Contact Information
Find event
Frank Adams Room 2
Alan Turing Building
Manchester