Statistical inference with stochastic gradient algorithms
Negrea, Jeffrey; Yang, Jun; Feng, Haoyue; Roy, Daniel; Huggins, Jonathan
The tuning of stochastic gradient algorithms (SGAs) for optimization and sampling is often based on
heuristics and trial-and-error rather than generalizable theory. We address this theory–practice gap by
characterizing the large-sample statistical asymptotics of SGAs via a joint step-size–sample-size scaling
limit. We show that iterate averaging with a large fixed step size is robust to the choice of tuning
parameters and asymptotically has covariance proportional to that of the MLE sampling distribution. We
also prove a Bernstein–von Mises-like theorem to guide tuning, including for generalized posteriors that
are robust to model misspecification. Numerical experiments validate our results and recommendations in
realistic finite-sample regimes. Our work lays the foundation for a systematic analysis of other stochastic
gradient Markov chain Monte Carlo algorithms for a wide range of models.
↧