Place: Online Seminar: Please sign up for our mailing list at www.physicsmeetsml.org for zoom link
Speaker: David Berman, Queen Mary University
Abstract: Statistical Inference is the process of determining a probability distribution over the space of parameters of a model given a data set. As more data becomes available this probability distribution becomes updated via the application of Bayes’ theorem. We present a treatment of this Bayesian updating process as a continuous dynamical system. Statistical inference is then governed by a first order differential equation describing a trajectory or flow in the information geometry determined by a parametric family of models. We solve this equation for some simple models and show that when the Cram´er-Rao bound is saturated the learning rate is governed by a simple 1/T power-law, with T a time-like variable denoting the quantity of data. We illustrate this with both analytic and numerical examples based on Gaussians and the inference of the coupling constant in the Ising model. Finally we compare the qualitative behaviour exhibited by Bayesian flows to the training of various neural networks on benchmarked data sets such as MNIST and CIFAR10 and show how that for networks exhibiting small final losses the simple power-law is also satisfied.