Authors
Max Welling, Yee Whye Teh, Christophe Andrieu, Jakub Kominiarczuk, Ted Meeds, Babak Shahbaba, Sebastian Vollmer
Publication date
2014/10
Journal
ISBA Bulletin
Volume
21
Issue
4
Pages
8-11
Description
Over the last half century, and particularly since the advent of Markov Chain Monte Carlo methods, Bayesian inference has enjoyed tremendous successes, so much so that some quarters have proclaimed the age-old philosophical arguments between Frequentism and Bayesianism resolved, and that the twenty-first century is the Bayesian century. Whether such proclamations are controversial or not, the twenty first century has brought a new challenge to Bayesian inference that may render moot such philosophical arguments without a serious discussion of computation. This is the challenge of Big Data. Indeed, the growth in volume and variety of data in the twenty first century has been staggering, and one could ask the question whether Bayesian inference is still a relevant statistical framework in this context. Does one still have to worry about model uncertainty and overfitting if there is so much data that uncertainties are negligible for all practical purposes? We believe there are a number of responses to this critique that rather point to the opposite direction.
Firstly, big data often means large p with p≫ N. For instance, modern MRI scanners record millions of voxels, possibly over time, for a handful of patients only. For another example, the cost for sequencing entire genomes is falling faster than Moore’s law, implying that within a few years a doctor could routinely sequence every patient’s genome: this is four billion basepairs per patient. Clearly, model uncertainty remains an important problem. Secondly, most of the data may not be relevant to certain model components. For instance, in recommendation systems with billions of user-product …
Total citations
2015201620172018201920202021202211111
Scholar articles
M Welling, YW Teh, C Andrieu, J Kominiarczuk… - ISBA Bulletin, 2014