º«¹úÂãÎè

Updated: Thu, 02/13/2025 - 09:17

Due to today’s storm, º«¹úÂãÎè classes are cancelled. Please note that campuses remain open, including Libraries, according to their schedules. For details, see the Alert email.


En raison de la tempête, les cours à º«¹úÂãÎè sont annulés aujourd’hui. Veuillez noter que les campus restent ouverts, y compris les bibliothèques selon leurs horaires. Pour plus de détails, voir le courriel d'alerte.

Event

Julyan Arbel, INRIA, Université Grenoble Alpes, France

Tuesday, May 23, 2017 15:30to16:30
Seminar Statistique Sherbrooke, 2500 Boul. de l'Université, Sherbrooke, QC, CA

Bayesian nonparametric inference for discovery probabilities.

The longstanding problem of discovery probabilities dates back to World War II with Alan Turing codebreaking the Axis forces Enigma machine at Bletchley Park. The problem can be simply sketched as follows. An experimenter sampling units (say animals) from a population and recording their type (say species) asks: What is the probability that the next sampled animal coincides with a species already observed a given number of times? or that it is a newly discovered species? Applications are not limited to ecology but span bioinformatics, genetics, machine learning, multi- armed bandits, and so on. In this talk I describe a Bayesian nonparametric (BNP) approach to the problem and compare it to the original and highly popular estimators known as Good-Turing estimators. More specifically, I start by recalling some basics about the Dirichlet process which is the cornerstone of the BNP paradigm. Then I present a closed form expression for the posterior distribution of discovery probabilities which naturally leads to simple credible intervals. Next I describe asymptotic approximations of the BNP estimators for large sample size, and conclude by illustrating the proposed results through a benchmark genomic dataset of Expressed Sequence Tags.

Follow us on

Back to top