Authors
Javed A Aslam, Virgil Pavlu, Emine Yilmaz
Publication date
2006/8/6
Book
Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Pages
541-548
Description
We consider the problem of large-scale retrieval evaluation, and we propose a statistical method for evaluating retrieval systems using incomplete judgments. Unlike existing techniques that (1) rely on effectively complete, and thus prohibitively expensive, relevance judgment sets, (2) produce biased estimates of standard performance measures, or (3) produce estimates of non-standard measures thought to be correlated with these standard measures, our proposed statistical technique produces unbiased estimates of the standard measures themselves.Our proposed technique is based on random sampling. While our estimates are unbiased by statistical design, their variance is dependent on the sampling distribution employed; as such, we derive a sampling distribution likely to yield low variance estimates. We test our proposed technique using benchmark TREC data, demonstrating that a sampling pool derived …
Total citations
20062007200820092010201120122013201420152016201720182019202020212022202320242221618221191812111414878755
Scholar articles
JA Aslam, V Pavlu, E Yilmaz - Proceedings of the 29th annual international ACM …, 2006