Authors
Tetsuya Sakai
Publication date
2014/6/26
Journal
ACM SIGIR Forum
Volume
48
Issue
1
Pages
3-12
Publisher
ACM
Description
IR revolves around evaluation. Therefore, IR researchers should employ sound evaluation practices. Nowadays many of us know that statistical significance testing is not enough, but not all of us know exactly what to do about it. This paper provides suggestions on how to report effect sizes and confidence intervals along with p-values, in the context of comparing IR systems using test collections. Hopefully, these practices will make IR papers more informative, and help researchers form more reliable conclusions that "add up." Finally, I pose a specific question for the IR community: should IR journal editors and SIGIR PC chairs require (rather than encourage) reporting of effect sizes and confidence intervals.
Total citations
2014201520162017201820192020202120222023202437121918837271