Authors
J Ross Beveridge, Kai She, Bruce Draper, Geof H Givens
Publication date
2001/12
Source
Proc. 3rd Workshop on the Empirical Evaluation of Computer Vision Systems
Description
This paper reviews some of the major issues associated with the statistical evaluation of Human Identification algorithms, emphasizing comparisons between algorithms on the same set of sample images. A general notation is developed and common performance metrics are defined. A simple success/failure evaluation methodology where recognition rate depends upon a binomially distributed random variable, recognition count, is developed and the conditions under which this model is appropriate are discussed. Some nonparametric techniques are also introduced, including bootstrapping. When applied to estimating the distribution of recognition count for a single set of iid sampled probe images, bootstrapping is noted as equivalent to the parametric binomial model. Bootstrapping applied to recognition rate over resampled sets of images can be problematic. Specifically, sampling with replacement to form image probe sets is shown to introduce a conflict between assumptions required by bootstrapping and the way recognition rate is computed. In part to overcome this difficulty with bootstrapping, a different nonparametric Monte Carlo method is introduced, and its utility illustrated with an extended example. This method permutes the choice of gallery and probe images. It is used to answer two questions. Question 1: How much does recognition rate vary when comparing images of individuals taken on different days using the same camera? Question 2: When is the observed difference in recognition rates for two distinct algorithms significant relative to this variation? Two important general features of nonparametric methods are illustrated by the …
Total citations
2001200220032004200520062007200820092010201120122013201420152016201720182019202020212022235109482241114343131
Scholar articles
JR Beveridge, K She, B Draper, GH Givens - Proc. 3rd Workshop on the Empirical Evaluation of …, 2001