Authors
Rina Foygel Barber, Emmanuel J Candès, Richard J Samworth
Publication date
2020/6/1
Journal
The Annals of Statistics
Volume
48
Issue
3
Pages
1409-1431
Publisher
Institute of Mathematical Statistics
Description
We consider the variable selection problem, which seeks to identify important variables influencing a response Y out of many candidate features X₁, . . . , Xp. We wish to do so while offering finite-sample guarantees about the fraction of false positives—selected variables Xj that in fact have no effect on Y after the other features are known. When the number of features p is large (perhaps even larger than the sample size n), and we have no prior knowledge regarding the type of dependence between Y and X, the model-X knockoffs framework nonetheless allows us to select a model with a guaranteed bound on the false discovery rate, as long as the distribution of the feature vector X = (X₁, . . . , Xp) is exactly known. This model selection procedure operates by constructing “knockoff copies” of each of the p features, which are then used as a control group to ensure that the model selection algorithm is not choosing …
Total citations
201920202021202220232024111417293027
Scholar articles
RF Barber, EJ Candès, RJ Samworth - The Annals of Statistics, 2020
RF Barber, E Candes, R Samworth - 2020
R Foygel Barber, EJ Candès, RJ Samworth - arXiv e-prints, 2018