View article

[PDF] from mlr.press

Data Valuation using Reinforcement Learning

Authors

Jinsung Yoon, Sercan O Arik, Tomas Pfister

Publication date

2020

Conference

ICML

Description

Quantifying the value of data is a fundamental problem in machine learning and has multiple important use cases:(1) building insights about the dataset and task,(2) domain adaptation,(3) corrupted sample discovery, and (4) robust learning. We propose Data Valuation using Reinforcement Learning (DVRL), to adaptively learn data values jointly with the predictor model. DVRL uses a data value estimator (DVE) to learn how likely each datum is used in training of the predictor model. DVE is trained using a reinforcement signal that reflects performance on the target task. We demonstrate that DVRL yields superior data value estimates compared to alternative methods across numerous datasets and application scenarios. The corrupted sample discovery performance of DVRL is close to optimal in many regimes (ie as if the noisy samples were known apriori), and for domain adaptation and robust learning DVRL significantly outperforms state-of-the-art by 14.6% and 10.8%, respectively.

Total citations

Cited by 190

202020212022202320246 27 56 59 42

Scholar articles

Data valuation using reinforcement learning

J Yoon, S Arik, T Pfister - International Conference on Machine Learning, 2020