View article

[PDF] from acm.org

Learning to limit data collection via scaling laws: A computational interpretation for the legal principle of data minimization

Authors

Divya Shanmugam, Fernando Diaz, Samira Shabanian, Michèle Finck, Asia Biega

Publication date

2022/6/21

Book

Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency

Pages

839-849

Description

Modern machine learning systems are increasingly characterized by extensive personal data collection, despite the diminishing returns and increasing societal costs of such practices. Yet, data minimisation is one of the core data protection principles enshrined in the European Union’s General Data Protection Regulation (’GDPR’) and requires that only personal data that is adequate, relevant and limited to what is necessary is processed. However, the principle has seen limited adoption due to the lack of technical interpretation.

In this work, we build on literature in machine learning and law to propose FIDO, a Framework for Inhibiting Data Overcollection. FIDO learns to limit data collection based on an interpretation of data minimization tied to system performance. Concretely, FIDO provides a data collection stopping criterion by iteratively updating an estimate of the performance curve, or the relationship between …

Total citations

Cited by 16

2022202320241 9 6

Scholar articles

Learning to limit data collection via scaling laws: A computational interpretation for the legal principle of data minimization

D Shanmugam, F Diaz, S Shabanian, M Finck, A Biega - Proceedings of the 2022 ACM Conference on Fairness …, 2022