Authors
Divya Shanmugam, Fernando Diaz, Samira Shabanian, Michèle Finck, Asia Biega
Publication date
2022/6/21
Book
Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency
Pages
839-849
Description
Modern machine learning systems are increasingly characterized by extensive personal data collection, despite the diminishing returns and increasing societal costs of such practices. Yet, data minimisation is one of the core data protection principles enshrined in the European Union’s General Data Protection Regulation (’GDPR’) and requires that only personal data that is adequate, relevant and limited to what is necessary is processed. However, the principle has seen limited adoption due to the lack of technical interpretation.
In this work, we build on literature in machine learning and law to propose FIDO, a Framework for Inhibiting Data Overcollection. FIDO learns to limit data collection based on an interpretation of data minimization tied to system performance. Concretely, FIDO provides a data collection stopping criterion by iteratively updating an estimate of the performance curve, or the relationship between …
Total citations
202220232024196
Scholar articles
D Shanmugam, F Diaz, S Shabanian, M Finck, A Biega - Proceedings of the 2022 ACM Conference on Fairness …, 2022