Delegating Data Collection in Decentralized Machine Learning

Nivasini Ananthakrishnan, Stephen Bates, Michael Jordan, Nika Haghtalab
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:478-486, 2024.

Abstract

Motivated by the emergence of decentralized machine learning (ML) ecosystems, we study the delegation of data collection. Taking the field of contract theory as our starting point, we design optimal and near-optimal contracts that deal with two fundamental information asymmetries that arise in decentralized ML: uncertainty in the assessment of model quality and uncertainty regarding the optimal performance of any model. We show that a principal can cope with such asymmetry via simple linear contracts that achieve $1-1/\epsilon$ fraction of the optimal utility. To address the lack of a priori knowledge regarding the optimal performance, we give a convex program that can adaptively and efficiently compute the optimal contract. We also analyze the optimal utility and linear contracts for the more complex setting of multiple interactions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-ananthakrishnan24a, title = {Delegating Data Collection in Decentralized Machine Learning}, author = {Ananthakrishnan, Nivasini and Bates, Stephen and Jordan, Michael and Haghtalab, Nika}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {478--486}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/ananthakrishnan24a/ananthakrishnan24a.pdf}, url = {https://proceedings.mlr.press/v238/ananthakrishnan24a.html}, abstract = {Motivated by the emergence of decentralized machine learning (ML) ecosystems, we study the delegation of data collection. Taking the field of contract theory as our starting point, we design optimal and near-optimal contracts that deal with two fundamental information asymmetries that arise in decentralized ML: uncertainty in the assessment of model quality and uncertainty regarding the optimal performance of any model. We show that a principal can cope with such asymmetry via simple linear contracts that achieve $1-1/\epsilon$ fraction of the optimal utility. To address the lack of a priori knowledge regarding the optimal performance, we give a convex program that can adaptively and efficiently compute the optimal contract. We also analyze the optimal utility and linear contracts for the more complex setting of multiple interactions.} }
Endnote
%0 Conference Paper %T Delegating Data Collection in Decentralized Machine Learning %A Nivasini Ananthakrishnan %A Stephen Bates %A Michael Jordan %A Nika Haghtalab %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-ananthakrishnan24a %I PMLR %P 478--486 %U https://proceedings.mlr.press/v238/ananthakrishnan24a.html %V 238 %X Motivated by the emergence of decentralized machine learning (ML) ecosystems, we study the delegation of data collection. Taking the field of contract theory as our starting point, we design optimal and near-optimal contracts that deal with two fundamental information asymmetries that arise in decentralized ML: uncertainty in the assessment of model quality and uncertainty regarding the optimal performance of any model. We show that a principal can cope with such asymmetry via simple linear contracts that achieve $1-1/\epsilon$ fraction of the optimal utility. To address the lack of a priori knowledge regarding the optimal performance, we give a convex program that can adaptively and efficiently compute the optimal contract. We also analyze the optimal utility and linear contracts for the more complex setting of multiple interactions.
APA
Ananthakrishnan, N., Bates, S., Jordan, M. & Haghtalab, N.. (2024). Delegating Data Collection in Decentralized Machine Learning. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:478-486 Available from https://proceedings.mlr.press/v238/ananthakrishnan24a.html.

Related Material