Multi-distribution Learning: From Worst-Case Optimality to Lexicographic Min-Max Optimality

Guanghui Wang, Umar Syed, Robert E. Schapire, Jacob Abernethy
Proceedings of The 37th International Conference on Algorithmic Learning Theory, PMLR 313:1-19, 2026.

Abstract

We study multi-distribution learning (MDL), where the goal is to train a model that performs well across a set of underlying groups or distributions. The predominant performance metric in MDL is min-max optimality, which guarantees minimum error on the “hardest” distribution presented. Optimizing this metric has the unfortunate side effect of producing models that potentially sacrifice performance gains on non-worst-case groups. In the present work we propose a natural alternative, the lexicographic min-max (lex-min-max) objective, which promotes balanced performance by sequentially minimizing the worst, second-worst, and subsequent group losses. Despite its non-convex nature, we show that obtaining an (approximate) lex-min-max solution can be as easy as achieving (approximate) min-max optimality. We develop an efficient algorithm that directly approximates lex-min-max optimality via implementing stochastic no-regret dynamics on a regularized variant of the classical min-max objective. Our method is efficient and easy to implement, and it advances the frontier of multi-distribution learning by providing stronger, hierarchy-aware performance guarantees.

Cite this Paper


BibTeX
@InProceedings{pmlr-v313-wang26a, title = {Multi-distribution Learning: From Worst-Case Optimality to Lexicographic Min-Max Optimality}, author = {Wang, Guanghui and Syed, Umar and Schapire, Robert E. and Abernethy, Jacob}, booktitle = {Proceedings of The 37th International Conference on Algorithmic Learning Theory}, pages = {1--19}, year = {2026}, editor = {Telgarsky, Matus and Ullman, Jonathan}, volume = {313}, series = {Proceedings of Machine Learning Research}, month = {23--26 Feb}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v313/main/assets/wang26a/wang26a.pdf}, url = {https://proceedings.mlr.press/v313/wang26a.html}, abstract = {We study multi-distribution learning (MDL), where the goal is to train a model that performs well across a set of underlying groups or distributions. The predominant performance metric in MDL is min-max optimality, which guarantees minimum error on the “hardest” distribution presented. Optimizing this metric has the unfortunate side effect of producing models that potentially sacrifice performance gains on non-worst-case groups. In the present work we propose a natural alternative, the lexicographic min-max (lex-min-max) objective, which promotes balanced performance by sequentially minimizing the worst, second-worst, and subsequent group losses. Despite its non-convex nature, we show that obtaining an (approximate) lex-min-max solution can be as easy as achieving (approximate) min-max optimality. We develop an efficient algorithm that directly approximates lex-min-max optimality via implementing stochastic no-regret dynamics on a regularized variant of the classical min-max objective. Our method is efficient and easy to implement, and it advances the frontier of multi-distribution learning by providing stronger, hierarchy-aware performance guarantees.} }
Endnote
%0 Conference Paper %T Multi-distribution Learning: From Worst-Case Optimality to Lexicographic Min-Max Optimality %A Guanghui Wang %A Umar Syed %A Robert E. Schapire %A Jacob Abernethy %B Proceedings of The 37th International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2026 %E Matus Telgarsky %E Jonathan Ullman %F pmlr-v313-wang26a %I PMLR %P 1--19 %U https://proceedings.mlr.press/v313/wang26a.html %V 313 %X We study multi-distribution learning (MDL), where the goal is to train a model that performs well across a set of underlying groups or distributions. The predominant performance metric in MDL is min-max optimality, which guarantees minimum error on the “hardest” distribution presented. Optimizing this metric has the unfortunate side effect of producing models that potentially sacrifice performance gains on non-worst-case groups. In the present work we propose a natural alternative, the lexicographic min-max (lex-min-max) objective, which promotes balanced performance by sequentially minimizing the worst, second-worst, and subsequent group losses. Despite its non-convex nature, we show that obtaining an (approximate) lex-min-max solution can be as easy as achieving (approximate) min-max optimality. We develop an efficient algorithm that directly approximates lex-min-max optimality via implementing stochastic no-regret dynamics on a regularized variant of the classical min-max objective. Our method is efficient and easy to implement, and it advances the frontier of multi-distribution learning by providing stronger, hierarchy-aware performance guarantees.
APA
Wang, G., Syed, U., Schapire, R.E. & Abernethy, J.. (2026). Multi-distribution Learning: From Worst-Case Optimality to Lexicographic Min-Max Optimality. Proceedings of The 37th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 313:1-19 Available from https://proceedings.mlr.press/v313/wang26a.html.

Related Material