Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning

Sebastien Lachapelle, Tristan Deleu, Divyat Mahajan, Ioannis Mitliagkas, Yoshua Bengio, Simon Lacoste-Julien, Quentin Bertrand
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:18171-18206, 2023.

Abstract

Although disentangled representations are often said to be beneficial for downstream tasks, current empirical and theoretical understanding is limited. In this work, we provide evidence that disentangled representations coupled with sparse task-specific predictors improve generalization. In the context of multi-task learning, we prove a new identifiability result that provides conditions under which maximally sparse predictors yield disentangled representations. Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem. Finally, we explore a meta-learning version of this algorithm based on group Lasso multiclass SVM predictors, for which we derive a tractable dual formulation. It obtains competitive results on standard few-shot classification benchmarks, while each task is using only a fraction of the learned representations.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-lachapelle23a, title = {Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning}, author = {Lachapelle, Sebastien and Deleu, Tristan and Mahajan, Divyat and Mitliagkas, Ioannis and Bengio, Yoshua and Lacoste-Julien, Simon and Bertrand, Quentin}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {18171--18206}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/lachapelle23a/lachapelle23a.pdf}, url = {https://proceedings.mlr.press/v202/lachapelle23a.html}, abstract = {Although disentangled representations are often said to be beneficial for downstream tasks, current empirical and theoretical understanding is limited. In this work, we provide evidence that disentangled representations coupled with sparse task-specific predictors improve generalization. In the context of multi-task learning, we prove a new identifiability result that provides conditions under which maximally sparse predictors yield disentangled representations. Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem. Finally, we explore a meta-learning version of this algorithm based on group Lasso multiclass SVM predictors, for which we derive a tractable dual formulation. It obtains competitive results on standard few-shot classification benchmarks, while each task is using only a fraction of the learned representations.} }
Endnote
%0 Conference Paper %T Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning %A Sebastien Lachapelle %A Tristan Deleu %A Divyat Mahajan %A Ioannis Mitliagkas %A Yoshua Bengio %A Simon Lacoste-Julien %A Quentin Bertrand %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-lachapelle23a %I PMLR %P 18171--18206 %U https://proceedings.mlr.press/v202/lachapelle23a.html %V 202 %X Although disentangled representations are often said to be beneficial for downstream tasks, current empirical and theoretical understanding is limited. In this work, we provide evidence that disentangled representations coupled with sparse task-specific predictors improve generalization. In the context of multi-task learning, we prove a new identifiability result that provides conditions under which maximally sparse predictors yield disentangled representations. Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem. Finally, we explore a meta-learning version of this algorithm based on group Lasso multiclass SVM predictors, for which we derive a tractable dual formulation. It obtains competitive results on standard few-shot classification benchmarks, while each task is using only a fraction of the learned representations.
APA
Lachapelle, S., Deleu, T., Mahajan, D., Mitliagkas, I., Bengio, Y., Lacoste-Julien, S. & Bertrand, Q.. (2023). Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:18171-18206 Available from https://proceedings.mlr.press/v202/lachapelle23a.html.

Related Material