The Lie-Group Bayesian Learning Rule

Eren Mehmet Kiral, Thomas Moellenhoff, Mohammad Emtiyaz Khan
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:3331-3352, 2023.

Abstract

The Bayesian Learning Rule provides a framework for generic algorithm design but can be difficult to use for three reasons. First, it requires a specific parameterization of exponential family. Second, it uses gradients which can be difficult to compute. Third, its update may not always stay on the manifold. We address these difficulties by proposing an extension based on Lie-groups where posteriors are parametrized through transformations of an arbitrary base distribution and updated via the group’s exponential map. This simplifies all three difficulties for many cases, providing flexible parametrizations through group’s action, simple gradient computation through reparameterization, and updates that always stay on the manifold. We use the new learning rule to derive a new algorithm for deep learning with desirable biologically-plausible attributes to learn sparse features. Our work opens a new frontier for the design of new algorithms by exploiting lie-group structures.

Cite this Paper


BibTeX
@InProceedings{pmlr-v206-kiral23a, title = {The Lie-Group Bayesian Learning Rule}, author = {Kiral, Eren Mehmet and Moellenhoff, Thomas and Khan, Mohammad Emtiyaz}, booktitle = {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics}, pages = {3331--3352}, year = {2023}, editor = {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem}, volume = {206}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v206/kiral23a/kiral23a.pdf}, url = {https://proceedings.mlr.press/v206/kiral23a.html}, abstract = {The Bayesian Learning Rule provides a framework for generic algorithm design but can be difficult to use for three reasons. First, it requires a specific parameterization of exponential family. Second, it uses gradients which can be difficult to compute. Third, its update may not always stay on the manifold. We address these difficulties by proposing an extension based on Lie-groups where posteriors are parametrized through transformations of an arbitrary base distribution and updated via the group’s exponential map. This simplifies all three difficulties for many cases, providing flexible parametrizations through group’s action, simple gradient computation through reparameterization, and updates that always stay on the manifold. We use the new learning rule to derive a new algorithm for deep learning with desirable biologically-plausible attributes to learn sparse features. Our work opens a new frontier for the design of new algorithms by exploiting lie-group structures.} }
Endnote
%0 Conference Paper %T The Lie-Group Bayesian Learning Rule %A Eren Mehmet Kiral %A Thomas Moellenhoff %A Mohammad Emtiyaz Khan %B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2023 %E Francisco Ruiz %E Jennifer Dy %E Jan-Willem van de Meent %F pmlr-v206-kiral23a %I PMLR %P 3331--3352 %U https://proceedings.mlr.press/v206/kiral23a.html %V 206 %X The Bayesian Learning Rule provides a framework for generic algorithm design but can be difficult to use for three reasons. First, it requires a specific parameterization of exponential family. Second, it uses gradients which can be difficult to compute. Third, its update may not always stay on the manifold. We address these difficulties by proposing an extension based on Lie-groups where posteriors are parametrized through transformations of an arbitrary base distribution and updated via the group’s exponential map. This simplifies all three difficulties for many cases, providing flexible parametrizations through group’s action, simple gradient computation through reparameterization, and updates that always stay on the manifold. We use the new learning rule to derive a new algorithm for deep learning with desirable biologically-plausible attributes to learn sparse features. Our work opens a new frontier for the design of new algorithms by exploiting lie-group structures.
APA
Kiral, E.M., Moellenhoff, T. & Khan, M.E.. (2023). The Lie-Group Bayesian Learning Rule. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:3331-3352 Available from https://proceedings.mlr.press/v206/kiral23a.html.

Related Material