Exploring Large Action Sets with Hyperspherical Embeddings using von Mises-Fisher Sampling

Walid Bendada, Guillaume Salha-Galvan, Romain Hennequin, Théo Bontempelli, Thomas Bouabça, Tristan Cazenave
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:3677-3711, 2025.

Abstract

This paper introduces von Mises-Fisher exploration (vMF-exp), a scalable method for exploring large action sets in reinforcement learning problems where hyperspherical embedding vectors represent these actions. vMF-exp involves initially sampling a state embedding representation using a von Mises-Fisher distribution, then exploring this representation’s nearest neighbors, which scales to virtually unlimited numbers of candidate actions. We show that, under theoretical assumptions, vMF-exp asymptotically maintains the same probability of exploring each action as Boltzmann Exploration (B-exp), a popular alternative that, nonetheless, suffers from scalability issues as it requires computing softmax values for each action. Consequently, vMF-exp serves as a scalable alternative to B-exp for exploring large action sets with hyperspherical embeddings. Experiments on simulated data, real-world public data, and the successful large-scale deployment of vMF-exp on the recommender system of a global music streaming service empirically validate the key properties of the proposed method.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-bendada25a, title = {Exploring Large Action Sets with Hyperspherical Embeddings using von Mises-{F}isher Sampling}, author = {Bendada, Walid and Salha-Galvan, Guillaume and Hennequin, Romain and Bontempelli, Th\'{e}o and Bouab\c{c}a, Thomas and Cazenave, Tristan}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {3677--3711}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/bendada25a/bendada25a.pdf}, url = {https://proceedings.mlr.press/v267/bendada25a.html}, abstract = {This paper introduces von Mises-Fisher exploration (vMF-exp), a scalable method for exploring large action sets in reinforcement learning problems where hyperspherical embedding vectors represent these actions. vMF-exp involves initially sampling a state embedding representation using a von Mises-Fisher distribution, then exploring this representation’s nearest neighbors, which scales to virtually unlimited numbers of candidate actions. We show that, under theoretical assumptions, vMF-exp asymptotically maintains the same probability of exploring each action as Boltzmann Exploration (B-exp), a popular alternative that, nonetheless, suffers from scalability issues as it requires computing softmax values for each action. Consequently, vMF-exp serves as a scalable alternative to B-exp for exploring large action sets with hyperspherical embeddings. Experiments on simulated data, real-world public data, and the successful large-scale deployment of vMF-exp on the recommender system of a global music streaming service empirically validate the key properties of the proposed method.} }
Endnote
%0 Conference Paper %T Exploring Large Action Sets with Hyperspherical Embeddings using von Mises-Fisher Sampling %A Walid Bendada %A Guillaume Salha-Galvan %A Romain Hennequin %A Théo Bontempelli %A Thomas Bouabça %A Tristan Cazenave %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-bendada25a %I PMLR %P 3677--3711 %U https://proceedings.mlr.press/v267/bendada25a.html %V 267 %X This paper introduces von Mises-Fisher exploration (vMF-exp), a scalable method for exploring large action sets in reinforcement learning problems where hyperspherical embedding vectors represent these actions. vMF-exp involves initially sampling a state embedding representation using a von Mises-Fisher distribution, then exploring this representation’s nearest neighbors, which scales to virtually unlimited numbers of candidate actions. We show that, under theoretical assumptions, vMF-exp asymptotically maintains the same probability of exploring each action as Boltzmann Exploration (B-exp), a popular alternative that, nonetheless, suffers from scalability issues as it requires computing softmax values for each action. Consequently, vMF-exp serves as a scalable alternative to B-exp for exploring large action sets with hyperspherical embeddings. Experiments on simulated data, real-world public data, and the successful large-scale deployment of vMF-exp on the recommender system of a global music streaming service empirically validate the key properties of the proposed method.
APA
Bendada, W., Salha-Galvan, G., Hennequin, R., Bontempelli, T., Bouabça, T. & Cazenave, T.. (2025). Exploring Large Action Sets with Hyperspherical Embeddings using von Mises-Fisher Sampling. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:3677-3711 Available from https://proceedings.mlr.press/v267/bendada25a.html.

Related Material