Source Identification for Mixtures of Product Distributions

Spencer Gordon, Bijan H Mazaheri, Yuval Rabani, Leonard Schulman
Proceedings of Thirty Fourth Conference on Learning Theory, PMLR 134:2193-2216, 2021.

Abstract

We give an algorithm for source identification of a mixture of k product distributions on n bits. This is a fundamental problem in machine learning with many applications. Our algorithm identifies the source parameters of an identifiable mixture, given, as input, approximate values of multilinear moments (derived, for instance, from a sufficiently large sample), using $2^{O(k^2)}n^{O(k)}$ arithmetic operations. Our result is the first explicit bound on the computational complexity of source identification of such mixtures. The running time improves previous results by Feldman, O’Donnell, and Servedio (FOCS 2005) and Chen and Moitra (STOC 2019) that guaranteed only learning the mixture (without parametric identification of the source). Our analysis gives a quantitative version of a qualitative characterization of identifiable sources that is due to Tahmasebi, Motahari, and Maddah-Ali (ISIT 2018).

Cite this Paper


BibTeX
@InProceedings{pmlr-v134-gordon21a, title = {Source Identification for Mixtures of Product Distributions}, author = {Gordon, Spencer and Mazaheri, Bijan H and Rabani, Yuval and Schulman, Leonard}, booktitle = {Proceedings of Thirty Fourth Conference on Learning Theory}, pages = {2193--2216}, year = {2021}, editor = {Belkin, Mikhail and Kpotufe, Samory}, volume = {134}, series = {Proceedings of Machine Learning Research}, month = {15--19 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v134/gordon21a/gordon21a.pdf}, url = {https://proceedings.mlr.press/v134/gordon21a.html}, abstract = {We give an algorithm for source identification of a mixture of k product distributions on n bits. This is a fundamental problem in machine learning with many applications. Our algorithm identifies the source parameters of an identifiable mixture, given, as input, approximate values of multilinear moments (derived, for instance, from a sufficiently large sample), using $2^{O(k^2)}n^{O(k)}$ arithmetic operations. Our result is the first explicit bound on the computational complexity of source identification of such mixtures. The running time improves previous results by Feldman, O’Donnell, and Servedio (FOCS 2005) and Chen and Moitra (STOC 2019) that guaranteed only learning the mixture (without parametric identification of the source). Our analysis gives a quantitative version of a qualitative characterization of identifiable sources that is due to Tahmasebi, Motahari, and Maddah-Ali (ISIT 2018).} }
Endnote
%0 Conference Paper %T Source Identification for Mixtures of Product Distributions %A Spencer Gordon %A Bijan H Mazaheri %A Yuval Rabani %A Leonard Schulman %B Proceedings of Thirty Fourth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2021 %E Mikhail Belkin %E Samory Kpotufe %F pmlr-v134-gordon21a %I PMLR %P 2193--2216 %U https://proceedings.mlr.press/v134/gordon21a.html %V 134 %X We give an algorithm for source identification of a mixture of k product distributions on n bits. This is a fundamental problem in machine learning with many applications. Our algorithm identifies the source parameters of an identifiable mixture, given, as input, approximate values of multilinear moments (derived, for instance, from a sufficiently large sample), using $2^{O(k^2)}n^{O(k)}$ arithmetic operations. Our result is the first explicit bound on the computational complexity of source identification of such mixtures. The running time improves previous results by Feldman, O’Donnell, and Servedio (FOCS 2005) and Chen and Moitra (STOC 2019) that guaranteed only learning the mixture (without parametric identification of the source). Our analysis gives a quantitative version of a qualitative characterization of identifiable sources that is due to Tahmasebi, Motahari, and Maddah-Ali (ISIT 2018).
APA
Gordon, S., Mazaheri, B.H., Rabani, Y. & Schulman, L.. (2021). Source Identification for Mixtures of Product Distributions. Proceedings of Thirty Fourth Conference on Learning Theory, in Proceedings of Machine Learning Research 134:2193-2216 Available from https://proceedings.mlr.press/v134/gordon21a.html.

Related Material