Finding k in Latent $k-$ polytope

Chiranjib Bhattacharyya, Ravindran Kannan, Amit Kumar
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:894-903, 2021.

Abstract

The recently introduced Latent $k-$ Polytope($\LkP$) encompasses several stochastic Mixed Membership models including Topic Models. The problem of finding $k$, the number of extreme points of $\LkP$, is a fundamental challenge and includes several important open problems such as determination of number of components in Ad-mixtures. This paper addresses this challenge by introducing Interpolative Convex Rank(\INR) of a matrix defined as the minimum number of its columns whose convex hull is within Hausdorff distance $\varepsilon$ of the convex hull of all columns. The first important contribution of this paper is to show that under \emph{standard assumptions} $k$ equals the \INR of a \emph{subset smoothed data matrix} defined from Data generated from an $\LkP$. The second important contribution of the paper is a polynomial time algorithm for finding $k$ under standard assumptions. An immediate corollary is the first polynomial time algorithm for finding the \emph{inner dimension} in Non-negative matrix factorisation(NMF) with assumptions which are qualitatively different than existing ones such as \emph{Separability}. %An immediate corollary is the first polynomial time algorithm for finding the \emph{inner dimension} in Non-negative matrix factorisation(NMF) with assumptions considerably weaker than \emph{Separability}.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-bhattacharyya21a, title = {Finding k in Latent $k-$ polytope}, author = {Bhattacharyya, Chiranjib and Kannan, Ravindran and Kumar, Amit}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {894--903}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/bhattacharyya21a/bhattacharyya21a.pdf}, url = {https://proceedings.mlr.press/v139/bhattacharyya21a.html}, abstract = {The recently introduced Latent $k-$ Polytope($\LkP$) encompasses several stochastic Mixed Membership models including Topic Models. The problem of finding $k$, the number of extreme points of $\LkP$, is a fundamental challenge and includes several important open problems such as determination of number of components in Ad-mixtures. This paper addresses this challenge by introducing Interpolative Convex Rank(\INR) of a matrix defined as the minimum number of its columns whose convex hull is within Hausdorff distance $\varepsilon$ of the convex hull of all columns. The first important contribution of this paper is to show that under \emph{standard assumptions} $k$ equals the \INR of a \emph{subset smoothed data matrix} defined from Data generated from an $\LkP$. The second important contribution of the paper is a polynomial time algorithm for finding $k$ under standard assumptions. An immediate corollary is the first polynomial time algorithm for finding the \emph{inner dimension} in Non-negative matrix factorisation(NMF) with assumptions which are qualitatively different than existing ones such as \emph{Separability}. %An immediate corollary is the first polynomial time algorithm for finding the \emph{inner dimension} in Non-negative matrix factorisation(NMF) with assumptions considerably weaker than \emph{Separability}.} }
Endnote
%0 Conference Paper %T Finding k in Latent $k-$ polytope %A Chiranjib Bhattacharyya %A Ravindran Kannan %A Amit Kumar %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-bhattacharyya21a %I PMLR %P 894--903 %U https://proceedings.mlr.press/v139/bhattacharyya21a.html %V 139 %X The recently introduced Latent $k-$ Polytope($\LkP$) encompasses several stochastic Mixed Membership models including Topic Models. The problem of finding $k$, the number of extreme points of $\LkP$, is a fundamental challenge and includes several important open problems such as determination of number of components in Ad-mixtures. This paper addresses this challenge by introducing Interpolative Convex Rank(\INR) of a matrix defined as the minimum number of its columns whose convex hull is within Hausdorff distance $\varepsilon$ of the convex hull of all columns. The first important contribution of this paper is to show that under \emph{standard assumptions} $k$ equals the \INR of a \emph{subset smoothed data matrix} defined from Data generated from an $\LkP$. The second important contribution of the paper is a polynomial time algorithm for finding $k$ under standard assumptions. An immediate corollary is the first polynomial time algorithm for finding the \emph{inner dimension} in Non-negative matrix factorisation(NMF) with assumptions which are qualitatively different than existing ones such as \emph{Separability}. %An immediate corollary is the first polynomial time algorithm for finding the \emph{inner dimension} in Non-negative matrix factorisation(NMF) with assumptions considerably weaker than \emph{Separability}.
APA
Bhattacharyya, C., Kannan, R. & Kumar, A.. (2021). Finding k in Latent $k-$ polytope. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:894-903 Available from https://proceedings.mlr.press/v139/bhattacharyya21a.html.

Related Material