Discrete Continuous Optimization Framework for Simultaneous Clustering and Training in Mixture Models

Parth Vipul Sangani; Arjun Shashank Kashettiwar; Pritish Chakraborty; Bhuvan Reddy Gangula; Durga S; Ganesh Ramakrishnan; Rishabh K Iyer; Abir De

Discrete Continuous Optimization Framework for Simultaneous Clustering and Training in Mixture Models

Parth Vipul Sangani, Arjun Shashank Kashettiwar, Pritish Chakraborty, Bhuvan Reddy Gangula, Durga S, Ganesh Ramakrishnan, Rishabh K Iyer, Abir De

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:29950-29970, 2023.

Abstract

We study a new framework of learning mixture models via automatic clustering called PRESTO, wherein we optimize a joint objective function on the model parameters and the partitioning, with each model tailored to perform well on its specific cluster. In contrast to prior work, we do not assume any generative model for the data. We convert our training problem to a joint parameter estimation cum a subset selection problem, subject to a matroid span constraint. This allows us to reduce our problem into a constrained set function minimization problem, where the underlying objective is monotone and approximately submodular. We then propose a new joint discrete-continuous optimization algorithm that achieves a bounded approximation guarantee for our problem. We show that PRESTO outperforms several alternative methods. Finally, we study PRESTO in the context of resource-efficient deep learning, where we train smaller resource-constrained models on each partition and show that it outperforms existing data partitioning and model pruning/knowledge distillation approaches, which in contrast to PRESTO, require large initial (teacher) models.

Cite this Paper

BibTeX


@InProceedings{pmlr-v202-sangani23a,
  title = 	 {Discrete Continuous Optimization Framework for Simultaneous Clustering and Training in Mixture Models},
  author =       {Sangani, Parth Vipul and Kashettiwar, Arjun Shashank and Chakraborty, Pritish and Gangula, Bhuvan Reddy and S, Durga and Ramakrishnan, Ganesh and Iyer, Rishabh K and De, Abir},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {29950--29970},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/sangani23a/sangani23a.pdf},
  url = 	 {https://proceedings.mlr.press/v202/sangani23a.html},
  abstract = 	 {We study a new framework of learning mixture models via automatic clustering called PRESTO, wherein we optimize a joint objective function on the model parameters and the partitioning, with each model tailored to perform well on its specific cluster. In contrast to prior work, we do not assume any generative model for the data. We convert our training problem to a joint parameter estimation cum a subset selection problem, subject to a matroid span constraint. This allows us to reduce our problem into a constrained set function minimization problem, where the underlying objective is monotone and approximately submodular. We then propose a new joint discrete-continuous optimization algorithm that achieves a bounded approximation guarantee for our problem. We show that PRESTO outperforms several alternative methods. Finally, we study PRESTO in the context of resource-efficient deep learning, where we train smaller resource-constrained models on each partition and show that it outperforms existing data partitioning and model pruning/knowledge distillation approaches, which in contrast to PRESTO, require large initial (teacher) models.}
}

Endnote

%0 Conference Paper
%T Discrete Continuous Optimization Framework for Simultaneous Clustering and Training in Mixture Models
%A Parth Vipul Sangani
%A Arjun Shashank Kashettiwar
%A Pritish Chakraborty
%A Bhuvan Reddy Gangula
%A Durga S
%A Ganesh Ramakrishnan
%A Rishabh K Iyer
%A Abir De
%B Proceedings of the 40th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Andreas Krause
%E Emma Brunskill
%E Kyunghyun Cho
%E Barbara Engelhardt
%E Sivan Sabato
%E Jonathan Scarlett	
%F pmlr-v202-sangani23a
%I PMLR
%P 29950--29970
%U https://proceedings.mlr.press/v202/sangani23a.html
%V 202
%X We study a new framework of learning mixture models via automatic clustering called PRESTO, wherein we optimize a joint objective function on the model parameters and the partitioning, with each model tailored to perform well on its specific cluster. In contrast to prior work, we do not assume any generative model for the data. We convert our training problem to a joint parameter estimation cum a subset selection problem, subject to a matroid span constraint. This allows us to reduce our problem into a constrained set function minimization problem, where the underlying objective is monotone and approximately submodular. We then propose a new joint discrete-continuous optimization algorithm that achieves a bounded approximation guarantee for our problem. We show that PRESTO outperforms several alternative methods. Finally, we study PRESTO in the context of resource-efficient deep learning, where we train smaller resource-constrained models on each partition and show that it outperforms existing data partitioning and model pruning/knowledge distillation approaches, which in contrast to PRESTO, require large initial (teacher) models.

APA


Sangani, P.V., Kashettiwar, A.S., Chakraborty, P., Gangula, B.R., S, D., Ramakrishnan, G., Iyer, R.K. & De, A.. (2023). Discrete Continuous Optimization Framework for Simultaneous Clustering and Training in Mixture Models. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:29950-29970 Available from https://proceedings.mlr.press/v202/sangani23a.html.

Discrete Continuous Optimization Framework for Simultaneous Clustering and Training in Mixture Models

Abstract

Cite this Paper

Related Material