Hierarchical partition of unity networks: fast multilevel training

Nathaniel Trask; Amelia Henriksen; Carianne Martinez; Eric Cyr

Hierarchical partition of unity networks: fast multilevel training

Nathaniel Trask, Amelia Henriksen, Carianne Martinez, Eric Cyr

Proceedings of Mathematical and Scientific Machine Learning, PMLR 190:271-286, 2022.

Abstract

We present a probabilistic mixture of experts framework to perform nonparametric piecewise polynomial approximation without the need for an underlying mesh partitioning space. Deep neural networks traditionally used for classification provide a means of localizing polynomial approximation, and the probabilistic formulation admits a trivially parallelizable expectation maximization (EM) strategy. We then introduce a hierarchical architecture whose EM loss naturally decomposes into coarse and fine scale terms and small decoupled least squares problems. We exploit this hierarchical structure to formulate a V-cycle multigrid-inspired training algorithm. A suite of benchmarks demonstrate the ability of the scheme to: realize for smooth data algebraic convergence with respect to number of partitions, exponential convergence with respect to polynomial order; exactly reproduce piecewise polynomial functions; and demonstrate through an application to data-driven semiconductor modeling the ability to accurately treat data spanning several orders of magnitude.

Cite this Paper

BibTeX


@InProceedings{pmlr-v190-trask22a,
  title = 	 {Hierarchical partition of unity networks: fast multilevel training},
  author =       {Trask, Nathaniel and Henriksen, Amelia and Martinez, Carianne and Cyr, Eric},
  booktitle = 	 {Proceedings of Mathematical and Scientific Machine Learning},
  pages = 	 {271--286},
  year = 	 {2022},
  editor = 	 {Dong, Bin and Li, Qianxiao and Wang, Lei and Xu, Zhi-Qin John},
  volume = 	 {190},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {15--17 Aug},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v190/trask22a/trask22a.pdf},
  url = 	 {https://proceedings.mlr.press/v190/trask22a.html},
  abstract = 	 {We present a probabilistic mixture of experts framework to perform nonparametric piecewise polynomial approximation without the need for an underlying mesh partitioning space. Deep neural networks traditionally used for classification provide a means of localizing polynomial approximation, and the probabilistic formulation admits a trivially parallelizable expectation maximization (EM) strategy. We then introduce a hierarchical architecture whose EM loss naturally decomposes into coarse and fine scale terms and small decoupled least squares problems. We exploit this hierarchical structure to formulate a V-cycle multigrid-inspired training algorithm. A suite of benchmarks demonstrate the ability of the scheme to: realize for smooth data algebraic convergence with respect to number of partitions, exponential convergence with respect to polynomial order; exactly reproduce piecewise polynomial functions; and demonstrate through an application to data-driven semiconductor modeling the ability to accurately treat data spanning several orders of magnitude.}
}

Endnote

%0 Conference Paper
%T Hierarchical partition of unity networks: fast multilevel training
%A Nathaniel Trask
%A Amelia Henriksen
%A Carianne Martinez
%A Eric Cyr
%B Proceedings of Mathematical and Scientific Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Bin Dong
%E Qianxiao Li
%E Lei Wang
%E Zhi-Qin John Xu	
%F pmlr-v190-trask22a
%I PMLR
%P 271--286
%U https://proceedings.mlr.press/v190/trask22a.html
%V 190
%X We present a probabilistic mixture of experts framework to perform nonparametric piecewise polynomial approximation without the need for an underlying mesh partitioning space. Deep neural networks traditionally used for classification provide a means of localizing polynomial approximation, and the probabilistic formulation admits a trivially parallelizable expectation maximization (EM) strategy. We then introduce a hierarchical architecture whose EM loss naturally decomposes into coarse and fine scale terms and small decoupled least squares problems. We exploit this hierarchical structure to formulate a V-cycle multigrid-inspired training algorithm. A suite of benchmarks demonstrate the ability of the scheme to: realize for smooth data algebraic convergence with respect to number of partitions, exponential convergence with respect to polynomial order; exactly reproduce piecewise polynomial functions; and demonstrate through an application to data-driven semiconductor modeling the ability to accurately treat data spanning several orders of magnitude.

APA


Trask, N., Henriksen, A., Martinez, C. & Cyr, E.. (2022). Hierarchical partition of unity networks: fast multilevel training. Proceedings of Mathematical and Scientific Machine Learning, in Proceedings of Machine Learning Research 190:271-286 Available from https://proceedings.mlr.press/v190/trask22a.html.

Related Material

Download PDF