Hierarchical Mixtures-of-Experts for Generalized Linear Models: Some Results on Denseness and Consistency

Wenxin Jiang; Martin A. Tanner

Hierarchical Mixtures-of-Experts for Generalized Linear Models: Some Results on Denseness and Consistency

Wenxin Jiang, Martin A. Tanner

Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics, PMLR R2, 1999.

Abstract

We investigate a class of hierarchical mixtures-of-experts (HME) models where exponential family regression models with generalized linear mean functions of the form

$\psi(a+x^T b)$ are mixed. Here

$\psi(\cdot)$ is the inverse link function. Suppose the true response

$y$ follows an exponential family regression model with mean function belonging to a class of smooth functions of the form

$\psi(h(x))$ where

$h \in W_{2;K_0}^\infty$ (a Sobolev class over

$[0,1]^{s}$ ). It is shown that the HME mean functions can approximate the true mean function, at a rate of

$O(m^{-2/s})$ in

$L_p$ norm. Moreover, the HME probability density functions can approximate the true density, at a rate of

$O(m^{-2/s})$ in Hellinger distance, and at a rate of

$O(m^{-4/s})$ in Kullback-Leibler divergence. These rates can be achieved within the family of HME structures with a tree of binary splits, or within the family of structures with a single layer of experts. Here

$s$ is the dimension of the predictor

$x$ . It is also shown that likelihood-based inference based on HME is consistent in recovering the truth, in the sense that as the sample size

$n$ and the number of experts

$m$ both increase, the mean square error of the estimated mean response goes to zero. Conditions for such results to hold are stated and discussed.

Cite this Paper

BibTeX


@InProceedings{pmlr-vR2-jiang99a,
  title = 	 {Hierarchical Mixtures-of-Experts for Generalized Linear Models: Some Results on Denseness and Consistency},
  author =       {Jiang, Wenxin and Tanner, Martin A.},
  booktitle = 	 {Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics},
  year = 	 {1999},
  editor = 	 {Heckerman, David and Whittaker, Joe},
  volume = 	 {R2},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {03--06 Jan},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/r2/jiang99a/jiang99a.pdf},
  url = 	 {https://proceedings.mlr.press/r2/jiang99a.html},
  abstract = 	 {We investigate a class of hierarchical mixtures-of-experts (HME) models where exponential family regression models with generalized linear mean functions of the form $\psi(a+x^T b)$ are mixed. Here $\psi(\cdot)$ is the inverse link function. Suppose the true response $y$ follows an exponential family regression model with mean function belonging to a class of smooth functions of the form $\psi(h(x))$ where $h \in W_{2;K_0}^\infty$ (a Sobolev class over $[0,1]^{s}$). It is shown that the HME mean functions can approximate the true mean function, at a rate of $O(m^{-2/s})$ in $L_p$ norm. Moreover, the HME probability density functions can approximate the true density, at a rate of $O(m^{-2/s})$ in Hellinger distance, and at a rate of $O(m^{-4/s})$ in Kullback-Leibler divergence. These rates can be achieved within the family of HME structures with a tree of binary splits, or within the family of structures with a single layer of experts. Here $s$ is the dimension of the predictor $x$. It is also shown that likelihood-based inference based on HME is consistent in recovering the truth, in the sense that as the sample size $n$ and the number of experts $m$ both increase, the mean square error of the estimated mean response goes to zero. Conditions for such results to hold are stated and discussed.},
  note =         {Reissued by PMLR on 20 August 2020.}
}

Endnote

%0 Conference Paper
%T Hierarchical Mixtures-of-Experts for Generalized Linear Models: Some Results on Denseness and Consistency
%A Wenxin Jiang
%A Martin A. Tanner
%B Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 1999
%E David Heckerman
%E Joe Whittaker	
%F pmlr-vR2-jiang99a
%I PMLR
%U https://proceedings.mlr.press/r2/jiang99a.html
%V R2
%X We investigate a class of hierarchical mixtures-of-experts (HME) models where exponential family regression models with generalized linear mean functions of the form $\psi(a+x^T b)$ are mixed. Here $\psi(\cdot)$ is the inverse link function. Suppose the true response $y$ follows an exponential family regression model with mean function belonging to a class of smooth functions of the form $\psi(h(x))$ where $h \in W_{2;K_0}^\infty$ (a Sobolev class over $[0,1]^{s}$). It is shown that the HME mean functions can approximate the true mean function, at a rate of $O(m^{-2/s})$ in $L_p$ norm. Moreover, the HME probability density functions can approximate the true density, at a rate of $O(m^{-2/s})$ in Hellinger distance, and at a rate of $O(m^{-4/s})$ in Kullback-Leibler divergence. These rates can be achieved within the family of HME structures with a tree of binary splits, or within the family of structures with a single layer of experts. Here $s$ is the dimension of the predictor $x$. It is also shown that likelihood-based inference based on HME is consistent in recovering the truth, in the sense that as the sample size $n$ and the number of experts $m$ both increase, the mean square error of the estimated mean response goes to zero. Conditions for such results to hold are stated and discussed.
%Z Reissued by PMLR on 20 August 2020.

APA


Jiang, W. & Tanner, M.A.. (1999). Hierarchical Mixtures-of-Experts for Generalized Linear Models: Some Results on Denseness and Consistency. Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research R2 Available from https://proceedings.mlr.press/r2/jiang99a.html. Reissued by PMLR on 20 August 2020.

Related Material

Download PDF