A New PHO-rmula for Improved Performance of Semi-Structured Networks

David Rügamer
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:29291-29305, 2023.

Abstract

Recent advances to combine structured regression models and deep neural networks for better interpretability, more expressiveness, and statistically valid uncertainty quantification demonstrate the versatility of semi-structured neural networks (SSNs). We show that techniques to properly identify the contributions of the different model components in SSNs, however, lead to suboptimal network estimation, slower convergence, and degenerated or erroneous predictions. In order to solve these problems while preserving favorable model properties, we propose a non-invasive post-hoc orthogonalization (PHO) that guarantees identifiability of model components and provides better estimation and prediction quality. Our theoretical findings are supported by numerical experiments, a benchmark comparison as well as a real-world application to COVID-19 infections.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-rugamer23a, title = {A New {PHO}-rmula for Improved Performance of Semi-Structured Networks}, author = {R\"{u}gamer, David}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {29291--29305}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/rugamer23a/rugamer23a.pdf}, url = {https://proceedings.mlr.press/v202/rugamer23a.html}, abstract = {Recent advances to combine structured regression models and deep neural networks for better interpretability, more expressiveness, and statistically valid uncertainty quantification demonstrate the versatility of semi-structured neural networks (SSNs). We show that techniques to properly identify the contributions of the different model components in SSNs, however, lead to suboptimal network estimation, slower convergence, and degenerated or erroneous predictions. In order to solve these problems while preserving favorable model properties, we propose a non-invasive post-hoc orthogonalization (PHO) that guarantees identifiability of model components and provides better estimation and prediction quality. Our theoretical findings are supported by numerical experiments, a benchmark comparison as well as a real-world application to COVID-19 infections.} }
Endnote
%0 Conference Paper %T A New PHO-rmula for Improved Performance of Semi-Structured Networks %A David Rügamer %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-rugamer23a %I PMLR %P 29291--29305 %U https://proceedings.mlr.press/v202/rugamer23a.html %V 202 %X Recent advances to combine structured regression models and deep neural networks for better interpretability, more expressiveness, and statistically valid uncertainty quantification demonstrate the versatility of semi-structured neural networks (SSNs). We show that techniques to properly identify the contributions of the different model components in SSNs, however, lead to suboptimal network estimation, slower convergence, and degenerated or erroneous predictions. In order to solve these problems while preserving favorable model properties, we propose a non-invasive post-hoc orthogonalization (PHO) that guarantees identifiability of model components and provides better estimation and prediction quality. Our theoretical findings are supported by numerical experiments, a benchmark comparison as well as a real-world application to COVID-19 infections.
APA
Rügamer, D.. (2023). A New PHO-rmula for Improved Performance of Semi-Structured Networks. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:29291-29305 Available from https://proceedings.mlr.press/v202/rugamer23a.html.

Related Material