BoXHED: Boosted eXact Hazard Estimator with Dynamic covariates

Xiaochen Wang, Arash Pakbin, Bobak Mortazavi, Hongyu Zhao, Donald Lee
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:9973-9982, 2020.

Abstract

The proliferation of medical monitoring devices makes it possible to track health vitals at high frequency, enabling the development of dynamic health risk scores that change with the underlying readings. Survival analysis, in particular hazard estimation, is well-suited to analyzing this stream of data to predict disease onset as a function of the time-varying vitals. This paper introduces the software package BoXHED (pronounced ‘box-head’) for nonparametrically estimating hazard functions via gradient boosting. BoXHED 1.0 is a novel tree-based implementation of the generic estimator proposed in Lee et al. (2017), which was designed for handling time-dependent covariates in a fully nonparametric manner. BoXHED is also the first publicly available software implementation for Lee et al. (2017). Applying it to a cardiovascular disease dataset from the Framingham Heart Study reveals novel interaction effects among known risk factors, potentially resolving an open question in clinical literature. BoXHED is available from GitHub: www.github.com/BoXHED.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-wang20o, title = {{B}o{XHED}: Boosted e{X}act Hazard Estimator with Dynamic covariates}, author = {Wang, Xiaochen and Pakbin, Arash and Mortazavi, Bobak and Zhao, Hongyu and Lee, Donald}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {9973--9982}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/wang20o/wang20o.pdf}, url = {https://proceedings.mlr.press/v119/wang20o.html}, abstract = {The proliferation of medical monitoring devices makes it possible to track health vitals at high frequency, enabling the development of dynamic health risk scores that change with the underlying readings. Survival analysis, in particular hazard estimation, is well-suited to analyzing this stream of data to predict disease onset as a function of the time-varying vitals. This paper introduces the software package BoXHED (pronounced ‘box-head’) for nonparametrically estimating hazard functions via gradient boosting. BoXHED 1.0 is a novel tree-based implementation of the generic estimator proposed in Lee et al. (2017), which was designed for handling time-dependent covariates in a fully nonparametric manner. BoXHED is also the first publicly available software implementation for Lee et al. (2017). Applying it to a cardiovascular disease dataset from the Framingham Heart Study reveals novel interaction effects among known risk factors, potentially resolving an open question in clinical literature. BoXHED is available from GitHub: www.github.com/BoXHED.} }
Endnote
%0 Conference Paper %T BoXHED: Boosted eXact Hazard Estimator with Dynamic covariates %A Xiaochen Wang %A Arash Pakbin %A Bobak Mortazavi %A Hongyu Zhao %A Donald Lee %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-wang20o %I PMLR %P 9973--9982 %U https://proceedings.mlr.press/v119/wang20o.html %V 119 %X The proliferation of medical monitoring devices makes it possible to track health vitals at high frequency, enabling the development of dynamic health risk scores that change with the underlying readings. Survival analysis, in particular hazard estimation, is well-suited to analyzing this stream of data to predict disease onset as a function of the time-varying vitals. This paper introduces the software package BoXHED (pronounced ‘box-head’) for nonparametrically estimating hazard functions via gradient boosting. BoXHED 1.0 is a novel tree-based implementation of the generic estimator proposed in Lee et al. (2017), which was designed for handling time-dependent covariates in a fully nonparametric manner. BoXHED is also the first publicly available software implementation for Lee et al. (2017). Applying it to a cardiovascular disease dataset from the Framingham Heart Study reveals novel interaction effects among known risk factors, potentially resolving an open question in clinical literature. BoXHED is available from GitHub: www.github.com/BoXHED.
APA
Wang, X., Pakbin, A., Mortazavi, B., Zhao, H. & Lee, D.. (2020). BoXHED: Boosted eXact Hazard Estimator with Dynamic covariates. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:9973-9982 Available from https://proceedings.mlr.press/v119/wang20o.html.

Related Material