MULTIPAR: Supervised Irregular Tensor Factorization with Multi-task Learning for Computational Phenotyping
Proceedings of the 3rd Machine Learning for Health Symposium, PMLR 225:498-511, 2023.
Tensor factorization has received increasing interest due to its intrinsic ability to capture latent factors in multi-dimensional data with many applications including Electronic Health Records (EHR) mining. PARAFAC2 and its variants have been proposed to address irregular tensors where one of the tensor modes is not aligned, e.g., different patients in EHRs may have different length of records. PARAFAC2 has been successfully applied to EHRs for extracting meaningful medical concepts (phenotypes). Despite recent advancements, current models’ predictability and interpretability are not satisfactory, which limits its utility for downstream analysis. In this paper, we propose MULTIPAR: a supervised irregular tensor factorization with multi-task learning for computational phenotyping. MULTIPAR is flexible to incorporate both static (e.g. in-hospital mortality prediction) and continuous or dynamic (e.g. the need for ventilation) tasks. By supervising the tensor factorization with downstream prediction tasks and leveraging information from multiple related predictive tasks, MULTIPAR can yield not only more meaningful phenotypes but also better predictive performance for downstream tasks. We conduct extensive experiments on two real-world temporal EHR datasets to demonstrate that MULTIPAR is scalable and achieves better tensor fit with more meaningful subgroups and stronger predictive performance compared to existing state-of-the-art methods. The implementation of MULTIPAR is available https://github.com/yifeiren13/MULTIPAR .