- title: 'Machine Learning for Health (ML4H) 2022' volume: 193 URL: https://proceedings.mlr.press/v193/parziale22a.html PDF: https://proceedings.mlr.press/v193/parziale22a/parziale22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-parziale22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shengpu family: Tang - given: Kristen family: Severson - given: Luis family: Oala - given: Adarsh family: Subbaswamy - given: Sayantan family: Kumar - given: Elora family: Schoerverth - given: Stefan family: Hegselmann - given: Helen family: Zhou - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Elena family: Sizikova - given: Girmaw Abebe family: Tadesse - given: Yuyin family: Zhou - given: Taylor family: Killian - given: Haoran family: Zhang - given: Fahad family: Kamran - given: Andrea family: Hobby - given: Mars family: Huang - given: Ahmed family: Alaa - given: Harvineet family: Singh - given: Irene Y. family: Chen - given: Shalmali family: Joshi editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 1-11 id: parziale22a issued: date-parts: - 2022 - 11 - 22 firstpage: 1 lastpage: 11 published: 2022-11-22 00:00:00 +0000 - title: 'Imputation Strategies Under Clinical Presence: Impact on Algorithmic Fairness' abstract: 'Biases have marked medical history, leading to unequal care affecting marginalised groups. The patterns of missingness in observational data often reflect these group discrepancies, but the algorithmic fairness implications of group-specific missingness are not well understood. Despite its potential impact, imputation is too often an overlooked preprocessing step. When explicitly considered, attention is placed on overall performance, ignoring how this preprocessing can reinforce group-specific inequities. Our work questions this choice by studying how imputation affects downstream algorithmic fairness. First, we provide a structured view of the relationship between clinical presence mechanisms and group-specific missingness patterns. Then, through simulations and real-world experiments, we demonstrate that the imputation choice influences marginalised group performance and that no imputation strategy consistently reduces disparities. Importantly, our results show that current practices may endanger health equity as similarly performing imputation strategies at the population level can affect marginalised groups differently. Finally, we propose recommendations for mitigating inequities that may stem from a neglected step of the machine learning pipeline.' volume: 193 URL: https://proceedings.mlr.press/v193/jeanselme22a.html PDF: https://proceedings.mlr.press/v193/jeanselme22a/jeanselme22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-jeanselme22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Vincent family: Jeanselme - given: Maria family: De-Arteaga - given: Zhe family: Zhang - given: Jessica family: Barrett - given: Brian family: Tom editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 12-34 id: jeanselme22a issued: date-parts: - 2022 - 11 - 22 firstpage: 12 lastpage: 34 published: 2022-11-22 00:00:00 +0000 - title: 'Predicting Treatment Adherence of Tuberculosis Patients at Scale' abstract: 'Tuberculosis (TB), an infectious bacterial disease, is a significant cause of death, especially in low-income countries, with an estimated ten million new cases reported globally in 2020. While TB is treatable, non-adherence to the medication regimen is a significant cause of morbidity and mortality. Thus, proactively identifying patients at risk of dropping off their medication regimen enables corrective measures to mitigate adverse outcomes. Using a proxy measure of extreme non-adherence and a dataset of nearly $700,000$ patients from four states in India, we formulate and solve the machine learning (ML) problem of early prediction of non-adherence based on a custom rank-based metric. We train ML models and evaluate against baselines, achieving a $\sim 100%$ lift over rule-based baselines and $\sim 214%$ over a random classifier, taking into account country-wide large-scale future deployment. We deal with various issues in the process, including data quality, high-cardinality categorical data, low target prevalence, distribution shift, variation across cohorts, algorithmic fairness, and the need for robustness and explainability. Our findings indicate that risk stratification of non-adherent patients is a viable, deployable-at-scale ML solution. As the official AI partner of India’s Central TB Division, we are working on multiple city and state-level pilots with the goal of pan-India deployment.' volume: 193 URL: https://proceedings.mlr.press/v193/kulkarni22a.html PDF: https://proceedings.mlr.press/v193/kulkarni22a/kulkarni22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-kulkarni22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Mihir family: Kulkarni - given: Satvik family: Golechha - given: Rishi family: Raj - given: Jithin K. family: Sreedharan - given: Ankit family: Bhardwaj - given: Santanu family: Rathod - given: Bhavin family: Vadera - given: Jayakrishna family: Kurada - given: Sanjay family: Mattoo - given: Rajendra family: Joshi - given: Kirankumar family: Rade - given: Alpan family: Raval editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 35-61 id: kulkarni22a issued: date-parts: - 2022 - 11 - 22 firstpage: 35 lastpage: 61 published: 2022-11-22 00:00:00 +0000 - title: 'Distributionally Robust Survival Analysis: A Novel Fairness Loss Without Demographics' abstract: 'We propose a general approach for training survival analysis models that minimizes a worst-case error across all subpopulations that are large enough (occurring with at least a user-specified minimum probability). This approach uses a training loss function that does not know any demographic information to treat as sensitive. Despite this, we demonstrate that our proposed approach often scores better on recently established fairness metrics (without a significant drop in prediction accuracy) compared to various baselines, including ones which directly use sensitive demographic information in their training loss. Our code is available at: https://github.com/discovershu/DRO_COX' volume: 193 URL: https://proceedings.mlr.press/v193/hu22a.html PDF: https://proceedings.mlr.press/v193/hu22a/hu22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-hu22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Shu family: Hu - given: George H. family: Chen editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 62-87 id: hu22a issued: date-parts: - 2022 - 11 - 22 firstpage: 62 lastpage: 87 published: 2022-11-22 00:00:00 +0000 - title: 'mmVAE: multimorbidity clustering using Relaxed Bernoulli $β$-Variational Autoencoders' abstract: 'The prevalence of chronic disease multimorbidity is a significant and increasing challenge for health systems. In many cases, the occurrence of one chronic disease leads to the development of one or more other chronic conditions. This exerts a significant challenge in improving patient outcomes and is a growing challenge globally as average population age increases. Using electronic health record information to identify patterns of co-occurring conditions is seen as an unbiased means of understanding multimorbidity but most studies have adopted off-the-shelf algorithmic techniques that are not tailored for the application. We present a novel bespoke approach for multimorbidity clustering based on a highly customised version of a $\beta$-variational autoencoder. We incorporate the use of minimum entropy clustering to identify sparse, low-dimensional factored representations that link at a feature-level to the observed patient-level multimorbidity profiles. We demonstrate how the approach can be used to explore complex structure in a population-scale health data sets by examining data from a UK population of nearly 300,000 women in pregnancy suffering from multimorbidity.' volume: 193 URL: https://proceedings.mlr.press/v193/gadd22a.html PDF: https://proceedings.mlr.press/v193/gadd22a/gadd22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-gadd22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Charles family: Gadd - given: Krishnarajah family: Nirantharakumar - given: Christopher family: Yau editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 88-102 id: gadd22a issued: date-parts: - 2022 - 11 - 22 firstpage: 88 lastpage: 102 published: 2022-11-22 00:00:00 +0000 - title: 'Feature Allocation Approach for Multimorbidity Trajectory Modelling' abstract: 'A multimorbidity trajectory charts the time-dependent acquisition of disease conditions in an individual. This is important for understanding and managing patients who have a complex array of multiple chronic conditions, particularly later in life. We construct a novel probabilistic generative model for multimorbidity acquisition within a Bayesian framework of latent feature allocation, which allows an individual’s morbidity profile to be driven by multiple latent factors and allows the modelling of age-dependent multimorbidity trajectories. We demonstrate the utility of our model in applications to both simulated data and disease event data from patient electronic health records. In each setting, we show our model can reconstruct clinically meaningful latent multimorbidity patterns and their age-dependent prevalence trajectories.' volume: 193 URL: https://proceedings.mlr.press/v193/kim22a.html PDF: https://proceedings.mlr.press/v193/kim22a/kim22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-kim22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Woojung family: Kim - given: Paul A. family: Jenkins - given: Christopher family: Yau editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 103-119 id: kim22a issued: date-parts: - 2022 - 11 - 22 firstpage: 103 lastpage: 119 published: 2022-11-22 00:00:00 +0000 - title: 'Towards Cross-Modal Causal Structure and Representation Learning' abstract: 'Does the SARS-CoV-2 virus cause patients’ chest X-Rays ground-glass opacities? Does an IDH-mutation cause differences in patients’ MRI images? Conventional causal discovery algorithms, although well developed to uncover the cause-effect relationships on structured data, cannot elucidate causal relations between unstructured images and structured scalar variables due to the complexity of the former. In this paper, we consider causal discovery between images and structured (scalar) variables. Specifically, we derive low dimensional image representations to analyze with structured variables. We propose a two-module amortized variational algorithm named Cross-Modal Variational Causal representation and structure Learning (CMCL). CMCL jointly learns identifiable representations given a set of independent structured variables and causal relations via formulating latent representations and structured variables into a direct acyclic graph. Moreover, we further enforce counterfactual invariance/variance onto representations. We demonstrate that CMCL outperforms other related methods on synthetic datasets and validate causal relations on semi-synthetic datasets by visualization.' volume: 193 URL: https://proceedings.mlr.press/v193/mao22a.html PDF: https://proceedings.mlr.press/v193/mao22a/mao22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-mao22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Haiyi family: Mao - given: Hongfu family: Liu - given: Jason Xiaotian family: Dou - given: Panayiotis V. family: Benos editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 120-140 id: mao22a issued: date-parts: - 2022 - 11 - 22 firstpage: 120 lastpage: 140 published: 2022-11-22 00:00:00 +0000 - title: 'Identifying Heterogeneous Treatment Effects in Multiple Outcomes using Joint Confidence Intervals' abstract: 'Heterogeneous treatment effects (HTEs) are commonly identified during randomized controlled trials (RCTs). Identifying subgroups of patients with similar treatment effects is of high interest in clinical research to advance precision medicine. Often, multiple clinical outcomes are measured during an RCT, each having a potentially heterogeneous effect. Recently there has been high interest in identifying subgroups from HTEs, however, there has been less focus on developing tools in settings where there are multiple outcomes. In this work, we propose a framework for partitioning the covariate space to identify subgroups across multiple outcomes based on the joint CIs. We test our algorithm on synthetic and semi-synthetic data where there are two outcomes, and demonstrate that our algorithm is able to capture the HTE in both outcomes simultaneously.' volume: 193 URL: https://proceedings.mlr.press/v193/argaw22a.html PDF: https://proceedings.mlr.press/v193/argaw22a/argaw22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-argaw22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Peniel N. family: Argaw - given: Elizabeth family: Healey - given: Isaac S. family: Kohane editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 141-170 id: argaw22a issued: date-parts: - 2022 - 11 - 22 firstpage: 141 lastpage: 170 published: 2022-11-22 00:00:00 +0000 - title: 'Meta-analysis of individualized treatment rules via sign-coherency' abstract: 'Medical treatments tailored to a patient’s baseline characteristics hold the potential of improving patient outcomes while reducing negative side effects. Learning individualized treatment rules (ITRs) often requires aggregation of multiple datasets(sites); however, current ITR methodology does not take between-site heterogeneity into account, which can hurt model generalizability when deploying back to each site. To address this problem, we develop a method for individual-level meta-analysis of ITRs, which jointly learns site-specific ITRs while borrowing information about feature sign-coherency via a scientifically-motivated directionality principle. We also develop an adaptive procedure for model tuning, using information criteria tailored to the ITR learning problem. We study the proposed methods through numerical experiments to understand their performance under different levels of between-site heterogeneity and apply the methodology to estimate ITRs in a large multi-center database of electronic health records. This work extends several popular methodologies for estimating ITRs (A-learning, weighted learning) to the multiple-sites setting.' volume: 193 URL: https://proceedings.mlr.press/v193/cheng22a.html PDF: https://proceedings.mlr.press/v193/cheng22a/cheng22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-cheng22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Jay Jojo family: Cheng - given: Jared D. family: Huling - given: Guanhua family: Chen editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 171-198 id: cheng22a issued: date-parts: - 2022 - 11 - 22 firstpage: 171 lastpage: 198 published: 2022-11-22 00:00:00 +0000 - title: 'SleepQA: A Health Coaching Dataset on Sleep for Extractive Question Answering' abstract: 'Question Answering (QA) systems can support health coaches in facilitating clients’ lifestyle behavior changes (e.g., in adopting healthy sleep habits). In this paper, we design a domain-specific QA pipeline for sleep coaching. To this end, we release SleepQA, a dataset created from 7,005 passages comprising 4,250 training examples with single annotations and 750 examples with 5-way annotations. We fine-tuned different domain-specific BERT models on our dataset and perform extensive automatic and human evaluation of the resulting end-to-end QA pipeline. Comparisons of our pipeline with baseline show improvements in domain-specific natural language processing on real-world questions. We hope that this dataset will lead to wider research interest in this important health domain.' volume: 193 URL: https://proceedings.mlr.press/v193/bojic22a.html PDF: https://proceedings.mlr.press/v193/bojic22a/bojic22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-bojic22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Iva family: Bojic - given: Qi Chwen family: Ong - given: Megh family: Thakkar - given: Esha family: Kamran - given: Irving Yu Le family: Shua - given: Jaime Rei Ern family: Pang - given: Jessica family: Chen - given: Vaaruni family: Nayak - given: Shafiq family: Joty - given: Josip family: Car editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 199-217 id: bojic22a issued: date-parts: - 2022 - 11 - 22 firstpage: 199 lastpage: 217 published: 2022-11-22 00:00:00 +0000 - title: 'Extend and Explain: Interpreting Very Long Language Models' abstract: 'While Transformer language models (LMs) are state-of-the-art for information extraction, long text introduces computational challenges requiring suboptimal preprocessing steps or alternative model architectures. Sparse attention LMs can represent longer sequences, overcoming performance hurdles. However, it remains unclear how to explain predictions from these models, as not all tokens attend to each other in the self-attention layers, and long sequences pose computational challenges for explainability algorithms when runtime depends on document length. These challenges are severe in the medical context where documents can be very long, and machine learning (ML) models must be auditable and trustworthy. We introduce a novel Masked Sampling Procedure (MSP) to identify the text blocks that contribute to a prediction, apply MSP in the context of predicting diagnoses from medical text, and validate our approach with a blind review by two clinicians. Our method identifies $\approx 1.7\times$ more clinically informative text blocks than the previous state-of-the-art, runs up to $100\times$ faster, and is tractable for generating important phrase pairs. MSP is particularly well-suited to long LMs but can be applied to any text classifier. We provide a general implementation here. https://github.com/Optum/long-medical-document-lms' volume: 193 URL: https://proceedings.mlr.press/v193/stremmel22a.html PDF: https://proceedings.mlr.press/v193/stremmel22a/stremmel22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-stremmel22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Joel family: Stremmel - given: Brian L. family: Hill - given: Jeffrey family: Hertzberg - given: Jaime family: Murillo - given: Llewelyn family: Allotey - given: Eran family: Halperin editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 218-258 id: stremmel22a issued: date-parts: - 2022 - 11 - 22 firstpage: 218 lastpage: 258 published: 2022-11-22 00:00:00 +0000 - title: 'Counterfactual and Factual Reasoning over Hypergraphs for Interpretable Clinical Predictions on EHR' abstract: 'Electronic Health Record modeling is crucial for digital medicine. However, existing models ignore higher-order interactions among medical codes and their causal relations towards downstream clinical predictions. To address such limitations, we propose a novel framework CACHE, to provide effective and insightful clinical predictions based on hypergraph representation learning and counterfactual and factual reasoning techniques. Experiments on two real EHR datasets show the superior performance of CACHE. Case studies with a domain expert illustrate a preferred capability of CACHE in generating clinically meaningful interpretations towards the correct predictions.' volume: 193 URL: https://proceedings.mlr.press/v193/xu22a.html PDF: https://proceedings.mlr.press/v193/xu22a/xu22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-xu22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Ran family: Xu - given: Yue family: Yu - given: Chao family: Zhang - given: Mohammed K family: Ali - given: Joyce C family: Ho - given: Carl family: Yang editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 259-278 id: xu22a issued: date-parts: - 2022 - 11 - 22 firstpage: 259 lastpage: 278 published: 2022-11-22 00:00:00 +0000 - title: 'Neurodevelopmental Phenotype Prediction: A State-of-the-Art Deep Learning Model' abstract: 'A major challenge in medical image analysis is the automated detection of biomarkers from neuroimaging data. Traditional approaches, often based on image registration, are limited in capturing the high variability of cortical organisation across individuals. Deep learning methods have been shown to be successful in overcoming this difficulty, and some of them have even outperformed medical professionals on certain datasets. In this paper, we apply a deep neural network to analyse the cortical surface data of neonates, derived from the publicly available Developing Human Connectome Project (dHCP). Our goal is to identify neurodevelopmental biomarkers and to predict gestational age at birth based on these biomarkers. Using scans of preterm neonates acquired around the term-equivalent age, we were able to investigate the impact of preterm birth on cortical growth and maturation during late gestation. Besides reaching state-of-the-art prediction accuracy, the proposed model has much fewer parameters than the baselines, and its error stays low on both unregistered and registered cortical surfaces.' volume: 193 URL: https://proceedings.mlr.press/v193/unyi22a.html PDF: https://proceedings.mlr.press/v193/unyi22a/unyi22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-unyi22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Dániel family: Unyi - given: Bálint family: Gyires-Tóth editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 279-289 id: unyi22a issued: date-parts: - 2022 - 11 - 22 firstpage: 279 lastpage: 289 published: 2022-11-22 00:00:00 +0000 - title: 'Analysing the effectiveness of a generative model for semi-supervised medical image segmentation' abstract: 'Image segmentation is important in medical imaging, providing valuable, quantitative information for clinical decision-making in diagnosis, therapy, and intervention. The state-of-the-art in automated segmentation remains supervised learning, employing discriminative models such as U-Net. However, training these models requires access to large amounts of manually labelled data which is often difficult to obtain in real medical applications. In such settings, semi-supervised learning (SSL) attempts to leverage the abundance of unlabelled data to obtain more robust and reliable models. Recently, generative models have been proposed for semantic segmentation, as they make an attractive choice for SSL. Their ability to capture the joint distribution over input images and output label maps provides a natural way to incorporate information from unlabelled images. This paper analyses whether deep generative models such as the SemanticGAN are truly viable alternatives to tackle challenging medical image segmentation problems. To that end, we thoroughly evaluate the segmentation performance, robustness, and potential subgroup disparities of discriminative and generative segmentation methods when applied to large-scale, publicly available chest X-ray datasets.' volume: 193 URL: https://proceedings.mlr.press/v193/rosnati22a.html PDF: https://proceedings.mlr.press/v193/rosnati22a/rosnati22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-rosnati22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Margherita family: Rosnati - given: Fabio De Sousa family: Ribeiro - given: Miguel family: Monteiro - given: Daniel Coelho prefix: de family: Castro - given: Ben family: Glocker editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 290-310 id: rosnati22a issued: date-parts: - 2022 - 11 - 22 firstpage: 290 lastpage: 310 published: 2022-11-22 00:00:00 +0000 - title: 'An Extensive Data Processing Pipeline for MIMIC-IV' abstract: 'An increasing amount of research is being devoted to applying machine learning methods to electronic health record (EHR) data for various clinical purposes. This growing area of research has exposed the challenges of the accessibility of EHRs. MIMIC is a popular, public, and free EHR dataset in a raw format that has been used in numerous studies. The absence of standardized preprocessing steps can be, however, a significant barrier to the wider adoption of this rare resource. Additionally, this absence can reduce the reproducibility of the developed tools and limit the ability to compare the results among similar studies. In this work, we provide a greatly customizable pipeline to extract, clean, and preprocess the data available in the fourth version of the MIMIC dataset (MIMIC-IV). The pipeline also presents an end-to-end wizard-like package supporting predictive model creations and evaluations. The pipeline covers a range of clinical prediction tasks which can be broadly classified into four categories - readmission, length of stay, mortality, and phenotype prediction. The tool is publicly available at https://github.com/healthylaife/MIMIC-IV-Data-Pipeline.' volume: 193 URL: https://proceedings.mlr.press/v193/gupta22a.html PDF: https://proceedings.mlr.press/v193/gupta22a/gupta22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-gupta22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Mehak family: Gupta - given: Brennan family: Gallamoza - given: Nicolas family: Cutrona - given: Pranjal family: Dhakal - given: Raphael family: Poulain - given: Rahmatollah family: Beheshti editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 311-325 id: gupta22a issued: date-parts: - 2022 - 11 - 22 firstpage: 311 lastpage: 325 published: 2022-11-22 00:00:00 +0000 - title: 'Predicting attrition patterns from pediatric weight management programs' abstract: 'Obesity is a major public health concern. Multidisciplinary pediatric weight management programs are considered standard treatment for children with obesity who are not able to be successfully managed in the primary care setting. Despite their great potential, high dropout rates (referred to as attrition) are a major hurdle in delivering successful interventions. Predicting attrition patterns can help providers reduce the alarmingly high rates of attrition (up to 80%) by engaging in earlier and more personalized interventions. Previous work has mainly focused on finding static predictors of attrition on smaller datasets and has achieved limited success in effective prediction. In this study, we have collected a five-year comprehensive dataset of 4,550 children from diverse backgrounds receiving treatment at four pediatric weight management programs in the US. We then developed a machine learning pipeline to predict (a) the likelihood of attrition, and (b) the change in body-mass index (BMI) percentile of children, at different time points after joining the weight management program. Our pipeline is greatly customized for this problem using advanced machine learning techniques to process longitudinal data, smaller-size data, and interrelated prediction tasks. The proposed method showed strong prediction performance as measured by AUROC scores (average AUROC of 0.77 for predicting attrition, and 0.78 for predicting weight outcomes).' volume: 193 URL: https://proceedings.mlr.press/v193/fayyaz22a.html PDF: https://proceedings.mlr.press/v193/fayyaz22a/fayyaz22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-fayyaz22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Hamed family: Fayyaz - given: Thao-Ly T. family: Phan - given: H. Timothy family: Bunnell - given: Rahmatollah family: Beheshti editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 326-342 id: fayyaz22a issued: date-parts: - 2022 - 11 - 22 firstpage: 326 lastpage: 342 published: 2022-11-22 00:00:00 +0000 - title: 'Automated LOINC Standardization Using Pre-trained Large Language Models' abstract: 'Harmonization of local source concepts to standard clinical terminologies is a prerequisite for multi-center data aggregation and sharing. Challenges in automating the mapping process stem from the idiosyncratic source encoding schemes adopted by different health systems and the lack of large publicly available training data. In this study, we aim to develop a scalable and generalizable machine learning tool to facilitate standardizing laboratory observations to the Logical Observation Identifiers Names and Codes (LOINC). Specifically, we leverage the contextual embedding from pre-trained T5 models and propose a two-stage fine-tuning strategy based on contrastive learning to enable learning in a few-shot setting without manual feature engineering. Our method utilizes unlabeled general LOINC ontology and data augmentation to achieve high accuracy on retrieving the most relevant LOINC targets when limited amount of labeled data are available. We further show that our model generalizes well to unseen targets. Taken together, our approach shows great potential to reduce manual effort in LOINC standardization and can be easily extended to mapping other terminologies.' volume: 193 URL: https://proceedings.mlr.press/v193/tu22a.html PDF: https://proceedings.mlr.press/v193/tu22a/tu22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-tu22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Tao family: Tu - given: Eric family: Loreaux - given: Emma family: Chesley - given: Adam D. family: Lelkes - given: Paul family: Gamble - given: Mathias family: Bellaiche - given: Martin family: Seneviratne - given: Ming-Jun family: Chen editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 343-355 id: tu22a issued: date-parts: - 2022 - 11 - 22 firstpage: 343 lastpage: 355 published: 2022-11-22 00:00:00 +0000 - title: 'An Empirical Study on Activity Recognition in Long Surgical Videos' abstract: 'Activity recognition in surgical videos is a key research area for developing next-generation devices and workflow monitoring systems. Since surgeries are long processes with highly-variable lengths, deep learning models used for surgical videos often consist of a two-stage setup using a backbone and temporal sequence model. In this paper, we investigate many state-of-the-art backbones and temporal models to find architectures that yield the strongest performance for surgical activity recognition. We first benchmark the models performance on a large-scale activity recognition dataset containing over 800 surgery videos captured in multiple clinical operating rooms. We further evaluate the models on the two smaller public datasets, the Cholec80 and Cataract-101 datasets, containing only 80 and 101 videos respectively. We empirically found that Swin-Transformer+BiGRU temporal model yielded strong performance on both datasets. Finally, we investigate the adaptability of the model to new domains by fine-tuning models to a new hospital and experimenting with a recent unsupervised domain adaptation approach.' volume: 193 URL: https://proceedings.mlr.press/v193/he22a.html PDF: https://proceedings.mlr.press/v193/he22a/he22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-he22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Zhuohong family: He - given: Ali family: Mottaghi - given: Aidean family: Sharghi - given: Muhammad Abdullah family: Jamal - given: Omid family: Mohareri editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 356-372 id: he22a issued: date-parts: - 2022 - 11 - 22 firstpage: 356 lastpage: 372 published: 2022-11-22 00:00:00 +0000 - title: 'OSLAT: Open Set Label Attention Transformer for Medical Entity Retrieval and Span Extraction' abstract: 'Medical entity span extraction and linking are critical steps for many healthcare NLP tasks. Most existing entity extraction methods either have a fixed vocabulary of medical entities or require span annotations. In this paper, we propose a method for linking an open set of entities that does not require any span annotations. Our method, Open Set Label Attention Transformer (OSLAT), uses the label-attention mechanism to learn candidate-entity contextualized text representations. We find that OSLAT can not only link entities but is also able to implicitly learn spans associated with entities. We evaluate OSLAT on two tasks: (1) span extraction trained without explicit span annotations, and (2) entity linking trained without span-level annotation. We test the generalizability of our method by training two separate models on two datasets with low entity overlap and comparing cross-dataset performance.' volume: 193 URL: https://proceedings.mlr.press/v193/li22a.html PDF: https://proceedings.mlr.press/v193/li22a/li22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-li22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Raymond family: Li - given: Ilya family: Valmianski - given: Li family: Deng - given: Xavier family: Amatriain - given: Anitha family: Kannan editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 373-390 id: li22a issued: date-parts: - 2022 - 11 - 22 firstpage: 373 lastpage: 390 published: 2022-11-22 00:00:00 +0000 - title: 'Adapting Pre-trained Vision Transformers from 2D to 3D through Weight Inflation Improves Medical Image Segmentation' abstract: 'Given the prevalence of 3D medical imaging technologies such as MRI and CT that are widely used in diagnosing and treating diverse diseases, 3D segmentation is one of the fundamental tasks of medical image analysis. Recently, Transformer-based models have started to achieve state-of-the-art performances across many vision tasks, through pre-training on large-scale natural image benchmark datasets. While works on medical image analysis have also begun to explore Transformer-based models, there is currently no optimal strategy to effectively leverage pre-trained Transformers, primarily due to the difference in dimensionality between 2D natural images and 3D medical images. Existing solutions either split 3D images into 2D slices and predict each slice independently, thereby losing crucial depth-wise information, or modify the Transformer architecture to support 3D inputs without leveraging pre-trained weights. In this work, we use a simple yet effective weight inflation strategy to adapt pre-trained Transformers from 2D to 3D, retaining the benefit of both transfer learning and depth information. We further investigate the effectiveness of transfer from different pre-training sources and objectives. Our approach achieves state-of-the-art performances across a broad range of 3D medical image datasets, and can become a standard strategy easily utilized by all work on Transformer-based models for 3D medical images, to maximize performance.' volume: 193 URL: https://proceedings.mlr.press/v193/zhang22a.html PDF: https://proceedings.mlr.press/v193/zhang22a/zhang22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-zhang22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Yuhui family: Zhang - given: Shih-Cheng family: Huang - given: Zhengping family: Zhou - given: Matthew P. family: Lungren - given: Serena family: Yeung editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 391-404 id: zhang22a issued: date-parts: - 2022 - 11 - 22 firstpage: 391 lastpage: 404 published: 2022-11-22 00:00:00 +0000 - title: 'Hyper-AdaC: Adaptive clustering-based hypergraph representation of whole slide images for survival analysis' abstract: 'The emergence of deep learning in the medical field has popularized the development of models to predict survival outcomes from histopathology images in precision oncology. Graph-based formalism has opened interesting perspectives for generating informative representations, as they can be context-aware and model local and global topological structures in the tumor’s microenvironment. However, the critical issue in using graph representations lies in their generalizability. They can suffer from overfitting due to their large sizes or high discrepancies between nodes due to random sampling from WSI. In addition, standard graph formulations are limited to pairwise interactions, which can sometimes fail to represent the reality observed in histopathology and hinder the interpretability of those interactions. In this work, we present Hyper-AdaC, an adaptive clustering-based hypergraph representation to model high-order correlations among different regions of the WSIs while being compact enough to help graph neural networks generalize in the case of survival prediction. We evaluate our approach on $5$ different public available cancer datasets. Our method outperforms most state-of-the-art graph-based methods for survival prediction with WSIs, creating a more efficient and robust alternative to other graph representations. Moreover, due to our formulation, attention maps are depicted at different resolutions depending on the tissue characteristics of each WSI. The code is available at: https://github.com/HakimBenkirane/Hyper-adaC.' volume: 193 URL: https://proceedings.mlr.press/v193/benkirane22a.html PDF: https://proceedings.mlr.press/v193/benkirane22a/benkirane22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-benkirane22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Hakim family: Benkirane - given: Maria family: Vakalopoulou - given: Stergios family: Christodoulidis - given: Ingrid-Judith family: Garberis - given: Stefan family: Michiels - given: Paul-Henry family: Cournède editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 405-418 id: benkirane22a issued: date-parts: - 2022 - 11 - 22 firstpage: 405 lastpage: 418 published: 2022-11-22 00:00:00 +0000 - title: 'Differentiable programming for functional connectomics' abstract: 'Mapping the functional connectome has the potential to uncover key insights into brain organisation. However, existing workflows for functional connectomics are limited in their adaptability to new data, and principled workflow design is a challenging combinatorial problem. We introduce an analytic paradigm that implements common operations used in functional connectomics as fully differentiable processing blocks. Under this paradigm, workflow configurations exist as reparameterisations of a differentiable functional that interpolates them. The differentiable program that we ultimately envision occupies a niche midway between traditional pipelines and end-to-end neural networks, combining the glass-box tractability and domain knowledge of the former with the amenability to optimisation of the latter. In this preliminary work, we provide a proof of concept for differentiable connectomics, demonstrating the capacity of our processing blocks across three separate problem domains critically important to brain mapping. We also provide a software library to facilitate adoption. Our differentiable framework is competitive with state-of-the-art methods in functional brain parcellation, time series denoising, and covariance modelling. Taken together, our results demonstrate the promise of differentiable programming for functional connectomics.' volume: 193 URL: https://proceedings.mlr.press/v193/ciric22a.html PDF: https://proceedings.mlr.press/v193/ciric22a/ciric22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-ciric22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Rastko family: Ciric - given: Armin W. family: Thomas - given: Oscar family: Esteban - given: Russell A. family: Poldrack editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 419-455 id: ciric22a issued: date-parts: - 2022 - 11 - 22 firstpage: 419 lastpage: 455 published: 2022-11-22 00:00:00 +0000 - title: 'Improving Radiology Report Generation Systems by Removing Hallucinated References to Non-existent Priors' abstract: 'Current deep learning models trained to generate radiology reports from chest radiographs are capable of producing clinically accurate, clear, and actionable text that can advance patient care. However, such systems all succumb to the same problem: making hallucinated references to non-existent prior reports. Such hallucinations occur because these models are trained on datasets of real-world patient reports that inherently refer to priors. To this end, we propose two methods to remove references to priors in radiology reports: (1) a GPT-3-based few-shot approach to rewrite medical reports without references to priors; and (2) a BioBERT-based token classification approach to directly remove words referring to priors. We use the aforementioned approaches to modify MIMIC-CXR, a publicly available dataset of chest X-rays and their associated free-text radiology reports; we then retrain CXR-RePaiR, a radiology report generation system, on the adapted MIMIC-CXR dataset. We find that our re-trained model—which we call CXR-ReDonE—outperforms previous report generation methods on clinical metrics, achieving an average BERTScore of 0.2351 ($2.57%$ absolute improvement). We expect our approach to be broadly valuable in enabling current radiology report generation systems to be more directly integrated into clinical pipelines. ' volume: 193 URL: https://proceedings.mlr.press/v193/ramesh22a.html PDF: https://proceedings.mlr.press/v193/ramesh22a/ramesh22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-ramesh22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Vignav family: Ramesh - given: Nathan A. family: Chi - given: Pranav family: Rajpurkar editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 456-473 id: ramesh22a issued: date-parts: - 2022 - 11 - 22 firstpage: 456 lastpage: 473 published: 2022-11-22 00:00:00 +0000 - title: 'Improving Sepsis Prediction Model Generalization With Optimal Transport' abstract: 'Sepsis is a deadly condition affecting many patients in the hospital. There have been many efforts to build models that predict the onset of sepsis, but these models tend to perform terribly when validated on external data from different hospitals due to distributional shifts in the data and insufficient samples from sepsis patients. To circumvent the curse from noisy and unbalanced samples, we develop a novel two-step approach for sepsis prediction: given feature-label points from the source domain and feature points from the target domain, to obtain a sepsis predictor that has satisfactory performance at the target domain. The proposed algorithm first learns how to transform sample points from the source domain to the target domain, and then applies the distributionally robust optimization (DRO) technique with the Sinkhorn distance and asymmetric cost function to reliably obtain a classifier with satisfactory out-of-sample performance. Connections between our proposed formulation and widely used classification models, i.e., DRO formulation with the Wasserstein distance and regularized logistic regression formulation, are also uncovered. Numerical experiments with synthetic and real datasets demonstrate the competitive performance of the proposed method.' volume: 193 URL: https://proceedings.mlr.press/v193/wang22a.html PDF: https://proceedings.mlr.press/v193/wang22a/wang22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-wang22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Jie family: Wang - given: Ronald family: Moore - given: Yao family: Xie - given: Rishikesan family: Kamaleswaran editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 474-488 id: wang22a issued: date-parts: - 2022 - 11 - 22 firstpage: 474 lastpage: 488 published: 2022-11-22 00:00:00 +0000 - title: 'A Path Towards Clinical Adaptation of Accelerated MRI' abstract: 'Accelerated MRI reconstructs images of clinical anatomies from sparsely sampled signal data to reduce patient scan times. While recent works have leveraged deep learning to accomplish this task, such approaches have often only been explored in simulated environments where there is no signal corruption or resource limitations. In this work, we explore augmentations to neural network MRI image reconstructors to enhance their clinical relevancy. Namely, we propose a ConvNet model for detecting sources of image artifacts that achieves a classifier $\mathit{F}_{2}$ score of 79.1%. We also demonstrate that training reconstructors on MR signal data with variable acceleration factors can improve their average performance during a clinical patient scan by up to 2%. We offer a loss function to overcome catastrophic forgetting when models learn to reconstruct MR images of multiple anatomies and orientations. Finally, we propose a method for using simulated phantom data to pre-train reconstructors in situations with limited clinically acquired datasets and compute capabilities. Our results provide a potential path forward for clinical adaptation of accelerated MRI.' volume: 193 URL: https://proceedings.mlr.press/v193/yao22a.html PDF: https://proceedings.mlr.press/v193/yao22a/yao22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-yao22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Michael S. family: Yao - given: Michael S. family: Hansen editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 489-511 id: yao22a issued: date-parts: - 2022 - 11 - 22 firstpage: 489 lastpage: 511 published: 2022-11-22 00:00:00 +0000 - title: 'Machine and Deep Learning Methods for Predicting Immune Checkpoint Blockade Response' abstract: 'Immune checkpoint blockade (ICB) therapy has improved treatment options in various cancer malignancies and holds promise for increasing the overall survival of treated patients. However, only a small proportion of patients benefit from ICB treatment. Furthermore, ICB therapy has been known to induce adverse autoimmunity reactions in certain patients. These two reasons motivate the clinical need to identify factors that predict a patient’s response to ICB treatment. In our study, we developed several machine and deep learning-based models to predict response to ICB treatment, using a real-world tabular dataset across sixteen cancer types. We showed that our best model CB16, which is based on gradient boosting, outperforms all-known published results for this task, with sensitivity and specificity scores of 80.6% and 78.8% respectively. Our model also offers insights to clinical interpretability through the use of the SHAP explanation framework, which are consistent with known important predictors. Next, in order to see if deep learning can improve performance, we propose a methodology for the design of deep neural networks that addresses the lack of spatial and temporal structure in tabular data. Our approach is based on a combination of learning ordered representations and ensembling techniques. We show that, for the ICB prediction problem, current SOTA deep-learning architectures such as TabNet and TabTransformer do not perform well while our method achieves good performance. Our method achieves an F1 score 12.4 percentage points beyond that of TabTransformer, and sensitivity and specificity scores of 77.3% and 62.2% respectively. Through our work, we hope to improve the task of predicting ICB response, and contribute towards the creation of high-performance and interpretable AI models for real-world tabular data.' volume: 193 URL: https://proceedings.mlr.press/v193/ho22a.html PDF: https://proceedings.mlr.press/v193/ho22a/ho22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-ho22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Danliang family: Ho - given: Mehul family: Motani editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 512-529 id: ho22a issued: date-parts: - 2022 - 11 - 22 firstpage: 512 lastpage: 529 published: 2022-11-22 00:00:00 +0000 - title: 'Deep Kernel Learning with Temporal Gaussian Processes for Clinical Variable Prediction in Alzheimer’s Disease' abstract: 'Longitudinal prediction of Alzheimer’s disease progression is of high importance for early diagnosis and clinical trial design. We propose to predict the longitudinal changes of neuroimaging biomarkers and cognitive scores by leveraging the expressivity of Deep Kernel Learning with single-task Gaussian Processes. The temporal function that describes the progression of the biomarker is learned through a Gaussian Process. By learning these temporal functions we can predict any future value of a clinical variable. We apply our method for extrapolation of neuroimaging biomarkers, SPARE-AD index, and cognitive metric Adas-Cog13, both significant predictors for the pathological and cognitive changes of Alzheimer’s Disease. The method has been validated in two cohorts, ADNI and BLSA, where the results show that the proposed method significantly outperforms baselines and state-of-the-art models in AD progression prediction both on providing point estimates and quantifying uncertainty.' volume: 193 URL: https://proceedings.mlr.press/v193/tassopoulou22a.html PDF: https://proceedings.mlr.press/v193/tassopoulou22a/tassopoulou22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-tassopoulou22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Vasiliki family: Tassopoulou - given: Fanyang family: Yu - given: Christos family: Davatzikos editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 539-551 id: tassopoulou22a issued: date-parts: - 2022 - 11 - 22 firstpage: 539 lastpage: 551 published: 2022-11-22 00:00:00 +0000 - title: 'Instability in clinical risk stratification models using deep learning' abstract: 'While it has been well known in the ML community that deep learning models suffer from instability, the consequences for healthcare deployments are under characterised. We study the stability of different model architectures trained on electronic health records, using a set of outpatient prediction tasks as a case study. We show that repeated training runs of the same deep learning model on the same training data can result in significantly different outcomes at a patient level even though global performance metrics remain stable. We propose two stability metrics for measuring the effect of randomness of model training, as well as mitigation strategies for improving model stability. ' volume: 193 URL: https://proceedings.mlr.press/v193/lopez-martinez22a.html PDF: https://proceedings.mlr.press/v193/lopez-martinez22a/lopez-martinez22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-lopez-martinez22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Daniel family: Lopez-Martinez - given: Alex family: Yakubovich - given: Martin family: Seneviratne - given: Adam D. family: Lelkes - given: Akshit family: Tyagi - given: Jonas family: Kemp - given: Ethan family: Steinberg - given: N. Lance family: Downing - given: Ron C. family: Li - given: Keith E. family: Morse - given: Nigam H. family: Shah - given: Ming-Jun family: Chen editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 552-565 id: lopez-martinez22a issued: date-parts: - 2022 - 11 - 22 firstpage: 552 lastpage: 565 published: 2022-11-22 00:00:00 +0000 - title: 'A for-loop is all you need. For solving the inverse problem in the case of personalized tumor growth modeling' abstract: 'Solving the inverse problem is the key step in evaluating the capacity of a physical model to describe real phenomena. In medical image computing, it aligns with the classical theme of image-based model personalization. Traditionally, a solution to the problem is obtained by performing either sampling or variational inference based methods. Both approaches aim to identify a set of free physical model parameters that results in a simulation best matching an empirical observation. When applied to brain tumor modeling, one of the instances of image-based model personalization in medical image computing, the overarching drawback of the methods is the time complexity of finding such a set. In a clinical setting with limited time between imaging and diagnosis or even intervention, this time complexity may prove critical. As the history of quantitative science is the history of compression (Schmidhuber and Fridman, 2018), we align in this paper with the historical tendency and propose a method compressing complex traditional strategies for solving an inverse problem into a simple database query task. We evaluated different ways of performing the database query task assessing the trade-off between accuracy and execution time. On the exemplary task of brain tumor growth modeling, we prove that the proposed method achieves one order speed-up compared to existing approaches for solving the inverse problem. The resulting compute time offers critical means for relying on more complex and, hence, realistic models, for integrating image preprocessing and inverse modeling even deeper, or for implementing the current model into a clinical workflow. The code is available at https://github.com/IvanEz/for-loop-tumor.' volume: 193 URL: https://proceedings.mlr.press/v193/ezhov22a.html PDF: https://proceedings.mlr.press/v193/ezhov22a/ezhov22a.pdf edit: https://github.com/mlresearch//v193/edit/gh-pages/_posts/2022-11-22-ezhov22a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of the 2nd Machine Learning for Health symposium' publisher: 'PMLR' author: - given: Ivan family: Ezhov - given: Marcel family: Rosier - given: Lucas family: Zimmer - given: Florian family: Kofler - given: Suprosanna family: Shit - given: Johannes C. family: Paetzold - given: Kevin family: Scibilia - given: Felix family: Steinbauer - given: Leon family: Maechler - given: Katharina family: Franitza - given: Tamaz family: Amiranashvili - given: Martin J. family: Menten - given: Marie family: Metz - given: Sailesh family: Conjeti - given: Benedikt family: Wiestler - given: Bjoern family: Menze editor: - given: Antonio family: Parziale - given: Monica family: Agrawal - given: Shalmali family: Joshi - given: Irene Y. family: Chen - given: Shengpu family: Tang - given: Luis family: Oala - given: Adarsh family: Subbaswamy page: 566-577 id: ezhov22a issued: date-parts: - 2022 - 11 - 22 firstpage: 566 lastpage: 577 published: 2022-11-22 00:00:00 +0000