- title: 'Machine Learning for Health (ML4H) 2021' volume: 158 URL: https://proceedings.mlr.press/v158/roy21a.html PDF: https://proceedings.mlr.press/v158/roy21a/roy21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-roy21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 1-12 id: roy21a issued: date-parts: - 2021 - 11 - 28 firstpage: 1 lastpage: 12 published: 2021-11-28 00:00:00 +0000 - title: 'Question Answering for Complex Electronic Health Records Database using Unified Encoder-Decoder Architecture' abstract: 'An intelligent machine that can answer human questions based on electronic health records (EHR-QA) has a great practical value, such as supporting clinical decisions, managing hospital administration, and medical chatbots. Previous table-based QA studies focusing on translating natural questions into table queries (NLQ2SQL), however, suffer from the unique nature of EHR data due to complex and specialized medical terminology, hence increased decoding difficulty. In this paper, we design UniQA, a unified encoder-decoder architecture for EHR-QA where natural language questions are converted to queries such as SQL or SPARQL. We also propose input masking (IM), a simple and effective method to cope with complex medical terms and various typos and better learn the SQL/SPARQL syntax. Combining the unified architecture with an effective auxiliary training objective, UniQA demonstrated a significant performance improvement against the previous state-of-the-art model for MIMICSQL* (14.2% gain), the most complex NLQ2SQL dataset in the EHR domain, and its typo-ridden versions ( 28.8% gain). In addition, we confirmed consistent results for the graph-based EHR-QA dataset, MIMICSPARQL*.' volume: 158 URL: https://proceedings.mlr.press/v158/bae21a.html PDF: https://proceedings.mlr.press/v158/bae21a/bae21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-bae21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Seongsu family: Bae - given: Daeyoung family: Kim - given: Jiho family: Kim - given: Edward family: Choi editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 13-25 id: bae21a issued: date-parts: - 2021 - 11 - 28 firstpage: 13 lastpage: 25 published: 2021-11-28 00:00:00 +0000 - title: 'Attention Distillation for Detection Transformers: Application to Real-Time Video Object Detection in Ultrasound' abstract: 'We introduce a method for efficient knowledge distillation of transformer-based object detectors. The proposed “attention distillation” makes use of the self-attention matrices generated within the layers of the state-of-art detection transformer (DETR) model. Localization information from the attention maps of a large teacher network are distilled into smaller student networks capable of running at much higher speeds. We further investigate distilling spatio-temporal information captured by 3D detection transformer networks into 2D object detectors that only process single frames. We apply the approach to the clinically important problem of detecting medical instruments in real-time from ultrasound video sequences, where inference speed is critical on computationally resource-limited hardware. We observe that, via attention distillation, student networks are able to approach the detection performance of larger teacher networks, while meeting strict computational requirements. Experiments demonstrate notable gains in accuracy and speed compared to detection transformer models trained without attention distillation.' volume: 158 URL: https://proceedings.mlr.press/v158/rubin21a.html PDF: https://proceedings.mlr.press/v158/rubin21a/rubin21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-rubin21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Jonathan family: Rubin - given: Ramon family: Erkamp - given: Ragha Srinivasa family: Naidu - given: Anumod Odungatta family: Thodiyil - given: Alvin family: Chen editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 26-37 id: rubin21a issued: date-parts: - 2021 - 11 - 28 firstpage: 26 lastpage: 37 published: 2021-11-28 00:00:00 +0000 - title: 'Towards Explainable End-to-End Prostate Cancer Relapse Prediction from H&E Images Combining Self-Attention Multiple Instance Learning with a Recurrent Neural Network' abstract: 'Clinical decision support for histopathology image data mainly focuses on strongly supervised annotations, which offers intuitive interpretability, but is bound by expert performance. Here, we propose an explainable cancer relapse prediction network (eCaReNet) and show that end-to-end learning without strong annotations offers state-of-the-art performance while interpretability can be included through an attention mechanism. On the use case of prostate cancer survival prediction, using 14,479 images and only relapse times as annotations, we reach a cumulative dynamic AUC of 0.78 on a validation set, being on par with an expert pathologist (and an AUC of 0.77 on a separate test set). Our model is well-calibrated and outputs survival curves as well as a risk score and group per patient. Making use of the attention weights of a multiple instance learning layer, we show that malignant patches have a higher influence on the prediction than benign patches, thus offering an intuitive interpretation of the prediction. Our code is available at www.github.com/imsb-uke/ecarenet.' volume: 158 URL: https://proceedings.mlr.press/v158/dietrich21a.html PDF: https://proceedings.mlr.press/v158/dietrich21a/dietrich21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-dietrich21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Esther family: Dietrich - given: Patrick family: Fuhlert - given: Anne family: Ernst - given: Guido family: Sauter - given: Maximilian family: Lennartz - given: H. Siegfried family: Stiehl - given: Marina family: Zimmermann - given: Stefan family: Bonn editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 38-53 id: dietrich21a issued: date-parts: - 2021 - 11 - 28 firstpage: 38 lastpage: 53 published: 2021-11-28 00:00:00 +0000 - title: 'How Transferable are Self-supervised Features in Medical Image Classification Tasks?' abstract: 'Transfer learning has become a standard practice to mitigate the lack of labeled data in medical classification tasks. Whereas finetuning a downstream task using supervised ImageNet pretrained features is straightforward and extensively investigated in many works, there is little study on the usefulness of self-supervised pretraining. This paper assesses the transferability of the most recent self-supervised ImageNet models, including SimCLR, SwAV, and DINO, on selected medical imaging classification tasks. The chosen tasks cover tumor detection in sentinel axillary lymph node images, diabetic retinopathy classification in fundus images, and multiple pathological condition classification in chest X-ray images. We demonstrate that self-supervised pretrained models yield richer embeddings than their supervised counterparts, benefiting downstream tasks for linear evaluation and finetuning. For example, at a critically small subset of the data with linear evaluation, we see an improvement up to 14.79% in Kappa score in the diabetic retinopathy classification task, 5.4% in AUC in the tumor classification task, 7.03% AUC in the pneumonia detection, and 9.4% in AUC in the detection of pathological conditions in chest X-ray. In addition, we introduce Dynamic Visual Meta-Embedding (DVME) as an end-to-end transfer learning approach that fuses pretrained embeddings from multiple models. We show that the collective representation obtained by DVME leads to a significant improvement in the performance of selected tasks compared to using a single pretrained model approach and can be generalized to any combination of pretrained models.' volume: 158 URL: https://proceedings.mlr.press/v158/truong21a.html PDF: https://proceedings.mlr.press/v158/truong21a/truong21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-truong21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Tuan family: Truong - given: Sadegh family: Mohammadi - given: Matthias family: Lenga editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 54-74 id: truong21a issued: date-parts: - 2021 - 11 - 28 firstpage: 54 lastpage: 74 published: 2021-11-28 00:00:00 +0000 - title: 'SmartTriage: A system for personalized patient data capture, documentation generation, and decision support' abstract: 'Symptom checkers have emerged as an important tool for collecting symptoms and diagnosing patients, minimizing the involvement of clinical personnel. We developed a machine-learning-backed system, SmartTriage, which goes beyond conventional symptom checking through a tight bi-directional integration with the electronic medical record (EMR). Conditioned on EMR-derived patient history, our system identifies the patient’s chief complaint from a free-text entry and then asks a series of discrete questions to obtain relevant symptomatology. The patient-specific data are used to predict detailed ICD-10-CM codes as well as medication, laboratory, and imaging orders. Patient responses and clinical decision support (CDS) predictions are then inserted back into the EMR. To train the machine learning components of SmartTriage, we employed novel data sets of over 25 million primary care encounters and 1 million patient free-text reason-for-visit entries. These data sets were used to construct: (1) a long short-term memory (LSTM) based patient history representation, (2) a fine-tuned transformer model for chief complaint extraction, (3) a random forest model for question sequencing, and (4) a feed-forward network for CDS predictions. In total, our system supports 337 patient chief complaints, which together make up >90% of all primary care encounters at aiser Permanente.' volume: 158 URL: https://proceedings.mlr.press/v158/valmianski21a.html PDF: https://proceedings.mlr.press/v158/valmianski21a/valmianski21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-valmianski21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Ilya family: Valmianski - given: Nave family: Frost - given: Navdeep family: Sood - given: Yang family: Wang - given: Baodong family: Liu - given: James J. family: Zhu - given: Sunil family: Karumuri - given: Ian M. family: Finn - given: Daniel S. family: Zisook editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 75-96 id: valmianski21a issued: date-parts: - 2021 - 11 - 28 firstpage: 75 lastpage: 96 published: 2021-11-28 00:00:00 +0000 - title: 'Prognosticating Colorectal Cancer Recurrence using an Interpretable Deep Multi-view Network' abstract: 'Colorectal cancer (CRC) is among the top three most common cancers worldwide, and around 30-50% of patients who have undergone curative-intent surgery will eventually develop recurrence. Early and accurate detection of cancer recurrence is essential to improve the health outcomes of patients. In our study, we propose an explainable multi-view deep neural network capable of extracting and integrating features from heterogeneous healthcare records. Our model takes in inputs from multiple views and comprises: 1) two subnetworks adapted to extract high quality features from time-series and tabular data views, and 2) a network that combines the two outputs and predicts CRC recurrence. Our model achieves an AUROC score of 0.95, and precision, sensitivity and specificity scores of 0.84, 0.82 and 0.96 respectively, outperforming all-known published results based on the commonly-used CEA prognostic marker, as well as that of most commercially available diagnostic assays. We explain our model’s decision by highlighting important features within both data views that contribute to the outcome, using SHAP with a novel workaround that alleviates assumptions on feature independence. Through our work, we hope to contribute to the adoption of AI in healthcare by creating accurate and interpretable models, leading to better post-operative management of CRC patients.' volume: 158 URL: https://proceedings.mlr.press/v158/ho21a.html PDF: https://proceedings.mlr.press/v158/ho21a/ho21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-ho21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Danliang family: Ho - given: Iain Bee Huat family: Tan - given: Mehul family: Motani editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 97-109 id: ho21a issued: date-parts: - 2021 - 11 - 28 firstpage: 97 lastpage: 109 published: 2021-11-28 00:00:00 +0000 - title: 'MEDCOD: A Medically-Accurate, Emotive, Diverse, and Controllable Dialog System' abstract: 'We present MEDCOD, a Medically-Accurate, Emotive, Diverse, and Controllable Dialog system with a unique approach to the natural language generator module. MEDCOD has been developed and evaluated specifically for the history taking task. It integrates the advantage of a traditional modular approach to incorporate (medical) domain knowledge with modern deep learning techniques to generate flexible, human-like natural language expressions. Two key aspects of MEDCOD’s natural language output are described in detail. First, the generated sentences are emotive and empathetic, similar to how a doctor would communicate to the patient. Second, the generated sentence structures and phrasings are varied and diverse while maintaining medical consistency with the desired medical concept (provided by the dialogue manager module of MEDCOD). Experimental results demonstrate the effectiveness of our approach in creating a human-like medical dialogue system. Relevant code is available at https://github.com/curai/curai-research/tree/main/MEDCOD.' volume: 158 URL: https://proceedings.mlr.press/v158/compton21a.html PDF: https://proceedings.mlr.press/v158/compton21a/compton21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-compton21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Rhys family: Compton - given: Ilya family: Valmianski - given: Li family: Deng - given: Costa family: Huang - given: Namit family: Katariya - given: Xavier family: Amatriain - given: Anitha family: Kannan editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 110-129 id: compton21a issued: date-parts: - 2021 - 11 - 28 firstpage: 110 lastpage: 129 published: 2021-11-28 00:00:00 +0000 - title: 'Domain-guided Self-supervision of EEG Data Improves Downstream Classification Performance and Generalizability' abstract: 'This paper presents a domain-guided approach for learning representations of scalp-electroencephalograms (EEGs) without relying on expert annotations. Expert labeling of EEGs has proven to be an unscalable process with low inter-reviewer agreement because of the complex and lengthy nature of EEG recordings. Hence, there is a need for machine learning (ML) approaches that can leverage expert domain knowledge without incurring the cost of labor-intensive annotations. Self-supervised learning (SSL) has shown promise in such settings, although existing SSL efforts on EEG data do not fully exploit EEG domain knowledge. Furthermore, it is unclear to what extent SSL models generalize to unseen tasks and datasets. Here we explore whether SSL tasks derived in a domain-guided fashion can learn generalizable EEG representations. Our contributions are three-fold: 1) we propose novel SSL tasks for EEG based on the spatial similarity of brain activity, underlying behavioral states, and age-related differences; 2) we present evidence that an encoder pretrained using the proposed SSL tasks shows strong predictive performance on multiple downstream classifications; and 3) using two large EEG datasets, we show that our encoder generalizes well to multiple EEG datasets during downstream evaluations.' volume: 158 URL: https://proceedings.mlr.press/v158/wagh21a.html PDF: https://proceedings.mlr.press/v158/wagh21a/wagh21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-wagh21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Neeraj family: Wagh - given: Jionghao family: Wei - given: Samarth family: Rawal - given: Brent family: Berry - given: Leland family: Barnard - given: Benjamin family: Brinkmann - given: Gregory family: Worrell - given: David family: Jones - given: Yogatheesan family: Varatharajah editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 130-142 id: wagh21a issued: date-parts: - 2021 - 11 - 28 firstpage: 130 lastpage: 142 published: 2021-11-28 00:00:00 +0000 - title: 'Deconfounding Temporal Autoencoder: Estimating Treatment Effects over Time Using Noisy Proxies' abstract: 'Estimating individualized treatment effects (ITEs) from observational data is crucial for decision-making. In order to obtain unbiased ITE estimates, a common assumption is that all confounders are observed. However, in practice, it is unlikely that we observe these confounders directly. Instead, we often observe noisy measurements of true confounders, which can serve as valid proxies. In this paper, we address the problem of estimating ITE in the longitudinal setting where we observe noisy proxies instead of true confounders. To this end, we develop the Deconfounding Temporal Autoencoder (DTA), a novel method that leverages observed noisy proxies to learn a hidden embedding that reflects the true hidden confounders. In particular, the DTA combines a long short-term memory autoencoder with a causal regularization penalty that renders the potential outcomes and treatment assignment conditionally independent given the learned hidden embedding. Once the hidden embedding is learned via DTA, state-of-the-art outcome models can be used to control for it and obtain unbiased estimates of ITE. Using synthetic and real-world medical data, we demonstrate the effectiveness of our DTA by improving over state-of-the-art benchmarks by a substantial margin.' volume: 158 URL: https://proceedings.mlr.press/v158/kuzmanovic21a.html PDF: https://proceedings.mlr.press/v158/kuzmanovic21a/kuzmanovic21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-kuzmanovic21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Milan family: Kuzmanovic - given: Tobias family: Hatt - given: Stefan family: Feuerriegel editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 143-155 id: kuzmanovic21a issued: date-parts: - 2021 - 11 - 28 firstpage: 143 lastpage: 155 published: 2021-11-28 00:00:00 +0000 - title: '3KG: Contrastive Learning of 12-Lead Electrocardiograms using Physiologically-Inspired Augmentations' abstract: 'We propose 3KG, a physiologically-inspired contrastive learning approach that generates views using 3D augmentations of the 12-lead electrocardiogram. We evaluate representation quality by fine-tuning a linear layer for the downstream task of 23-class diagnosis on the PhysioNet 2020 challenge training data and find that 3KG achieves a 9.1% increase in mean AUC over the best self-supervised baseline when trained on 1% of labeled data. Our empirical analysis shows that combining spatial and temporal augmentations produces the strongest representations. In addition, we investigate the effect of this physiologically-inspired pretraining on downstream performance on different disease subgroups and find that 3KG makes the greatest gains for conduction and rhythm abnormalities. Our method allows for flexibility in incorporating other self-supervised strategies and highlights the potential for similar modality-specific augmentations for other biomedical signals.' volume: 158 URL: https://proceedings.mlr.press/v158/gopal21a.html PDF: https://proceedings.mlr.press/v158/gopal21a/gopal21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-gopal21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Bryan family: Gopal - given: Ryan family: Han - given: Gautham family: Raghupathi - given: Andrew family: Ng - given: Geoff family: Tison - given: Pranav family: Rajpurkar editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 156-167 id: gopal21a issued: date-parts: - 2021 - 11 - 28 firstpage: 156 lastpage: 167 published: 2021-11-28 00:00:00 +0000 - title: 'Image Classification with Consistent Supporting Evidence' abstract: 'Adoption of machine learning models in healthcare requires end users’ trust in the system. Models that provide additional supportive evidence for their predictions promise to facilitate adoption. We define consistent evidence to be both compatible and sufficient with respect to model predictions. We propose measures of model inconsistency and regularizers that promote more consistent evidence. We demonstrate our ideas in the context of edema severity grading from chest radiographs. We demonstrate empirically that consistent models provide competitive performance while supporting interpretation.' volume: 158 URL: https://proceedings.mlr.press/v158/wang21a.html PDF: https://proceedings.mlr.press/v158/wang21a/wang21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-wang21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Peiqi family: Wang - given: Ruizhi family: Liao - given: Daniel family: Moyer - given: Seth family: Berkowitz - given: Steven family: Horng - given: Polina family: Golland editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 168-180 id: wang21a issued: date-parts: - 2021 - 11 - 28 firstpage: 168 lastpage: 180 published: 2021-11-28 00:00:00 +0000 - title: 'Early Exit Ensembles for Uncertainty Quantification' abstract: 'Deep learning is increasingly used for decision-making in health applications. However, commonly used deep learning models are deterministic and are unable to provide any estimate of predictive uncertainty. Quantifying model uncertainty is crucial for reducing the risk of misdiagnosis by informing practitioners of low-confident predictions. To address this issue, we propose early exit ensembles, a novel framework capable of capturing predictive uncertainty via an implicit ensemble of early exits. We evaluate our approach on the task of classification using three state-of-the-art deep learning architectures applied to three medical imaging datasets. Our experiments show that early exit ensembles provide better-calibrated uncertainty compared to Monte Carlo dropout and deep ensembles using just a single forward-pass of the model. Depending on the dataset and baseline, early exit ensembles can improve uncertainty metrics up to 2x, while increasing accuracy by up to 2% over its single model counterpart. Finally, our results suggest that by providing well-calibrated predictive uncertainty for both in- and out-of-distribution inputs, early exit ensembles have the potential to improve trustworthiness of models in high-risk medical decision-making.' volume: 158 URL: https://proceedings.mlr.press/v158/qendro21a.html PDF: https://proceedings.mlr.press/v158/qendro21a/qendro21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-qendro21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Lorena family: Qendro - given: Alexander family: Campbell - given: Pietro family: Lio - given: Cecilia family: Mascolo editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 181-195 id: qendro21a issued: date-parts: - 2021 - 11 - 28 firstpage: 181 lastpage: 195 published: 2021-11-28 00:00:00 +0000 - title: 'RadBERT-CL: Factually-Aware Contrastive Learning For Radiology Report Classification' abstract: 'Radiology reports are unstructured and contain the imaging findings and corresponding diagnoses transcribed by radiologists which include clinical facts and negated and/or uncertain statements. Extracting pathologic findings and diagnoses from radiology reports is important for quality control, population health, and monitoring of disease progress. Existing works, primarily rely either on rule-based systems or transformer-based pre-trained model fine-tuning, but could not take the factual and uncertain information into consideration, and therefore generate false positive outputs. In this work, we introduce three sedulous augmentation techniques which retain factual and critical information while generating augmentations for contrastive learning. We introduce RadBERT-CL, which fuses these information into BlueBert via a self-supervised contrastive loss. Our experiments on MIMIC-CXR show superior performance of RadBERT-CL on fine-tuning for multi-class, multi-label report classification. We illustrate that when few labeled data are available, RadBERT-CL outperforms conventional SOTA transformers (BERT/BlueBert) by significantly larger margins (6-11%). We also show that the representations learned by RadBERT-CL can capture critical medical information in the latent space.' volume: 158 URL: https://proceedings.mlr.press/v158/jaiswal21a.html PDF: https://proceedings.mlr.press/v158/jaiswal21a/jaiswal21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-jaiswal21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Ajay family: Jaiswal - given: Liyan family: Tang - given: Meheli family: Ghosh - given: Justin F. family: Rousseau - given: Yifan family: Peng - given: Ying family: Ding editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 196-208 id: jaiswal21a issued: date-parts: - 2021 - 11 - 28 firstpage: 196 lastpage: 208 published: 2021-11-28 00:00:00 +0000 - title: 'Retrieval-Based Chest X-Ray Report Generation Using a Pre-trained Contrastive Language-Image Model' abstract: 'We propose CXR-RePaiR: a retrieval-based radiology report generation approach using a pre-trained contrastive language-image model. Our method generates clinically accurate reports on both in-distribution and out-of-distribution data. CXR-RePaiR outperforms or matches prior report generation methods on clinical metrics, achieving an average F$_1$ score of 0.352 ($\Delta$ + 7.98%) on an external radiology dataset (CheXpert). Further, we implement a compression approach used to reduce the size of the reference corpus and speed up the runtime of our retrieval method. With compression, our model maintains similar performance while producing reports 70% faster than the best generative model. Our approach can be broadly useful in improving the diagnostic performance and generalizability of report generation models and enabling their use in clinical workflows.' volume: 158 URL: https://proceedings.mlr.press/v158/endo21a.html PDF: https://proceedings.mlr.press/v158/endo21a/endo21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-endo21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Mark family: Endo - given: Rayan family: Krishnan - given: Viswesh family: Krishna - given: Andrew Y. family: Ng - given: Pranav family: Rajpurkar editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 209-219 id: endo21a issued: date-parts: - 2021 - 11 - 28 firstpage: 209 lastpage: 219 published: 2021-11-28 00:00:00 +0000 - title: 'Longitudinal patient stratification of electronic health records with flexible adjustment for clinical outcomes' abstract: 'The increase in availability of longitudinal EHR data is leading to improved understanding of diseases and discovery of novel phenotypes. The majority of clustering algorithms focus only on patient trajectories, yet patients with similar trajectories may have different outcomes. Finding subgroups of patients with different trajectories and outcomes can guide future drug development and improve recruitment to clinical trials. We develop a recurrent neural network autoencoder to cluster EHR data using reconstruction, outcome, and clustering losses which can be weighted to find different types of patient clusters. We show our model is able to discover known clusters from both data biases and outcome differences, outperforming baseline models. We demonstrate the model performance on 29,229 diabetes patients, showing it finds clusters of patients with both different trajectories and different outcomes which can be utilized to aid clinical decision making.' volume: 158 URL: https://proceedings.mlr.press/v158/carr21a.html PDF: https://proceedings.mlr.press/v158/carr21a/carr21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-carr21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Oliver family: Carr - given: Avelino family: Javer - given: Patrick family: Rockenschaub - given: Owen family: Parsons - given: Robert family: Durichen editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 220-238 id: carr21a issued: date-parts: - 2021 - 11 - 28 firstpage: 220 lastpage: 238 published: 2021-11-28 00:00:00 +0000 - title: 'CEHR-BERT: Incorporating temporal information from structured EHR data to improve prediction tasks' abstract: 'Embedding algorithms are increasingly used to represent clinical concepts in healthcare for improving machine learning tasks such as clinical phenotyping and disease prediction. Recent studies have adapted state-of-the-art bidirectional encoder representations from transformers (BERT) architecture to structured electronic health records (EHR) data for the generation of contextualized concept embeddings, yet do not fully incorporate temporal data across multiple clinical domains. Therefore we developed a new BERT adaptation, CEHR-BERT, to incorporate temporal information using a hybrid approach by augmenting the input to BERT using artificial time tokens, incorporating time, age, and concept embeddings, and introducing a new second learning objective for visit type. CEHR-BERT was trained on a subset of clinical data from Columbia University Irving Medical Center-New York Presbyterian Hospital, which includes 2.4M patients, spanning over three decades, and tested using 4-fold evaluation on the following prediction tasks: hospitalization, death, new heart failure (HF) diagnosis, and HF readmission. Our experiments show that CEHR-BERT outperformed existing state-of-the-art clinical BERT adaptations and baseline models across all 4 prediction tasks in both ROC-AUC and PR-AUC. CEHR-BERT also demonstrated strong few-shot learning capability, as our model trained on only 5% of data outperformed comparison models trained on the entire data set. Ablation studies to better understand the contribution of each time component showed incremental gains with every element, suggesting that CEHR-BERT’s incorporation of artificial time tokens, time/age embeddings with concept embeddings, and the addition of the second learning objective represents a promising approach for future BERT-based clinical embeddings.' volume: 158 URL: https://proceedings.mlr.press/v158/pang21a.html PDF: https://proceedings.mlr.press/v158/pang21a/pang21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-pang21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Chao family: Pang - given: Xinzhuo family: Jiang - given: Krishna S. family: Kalluri - given: Matthew family: Spotnitz - given: RuiJun family: Chen - given: Adler family: Perotte - given: Karthik family: Natarajan editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 239-260 id: pang21a issued: date-parts: - 2021 - 11 - 28 firstpage: 239 lastpage: 260 published: 2021-11-28 00:00:00 +0000 - title: 'End-to-End Sequential Sampling and Reconstruction for MRI' abstract: 'Accelerated MRI shortens acquisition time by subsampling in the measurement $\kappa$-space. Recovering a high-fidelity anatomical image from subsampled measurements requires close cooperation between two components: (1) a sampler that chooses the subsampling pattern and (2) a reconstructor that recovers images from incomplete measurements. In this paper, we leverage the sequential nature of MRI measurements, and propose a fully differentiable framework that jointly learns a sequential sampling policy simultaneously with a reconstruction strategy. This co-designed framework is able to adapt during acquisition in order to capture the most informative measurements for a particular target. Experimental results on the fastMRI knee dataset demonstrate that the proposed approach successfully utilizes intermediate information during the sampling process to boost reconstruction performance. In particular, our proposed method can outperform the current state-of-the-art learned $\kappa$-space sampling baseline on over 96% of test samples. We also investigate the individual and collective benefits of the sequential sampling and co-design strategies.' volume: 158 URL: https://proceedings.mlr.press/v158/yin21a.html PDF: https://proceedings.mlr.press/v158/yin21a/yin21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-yin21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Tianwei family: Yin - given: Zihui family: Wu - given: He family: Sun - given: Adrian V. family: Dalca - given: Yisong family: Yue - given: Katherine L. family: Bouman editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 261-281 id: yin21a issued: date-parts: - 2021 - 11 - 28 firstpage: 261 lastpage: 281 published: 2021-11-28 00:00:00 +0000 - title: 'G-Net: a Recurrent Network Approach to G-Computation for Counterfactual Prediction Under a Dynamic Treatment Regime' abstract: 'Counterfactual prediction is a fundamental task in decision-making. This paper introduces G-Net, a sequential deep learning framework for counterfactual prediction under dynamic time-varying treatment strategies in complex longitudinal settings. G-Net is based upon g-computation, a causal inference method for estimating effects of general dynamic treatment strategies. Past g-computation implementations have mostly been built using classical regression models. G-Net instead adopts a recurrent neural network framework to capture complex temporal and nonlinear dependencies in the data. To our knowledge, G-Net is the first g-computation based deep sequential modeling framework that provides estimates of treatment effects under \em{dynamic} and \em{time-varying} treatment strategies. We evaluate G-Net using simulated longitudinal data from two sources: CVSim, a mechanistic model of the cardiovascular system, and a pharmacokinetic simulation of tumor growth. G-Net outperforms both classical and state-of-the-art counterfactual prediction models in these settings.' volume: 158 URL: https://proceedings.mlr.press/v158/li21a.html PDF: https://proceedings.mlr.press/v158/li21a/li21a.pdf edit: https://github.com/mlresearch//v158/edit/gh-pages/_posts/2021-11-28-li21a.md series: 'Proceedings of Machine Learning Research' container-title: 'Proceedings of Machine Learning for Health' publisher: 'PMLR' author: - given: Rui family: Li - given: Stephanie family: Hu - given: Mingyu family: Lu - given: Yuria family: Utsumi - given: Prithwish family: Chakraborty - given: Daby M. family: Sow - given: Piyush family: Madan - given: Jun family: Li - given: Mohamed family: Ghalwash - given: Zach family: Shahn - given: Li-wei family: Lehman editor: - given: Subhrajit family: Roy - given: Stephen family: Pfohl - given: Emma family: Rocheteau - given: Girmaw Abebe family: Tadesse - given: Luis family: Oala - given: Fabian family: Falck - given: Yuyin family: Zhou - given: Liyue family: Shen - given: Ghada family: Zamzmi - given: Purity family: Mugambi - given: Ayah family: Zirikly - given: Matthew B. A. family: McDermott - given: Emily family: Alsentzer page: 282-299 id: li21a issued: date-parts: - 2021 - 11 - 28 firstpage: 282 lastpage: 299 published: 2021-11-28 00:00:00 +0000