Zero-Shot Clinical Acronym Expansion via Latent Meaning Cells

Griffin Adams, Mert Ketenci, Shreyas Bhave, Adler Perotte, Noémie Elhadad
Proceedings of the Machine Learning for Health NeurIPS Workshop, PMLR 136:12-40, 2020.

Abstract

We introduce Latent Meaning Cells, a deep latent variable model which learns contextualized representations of words by combining local lexical context and metadata. Metadata can refer to granular context, such as section type, or to more global context, such as unique document ids. Reliance on metadata for contextualized representation learning is apropos in the clinical domain where text is semi-structured and expresses high variation in topics. We evaluate the LMC model on the task of zero-shot clinical acronym expansion across three datasets. The LMC significantly outperforms a diverse set of baselines at a fraction of the pre-training cost and learns clinically coherent representations. We demonstrate that not only is metadata itself very helpful for the task, but that the LMC inference algorithm provides an additional large benefit.

Cite this Paper


BibTeX
@InProceedings{pmlr-v136-adams20a, title = {Zero-Shot Clinical Acronym Expansion via Latent Meaning Cells}, author = {Adams, Griffin and Ketenci, Mert and Bhave, Shreyas and Perotte, Adler and Elhadad, No\'emie}, booktitle = {Proceedings of the Machine Learning for Health NeurIPS Workshop}, pages = {12--40}, year = {2020}, editor = {Alsentzer, Emily and McDermott, Matthew B. A. and Falck, Fabian and Sarkar, Suproteem K. and Roy, Subhrajit and Hyland, Stephanie L.}, volume = {136}, series = {Proceedings of Machine Learning Research}, month = {11 Dec}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v136/adams20a/adams20a.pdf}, url = {https://proceedings.mlr.press/v136/adams20a.html}, abstract = {We introduce Latent Meaning Cells, a deep latent variable model which learns contextualized representations of words by combining local lexical context and metadata. Metadata can refer to granular context, such as section type, or to more global context, such as unique document ids. Reliance on metadata for contextualized representation learning is apropos in the clinical domain where text is semi-structured and expresses high variation in topics. We evaluate the LMC model on the task of zero-shot clinical acronym expansion across three datasets. The LMC significantly outperforms a diverse set of baselines at a fraction of the pre-training cost and learns clinically coherent representations. We demonstrate that not only is metadata itself very helpful for the task, but that the LMC inference algorithm provides an additional large benefit.} }
Endnote
%0 Conference Paper %T Zero-Shot Clinical Acronym Expansion via Latent Meaning Cells %A Griffin Adams %A Mert Ketenci %A Shreyas Bhave %A Adler Perotte %A Noémie Elhadad %B Proceedings of the Machine Learning for Health NeurIPS Workshop %C Proceedings of Machine Learning Research %D 2020 %E Emily Alsentzer %E Matthew B. A. McDermott %E Fabian Falck %E Suproteem K. Sarkar %E Subhrajit Roy %E Stephanie L. Hyland %F pmlr-v136-adams20a %I PMLR %P 12--40 %U https://proceedings.mlr.press/v136/adams20a.html %V 136 %X We introduce Latent Meaning Cells, a deep latent variable model which learns contextualized representations of words by combining local lexical context and metadata. Metadata can refer to granular context, such as section type, or to more global context, such as unique document ids. Reliance on metadata for contextualized representation learning is apropos in the clinical domain where text is semi-structured and expresses high variation in topics. We evaluate the LMC model on the task of zero-shot clinical acronym expansion across three datasets. The LMC significantly outperforms a diverse set of baselines at a fraction of the pre-training cost and learns clinically coherent representations. We demonstrate that not only is metadata itself very helpful for the task, but that the LMC inference algorithm provides an additional large benefit.
APA
Adams, G., Ketenci, M., Bhave, S., Perotte, A. & Elhadad, N.. (2020). Zero-Shot Clinical Acronym Expansion via Latent Meaning Cells. Proceedings of the Machine Learning for Health NeurIPS Workshop, in Proceedings of Machine Learning Research 136:12-40 Available from https://proceedings.mlr.press/v136/adams20a.html.

Related Material