Label-Perceptive Adversarial Domain Adaptation for Named Entity Recognition in Traditional Chinese Medicine: Dataset and Approach

Yu Tong
Proceedings of the 17th Asian Conference on Machine Learning, PMLR 304:1118-1133, 2025.

Abstract

In the field of Traditional Chinese Medicine (TCM), Named Entity Recognition (NER) is a crucial task. However, the scarcity of NER datasets in TCM significantly hampers the performance of models in this domain. A promising approach to addressing this low-resource issue is through domain adaptation techniques. Current domain adaptation methods typically leverage large amounts of labeled data from a source domain to bridge the gap between the source and target domains, making the features of the generated target domain data as similar as possible to those of the source domain, thereby enhancing model performance in the target domain. However, existing methods primarily focus on aligning textual features and neglect the importance of label information. In the NER task, labels not only indicate categories but also carry important categorical information. Therefore, this paper proposes a Label-Perceptive Adversarial Domain Adaptation (LPADA) method that integrates label information with textual features, providing additional contextual information for the domain adaptation process, thus enhancing the model’s performance in the TCM domain. Furthermore, we annotate medical case records to construct a dataset TCMNER2024 and establish a baseline. TCMNER2024 dataset can be accessed via https://github.com/TCMNER/TCMNER2024. The evaluation demonstrates that our approach significantly outperforms existing methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v304-tong25a, title = {Label-Perceptive Adversarial Domain Adaptation for Named Entity Recognition in Traditional Chinese Medicine: Dataset and Approach}, author = {Tong, Yu}, booktitle = {Proceedings of the 17th Asian Conference on Machine Learning}, pages = {1118--1133}, year = {2025}, editor = {Lee, Hung-yi and Liu, Tongliang}, volume = {304}, series = {Proceedings of Machine Learning Research}, month = {09--12 Dec}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v304/main/assets/tong25a/tong25a.pdf}, url = {https://proceedings.mlr.press/v304/tong25a.html}, abstract = {In the field of Traditional Chinese Medicine (TCM), Named Entity Recognition (NER) is a crucial task. However, the scarcity of NER datasets in TCM significantly hampers the performance of models in this domain. A promising approach to addressing this low-resource issue is through domain adaptation techniques. Current domain adaptation methods typically leverage large amounts of labeled data from a source domain to bridge the gap between the source and target domains, making the features of the generated target domain data as similar as possible to those of the source domain, thereby enhancing model performance in the target domain. However, existing methods primarily focus on aligning textual features and neglect the importance of label information. In the NER task, labels not only indicate categories but also carry important categorical information. Therefore, this paper proposes a Label-Perceptive Adversarial Domain Adaptation (LPADA) method that integrates label information with textual features, providing additional contextual information for the domain adaptation process, thus enhancing the model’s performance in the TCM domain. Furthermore, we annotate medical case records to construct a dataset TCMNER2024 and establish a baseline. TCMNER2024 dataset can be accessed via https://github.com/TCMNER/TCMNER2024. The evaluation demonstrates that our approach significantly outperforms existing methods.} }
Endnote
%0 Conference Paper %T Label-Perceptive Adversarial Domain Adaptation for Named Entity Recognition in Traditional Chinese Medicine: Dataset and Approach %A Yu Tong %B Proceedings of the 17th Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Hung-yi Lee %E Tongliang Liu %F pmlr-v304-tong25a %I PMLR %P 1118--1133 %U https://proceedings.mlr.press/v304/tong25a.html %V 304 %X In the field of Traditional Chinese Medicine (TCM), Named Entity Recognition (NER) is a crucial task. However, the scarcity of NER datasets in TCM significantly hampers the performance of models in this domain. A promising approach to addressing this low-resource issue is through domain adaptation techniques. Current domain adaptation methods typically leverage large amounts of labeled data from a source domain to bridge the gap between the source and target domains, making the features of the generated target domain data as similar as possible to those of the source domain, thereby enhancing model performance in the target domain. However, existing methods primarily focus on aligning textual features and neglect the importance of label information. In the NER task, labels not only indicate categories but also carry important categorical information. Therefore, this paper proposes a Label-Perceptive Adversarial Domain Adaptation (LPADA) method that integrates label information with textual features, providing additional contextual information for the domain adaptation process, thus enhancing the model’s performance in the TCM domain. Furthermore, we annotate medical case records to construct a dataset TCMNER2024 and establish a baseline. TCMNER2024 dataset can be accessed via https://github.com/TCMNER/TCMNER2024. The evaluation demonstrates that our approach significantly outperforms existing methods.
APA
Tong, Y.. (2025). Label-Perceptive Adversarial Domain Adaptation for Named Entity Recognition in Traditional Chinese Medicine: Dataset and Approach. Proceedings of the 17th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 304:1118-1133 Available from https://proceedings.mlr.press/v304/tong25a.html.

Related Material