Research on Part of Speech Enhanced Text Classification Based on Rotation Position Encoding and Hierarchical Features fusion Text Classification Text Classification

Mingjian Li, Huaiyang Li
Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing, PMLR 278:67-76, 2025.

Abstract

In response to the current problems of missing contextual information, incomplete feature representation, and difficulty in semantic parsing in text classification. This article proposes a text classification framework that combines feature fusion and vocabulary enhancement. Firstly, use WoBERT to encode the text, collect dynamic word vectors, and effectively integrate vocabulary into characters to enhance boundary interaction; Secondly, rotation position encoding is introduced into the character vector to obtain relative distance information between characters and improve feature embedding; Subsequently, to enhance the feature capture capability, the D-mixup structure was introduced to cross fuse the relative distance and CLS information, and continuously extract the global representation of the text in depth; Finally, the Multi Sample Dropout method is used to calculate the loss of multi-level mixed global representations, improving the learning ability of the model. In the experiments on the THUCnews and SMP2020 datasets, the F1 values of the proposed model were 94.72% and 78.38%, respectively, indicating better performance than the current research methods. This indicates that the model proposed in this article can effectively improve generalization and robustness, enhance text classification performance, and is easy to implement, providing reference ideas for future research.

Cite this Paper


BibTeX
@InProceedings{pmlr-v278-li25d, title = {Research on Part of Speech Enhanced Text Classification Based on Rotation Position Encoding and Hierarchical Features fusion Text Classification Text Classification}, author = {Li, Mingjian and Li, Huaiyang}, booktitle = {Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing}, pages = {67--76}, year = {2025}, editor = {Zeng, Nianyin and Pachori, Ram Bilas and Wang, Dongshu}, volume = {278}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v278/main/assets/li25d/li25d.pdf}, url = {https://proceedings.mlr.press/v278/li25d.html}, abstract = {In response to the current problems of missing contextual information, incomplete feature representation, and difficulty in semantic parsing in text classification. This article proposes a text classification framework that combines feature fusion and vocabulary enhancement. Firstly, use WoBERT to encode the text, collect dynamic word vectors, and effectively integrate vocabulary into characters to enhance boundary interaction; Secondly, rotation position encoding is introduced into the character vector to obtain relative distance information between characters and improve feature embedding; Subsequently, to enhance the feature capture capability, the D-mixup structure was introduced to cross fuse the relative distance and CLS information, and continuously extract the global representation of the text in depth; Finally, the Multi Sample Dropout method is used to calculate the loss of multi-level mixed global representations, improving the learning ability of the model. In the experiments on the THUCnews and SMP2020 datasets, the F1 values of the proposed model were 94.72% and 78.38%, respectively, indicating better performance than the current research methods. This indicates that the model proposed in this article can effectively improve generalization and robustness, enhance text classification performance, and is easy to implement, providing reference ideas for future research.} }
Endnote
%0 Conference Paper %T Research on Part of Speech Enhanced Text Classification Based on Rotation Position Encoding and Hierarchical Features fusion Text Classification Text Classification %A Mingjian Li %A Huaiyang Li %B Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing %C Proceedings of Machine Learning Research %D 2025 %E Nianyin Zeng %E Ram Bilas Pachori %E Dongshu Wang %F pmlr-v278-li25d %I PMLR %P 67--76 %U https://proceedings.mlr.press/v278/li25d.html %V 278 %X In response to the current problems of missing contextual information, incomplete feature representation, and difficulty in semantic parsing in text classification. This article proposes a text classification framework that combines feature fusion and vocabulary enhancement. Firstly, use WoBERT to encode the text, collect dynamic word vectors, and effectively integrate vocabulary into characters to enhance boundary interaction; Secondly, rotation position encoding is introduced into the character vector to obtain relative distance information between characters and improve feature embedding; Subsequently, to enhance the feature capture capability, the D-mixup structure was introduced to cross fuse the relative distance and CLS information, and continuously extract the global representation of the text in depth; Finally, the Multi Sample Dropout method is used to calculate the loss of multi-level mixed global representations, improving the learning ability of the model. In the experiments on the THUCnews and SMP2020 datasets, the F1 values of the proposed model were 94.72% and 78.38%, respectively, indicating better performance than the current research methods. This indicates that the model proposed in this article can effectively improve generalization and robustness, enhance text classification performance, and is easy to implement, providing reference ideas for future research.
APA
Li, M. & Li, H.. (2025). Research on Part of Speech Enhanced Text Classification Based on Rotation Position Encoding and Hierarchical Features fusion Text Classification Text Classification. Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing, in Proceedings of Machine Learning Research 278:67-76 Available from https://proceedings.mlr.press/v278/li25d.html.

Related Material