RedditEM: Unveiling Diachronic Semantic Shifts in Social Network Discourse

Jiajun Zou, Sixing Wu, Jinshuai Yang, Minghu Jiang, Yongfeng Huang
Proceedings of the 16th Asian Conference on Machine Learning, PMLR 260:968-983, 2025.

Abstract

Humans employ words to convey abstract concepts. The evolution of lexical semantics holds significance not only in Natural Language Processing applications but also in the realm of social computing research. However, the scarcity of diachronic word representations persists due to the substantial computational demands, particularly evident in the absence of large-scale and enduring diachronic word embeddings for social network texts. Herein, we introduce RedditEM, a comprehensive collection of diachronic word representations derived from Reddit English comment texts, featuring one word embedding per month spanning from January 2010 to December 2021. To assess the diachronic semantic shifts of words, we employ cosine distance metrics and juxtapose the embeddings’ neighborhoods. Our experimental findings underscore the utility of RedditEM in detecting alterations in word meanings within social networks and advancing social computing endeavors. Researchers interested in accessing this resource are cordially invited to contact us without hesitation.

Cite this Paper


BibTeX
@InProceedings{pmlr-v260-zou25a, title = {{RedditEM}: {U}nveiling Diachronic Semantic Shifts in Social Network Discourse}, author = {Zou, Jiajun and Wu, Sixing and Yang, Jinshuai and Jiang, Minghu and Huang, Yongfeng}, booktitle = {Proceedings of the 16th Asian Conference on Machine Learning}, pages = {968--983}, year = {2025}, editor = {Nguyen, Vu and Lin, Hsuan-Tien}, volume = {260}, series = {Proceedings of Machine Learning Research}, month = {05--08 Dec}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v260/main/assets/zou25a/zou25a.pdf}, url = {https://proceedings.mlr.press/v260/zou25a.html}, abstract = {Humans employ words to convey abstract concepts. The evolution of lexical semantics holds significance not only in Natural Language Processing applications but also in the realm of social computing research. However, the scarcity of diachronic word representations persists due to the substantial computational demands, particularly evident in the absence of large-scale and enduring diachronic word embeddings for social network texts. Herein, we introduce RedditEM, a comprehensive collection of diachronic word representations derived from Reddit English comment texts, featuring one word embedding per month spanning from January 2010 to December 2021. To assess the diachronic semantic shifts of words, we employ cosine distance metrics and juxtapose the embeddings’ neighborhoods. Our experimental findings underscore the utility of RedditEM in detecting alterations in word meanings within social networks and advancing social computing endeavors. Researchers interested in accessing this resource are cordially invited to contact us without hesitation.} }
Endnote
%0 Conference Paper %T RedditEM: Unveiling Diachronic Semantic Shifts in Social Network Discourse %A Jiajun Zou %A Sixing Wu %A Jinshuai Yang %A Minghu Jiang %A Yongfeng Huang %B Proceedings of the 16th Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Vu Nguyen %E Hsuan-Tien Lin %F pmlr-v260-zou25a %I PMLR %P 968--983 %U https://proceedings.mlr.press/v260/zou25a.html %V 260 %X Humans employ words to convey abstract concepts. The evolution of lexical semantics holds significance not only in Natural Language Processing applications but also in the realm of social computing research. However, the scarcity of diachronic word representations persists due to the substantial computational demands, particularly evident in the absence of large-scale and enduring diachronic word embeddings for social network texts. Herein, we introduce RedditEM, a comprehensive collection of diachronic word representations derived from Reddit English comment texts, featuring one word embedding per month spanning from January 2010 to December 2021. To assess the diachronic semantic shifts of words, we employ cosine distance metrics and juxtapose the embeddings’ neighborhoods. Our experimental findings underscore the utility of RedditEM in detecting alterations in word meanings within social networks and advancing social computing endeavors. Researchers interested in accessing this resource are cordially invited to contact us without hesitation.
APA
Zou, J., Wu, S., Yang, J., Jiang, M. & Huang, Y.. (2025). RedditEM: Unveiling Diachronic Semantic Shifts in Social Network Discourse. Proceedings of the 16th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 260:968-983 Available from https://proceedings.mlr.press/v260/zou25a.html.

Related Material