[edit]
RedditEM: Unveiling Diachronic Semantic Shifts in Social Network Discourse
Proceedings of the 16th Asian Conference on Machine Learning, PMLR 260:968-983, 2025.
Abstract
Humans employ words to convey abstract concepts. The evolution of lexical semantics holds significance not only in Natural Language Processing applications but also in the realm of social computing research. However, the scarcity of diachronic word representations persists due to the substantial computational demands, particularly evident in the absence of large-scale and enduring diachronic word embeddings for social network texts. Herein, we introduce RedditEM, a comprehensive collection of diachronic word representations derived from Reddit English comment texts, featuring one word embedding per month spanning from January 2010 to December 2021. To assess the diachronic semantic shifts of words, we employ cosine distance metrics and juxtapose the embeddings’ neighborhoods. Our experimental findings underscore the utility of RedditEM in detecting alterations in word meanings within social networks and advancing social computing endeavors. Researchers interested in accessing this resource are cordially invited to contact us without hesitation.