From Tweets to Model-Based Causal Spans: Noise-Robust Transformers for Social Media Sentiment Analysis in the Age of LLMs

MILOUD MIHOUBI, MERIEM ZERKOUK, Belkacem CHIKHAOUI
Proceedings of the The 39th Canadian Conference on Artificial Intelligence, PMLR 318:880-887, 2026.

Abstract

Social media text is short, noisy, and rapidly evolving. Transformer-based sentiment models like BERTweet are brittle under lexical noise and offer limited explainability. We propose the Noise-Robust Causal Transformer (NRCT), which augments BERTweet with a contrastive objective that aligns semantically equivalent but lexically perturbed tweets, and a causal attention head trained to highlight sparse token spans that drive the model’s prediction. On Sentiment140 and TweetEval-Sentiment, NRCT matches clean accuracy, improves macro-F1 under synthetic noise, and produces token rationales that are more faithful than standard attention (higher deletion/insertion AUC). NRCT offers a practical trade-off be- tween accuracy, robustness, and model-based interpretability for social media sentiment analysis.

Cite this Paper


BibTeX
@InProceedings{pmlr-v318-mihoubi26a, title = {From Tweets to Model-Based Causal Spans: Noise-Robust Transformers for Social Media Sentiment Analysis in the Age of LLMs}, author = {MIHOUBI, MILOUD and ZERKOUK, MERIEM and CHIKHAOUI, Belkacem}, booktitle = {Proceedings of the The 39th Canadian Conference on Artificial Intelligence}, pages = {880--887}, year = {2026}, editor = {Bouzar-Benlabiod, Lydia and Leung, Carson}, volume = {318}, series = {Proceedings of Machine Learning Research}, month = {25--29 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v318/main/assets/mihoubi26a/mihoubi26a.pdf}, url = {https://proceedings.mlr.press/v318/mihoubi26a.html}, abstract = {Social media text is short, noisy, and rapidly evolving. Transformer-based sentiment models like BERTweet are brittle under lexical noise and offer limited explainability. We propose the Noise-Robust Causal Transformer (NRCT), which augments BERTweet with a contrastive objective that aligns semantically equivalent but lexically perturbed tweets, and a causal attention head trained to highlight sparse token spans that drive the model’s prediction. On Sentiment140 and TweetEval-Sentiment, NRCT matches clean accuracy, improves macro-F1 under synthetic noise, and produces token rationales that are more faithful than standard attention (higher deletion/insertion AUC). NRCT offers a practical trade-off be- tween accuracy, robustness, and model-based interpretability for social media sentiment analysis.} }
Endnote
%0 Conference Paper %T From Tweets to Model-Based Causal Spans: Noise-Robust Transformers for Social Media Sentiment Analysis in the Age of LLMs %A MILOUD MIHOUBI %A MERIEM ZERKOUK %A Belkacem CHIKHAOUI %B Proceedings of the The 39th Canadian Conference on Artificial Intelligence %C Proceedings of Machine Learning Research %D 2026 %E Lydia Bouzar-Benlabiod %E Carson Leung %F pmlr-v318-mihoubi26a %I PMLR %P 880--887 %U https://proceedings.mlr.press/v318/mihoubi26a.html %V 318 %X Social media text is short, noisy, and rapidly evolving. Transformer-based sentiment models like BERTweet are brittle under lexical noise and offer limited explainability. We propose the Noise-Robust Causal Transformer (NRCT), which augments BERTweet with a contrastive objective that aligns semantically equivalent but lexically perturbed tweets, and a causal attention head trained to highlight sparse token spans that drive the model’s prediction. On Sentiment140 and TweetEval-Sentiment, NRCT matches clean accuracy, improves macro-F1 under synthetic noise, and produces token rationales that are more faithful than standard attention (higher deletion/insertion AUC). NRCT offers a practical trade-off be- tween accuracy, robustness, and model-based interpretability for social media sentiment analysis.
APA
MIHOUBI, M., ZERKOUK, M. & CHIKHAOUI, B.. (2026). From Tweets to Model-Based Causal Spans: Noise-Robust Transformers for Social Media Sentiment Analysis in the Age of LLMs. Proceedings of the The 39th Canadian Conference on Artificial Intelligence, in Proceedings of Machine Learning Research 318:880-887 Available from https://proceedings.mlr.press/v318/mihoubi26a.html.

Related Material