From Tweets to Model-Based Causal Spans: Noise-Robust Transformers for Social Media Sentiment Analysis in the Age of LLMs

MILOUD MIHOUBI; MERIEM ZERKOUK; Belkacem CHIKHAOUI

From Tweets to Model-Based Causal Spans: Noise-Robust Transformers for Social Media Sentiment Analysis in the Age of LLMs

MILOUD MIHOUBI, MERIEM ZERKOUK, Belkacem CHIKHAOUI

Proceedings of the The 39th Canadian Conference on Artificial Intelligence, PMLR 318:880-887, 2026.

Abstract

Social media text is short, noisy, and rapidly evolving. Transformer-based sentiment models like BERTweet are brittle under lexical noise and offer limited explainability. We propose the Noise-Robust Causal Transformer (NRCT), which augments BERTweet with a contrastive objective that aligns semantically equivalent but lexically perturbed tweets, and a causal attention head trained to highlight sparse token spans that drive the model’s prediction. On Sentiment140 and TweetEval-Sentiment, NRCT matches clean accuracy, improves macro-F1 under synthetic noise, and produces token rationales that are more faithful than standard attention (higher deletion/insertion AUC). NRCT offers a practical trade-off be- tween accuracy, robustness, and model-based interpretability for social media sentiment analysis.

Cite this Paper

BibTeX

@InProceedings{pmlr-v318-mihoubi26a,
  title = 	 {From Tweets to Model-Based Causal Spans: Noise-Robust Transformers for Social Media Sentiment Analysis in the Age of LLMs},
  author =       {MIHOUBI, MILOUD and ZERKOUK, MERIEM and CHIKHAOUI, Belkacem},
  booktitle = 	 {Proceedings of the The 39th Canadian Conference on Artificial Intelligence},
  pages = 	 {880--887},
  year = 	 {2026},
  editor = 	 {Bouzar-Benlabiod, Lydia and Leung, Carson},
  volume = 	 {318},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25--29 May},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v318/main/assets/mihoubi26a/mihoubi26a.pdf},
  url = 	 {https://proceedings.mlr.press/v318/mihoubi26a.html},
  abstract = 	 {Social media text is short, noisy, and rapidly evolving. Transformer-based sentiment models like BERTweet are brittle under lexical noise and offer limited explainability. We propose the Noise-Robust Causal Transformer (NRCT), which augments BERTweet with a contrastive objective that aligns semantically equivalent but lexically perturbed tweets, and a causal attention head trained to highlight sparse token spans that drive the model’s prediction. On Sentiment140 and TweetEval-Sentiment, NRCT matches clean accuracy, improves macro-F1 under synthetic noise, and produces token rationales that are more faithful than standard attention (higher deletion/insertion AUC). NRCT offers a practical trade-off be- tween accuracy, robustness, and model-based interpretability for social media sentiment analysis.}
}

Endnote

%0 Conference Paper
%T From Tweets to Model-Based Causal Spans: Noise-Robust Transformers for Social Media Sentiment Analysis in the Age of LLMs
%A MILOUD MIHOUBI
%A MERIEM ZERKOUK
%A Belkacem CHIKHAOUI
%B Proceedings of the The 39th Canadian Conference on Artificial Intelligence
%C Proceedings of Machine Learning Research
%D 2026
%E Lydia Bouzar-Benlabiod
%E Carson Leung	
%F pmlr-v318-mihoubi26a
%I PMLR
%P 880--887
%U https://proceedings.mlr.press/v318/mihoubi26a.html
%V 318
%X Social media text is short, noisy, and rapidly evolving. Transformer-based sentiment models like BERTweet are brittle under lexical noise and offer limited explainability. We propose the Noise-Robust Causal Transformer (NRCT), which augments BERTweet with a contrastive objective that aligns semantically equivalent but lexically perturbed tweets, and a causal attention head trained to highlight sparse token spans that drive the model’s prediction. On Sentiment140 and TweetEval-Sentiment, NRCT matches clean accuracy, improves macro-F1 under synthetic noise, and produces token rationales that are more faithful than standard attention (higher deletion/insertion AUC). NRCT offers a practical trade-off be- tween accuracy, robustness, and model-based interpretability for social media sentiment analysis.

APA

MIHOUBI, M., ZERKOUK, M. & CHIKHAOUI, B.. (2026). From Tweets to Model-Based Causal Spans: Noise-Robust Transformers for Social Media Sentiment Analysis in the Age of LLMs. Proceedings of the The 39th Canadian Conference on Artificial Intelligence, in Proceedings of Machine Learning Research 318:880-887 Available from https://proceedings.mlr.press/v318/mihoubi26a.html.

Related Material

Download PDF