ToxiSight: Leveraging Moderator Expertise Through Behavioral Measurement in Gaming Toxicity Annotation

Zachary Yang; Vicki Chen; Domenico Tullo; Reihaneh Rabbany

ToxiSight: Leveraging Moderator Expertise Through Behavioral Measurement in Gaming Toxicity Annotation

Zachary Yang, Vicki Chen, Domenico Tullo, Reihaneh Rabbany

Proceedings of the The 39th Canadian Conference on Artificial Intelligence, PMLR 318:650-661, 2026.

Abstract

Content moderation systems commonly treat human annotators as interchangeable label sources, resolving disagreements through majority voting or expert arbitration. We present ToxiSight, an annotation platform that reframes this assumption: rather than extracting consensus, the system supports moderator reasoning by treating hesitation, revision, and disagreement as signals revealing where content is genuinely ambiguous and where taxonomic guidelines fail. ToxiSight integrates gaming-specific contextual widgets with behavioral telemetry, capturing the cognitive processes underlying toxicity validation decisions. Through deployment with 10 professional moderators across 60,000 lines of gaming chat, we demonstrate that behavioral patterns expose systematic category failures invisible to traditional inter-annotator metrics. The Controversial category shows 72% revision rates with fast processing times, indicating immediate recognition of definitional breakdown, while Threats (Life-Threatening) exhibits 75% revisions with slow processing, signaling genuine interpretive complexity. Completion rates improved from 60% to 95%, and moderators reported reduced decision stress when permitted to express uncertainty. This case study demonstrates that trustworthy toxicity detection requires annotation systems designed around the irreducible complexity of human judgment, not against it.

Cite this Paper

BibTeX

@InProceedings{pmlr-v318-yang26a,
  title = 	 {ToxiSight: Leveraging Moderator Expertise Through Behavioral Measurement in Gaming Toxicity Annotation},
  author =       {Yang, Zachary and Chen, Vicki and Tullo, Domenico and Rabbany, Reihaneh},
  booktitle = 	 {Proceedings of the The 39th Canadian Conference on Artificial Intelligence},
  pages = 	 {650--661},
  year = 	 {2026},
  editor = 	 {Bouzar-Benlabiod, Lydia and Leung, Carson},
  volume = 	 {318},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25--29 May},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v318/main/assets/yang26a/yang26a.pdf},
  url = 	 {https://proceedings.mlr.press/v318/yang26a.html},
  abstract = 	 {Content moderation systems commonly treat human annotators as interchangeable label sources, resolving disagreements through majority voting or expert arbitration. We present ToxiSight, an annotation platform that reframes this assumption: rather than extracting consensus, the system supports moderator reasoning by treating hesitation, revision, and disagreement as signals revealing where content is genuinely ambiguous and where taxonomic guidelines fail. ToxiSight integrates gaming-specific contextual widgets with behavioral telemetry, capturing the cognitive processes underlying toxicity validation decisions. Through deployment with 10 professional moderators across 60,000 lines of gaming chat, we demonstrate that behavioral patterns expose systematic category failures invisible to traditional inter-annotator metrics. The Controversial category shows 72% revision rates with fast processing times, indicating immediate recognition of definitional breakdown, while Threats (Life-Threatening) exhibits 75% revisions with slow processing, signaling genuine interpretive complexity. Completion rates improved from 60% to 95%, and moderators reported reduced decision stress when permitted to express uncertainty. This case study demonstrates that trustworthy toxicity detection requires annotation systems designed around the irreducible complexity of human judgment, not against it.}
}

Endnote

%0 Conference Paper
%T ToxiSight: Leveraging Moderator Expertise Through Behavioral Measurement in Gaming Toxicity Annotation
%A Zachary Yang
%A Vicki Chen
%A Domenico Tullo
%A Reihaneh Rabbany
%B Proceedings of the The 39th Canadian Conference on Artificial Intelligence
%C Proceedings of Machine Learning Research
%D 2026
%E Lydia Bouzar-Benlabiod
%E Carson Leung	
%F pmlr-v318-yang26a
%I PMLR
%P 650--661
%U https://proceedings.mlr.press/v318/yang26a.html
%V 318
%X Content moderation systems commonly treat human annotators as interchangeable label sources, resolving disagreements through majority voting or expert arbitration. We present ToxiSight, an annotation platform that reframes this assumption: rather than extracting consensus, the system supports moderator reasoning by treating hesitation, revision, and disagreement as signals revealing where content is genuinely ambiguous and where taxonomic guidelines fail. ToxiSight integrates gaming-specific contextual widgets with behavioral telemetry, capturing the cognitive processes underlying toxicity validation decisions. Through deployment with 10 professional moderators across 60,000 lines of gaming chat, we demonstrate that behavioral patterns expose systematic category failures invisible to traditional inter-annotator metrics. The Controversial category shows 72% revision rates with fast processing times, indicating immediate recognition of definitional breakdown, while Threats (Life-Threatening) exhibits 75% revisions with slow processing, signaling genuine interpretive complexity. Completion rates improved from 60% to 95%, and moderators reported reduced decision stress when permitted to express uncertainty. This case study demonstrates that trustworthy toxicity detection requires annotation systems designed around the irreducible complexity of human judgment, not against it.

APA

Yang, Z., Chen, V., Tullo, D. & Rabbany, R.. (2026). ToxiSight: Leveraging Moderator Expertise Through Behavioral Measurement in Gaming Toxicity Annotation. Proceedings of the The 39th Canadian Conference on Artificial Intelligence, in Proceedings of Machine Learning Research 318:650-661 Available from https://proceedings.mlr.press/v318/yang26a.html.

Related Material

Download PDF