Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text

Abhimanyu Hans; Avi Schwarzschild; Valeriia Cherepanova; Hamid Kazemi; Aniruddha Saha; Micah Goldblum; Jonas Geiping; Tom Goldstein

Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text

Abhimanyu Hans, Avi Schwarzschild, Valeriia Cherepanova, Hamid Kazemi, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein

Proceedings of the 41st International Conference on Machine Learning, PMLR 235:17519-17537, 2024.

Abstract

Detecting text generated by modern large language models is thought to be hard, as both LLMs and humans can exhibit a wide range of complex behaviors. However, we find that a score based on contrasting two closely related language models is highly accurate at separating human-generated and machine-generated text. Based on this mechanism, we propose a novel LLM detector that only requires simple calculations using a pair of pre-trained LLMs. The method, called Binoculars, achieves state-of-the-art accuracy without any training data. It is capable of spotting machine text from a range of modern LLMs without any model-specific modifications. We comprehensively evaluate Binoculars on a number of text sources and in varied situations. Over a wide range of document types, Binoculars detects over 90% of generated samples from ChatGPT (and other LLMs) at a false positive rate of 0.01%, despite not being trained on any ChatGPT data. Code available at https://github.com/ahans30/Binoculars.

Cite this Paper

BibTeX


@InProceedings{pmlr-v235-hans24a,
  title = 	 {Spotting {LLM}s With Binoculars: Zero-Shot Detection of Machine-Generated Text},
  author =       {Hans, Abhimanyu and Schwarzschild, Avi and Cherepanova, Valeriia and Kazemi, Hamid and Saha, Aniruddha and Goldblum, Micah and Geiping, Jonas and Goldstein, Tom},
  booktitle = 	 {Proceedings of the 41st International Conference on Machine Learning},
  pages = 	 {17519--17537},
  year = 	 {2024},
  editor = 	 {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix},
  volume = 	 {235},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {21--27 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v235/main/assets/hans24a/hans24a.pdf},
  url = 	 {https://proceedings.mlr.press/v235/hans24a.html},
  abstract = 	 {Detecting text generated by modern large language models is thought to be hard, as both LLMs and humans can exhibit a wide range of complex behaviors. However, we find that a score based on contrasting two closely related language models is highly accurate at separating human-generated and machine-generated text. Based on this mechanism, we propose a novel LLM detector that only requires simple calculations using a pair of pre-trained LLMs. The method, called Binoculars, achieves state-of-the-art accuracy without any training data. It is capable of spotting machine text from a range of modern LLMs without any model-specific modifications. We comprehensively evaluate Binoculars on a number of text sources and in varied situations. Over a wide range of document types, Binoculars detects over 90% of generated samples from ChatGPT (and other LLMs) at a false positive rate of 0.01%, despite not being trained on any ChatGPT data. Code available at https://github.com/ahans30/Binoculars.}
}

Endnote

%0 Conference Paper
%T Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text
%A Abhimanyu Hans
%A Avi Schwarzschild
%A Valeriia Cherepanova
%A Hamid Kazemi
%A Aniruddha Saha
%A Micah Goldblum
%A Jonas Geiping
%A Tom Goldstein
%B Proceedings of the 41st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2024
%E Ruslan Salakhutdinov
%E Zico Kolter
%E Katherine Heller
%E Adrian Weller
%E Nuria Oliver
%E Jonathan Scarlett
%E Felix Berkenkamp	
%F pmlr-v235-hans24a
%I PMLR
%P 17519--17537
%U https://proceedings.mlr.press/v235/hans24a.html
%V 235
%X Detecting text generated by modern large language models is thought to be hard, as both LLMs and humans can exhibit a wide range of complex behaviors. However, we find that a score based on contrasting two closely related language models is highly accurate at separating human-generated and machine-generated text. Based on this mechanism, we propose a novel LLM detector that only requires simple calculations using a pair of pre-trained LLMs. The method, called Binoculars, achieves state-of-the-art accuracy without any training data. It is capable of spotting machine text from a range of modern LLMs without any model-specific modifications. We comprehensively evaluate Binoculars on a number of text sources and in varied situations. Over a wide range of document types, Binoculars detects over 90% of generated samples from ChatGPT (and other LLMs) at a false positive rate of 0.01%, despite not being trained on any ChatGPT data. Code available at https://github.com/ahans30/Binoculars.

APA


Hans, A., Schwarzschild, A., Cherepanova, V., Kazemi, H., Saha, A., Goldblum, M., Geiping, J. & Goldstein, T.. (2024). Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:17519-17537 Available from https://proceedings.mlr.press/v235/hans24a.html.

Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text

Abstract

Cite this Paper

Related Material