TOPLOC: A Locality Sensitive Hashing Scheme for Trustless Verifiable Inference

Jack Min Ong, Matthew Di Ferrante, Aaron Pazdera, Ryan Garner, Sami Jaghouar, Manveer Basra, Max Ryabinin, Johannes Hagemann
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:47196-47211, 2025.

Abstract

Large language models (LLMs) have proven to be very capable, but access to frontier models currently relies on inference providers. This introduces trust challenges: how can we be sure that the provider is using the model configuration they claim? We propose TOPLOC, a novel method for verifiable inference that addresses this problem. TOPLOC leverages a compact locality-sensitive hashing mechanism for intermediate activations, which can detect unauthorized modifications to models, prompts, or precision with 100% accuracy, achieving no false positives or negatives in our empirical evaluations. Our approach is robust across diverse hardware configurations, GPU types, and algebraic reorderings, which allows for validation speeds significantly faster than the original inference. By introducing a polynomial encoding scheme, TOPLOC minimizes the memory overhead of the generated proofs by $1000\times$, requiring only 258 bytes of storage per 32 new tokens, compared to the 262 KB requirement of storing the token embeddings directly for Llama 3.1-8B-Instruct. Our method empowers users to verify LLM inference computations efficiently, fostering greater trust and transparency in open ecosystems and laying a foundation for decentralized, verifiable and trustless AI services.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-ong25a, title = {{TOPLOC}: A Locality Sensitive Hashing Scheme for Trustless Verifiable Inference}, author = {Ong, Jack Min and Ferrante, Matthew Di and Pazdera, Aaron and Garner, Ryan and Jaghouar, Sami and Basra, Manveer and Ryabinin, Max and Hagemann, Johannes}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {47196--47211}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/ong25a/ong25a.pdf}, url = {https://proceedings.mlr.press/v267/ong25a.html}, abstract = {Large language models (LLMs) have proven to be very capable, but access to frontier models currently relies on inference providers. This introduces trust challenges: how can we be sure that the provider is using the model configuration they claim? We propose TOPLOC, a novel method for verifiable inference that addresses this problem. TOPLOC leverages a compact locality-sensitive hashing mechanism for intermediate activations, which can detect unauthorized modifications to models, prompts, or precision with 100% accuracy, achieving no false positives or negatives in our empirical evaluations. Our approach is robust across diverse hardware configurations, GPU types, and algebraic reorderings, which allows for validation speeds significantly faster than the original inference. By introducing a polynomial encoding scheme, TOPLOC minimizes the memory overhead of the generated proofs by $1000\times$, requiring only 258 bytes of storage per 32 new tokens, compared to the 262 KB requirement of storing the token embeddings directly for Llama 3.1-8B-Instruct. Our method empowers users to verify LLM inference computations efficiently, fostering greater trust and transparency in open ecosystems and laying a foundation for decentralized, verifiable and trustless AI services.} }
Endnote
%0 Conference Paper %T TOPLOC: A Locality Sensitive Hashing Scheme for Trustless Verifiable Inference %A Jack Min Ong %A Matthew Di Ferrante %A Aaron Pazdera %A Ryan Garner %A Sami Jaghouar %A Manveer Basra %A Max Ryabinin %A Johannes Hagemann %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-ong25a %I PMLR %P 47196--47211 %U https://proceedings.mlr.press/v267/ong25a.html %V 267 %X Large language models (LLMs) have proven to be very capable, but access to frontier models currently relies on inference providers. This introduces trust challenges: how can we be sure that the provider is using the model configuration they claim? We propose TOPLOC, a novel method for verifiable inference that addresses this problem. TOPLOC leverages a compact locality-sensitive hashing mechanism for intermediate activations, which can detect unauthorized modifications to models, prompts, or precision with 100% accuracy, achieving no false positives or negatives in our empirical evaluations. Our approach is robust across diverse hardware configurations, GPU types, and algebraic reorderings, which allows for validation speeds significantly faster than the original inference. By introducing a polynomial encoding scheme, TOPLOC minimizes the memory overhead of the generated proofs by $1000\times$, requiring only 258 bytes of storage per 32 new tokens, compared to the 262 KB requirement of storing the token embeddings directly for Llama 3.1-8B-Instruct. Our method empowers users to verify LLM inference computations efficiently, fostering greater trust and transparency in open ecosystems and laying a foundation for decentralized, verifiable and trustless AI services.
APA
Ong, J.M., Ferrante, M.D., Pazdera, A., Garner, R., Jaghouar, S., Basra, M., Ryabinin, M. & Hagemann, J.. (2025). TOPLOC: A Locality Sensitive Hashing Scheme for Trustless Verifiable Inference. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:47196-47211 Available from https://proceedings.mlr.press/v267/ong25a.html.

Related Material