IBCircuit: Towards Holistic Circuit Discovery with Information Bottleneck

Tian Bian, Yifan Niu, Chaohao Yuan, Chengzhi Piao, Bingzhe Wu, Long-Kai Huang, Yu Rong, Tingyang Xu, Hong Cheng, Jia Li
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:4289-4302, 2025.

Abstract

Circuit discovery has recently attracted attention as a potential research direction to explain the non-trivial behaviors of language models. It aims to find the computational subgraphs, also known as circuits, within the model that are responsible for solving specific tasks. However, most existing studies overlook the holistic nature of these circuits and require designing specific corrupted activations for different tasks, which is inaccurate and inefficient. In this work, we propose an end-to-end approach based on the principle of Information Bottleneck, called IBCircuit, to holistically identify informative circuits. In contrast to traditional causal interventions, IBCircuit is an optimization framework for holistic circuit discovery and can be applied to any given task without tediously corrupted activation design. In both the Indirect Object Identification (IOI) and Greater-Than tasks, IBCircuit identifies more faithful and minimal circuits in terms of critical node components and edge components compared to recent related work.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-bian25a, title = {{IBC}ircuit: Towards Holistic Circuit Discovery with Information Bottleneck}, author = {Bian, Tian and Niu, Yifan and Yuan, Chaohao and Piao, Chengzhi and Wu, Bingzhe and Huang, Long-Kai and Rong, Yu and Xu, Tingyang and Cheng, Hong and Li, Jia}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {4289--4302}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/bian25a/bian25a.pdf}, url = {https://proceedings.mlr.press/v267/bian25a.html}, abstract = {Circuit discovery has recently attracted attention as a potential research direction to explain the non-trivial behaviors of language models. It aims to find the computational subgraphs, also known as circuits, within the model that are responsible for solving specific tasks. However, most existing studies overlook the holistic nature of these circuits and require designing specific corrupted activations for different tasks, which is inaccurate and inefficient. In this work, we propose an end-to-end approach based on the principle of Information Bottleneck, called IBCircuit, to holistically identify informative circuits. In contrast to traditional causal interventions, IBCircuit is an optimization framework for holistic circuit discovery and can be applied to any given task without tediously corrupted activation design. In both the Indirect Object Identification (IOI) and Greater-Than tasks, IBCircuit identifies more faithful and minimal circuits in terms of critical node components and edge components compared to recent related work.} }
Endnote
%0 Conference Paper %T IBCircuit: Towards Holistic Circuit Discovery with Information Bottleneck %A Tian Bian %A Yifan Niu %A Chaohao Yuan %A Chengzhi Piao %A Bingzhe Wu %A Long-Kai Huang %A Yu Rong %A Tingyang Xu %A Hong Cheng %A Jia Li %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-bian25a %I PMLR %P 4289--4302 %U https://proceedings.mlr.press/v267/bian25a.html %V 267 %X Circuit discovery has recently attracted attention as a potential research direction to explain the non-trivial behaviors of language models. It aims to find the computational subgraphs, also known as circuits, within the model that are responsible for solving specific tasks. However, most existing studies overlook the holistic nature of these circuits and require designing specific corrupted activations for different tasks, which is inaccurate and inefficient. In this work, we propose an end-to-end approach based on the principle of Information Bottleneck, called IBCircuit, to holistically identify informative circuits. In contrast to traditional causal interventions, IBCircuit is an optimization framework for holistic circuit discovery and can be applied to any given task without tediously corrupted activation design. In both the Indirect Object Identification (IOI) and Greater-Than tasks, IBCircuit identifies more faithful and minimal circuits in terms of critical node components and edge components compared to recent related work.
APA
Bian, T., Niu, Y., Yuan, C., Piao, C., Wu, B., Huang, L., Rong, Y., Xu, T., Cheng, H. & Li, J.. (2025). IBCircuit: Towards Holistic Circuit Discovery with Information Bottleneck. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:4289-4302 Available from https://proceedings.mlr.press/v267/bian25a.html.

Related Material