The Missing Alignment Link of In-context Learning on Sequences

Harshvardhan Agarwal, Sunita Sarawagi
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:550-565, 2025.

Abstract

Large language models (LLMs) have demonstrated the capability to perform in-context learning (ICL) for completely unseen tasks in classification or language completion. Sequence to sequence (seq2seq) is another popular task category with several applications seeking quick adaptation with ICL. We present a systematic analysis of the ICL capability of LLMs on Seq2Seq tasks using a formal structured language-pair. Our study reveals a critical limitation: except for very short input sequences, ICL fails to achieve consistent learning across all output positions. This exposes a fundamental weakness of modern LLMs — their inability to effectively uncover the alignment between input and output sequences. Consequently, this limitation results in incomplete induction heads, which are essential for in-context learning of new discrete mappings. To address these limitations, we propose ICA-Tune, a method for focused fine-tuning of an LLM using in-context examples. We present a mechanistic evaluation with two accuracy probes to show how input-output alignment emerges in middle layers of an LLM without direct supervision. This alignment leads to an abrupt jump in the completeness of the induction heads in higher layers. We show that, compared to standard fine-tuning, ICA-Tune enables more sample efficient learning and better generalization to OOD instances.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-agarwal25d, title = {The Missing Alignment Link of In-context Learning on Sequences}, author = {Agarwal, Harshvardhan and Sarawagi, Sunita}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {550--565}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/agarwal25d/agarwal25d.pdf}, url = {https://proceedings.mlr.press/v267/agarwal25d.html}, abstract = {Large language models (LLMs) have demonstrated the capability to perform in-context learning (ICL) for completely unseen tasks in classification or language completion. Sequence to sequence (seq2seq) is another popular task category with several applications seeking quick adaptation with ICL. We present a systematic analysis of the ICL capability of LLMs on Seq2Seq tasks using a formal structured language-pair. Our study reveals a critical limitation: except for very short input sequences, ICL fails to achieve consistent learning across all output positions. This exposes a fundamental weakness of modern LLMs — their inability to effectively uncover the alignment between input and output sequences. Consequently, this limitation results in incomplete induction heads, which are essential for in-context learning of new discrete mappings. To address these limitations, we propose ICA-Tune, a method for focused fine-tuning of an LLM using in-context examples. We present a mechanistic evaluation with two accuracy probes to show how input-output alignment emerges in middle layers of an LLM without direct supervision. This alignment leads to an abrupt jump in the completeness of the induction heads in higher layers. We show that, compared to standard fine-tuning, ICA-Tune enables more sample efficient learning and better generalization to OOD instances.} }
Endnote
%0 Conference Paper %T The Missing Alignment Link of In-context Learning on Sequences %A Harshvardhan Agarwal %A Sunita Sarawagi %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-agarwal25d %I PMLR %P 550--565 %U https://proceedings.mlr.press/v267/agarwal25d.html %V 267 %X Large language models (LLMs) have demonstrated the capability to perform in-context learning (ICL) for completely unseen tasks in classification or language completion. Sequence to sequence (seq2seq) is another popular task category with several applications seeking quick adaptation with ICL. We present a systematic analysis of the ICL capability of LLMs on Seq2Seq tasks using a formal structured language-pair. Our study reveals a critical limitation: except for very short input sequences, ICL fails to achieve consistent learning across all output positions. This exposes a fundamental weakness of modern LLMs — their inability to effectively uncover the alignment between input and output sequences. Consequently, this limitation results in incomplete induction heads, which are essential for in-context learning of new discrete mappings. To address these limitations, we propose ICA-Tune, a method for focused fine-tuning of an LLM using in-context examples. We present a mechanistic evaluation with two accuracy probes to show how input-output alignment emerges in middle layers of an LLM without direct supervision. This alignment leads to an abrupt jump in the completeness of the induction heads in higher layers. We show that, compared to standard fine-tuning, ICA-Tune enables more sample efficient learning and better generalization to OOD instances.
APA
Agarwal, H. & Sarawagi, S.. (2025). The Missing Alignment Link of In-context Learning on Sequences. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:550-565 Available from https://proceedings.mlr.press/v267/agarwal25d.html.

Related Material