How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding

Yuchen Li; Yuanzhi Li; Andrej Risteski

How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding

Yuchen Li, Yuanzhi Li, Andrej Risteski

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:19689-19729, 2023.

Abstract

While the successes of transformers across many domains are indisputable, accurate understanding of the learning mechanics is still largely lacking. Their capabilities have been probed on benchmarks which include a variety of structured and reasoning tasks—but mathematical understanding is lagging substantially behind. Recent lines of work have begun studying representational aspects of this question: that is, the size/depth/complexity of attention-based networks to perform certain tasks. However, there is no guarantee the learning dynamics will converge to the constructions proposed. In our paper, we provide fine-grained mechanistic understanding of how transformers learn “semantic structure”, understood as capturing co-occurrence structure of words. Precisely, we show, through a combination of mathematical analysis and experiments on Wikipedia data and synthetic data modeled by Latent Dirichlet Allocation (LDA), that the embedding layer and the self-attention layer encode the topical structure. In the former case, this manifests as higher average inner product of embeddings between same-topic words. In the latter, it manifests as higher average pairwise attention between same-topic words. The mathematical results involve several assumptions to make the analysis tractable, which we verify on data, and might be of independent interest as well.

Cite this Paper

BibTeX


@InProceedings{pmlr-v202-li23p,
  title = 	 {How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding},
  author =       {Li, Yuchen and Li, Yuanzhi and Risteski, Andrej},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {19689--19729},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/li23p/li23p.pdf},
  url = 	 {https://proceedings.mlr.press/v202/li23p.html},
  abstract = 	 {While the successes of transformers across many domains are indisputable, accurate understanding of the learning mechanics is still largely lacking. Their capabilities have been probed on benchmarks which include a variety of structured and reasoning tasks—but mathematical understanding is lagging substantially behind. Recent lines of work have begun studying representational aspects of this question: that is, the size/depth/complexity of attention-based networks to perform certain tasks. However, there is no guarantee the learning dynamics will converge to the constructions proposed. In our paper, we provide fine-grained mechanistic understanding of how transformers learn “semantic structure”, understood as capturing co-occurrence structure of words. Precisely, we show, through a combination of mathematical analysis and experiments on Wikipedia data and synthetic data modeled by Latent Dirichlet Allocation (LDA), that the embedding layer and the self-attention layer encode the topical structure. In the former case, this manifests as higher average inner product of embeddings between same-topic words. In the latter, it manifests as higher average pairwise attention between same-topic words. The mathematical results involve several assumptions to make the analysis tractable, which we verify on data, and might be of independent interest as well.}
}

Endnote

%0 Conference Paper
%T How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding
%A Yuchen Li
%A Yuanzhi Li
%A Andrej Risteski
%B Proceedings of the 40th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Andreas Krause
%E Emma Brunskill
%E Kyunghyun Cho
%E Barbara Engelhardt
%E Sivan Sabato
%E Jonathan Scarlett	
%F pmlr-v202-li23p
%I PMLR
%P 19689--19729
%U https://proceedings.mlr.press/v202/li23p.html
%V 202
%X While the successes of transformers across many domains are indisputable, accurate understanding of the learning mechanics is still largely lacking. Their capabilities have been probed on benchmarks which include a variety of structured and reasoning tasks—but mathematical understanding is lagging substantially behind. Recent lines of work have begun studying representational aspects of this question: that is, the size/depth/complexity of attention-based networks to perform certain tasks. However, there is no guarantee the learning dynamics will converge to the constructions proposed. In our paper, we provide fine-grained mechanistic understanding of how transformers learn “semantic structure”, understood as capturing co-occurrence structure of words. Precisely, we show, through a combination of mathematical analysis and experiments on Wikipedia data and synthetic data modeled by Latent Dirichlet Allocation (LDA), that the embedding layer and the self-attention layer encode the topical structure. In the former case, this manifests as higher average inner product of embeddings between same-topic words. In the latter, it manifests as higher average pairwise attention between same-topic words. The mathematical results involve several assumptions to make the analysis tractable, which we verify on data, and might be of independent interest as well.

APA


Li, Y., Li, Y. & Risteski, A.. (2023). How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:19689-19729 Available from https://proceedings.mlr.press/v202/li23p.html.

How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding

Abstract

Cite this Paper

Related Material