In-Context In-Context Learning with Transformer Neural Processes

Matthew Ashman, Cristiana Diaconu, Adrian Weller, Richard E. Turner
Proceedings of the 6th Symposium on Advances in Approximate Bayesian Inference, PMLR 253:1-29, 2024.

Abstract

Neural processes (NPs) are a powerful family of meta-learning models that seek to approximate the posterior predictive map of the ground-truth stochastic process from which each dataset in a meta-dataset is sampled. There are many cases in which practitioners, besides having access to the dataset of interest, may also have access to other datasets that share similarities with it. In this case, integrating these datasets into the NP can improve predictions. We equip NPs with this functionality and describe this paradigm as in-context in-context learning. Standard NP architectures, such as the convolutional conditional NP (ConvCNP) or the family of transformer neural processes (TNPs), are not capable of in-context in-context learning, as they are only able to condition on a single dataset. We address this shortcoming by developing the in-context in-context learning pseudo-token TNP (ICICL-TNP). The ICICL-TNP builds on the family of PT-TNPs, which utilise pseudo-token-based transformer architectures to sidestep the quadratic computational complexity associated with regular transformer architectures. Importantly, the ICICL-TNP is capable of conditioning on both sets of datapoints and sets of datasets, enabling it to perform in-context in-context learning. We demonstrate the importance of in-context in-context learning and the effectiveness of the ICICL-TNP in a number of experiments.

Cite this Paper


BibTeX
@InProceedings{pmlr-v253-ashman24a, title = {In-Context In-Context Learning with Transformer Neural Processes}, author = {Ashman, Matthew and Diaconu, Cristiana and Weller, Adrian and Turner, Richard E.}, booktitle = {Proceedings of the 6th Symposium on Advances in Approximate Bayesian Inference}, pages = {1--29}, year = {2024}, editor = {AntorĂ¡n, Javier and Naesseth, Christian A.}, volume = {253}, series = {Proceedings of Machine Learning Research}, month = {21 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v253/main/assets/ashman24a/ashman24a.pdf}, url = {https://proceedings.mlr.press/v253/ashman24a.html}, abstract = {Neural processes (NPs) are a powerful family of meta-learning models that seek to approximate the posterior predictive map of the ground-truth stochastic process from which each dataset in a meta-dataset is sampled. There are many cases in which practitioners, besides having access to the dataset of interest, may also have access to other datasets that share similarities with it. In this case, integrating these datasets into the NP can improve predictions. We equip NPs with this functionality and describe this paradigm as in-context in-context learning. Standard NP architectures, such as the convolutional conditional NP (ConvCNP) or the family of transformer neural processes (TNPs), are not capable of in-context in-context learning, as they are only able to condition on a single dataset. We address this shortcoming by developing the in-context in-context learning pseudo-token TNP (ICICL-TNP). The ICICL-TNP builds on the family of PT-TNPs, which utilise pseudo-token-based transformer architectures to sidestep the quadratic computational complexity associated with regular transformer architectures. Importantly, the ICICL-TNP is capable of conditioning on both sets of datapoints and sets of datasets, enabling it to perform in-context in-context learning. We demonstrate the importance of in-context in-context learning and the effectiveness of the ICICL-TNP in a number of experiments.} }
Endnote
%0 Conference Paper %T In-Context In-Context Learning with Transformer Neural Processes %A Matthew Ashman %A Cristiana Diaconu %A Adrian Weller %A Richard E. Turner %B Proceedings of the 6th Symposium on Advances in Approximate Bayesian Inference %C Proceedings of Machine Learning Research %D 2024 %E Javier AntorĂ¡n %E Christian A. Naesseth %F pmlr-v253-ashman24a %I PMLR %P 1--29 %U https://proceedings.mlr.press/v253/ashman24a.html %V 253 %X Neural processes (NPs) are a powerful family of meta-learning models that seek to approximate the posterior predictive map of the ground-truth stochastic process from which each dataset in a meta-dataset is sampled. There are many cases in which practitioners, besides having access to the dataset of interest, may also have access to other datasets that share similarities with it. In this case, integrating these datasets into the NP can improve predictions. We equip NPs with this functionality and describe this paradigm as in-context in-context learning. Standard NP architectures, such as the convolutional conditional NP (ConvCNP) or the family of transformer neural processes (TNPs), are not capable of in-context in-context learning, as they are only able to condition on a single dataset. We address this shortcoming by developing the in-context in-context learning pseudo-token TNP (ICICL-TNP). The ICICL-TNP builds on the family of PT-TNPs, which utilise pseudo-token-based transformer architectures to sidestep the quadratic computational complexity associated with regular transformer architectures. Importantly, the ICICL-TNP is capable of conditioning on both sets of datapoints and sets of datasets, enabling it to perform in-context in-context learning. We demonstrate the importance of in-context in-context learning and the effectiveness of the ICICL-TNP in a number of experiments.
APA
Ashman, M., Diaconu, C., Weller, A. & Turner, R.E.. (2024). In-Context In-Context Learning with Transformer Neural Processes. Proceedings of the 6th Symposium on Advances in Approximate Bayesian Inference, in Proceedings of Machine Learning Research 253:1-29 Available from https://proceedings.mlr.press/v253/ashman24a.html.

Related Material