Bimodal Modelling of Source Code and Natural Language

Miltos Allamanis; Daniel Tarlow; Andrew Gordon; Yi Wei

Bimodal Modelling of Source Code and Natural Language

Miltos Allamanis, Daniel Tarlow, Andrew Gordon, Yi Wei

Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:2123-2132, 2015.

Abstract

We consider the problem of building probabilistic models that jointly model short natural language utterances and source code snippets. The aim is to bring together recent work on statistical modelling of source code and work on bimodal models of images and natural language. The resulting models are useful for a variety of tasks that involve natural language and source code. We demonstrate their performance on two retrieval tasks: retrieving source code snippets given a natural language query, and retrieving natural language descriptions given a source code query (i.e., source code captioning). The experiments show there to be promise in this direction, and that modelling the structure of source code is helpful towards the retrieval tasks.

Cite this Paper

BibTeX


@InProceedings{pmlr-v37-allamanis15,
  title = 	 {Bimodal Modelling of Source Code and Natural Language},
  author = 	 {Allamanis, Miltos and Tarlow, Daniel and Gordon, Andrew and Wei, Yi},
  booktitle = 	 {Proceedings of the 32nd International Conference on Machine Learning},
  pages = 	 {2123--2132},
  year = 	 {2015},
  editor = 	 {Bach, Francis and Blei, David},
  volume = 	 {37},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Lille, France},
  month = 	 {07--09 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v37/allamanis15.pdf},
  url = 	 {https://proceedings.mlr.press/v37/allamanis15.html},
  abstract = 	 {We consider the problem of building probabilistic models that jointly model short natural language utterances and source code snippets. The aim is to bring together recent work on statistical modelling of source code and work on bimodal models of images and natural language. The resulting models are useful for a variety of tasks that involve natural language and source code. We demonstrate their performance on two retrieval tasks: retrieving source code snippets given a natural language query, and retrieving natural language descriptions given a source code query (i.e., source code captioning). The experiments show there to be promise in this direction, and that modelling the structure of source code is helpful towards the retrieval tasks.}
}

Endnote

%0 Conference Paper
%T Bimodal Modelling of Source Code and Natural Language
%A Miltos Allamanis
%A Daniel Tarlow
%A Andrew Gordon
%A Yi Wei
%B Proceedings of the 32nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2015
%E Francis Bach
%E David Blei	
%F pmlr-v37-allamanis15
%I PMLR
%P 2123--2132
%U https://proceedings.mlr.press/v37/allamanis15.html
%V 37
%X We consider the problem of building probabilistic models that jointly model short natural language utterances and source code snippets. The aim is to bring together recent work on statistical modelling of source code and work on bimodal models of images and natural language. The resulting models are useful for a variety of tasks that involve natural language and source code. We demonstrate their performance on two retrieval tasks: retrieving source code snippets given a natural language query, and retrieving natural language descriptions given a source code query (i.e., source code captioning). The experiments show there to be promise in this direction, and that modelling the structure of source code is helpful towards the retrieval tasks.

RIS


TY  - CPAPER
TI  - Bimodal Modelling of Source Code and Natural Language
AU  - Miltos Allamanis
AU  - Daniel Tarlow
AU  - Andrew Gordon
AU  - Yi Wei
BT  - Proceedings of the 32nd International Conference on Machine Learning
DA  - 2015/06/01
ED  - Francis Bach
ED  - David Blei	
ID  - pmlr-v37-allamanis15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 37
SP  - 2123
EP  - 2132
L1  - http://proceedings.mlr.press/v37/allamanis15.pdf
UR  - https://proceedings.mlr.press/v37/allamanis15.html
AB  - We consider the problem of building probabilistic models that jointly model short natural language utterances and source code snippets. The aim is to bring together recent work on statistical modelling of source code and work on bimodal models of images and natural language. The resulting models are useful for a variety of tasks that involve natural language and source code. We demonstrate their performance on two retrieval tasks: retrieving source code snippets given a natural language query, and retrieving natural language descriptions given a source code query (i.e., source code captioning). The experiments show there to be promise in this direction, and that modelling the structure of source code is helpful towards the retrieval tasks.
ER  -

APA


Allamanis, M., Tarlow, D., Gordon, A. & Wei, Y.. (2015). Bimodal Modelling of Source Code and Natural Language. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:2123-2132 Available from https://proceedings.mlr.press/v37/allamanis15.html.

Bimodal Modelling of Source Code and Natural Language

Abstract

Cite this Paper

Related Material