Ask Me Anything: Dynamic Memory Networks for Natural Language Processing

Ankit Kumar; Ozan Irsoy; Peter Ondruska; Mohit Iyyer; James Bradbury; Ishaan Gulrajani; Victor Zhong; Romain Paulus; Richard Socher

Ask Me Anything: Dynamic Memory Networks for Natural Language Processing

Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus, Richard Socher

Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:1378-1387, 2016.

Abstract

Most tasks in natural language processing can be cast into question answering (QA) problems over language input. We introduce the dynamic memory network (DMN), a neural network architecture which processes input sequences and questions, forms episodic memories, and generates relevant answers. Questions trigger an iterative attention process which allows the model to condition its attention on the inputs and the result of previous iterations. These results are then reasoned over in a hierarchical recurrent sequence model to generate answers. The DMN can be trained end-to-end and obtains state-of-the-art results on several types of tasks and datasets: question answering (Facebook’s bAbI dataset), text classification for sentiment analysis (Stanford Sentiment Treebank) and sequence modeling for part-of-speech tagging (WSJ-PTB). The training for these different tasks relies exclusively on trained word vector representations and input-question-answer triplets.

Cite this Paper

BibTeX


@InProceedings{pmlr-v48-kumar16,
  title = 	 {Ask Me Anything: Dynamic Memory Networks for Natural Language Processing},
  author = 	 {Kumar, Ankit and Irsoy, Ozan and Ondruska, Peter and Iyyer, Mohit and Bradbury, James and Gulrajani, Ishaan and Zhong, Victor and Paulus, Romain and Socher, Richard},
  booktitle = 	 {Proceedings of The 33rd International Conference on Machine Learning},
  pages = 	 {1378--1387},
  year = 	 {2016},
  editor = 	 {Balcan, Maria Florina and Weinberger, Kilian Q.},
  volume = 	 {48},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {20--22 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v48/kumar16.pdf},
  url = 	 {https://proceedings.mlr.press/v48/kumar16.html},
  abstract = 	 {Most tasks in natural language processing can be cast into question answering (QA) problems over language input. We introduce the dynamic memory network (DMN), a neural network architecture which processes input sequences and questions, forms episodic memories, and generates relevant answers. Questions trigger an iterative attention process which allows the model to condition its attention on the inputs and the result of previous iterations. These results are then reasoned over in a hierarchical recurrent sequence model to generate answers. The DMN can be trained end-to-end and obtains state-of-the-art results on several types of tasks and datasets: question answering (Facebook’s bAbI dataset), text classification for sentiment analysis (Stanford Sentiment Treebank) and sequence modeling for part-of-speech tagging (WSJ-PTB). The training for these different tasks relies exclusively on trained word vector representations and input-question-answer triplets.}
}

Endnote

%0 Conference Paper
%T Ask Me Anything: Dynamic Memory Networks for Natural Language Processing
%A Ankit Kumar
%A Ozan Irsoy
%A Peter Ondruska
%A Mohit Iyyer
%A James Bradbury
%A Ishaan Gulrajani
%A Victor Zhong
%A Romain Paulus
%A Richard Socher
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger	
%F pmlr-v48-kumar16
%I PMLR
%P 1378--1387
%U https://proceedings.mlr.press/v48/kumar16.html
%V 48
%X Most tasks in natural language processing can be cast into question answering (QA) problems over language input. We introduce the dynamic memory network (DMN), a neural network architecture which processes input sequences and questions, forms episodic memories, and generates relevant answers. Questions trigger an iterative attention process which allows the model to condition its attention on the inputs and the result of previous iterations. These results are then reasoned over in a hierarchical recurrent sequence model to generate answers. The DMN can be trained end-to-end and obtains state-of-the-art results on several types of tasks and datasets: question answering (Facebook’s bAbI dataset), text classification for sentiment analysis (Stanford Sentiment Treebank) and sequence modeling for part-of-speech tagging (WSJ-PTB). The training for these different tasks relies exclusively on trained word vector representations and input-question-answer triplets.

RIS


TY  - CPAPER
TI  - Ask Me Anything: Dynamic Memory Networks for Natural Language Processing
AU  - Ankit Kumar
AU  - Ozan Irsoy
AU  - Peter Ondruska
AU  - Mohit Iyyer
AU  - James Bradbury
AU  - Ishaan Gulrajani
AU  - Victor Zhong
AU  - Romain Paulus
AU  - Richard Socher
BT  - Proceedings of The 33rd International Conference on Machine Learning
DA  - 2016/06/11
ED  - Maria Florina Balcan
ED  - Kilian Q. Weinberger	
ID  - pmlr-v48-kumar16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 48
SP  - 1378
EP  - 1387
L1  - http://proceedings.mlr.press/v48/kumar16.pdf
UR  - https://proceedings.mlr.press/v48/kumar16.html
AB  - Most tasks in natural language processing can be cast into question answering (QA) problems over language input. We introduce the dynamic memory network (DMN), a neural network architecture which processes input sequences and questions, forms episodic memories, and generates relevant answers. Questions trigger an iterative attention process which allows the model to condition its attention on the inputs and the result of previous iterations. These results are then reasoned over in a hierarchical recurrent sequence model to generate answers. The DMN can be trained end-to-end and obtains state-of-the-art results on several types of tasks and datasets: question answering (Facebook’s bAbI dataset), text classification for sentiment analysis (Stanford Sentiment Treebank) and sequence modeling for part-of-speech tagging (WSJ-PTB). The training for these different tasks relies exclusively on trained word vector representations and input-question-answer triplets.
ER  -

APA


Kumar, A., Irsoy, O., Ondruska, P., Iyyer, M., Bradbury, J., Gulrajani, I., Zhong, V., Paulus, R. & Socher, R.. (2016). Ask Me Anything: Dynamic Memory Networks for Natural Language Processing. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:1378-1387 Available from https://proceedings.mlr.press/v48/kumar16.html.

Ask Me Anything: Dynamic Memory Networks for Natural Language Processing

Abstract

Cite this Paper

Related Material