WebQA: A Multimodal Multihop NeurIPS Challenge

Yingshan Chang, Yonatan Bisk
Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, PMLR 176:232-245, 2022.

Abstract

Scaling the current QA formulation to the open-domain and multi-hop nature of web searches requires fundamental advances in visual representation learning, multimodal reasoning and language generation. To facilitate research at this intersection, we propose WebQA challenge that mirrors the way humans use the web: 1) Ask a question, 2) Choose sources to aggregate, and 3) Produce a fluent language response. Our challenge for the community is to create unified multimodal reasoning models that can answer questions regardless of the source modality, moving us closer to digital assistants that search through not only text-based knowledge, but also the richer visual trove of information.

Cite this Paper


BibTeX
@InProceedings{pmlr-v176-chang22a, title = {WebQA: A Multimodal Multihop NeurIPS Challenge}, author = {Chang, Yingshan and Bisk, Yonatan}, booktitle = {Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track}, pages = {232--245}, year = {2022}, editor = {Kiela, Douwe and Ciccone, Marco and Caputo, Barbara}, volume = {176}, series = {Proceedings of Machine Learning Research}, month = {06--14 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v176/chang22a/chang22a.pdf}, url = {https://proceedings.mlr.press/v176/chang22a.html}, abstract = {Scaling the current QA formulation to the open-domain and multi-hop nature of web searches requires fundamental advances in visual representation learning, multimodal reasoning and language generation. To facilitate research at this intersection, we propose WebQA challenge that mirrors the way humans use the web: 1) Ask a question, 2) Choose sources to aggregate, and 3) Produce a fluent language response. Our challenge for the community is to create unified multimodal reasoning models that can answer questions regardless of the source modality, moving us closer to digital assistants that search through not only text-based knowledge, but also the richer visual trove of information.} }
Endnote
%0 Conference Paper %T WebQA: A Multimodal Multihop NeurIPS Challenge %A Yingshan Chang %A Yonatan Bisk %B Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track %C Proceedings of Machine Learning Research %D 2022 %E Douwe Kiela %E Marco Ciccone %E Barbara Caputo %F pmlr-v176-chang22a %I PMLR %P 232--245 %U https://proceedings.mlr.press/v176/chang22a.html %V 176 %X Scaling the current QA formulation to the open-domain and multi-hop nature of web searches requires fundamental advances in visual representation learning, multimodal reasoning and language generation. To facilitate research at this intersection, we propose WebQA challenge that mirrors the way humans use the web: 1) Ask a question, 2) Choose sources to aggregate, and 3) Produce a fluent language response. Our challenge for the community is to create unified multimodal reasoning models that can answer questions regardless of the source modality, moving us closer to digital assistants that search through not only text-based knowledge, but also the richer visual trove of information.
APA
Chang, Y. & Bisk, Y.. (2022). WebQA: A Multimodal Multihop NeurIPS Challenge. Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, in Proceedings of Machine Learning Research 176:232-245 Available from https://proceedings.mlr.press/v176/chang22a.html.

Related Material