MABe22: A Multi-Species Multi-Task Benchmark for Learned Representations of Behavior

Jennifer J. Sun, Markus Marks, Andrew Wesley Ulmer, Dipam Chakraborty, Brian Geuther, Edward Hayes, Heng Jia, Vivek Kumar, Sebastian Oleszko, Zachary Partridge, Milan Peelman, Alice Robie, Catherine E Schretter, Keith Sheppard, Chao Sun, Param Uttarwar, Julian Morgan Wagner, Erik Werner, Joseph Parker, Pietro Perona, Yisong Yue, Kristin Branson, Ann Kennedy
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:32936-32990, 2023.

Abstract

We introduce MABe22, a large-scale, multi-agent video and trajectory benchmark to assess the quality of learned behavior representations. This dataset is collected from a variety of biology experiments, and includes triplets of interacting mice (4.7 million frames video+pose tracking data, 10 million frames pose only), symbiotic beetle-ant interactions (10 million frames video data), and groups of interacting flies (4.4 million frames of pose tracking data). Accompanying these data, we introduce a panel of real-life downstream analysis tasks to assess the quality of learned representations by evaluating how well they preserve information about the experimental conditions (e.g. strain, time of day, optogenetic stimulation) and animal behavior. We test multiple state-of-the-art self-supervised video and trajectory representation learning methods to demonstrate the use of our benchmark, revealing that methods developed using human action datasets do not fully translate to animal datasets. We hope that our benchmark and dataset encourage a broader exploration of behavior representation learning methods across species and settings.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-sun23g, title = {{MAB}e22: A Multi-Species Multi-Task Benchmark for Learned Representations of Behavior}, author = {Sun, Jennifer J. and Marks, Markus and Ulmer, Andrew Wesley and Chakraborty, Dipam and Geuther, Brian and Hayes, Edward and Jia, Heng and Kumar, Vivek and Oleszko, Sebastian and Partridge, Zachary and Peelman, Milan and Robie, Alice and Schretter, Catherine E and Sheppard, Keith and Sun, Chao and Uttarwar, Param and Wagner, Julian Morgan and Werner, Erik and Parker, Joseph and Perona, Pietro and Yue, Yisong and Branson, Kristin and Kennedy, Ann}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {32936--32990}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/sun23g/sun23g.pdf}, url = {https://proceedings.mlr.press/v202/sun23g.html}, abstract = {We introduce MABe22, a large-scale, multi-agent video and trajectory benchmark to assess the quality of learned behavior representations. This dataset is collected from a variety of biology experiments, and includes triplets of interacting mice (4.7 million frames video+pose tracking data, 10 million frames pose only), symbiotic beetle-ant interactions (10 million frames video data), and groups of interacting flies (4.4 million frames of pose tracking data). Accompanying these data, we introduce a panel of real-life downstream analysis tasks to assess the quality of learned representations by evaluating how well they preserve information about the experimental conditions (e.g. strain, time of day, optogenetic stimulation) and animal behavior. We test multiple state-of-the-art self-supervised video and trajectory representation learning methods to demonstrate the use of our benchmark, revealing that methods developed using human action datasets do not fully translate to animal datasets. We hope that our benchmark and dataset encourage a broader exploration of behavior representation learning methods across species and settings.} }
Endnote
%0 Conference Paper %T MABe22: A Multi-Species Multi-Task Benchmark for Learned Representations of Behavior %A Jennifer J. Sun %A Markus Marks %A Andrew Wesley Ulmer %A Dipam Chakraborty %A Brian Geuther %A Edward Hayes %A Heng Jia %A Vivek Kumar %A Sebastian Oleszko %A Zachary Partridge %A Milan Peelman %A Alice Robie %A Catherine E Schretter %A Keith Sheppard %A Chao Sun %A Param Uttarwar %A Julian Morgan Wagner %A Erik Werner %A Joseph Parker %A Pietro Perona %A Yisong Yue %A Kristin Branson %A Ann Kennedy %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-sun23g %I PMLR %P 32936--32990 %U https://proceedings.mlr.press/v202/sun23g.html %V 202 %X We introduce MABe22, a large-scale, multi-agent video and trajectory benchmark to assess the quality of learned behavior representations. This dataset is collected from a variety of biology experiments, and includes triplets of interacting mice (4.7 million frames video+pose tracking data, 10 million frames pose only), symbiotic beetle-ant interactions (10 million frames video data), and groups of interacting flies (4.4 million frames of pose tracking data). Accompanying these data, we introduce a panel of real-life downstream analysis tasks to assess the quality of learned representations by evaluating how well they preserve information about the experimental conditions (e.g. strain, time of day, optogenetic stimulation) and animal behavior. We test multiple state-of-the-art self-supervised video and trajectory representation learning methods to demonstrate the use of our benchmark, revealing that methods developed using human action datasets do not fully translate to animal datasets. We hope that our benchmark and dataset encourage a broader exploration of behavior representation learning methods across species and settings.
APA
Sun, J.J., Marks, M., Ulmer, A.W., Chakraborty, D., Geuther, B., Hayes, E., Jia, H., Kumar, V., Oleszko, S., Partridge, Z., Peelman, M., Robie, A., Schretter, C.E., Sheppard, K., Sun, C., Uttarwar, P., Wagner, J.M., Werner, E., Parker, J., Perona, P., Yue, Y., Branson, K. & Kennedy, A.. (2023). MABe22: A Multi-Species Multi-Task Benchmark for Learned Representations of Behavior. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:32936-32990 Available from https://proceedings.mlr.press/v202/sun23g.html.

Related Material