CLActive: Episodic Memories for Rapid Active Learning

Sri Aurobindo Munagala, Sidhant Subramanian, Shyamgopal Karthik, Ameya Prabhu, Anoop Namboodiri
Proceedings of The 1st Conference on Lifelong Learning Agents, PMLR 199:430-440, 2022.

Abstract

Active Learning aims to solve the problem of alleviating labelling costs for large-scale datasets by selecting a subset of data to effectively train on. Deep Active Learning (DAL) techniques typically involve repeated training of a model for sample acquisition over the entire subset of labelled data available in each round. This can be prohibitively expensive to run in real-world scenarios with large and constantly growing data. Some work has been done to address this – notably, Selection-Via-Proxy (SVP) proposed the use of a separate, smaller proxy model for acquisition. We explore further optimizations to the standard DAL setup and propose CLActive: an optimization procedure that brings significant speedups which maintains a constant training time for the selection model across rounds and retains information from past rounds using Experience Replay. We demonstrate large improvements in total train-time compared to the fully-trained baselines and SVP. We achieve up to 89$\times$, 7$\times$, 61$\times$ speedups over the fully-trained baseline at 50% of dataset collection in CIFAR, Imagenet and Amazon Review datasets, respectively, with little accuracy loss. We also show that CLActive is robust against catastrophic forgetting in a challenging class-incremental active-learning setting. Overall, we believe that CLActive can effectively enable rapid prototyping and deployment of deep AL algorithms in real-world use cases across a variety of settings.

Cite this Paper


BibTeX
@InProceedings{pmlr-v199-munagala22a, title = {CLActive: Episodic Memories for Rapid Active Learning}, author = {Munagala, Sri Aurobindo and Subramanian, Sidhant and Karthik, Shyamgopal and Prabhu, Ameya and Namboodiri, Anoop}, booktitle = {Proceedings of The 1st Conference on Lifelong Learning Agents}, pages = {430--440}, year = {2022}, editor = {Chandar, Sarath and Pascanu, Razvan and Precup, Doina}, volume = {199}, series = {Proceedings of Machine Learning Research}, month = {22--24 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v199/munagala22a/munagala22a.pdf}, url = {https://proceedings.mlr.press/v199/munagala22a.html}, abstract = {Active Learning aims to solve the problem of alleviating labelling costs for large-scale datasets by selecting a subset of data to effectively train on. Deep Active Learning (DAL) techniques typically involve repeated training of a model for sample acquisition over the entire subset of labelled data available in each round. This can be prohibitively expensive to run in real-world scenarios with large and constantly growing data. Some work has been done to address this – notably, Selection-Via-Proxy (SVP) proposed the use of a separate, smaller proxy model for acquisition. We explore further optimizations to the standard DAL setup and propose CLActive: an optimization procedure that brings significant speedups which maintains a constant training time for the selection model across rounds and retains information from past rounds using Experience Replay. We demonstrate large improvements in total train-time compared to the fully-trained baselines and SVP. We achieve up to 89$\times$, 7$\times$, 61$\times$ speedups over the fully-trained baseline at 50% of dataset collection in CIFAR, Imagenet and Amazon Review datasets, respectively, with little accuracy loss. We also show that CLActive is robust against catastrophic forgetting in a challenging class-incremental active-learning setting. Overall, we believe that CLActive can effectively enable rapid prototyping and deployment of deep AL algorithms in real-world use cases across a variety of settings.} }
Endnote
%0 Conference Paper %T CLActive: Episodic Memories for Rapid Active Learning %A Sri Aurobindo Munagala %A Sidhant Subramanian %A Shyamgopal Karthik %A Ameya Prabhu %A Anoop Namboodiri %B Proceedings of The 1st Conference on Lifelong Learning Agents %C Proceedings of Machine Learning Research %D 2022 %E Sarath Chandar %E Razvan Pascanu %E Doina Precup %F pmlr-v199-munagala22a %I PMLR %P 430--440 %U https://proceedings.mlr.press/v199/munagala22a.html %V 199 %X Active Learning aims to solve the problem of alleviating labelling costs for large-scale datasets by selecting a subset of data to effectively train on. Deep Active Learning (DAL) techniques typically involve repeated training of a model for sample acquisition over the entire subset of labelled data available in each round. This can be prohibitively expensive to run in real-world scenarios with large and constantly growing data. Some work has been done to address this – notably, Selection-Via-Proxy (SVP) proposed the use of a separate, smaller proxy model for acquisition. We explore further optimizations to the standard DAL setup and propose CLActive: an optimization procedure that brings significant speedups which maintains a constant training time for the selection model across rounds and retains information from past rounds using Experience Replay. We demonstrate large improvements in total train-time compared to the fully-trained baselines and SVP. We achieve up to 89$\times$, 7$\times$, 61$\times$ speedups over the fully-trained baseline at 50% of dataset collection in CIFAR, Imagenet and Amazon Review datasets, respectively, with little accuracy loss. We also show that CLActive is robust against catastrophic forgetting in a challenging class-incremental active-learning setting. Overall, we believe that CLActive can effectively enable rapid prototyping and deployment of deep AL algorithms in real-world use cases across a variety of settings.
APA
Munagala, S.A., Subramanian, S., Karthik, S., Prabhu, A. & Namboodiri, A.. (2022). CLActive: Episodic Memories for Rapid Active Learning. Proceedings of The 1st Conference on Lifelong Learning Agents, in Proceedings of Machine Learning Research 199:430-440 Available from https://proceedings.mlr.press/v199/munagala22a.html.

Related Material