MOGIC: Metadata-infused Oracle Guidance for Improved Extreme Classification

Suchith Chidananda Prabhu, Bhavyajeet Singh, Anshul Mittal, Siddarth Asokan, Shikhar Mohan, Deepak Saini, Yashoteja Prabhu, Lakshya Kumar, Jian Jiao, Amit S, Niket Tandon, Manish Gupta, Sumeet Agarwal, Manik Varma
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:49709-49736, 2025.

Abstract

Retrieval-augmented classification and generation models benefit from early-stage fusion of high-quality text-based metadata, often called memory, but face high latency and noise sensitivity. In extreme classification (XC), where low latency is crucial, existing methods use late-stage fusion for efficiency and robustness. To enhance accuracy while maintaining low latency, we propose MOGIC, a novel approach to metadata-infused oracle guidance for XC. We train an early-fusion oracle classifier with access to both query-side and label-side ground-truth metadata in textual form and subsequently use it to guide existing memory-based XC disciple models via regularization. The MOGIC algorithm improves precision@1 and propensity-scored precision@1 of XC disciple models by 1-2% on six standard datasets, at no additional inference-time cost. We show that MOGIC can be used in a plug-and-play manner to enhance memory-free XC models such as NGAME or DEXA. Lastly, we demonstrate the robustness of the MOGIC algorithm to missing and noisy metadata. The code is publicly available at https://github.com/suchith720/mogic.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-prabhu25a, title = {{MOGIC}: Metadata-infused Oracle Guidance for Improved Extreme Classification}, author = {Prabhu, Suchith Chidananda and Singh, Bhavyajeet and Mittal, Anshul and Asokan, Siddarth and Mohan, Shikhar and Saini, Deepak and Prabhu, Yashoteja and Kumar, Lakshya and Jiao, Jian and S, Amit and Tandon, Niket and Gupta, Manish and Agarwal, Sumeet and Varma, Manik}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {49709--49736}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/prabhu25a/prabhu25a.pdf}, url = {https://proceedings.mlr.press/v267/prabhu25a.html}, abstract = {Retrieval-augmented classification and generation models benefit from early-stage fusion of high-quality text-based metadata, often called memory, but face high latency and noise sensitivity. In extreme classification (XC), where low latency is crucial, existing methods use late-stage fusion for efficiency and robustness. To enhance accuracy while maintaining low latency, we propose MOGIC, a novel approach to metadata-infused oracle guidance for XC. We train an early-fusion oracle classifier with access to both query-side and label-side ground-truth metadata in textual form and subsequently use it to guide existing memory-based XC disciple models via regularization. The MOGIC algorithm improves precision@1 and propensity-scored precision@1 of XC disciple models by 1-2% on six standard datasets, at no additional inference-time cost. We show that MOGIC can be used in a plug-and-play manner to enhance memory-free XC models such as NGAME or DEXA. Lastly, we demonstrate the robustness of the MOGIC algorithm to missing and noisy metadata. The code is publicly available at https://github.com/suchith720/mogic.} }
Endnote
%0 Conference Paper %T MOGIC: Metadata-infused Oracle Guidance for Improved Extreme Classification %A Suchith Chidananda Prabhu %A Bhavyajeet Singh %A Anshul Mittal %A Siddarth Asokan %A Shikhar Mohan %A Deepak Saini %A Yashoteja Prabhu %A Lakshya Kumar %A Jian Jiao %A Amit S %A Niket Tandon %A Manish Gupta %A Sumeet Agarwal %A Manik Varma %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-prabhu25a %I PMLR %P 49709--49736 %U https://proceedings.mlr.press/v267/prabhu25a.html %V 267 %X Retrieval-augmented classification and generation models benefit from early-stage fusion of high-quality text-based metadata, often called memory, but face high latency and noise sensitivity. In extreme classification (XC), where low latency is crucial, existing methods use late-stage fusion for efficiency and robustness. To enhance accuracy while maintaining low latency, we propose MOGIC, a novel approach to metadata-infused oracle guidance for XC. We train an early-fusion oracle classifier with access to both query-side and label-side ground-truth metadata in textual form and subsequently use it to guide existing memory-based XC disciple models via regularization. The MOGIC algorithm improves precision@1 and propensity-scored precision@1 of XC disciple models by 1-2% on six standard datasets, at no additional inference-time cost. We show that MOGIC can be used in a plug-and-play manner to enhance memory-free XC models such as NGAME or DEXA. Lastly, we demonstrate the robustness of the MOGIC algorithm to missing and noisy metadata. The code is publicly available at https://github.com/suchith720/mogic.
APA
Prabhu, S.C., Singh, B., Mittal, A., Asokan, S., Mohan, S., Saini, D., Prabhu, Y., Kumar, L., Jiao, J., S, A., Tandon, N., Gupta, M., Agarwal, S. & Varma, M.. (2025). MOGIC: Metadata-infused Oracle Guidance for Improved Extreme Classification. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:49709-49736 Available from https://proceedings.mlr.press/v267/prabhu25a.html.

Related Material