Concept Forgetting via Label Annealing

Subhodip Panda, Ananda Theertha Suresh, Atri Guha, Prathosh Ap
Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, PMLR 286:3346-3360, 2025.

Abstract

The effectiveness of current machine learning models relies on their ability to grasp diverse concepts present in datasets. However, biased and noisy data can inadvertently cause these models to learn certain undesired concepts, undermining their ability to generalize and provide utility. Consequently, modifying a trained model to forget these concepts becomes imperative for their responsible deployment. We refer to this problem as *concept forgetting*. Our goal is to develop techniques for forgetting specific undesired concepts from a pre-trained classification model’s prediction. To achieve this goal, we present an algorithm called **L**abel **AN**nealing (**LAN**). This iterative algorithm employs a two-stage method for each iteration. In the first stage, pseudo-labels are assigned to all the samples by annealing or redistributing the original labels based on the predictions of the model in the current iteration. During the second stage, the model is fine-tuned on this pseudo-labeled dataset generated from the first stage. We illustrate the effectiveness of the proposed algorithms across various models and datasets. Our method reduces *concept violation*, a metric that measures how much the model forgets specific concepts, by about 85.35% on the MNIST dataset, 73.25% on the CIFAR-10 dataset, and 69.46% on the CelebA dataset while maintaining high model accuracy.

Cite this Paper


BibTeX
@InProceedings{pmlr-v286-panda25a, title = {Concept Forgetting via Label Annealing}, author = {Panda, Subhodip and Suresh, Ananda Theertha and Guha, Atri and Ap, Prathosh}, booktitle = {Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence}, pages = {3346--3360}, year = {2025}, editor = {Chiappa, Silvia and Magliacane, Sara}, volume = {286}, series = {Proceedings of Machine Learning Research}, month = {21--25 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v286/main/assets/panda25a/panda25a.pdf}, url = {https://proceedings.mlr.press/v286/panda25a.html}, abstract = {The effectiveness of current machine learning models relies on their ability to grasp diverse concepts present in datasets. However, biased and noisy data can inadvertently cause these models to learn certain undesired concepts, undermining their ability to generalize and provide utility. Consequently, modifying a trained model to forget these concepts becomes imperative for their responsible deployment. We refer to this problem as *concept forgetting*. Our goal is to develop techniques for forgetting specific undesired concepts from a pre-trained classification model’s prediction. To achieve this goal, we present an algorithm called **L**abel **AN**nealing (**LAN**). This iterative algorithm employs a two-stage method for each iteration. In the first stage, pseudo-labels are assigned to all the samples by annealing or redistributing the original labels based on the predictions of the model in the current iteration. During the second stage, the model is fine-tuned on this pseudo-labeled dataset generated from the first stage. We illustrate the effectiveness of the proposed algorithms across various models and datasets. Our method reduces *concept violation*, a metric that measures how much the model forgets specific concepts, by about 85.35% on the MNIST dataset, 73.25% on the CIFAR-10 dataset, and 69.46% on the CelebA dataset while maintaining high model accuracy.} }
Endnote
%0 Conference Paper %T Concept Forgetting via Label Annealing %A Subhodip Panda %A Ananda Theertha Suresh %A Atri Guha %A Prathosh Ap %B Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2025 %E Silvia Chiappa %E Sara Magliacane %F pmlr-v286-panda25a %I PMLR %P 3346--3360 %U https://proceedings.mlr.press/v286/panda25a.html %V 286 %X The effectiveness of current machine learning models relies on their ability to grasp diverse concepts present in datasets. However, biased and noisy data can inadvertently cause these models to learn certain undesired concepts, undermining their ability to generalize and provide utility. Consequently, modifying a trained model to forget these concepts becomes imperative for their responsible deployment. We refer to this problem as *concept forgetting*. Our goal is to develop techniques for forgetting specific undesired concepts from a pre-trained classification model’s prediction. To achieve this goal, we present an algorithm called **L**abel **AN**nealing (**LAN**). This iterative algorithm employs a two-stage method for each iteration. In the first stage, pseudo-labels are assigned to all the samples by annealing or redistributing the original labels based on the predictions of the model in the current iteration. During the second stage, the model is fine-tuned on this pseudo-labeled dataset generated from the first stage. We illustrate the effectiveness of the proposed algorithms across various models and datasets. Our method reduces *concept violation*, a metric that measures how much the model forgets specific concepts, by about 85.35% on the MNIST dataset, 73.25% on the CIFAR-10 dataset, and 69.46% on the CelebA dataset while maintaining high model accuracy.
APA
Panda, S., Suresh, A.T., Guha, A. & Ap, P.. (2025). Concept Forgetting via Label Annealing. Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 286:3346-3360 Available from https://proceedings.mlr.press/v286/panda25a.html.

Related Material