Anchor-Guided Repair: A Defense Mechanism for Enhancing Stability of Compromised Pretrained Language Models Against Low-Precision and Weight Noise Attacks

Abrar Mahir Rohan, Nafiz Khan, Tahsin Tajwar Tanni, Fuad Fardin, Anika Bushra
Proceedings of IndabaX Nigeria 2026: Building Scalable AI That Works: From Research to Deployment in Resource-Constrained Environments, PMLR 319:368-381, 2026.

Abstract

We propose Anchor-Guided Repair, a defense mechanism for stabilising large language models (LLMs) compromised by weight noise injection and low-precision quantisation attacks. The method retrains the attacked model on clean text with an anchor regularisation loss that penalises large parameter deviations from a clean reference model. The combined objective balances language modelling loss and anchoring regularisation. Tested across various quantisation levels and weighted Gaussian noise attack scenarios, Anchor-Guided Repair consistently improves stability and performance relative to attacked models, demonstrating that anchoring can recover reliability even without proprietary training data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v319-rohan26a, title = {Anchor-Guided Repair: A Defense Mechanism for Enhancing Stability of Compromised Pretrained Language Models Against Low-Precision and Weight Noise Attacks}, author = {Rohan, Abrar Mahir and Khan, Nafiz and Tanni, Tahsin Tajwar and Fardin, Fuad and Bushra, Anika}, booktitle = {Proceedings of IndabaX Nigeria 2026: Building Scalable AI That Works: From Research to Deployment in Resource-Constrained Environments}, pages = {368--381}, year = {2026}, editor = {Folorunso, Sakinat and Ogundokun, Roseline and Oladipo, Francisca}, volume = {319}, series = {Proceedings of Machine Learning Research}, month = {11--14 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v319/main/assets/rohan26a/rohan26a.pdf}, url = {https://proceedings.mlr.press/v319/rohan26a.html}, abstract = {We propose Anchor-Guided Repair, a defense mechanism for stabilising large language models (LLMs) compromised by weight noise injection and low-precision quantisation attacks. The method retrains the attacked model on clean text with an anchor regularisation loss that penalises large parameter deviations from a clean reference model. The combined objective balances language modelling loss and anchoring regularisation. Tested across various quantisation levels and weighted Gaussian noise attack scenarios, Anchor-Guided Repair consistently improves stability and performance relative to attacked models, demonstrating that anchoring can recover reliability even without proprietary training data.} }
Endnote
%0 Conference Paper %T Anchor-Guided Repair: A Defense Mechanism for Enhancing Stability of Compromised Pretrained Language Models Against Low-Precision and Weight Noise Attacks %A Abrar Mahir Rohan %A Nafiz Khan %A Tahsin Tajwar Tanni %A Fuad Fardin %A Anika Bushra %B Proceedings of IndabaX Nigeria 2026: Building Scalable AI That Works: From Research to Deployment in Resource-Constrained Environments %C Proceedings of Machine Learning Research %D 2026 %E Sakinat Folorunso %E Roseline Ogundokun %E Francisca Oladipo %F pmlr-v319-rohan26a %I PMLR %P 368--381 %U https://proceedings.mlr.press/v319/rohan26a.html %V 319 %X We propose Anchor-Guided Repair, a defense mechanism for stabilising large language models (LLMs) compromised by weight noise injection and low-precision quantisation attacks. The method retrains the attacked model on clean text with an anchor regularisation loss that penalises large parameter deviations from a clean reference model. The combined objective balances language modelling loss and anchoring regularisation. Tested across various quantisation levels and weighted Gaussian noise attack scenarios, Anchor-Guided Repair consistently improves stability and performance relative to attacked models, demonstrating that anchoring can recover reliability even without proprietary training data.
APA
Rohan, A.M., Khan, N., Tanni, T.T., Fardin, F. & Bushra, A.. (2026). Anchor-Guided Repair: A Defense Mechanism for Enhancing Stability of Compromised Pretrained Language Models Against Low-Precision and Weight Noise Attacks. Proceedings of IndabaX Nigeria 2026: Building Scalable AI That Works: From Research to Deployment in Resource-Constrained Environments, in Proceedings of Machine Learning Research 319:368-381 Available from https://proceedings.mlr.press/v319/rohan26a.html.

Related Material