Causal Reinforcement Learning for Labelling Optimization in Cyber Anomaly Detection

Susan Babirye, Gong Yu, Shimadzu Hideyasu, Kyriakopoulos Konstantinos
Proceedings of the 2025 Conference on Applied Machine Learning for Information Security, PMLR 299:110-134, 2025.

Abstract

The application of machine learning (ML) for cyber anomaly detection has attracted significant research attention. However, existing detection systems often face major challenges, including rigid feature discretisation, black-box classification, biased learning from confounded data, and lack of robustness, which collectively compromise interpretability, fairness, and predictive accuracy. Causal inference offers a robust approach to estimating intervention effects by isolating spurious correlations from true cause-effect relationships, crucial for reliable decision making under uncertainty. In contrast, reinforcement learning (RL) enables agents to learn optimal adaptive policies through interaction with dynamic environments. To address the aforementioned challenges, this work proposes a paradigm that leverages a RL framework to drive causal inference into the anomaly detection pipeline. Specifically, an RL agent is trained to optimize binning thresholds for confounded numerical features, guided by a reward function that incorporates both causal effect estimation and predictive accuracy. This approach enables the agent to learn feature discretisation strategies that avoid spurious associations induced by confounders, resulting in thresholds that are both causally aware and statistically effective. The optimized binning policy is then applied to transform the dataset, and a decision tree classifier is trained on the resulting unbiased features. This produces a model that is interpretable, robust to confounding, and sensitive to causal structures. Experimental results show that the proposed approach improves robustness and interpretability in unseen environments. This work highlights the potential of combining causal reasoning with adaptive learning to produce high-performance, transparent, optimal feature discretisation, and bias-aware cyber defence models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v299-babirye25a, title = {Causal Reinforcement Learning for Labelling Optimization in Cyber Anomaly Detection}, author = {Babirye, Susan and Yu, Gong and Hideyasu, Shimadzu and Konstantinos, Kyriakopoulos}, booktitle = {Proceedings of the 2025 Conference on Applied Machine Learning for Information Security}, pages = {110--134}, year = {2025}, editor = {Raff, Edward and Rudd, Ethan M.}, volume = {299}, series = {Proceedings of Machine Learning Research}, month = {22--24 Oct}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v299/main/assets/babirye25a/babirye25a.pdf}, url = {https://proceedings.mlr.press/v299/babirye25a.html}, abstract = {The application of machine learning (ML) for cyber anomaly detection has attracted significant research attention. However, existing detection systems often face major challenges, including rigid feature discretisation, black-box classification, biased learning from confounded data, and lack of robustness, which collectively compromise interpretability, fairness, and predictive accuracy. Causal inference offers a robust approach to estimating intervention effects by isolating spurious correlations from true cause-effect relationships, crucial for reliable decision making under uncertainty. In contrast, reinforcement learning (RL) enables agents to learn optimal adaptive policies through interaction with dynamic environments. To address the aforementioned challenges, this work proposes a paradigm that leverages a RL framework to drive causal inference into the anomaly detection pipeline. Specifically, an RL agent is trained to optimize binning thresholds for confounded numerical features, guided by a reward function that incorporates both causal effect estimation and predictive accuracy. This approach enables the agent to learn feature discretisation strategies that avoid spurious associations induced by confounders, resulting in thresholds that are both causally aware and statistically effective. The optimized binning policy is then applied to transform the dataset, and a decision tree classifier is trained on the resulting unbiased features. This produces a model that is interpretable, robust to confounding, and sensitive to causal structures. Experimental results show that the proposed approach improves robustness and interpretability in unseen environments. This work highlights the potential of combining causal reasoning with adaptive learning to produce high-performance, transparent, optimal feature discretisation, and bias-aware cyber defence models. } }
Endnote
%0 Conference Paper %T Causal Reinforcement Learning for Labelling Optimization in Cyber Anomaly Detection %A Susan Babirye %A Gong Yu %A Shimadzu Hideyasu %A Kyriakopoulos Konstantinos %B Proceedings of the 2025 Conference on Applied Machine Learning for Information Security %C Proceedings of Machine Learning Research %D 2025 %E Edward Raff %E Ethan M. Rudd %F pmlr-v299-babirye25a %I PMLR %P 110--134 %U https://proceedings.mlr.press/v299/babirye25a.html %V 299 %X The application of machine learning (ML) for cyber anomaly detection has attracted significant research attention. However, existing detection systems often face major challenges, including rigid feature discretisation, black-box classification, biased learning from confounded data, and lack of robustness, which collectively compromise interpretability, fairness, and predictive accuracy. Causal inference offers a robust approach to estimating intervention effects by isolating spurious correlations from true cause-effect relationships, crucial for reliable decision making under uncertainty. In contrast, reinforcement learning (RL) enables agents to learn optimal adaptive policies through interaction with dynamic environments. To address the aforementioned challenges, this work proposes a paradigm that leverages a RL framework to drive causal inference into the anomaly detection pipeline. Specifically, an RL agent is trained to optimize binning thresholds for confounded numerical features, guided by a reward function that incorporates both causal effect estimation and predictive accuracy. This approach enables the agent to learn feature discretisation strategies that avoid spurious associations induced by confounders, resulting in thresholds that are both causally aware and statistically effective. The optimized binning policy is then applied to transform the dataset, and a decision tree classifier is trained on the resulting unbiased features. This produces a model that is interpretable, robust to confounding, and sensitive to causal structures. Experimental results show that the proposed approach improves robustness and interpretability in unseen environments. This work highlights the potential of combining causal reasoning with adaptive learning to produce high-performance, transparent, optimal feature discretisation, and bias-aware cyber defence models.
APA
Babirye, S., Yu, G., Hideyasu, S. & Konstantinos, K.. (2025). Causal Reinforcement Learning for Labelling Optimization in Cyber Anomaly Detection. Proceedings of the 2025 Conference on Applied Machine Learning for Information Security, in Proceedings of Machine Learning Research 299:110-134 Available from https://proceedings.mlr.press/v299/babirye25a.html.

Related Material