Disentangling Uncertainties by Learning Compressed Data Representation

Zhiyu An; Zhibo Hou; Wan Du

Disentangling Uncertainties by Learning Compressed Data Representation

Zhiyu An, Zhibo Hou, Wan Du

Proceedings of the 7th Annual Learning for Dynamics \& Control Conference, PMLR 283:1472-1483, 2025.

Abstract

We study aleatoric and epistemic uncertainty estimation in a learned regressive system dynamics model. Disentangling aleatoric uncertainty (the inherent randomness of the system) from epistemic uncertainty (the lack of data) is crucial for downstream tasks such as risk-aware control and reinforcement learning, efficient exploration, and robust policy transfer. While existing approaches like Gaussian Processes, Bayesian networks, and model ensembles are widely adopted, they suffer from either high computational complexity or inaccurate uncertainty estimation. To address these limitations, we propose the Compressed Data Representation Model (CDRM), a framework that learns a neural network encoding of the data distribution and enables direct sampling from the output distribution. Our approach incorporates a novel inference procedure based on Langevin dynamics sampling, allowing CDRM to predict arbitrary output distributions rather than being constrained to a Gaussian prior. Theoretical analysis provides the conditions where CDRM achieves better memory and computational complexity compared to bin-based compression methods. Empirical evaluations show that CDRM demonstrates a superior capability to identify aleatoric and epistemic uncertainties separately, achieving AUROCs of 0.8876 and 0.9981 on a single test set containing a mixture of both uncertainties. Qualitative results further show that CDRM’s capability extends to datasets with multimodal output distributions, a challenging scenario where existing methods consistently fail. Code and supplementary materials are available at \url{https://github.com/ryeii/CDRM}.

Cite this Paper

BibTeX

@InProceedings{pmlr-v283-an25a,
  title = 	 {Disentangling Uncertainties by Learning Compressed Data Representation},
  author =       {An, Zhiyu and Hou, Zhibo and Du, Wan},
  booktitle = 	 {Proceedings of the 7th Annual Learning for Dynamics \& Control Conference},
  pages = 	 {1472--1483},
  year = 	 {2025},
  editor = 	 {Ozay, Necmiye and Balzano, Laura and Panagou, Dimitra and Abate, Alessandro},
  volume = 	 {283},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {04--06 Jun},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v283/main/assets/an25a/an25a.pdf},
  url = 	 {https://proceedings.mlr.press/v283/an25a.html},
  abstract = 	 {We study aleatoric and epistemic uncertainty estimation in a learned regressive system dynamics model. Disentangling aleatoric uncertainty (the inherent randomness of the system) from epistemic uncertainty (the lack of data) is crucial for downstream tasks such as risk-aware control and reinforcement learning, efficient exploration, and robust policy transfer. While existing approaches like Gaussian Processes, Bayesian networks, and model ensembles are widely adopted, they suffer from either high computational complexity or inaccurate uncertainty estimation. To address these limitations, we propose the Compressed Data Representation Model (CDRM), a framework that learns a neural network encoding of the data distribution and enables direct sampling from the output distribution. Our approach incorporates a novel inference procedure based on Langevin dynamics sampling, allowing CDRM to predict arbitrary output distributions rather than being constrained to a Gaussian prior. Theoretical analysis provides the conditions where CDRM achieves better memory and computational complexity compared to bin-based compression methods. Empirical evaluations show that CDRM demonstrates a superior capability to identify aleatoric and epistemic uncertainties separately, achieving AUROCs of 0.8876 and 0.9981 on a single test set containing a mixture of both uncertainties. Qualitative results further show that CDRM’s capability extends to datasets with multimodal output distributions, a challenging scenario where existing methods consistently fail. Code and supplementary materials are available at \url{https://github.com/ryeii/CDRM}.}
}

Endnote

%0 Conference Paper
%T Disentangling Uncertainties by Learning Compressed Data Representation
%A Zhiyu An
%A Zhibo Hou
%A Wan Du
%B Proceedings of the 7th Annual Learning for Dynamics \& Control Conference
%C Proceedings of Machine Learning Research
%D 2025
%E Necmiye Ozay
%E Laura Balzano
%E Dimitra Panagou
%E Alessandro Abate	
%F pmlr-v283-an25a
%I PMLR
%P 1472--1483
%U https://proceedings.mlr.press/v283/an25a.html
%V 283
%X We study aleatoric and epistemic uncertainty estimation in a learned regressive system dynamics model. Disentangling aleatoric uncertainty (the inherent randomness of the system) from epistemic uncertainty (the lack of data) is crucial for downstream tasks such as risk-aware control and reinforcement learning, efficient exploration, and robust policy transfer. While existing approaches like Gaussian Processes, Bayesian networks, and model ensembles are widely adopted, they suffer from either high computational complexity or inaccurate uncertainty estimation. To address these limitations, we propose the Compressed Data Representation Model (CDRM), a framework that learns a neural network encoding of the data distribution and enables direct sampling from the output distribution. Our approach incorporates a novel inference procedure based on Langevin dynamics sampling, allowing CDRM to predict arbitrary output distributions rather than being constrained to a Gaussian prior. Theoretical analysis provides the conditions where CDRM achieves better memory and computational complexity compared to bin-based compression methods. Empirical evaluations show that CDRM demonstrates a superior capability to identify aleatoric and epistemic uncertainties separately, achieving AUROCs of 0.8876 and 0.9981 on a single test set containing a mixture of both uncertainties. Qualitative results further show that CDRM’s capability extends to datasets with multimodal output distributions, a challenging scenario where existing methods consistently fail. Code and supplementary materials are available at \url{https://github.com/ryeii/CDRM}.

APA

An, Z., Hou, Z. & Du, W.. (2025). Disentangling Uncertainties by Learning Compressed Data Representation. Proceedings of the 7th Annual Learning for Dynamics \& Control Conference, in Proceedings of Machine Learning Research 283:1472-1483 Available from https://proceedings.mlr.press/v283/an25a.html.

Related Material

Download PDF