How to Train Your Latent Control Barrier Function: Smooth Safety Filtering Under Hard-to-Model Constraints

Kensuke Nakamura, Arun L Bishop, Steven Man, Aaron M. Johnson, Zachary Manchester, Andrea Bajcsy
Proceedings of The 8th Annual Learning for Dynamics and Control Conference, PMLR 331:145-168, 2026.

Abstract

Latent safety filters extend Hamilton-Jacobi (HJ) reachability to operate on latent state representa- tions and dynamics learned directly from high-dimensional observations, enabling safe visuomotor control under hard-to-model constraints. However, existing methods implement “least-restrictive” filtering that discretely switch between nominal and safety policies, potentially undermining the task performance that makes modern visuomotor policies valuable. While reachability value func- tions can, in principle, be adapted to be control barrier functions (CBFs) for smooth optimization- based filtering, we theoretically and empirically show that current latent-space learning methods produce fundamentally incompatible value functions. We identify two sources of incompatibility: First, in HJ reachability, failures are encoded via a “margin function” in latent space, whose sign indicates whether or not a latent is in the constraint set. However, representing the margin function as a classifier yields saturated value functions that exhibit discontinuous jumps. We prove that the value function’s Lipschitz constant scales linearly with the margin function’s Lipschitz constant, revealing that smooth CBFs require smooth margins. Second, reinforcement learning (RL) approx- imations trained solely on safety policy data yield inaccurate value estimates for nominal policy actions, precisely where CBF filtering needs them. We propose the LatentCBF, which addresses both challenges through gradient penalties that lead to smooth margin functions without additional labeling, and a value-training procedure that mixes data from both the nominal and safety policy distributions. Experiments on simulated benchmarks and hardware with a vision-based manipula- tion policy demonstrate that LatentCBF enables smooth safety filtering while doubling the success rate over prior switching methods

Cite this Paper


BibTeX
@InProceedings{pmlr-v331-nakamura26a, title = {How to Train Your Latent Control Barrier Function: Smooth Safety Filtering Under Hard-to-Model Constraints}, author = {Nakamura, Kensuke and Bishop, Arun L and Man, Steven and Johnson, Aaron M. and Manchester, Zachary and Bajcsy, Andrea}, booktitle = {Proceedings of The 8th Annual Learning for Dynamics and Control Conference}, pages = {145--168}, year = {2026}, editor = {Sukhatme, Gaurav and Lindemann, Lars and Tu, Stephen and Wierman, Adam and Atanasov, Nikolay}, volume = {331}, series = {Proceedings of Machine Learning Research}, month = {17--19 Jun}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v331/main/assets/nakamura26a/nakamura26a.pdf}, url = {https://proceedings.mlr.press/v331/nakamura26a.html}, abstract = {Latent safety filters extend Hamilton-Jacobi (HJ) reachability to operate on latent state representa- tions and dynamics learned directly from high-dimensional observations, enabling safe visuomotor control under hard-to-model constraints. However, existing methods implement “least-restrictive” filtering that discretely switch between nominal and safety policies, potentially undermining the task performance that makes modern visuomotor policies valuable. While reachability value func- tions can, in principle, be adapted to be control barrier functions (CBFs) for smooth optimization- based filtering, we theoretically and empirically show that current latent-space learning methods produce fundamentally incompatible value functions. We identify two sources of incompatibility: First, in HJ reachability, failures are encoded via a “margin function” in latent space, whose sign indicates whether or not a latent is in the constraint set. However, representing the margin function as a classifier yields saturated value functions that exhibit discontinuous jumps. We prove that the value function’s Lipschitz constant scales linearly with the margin function’s Lipschitz constant, revealing that smooth CBFs require smooth margins. Second, reinforcement learning (RL) approx- imations trained solely on safety policy data yield inaccurate value estimates for nominal policy actions, precisely where CBF filtering needs them. We propose the LatentCBF, which addresses both challenges through gradient penalties that lead to smooth margin functions without additional labeling, and a value-training procedure that mixes data from both the nominal and safety policy distributions. Experiments on simulated benchmarks and hardware with a vision-based manipula- tion policy demonstrate that LatentCBF enables smooth safety filtering while doubling the success rate over prior switching methods} }
Endnote
%0 Conference Paper %T How to Train Your Latent Control Barrier Function: Smooth Safety Filtering Under Hard-to-Model Constraints %A Kensuke Nakamura %A Arun L Bishop %A Steven Man %A Aaron M. Johnson %A Zachary Manchester %A Andrea Bajcsy %B Proceedings of The 8th Annual Learning for Dynamics and Control Conference %C Proceedings of Machine Learning Research %D 2026 %E Gaurav Sukhatme %E Lars Lindemann %E Stephen Tu %E Adam Wierman %E Nikolay Atanasov %F pmlr-v331-nakamura26a %I PMLR %P 145--168 %U https://proceedings.mlr.press/v331/nakamura26a.html %V 331 %X Latent safety filters extend Hamilton-Jacobi (HJ) reachability to operate on latent state representa- tions and dynamics learned directly from high-dimensional observations, enabling safe visuomotor control under hard-to-model constraints. However, existing methods implement “least-restrictive” filtering that discretely switch between nominal and safety policies, potentially undermining the task performance that makes modern visuomotor policies valuable. While reachability value func- tions can, in principle, be adapted to be control barrier functions (CBFs) for smooth optimization- based filtering, we theoretically and empirically show that current latent-space learning methods produce fundamentally incompatible value functions. We identify two sources of incompatibility: First, in HJ reachability, failures are encoded via a “margin function” in latent space, whose sign indicates whether or not a latent is in the constraint set. However, representing the margin function as a classifier yields saturated value functions that exhibit discontinuous jumps. We prove that the value function’s Lipschitz constant scales linearly with the margin function’s Lipschitz constant, revealing that smooth CBFs require smooth margins. Second, reinforcement learning (RL) approx- imations trained solely on safety policy data yield inaccurate value estimates for nominal policy actions, precisely where CBF filtering needs them. We propose the LatentCBF, which addresses both challenges through gradient penalties that lead to smooth margin functions without additional labeling, and a value-training procedure that mixes data from both the nominal and safety policy distributions. Experiments on simulated benchmarks and hardware with a vision-based manipula- tion policy demonstrate that LatentCBF enables smooth safety filtering while doubling the success rate over prior switching methods
APA
Nakamura, K., Bishop, A.L., Man, S., Johnson, A.M., Manchester, Z. & Bajcsy, A.. (2026). How to Train Your Latent Control Barrier Function: Smooth Safety Filtering Under Hard-to-Model Constraints. Proceedings of The 8th Annual Learning for Dynamics and Control Conference, in Proceedings of Machine Learning Research 331:145-168 Available from https://proceedings.mlr.press/v331/nakamura26a.html.

Related Material