Certified Invertibility in Neural Networks via Mixed-Integer Programming
Proceedings of The 5th Annual Learning for Dynamics and Control Conference, PMLR 211:483-496, 2023.
Neural networks are known to be vulnerable to adversarial attacks, which are small, imperceptible perturbations that can significantly alter the network’s output. Conversely, there may exist large, meaningful perturbations that do not affect the network’s decision (excessive invariance). In our research, we investigate this latter phenomenon in two contexts: (a) discrete-time dynamical system identification, and (b) the calibration of a neural network’s output to that of another network. We examine noninvertibility through the lens of mathematical optimization, where the global solution measures the “safety" of the network predictions by their distance from the non-invertibility boundary. We formulate mixed-integer programs (MIPs) for ReLU networks and $L_p$ norms ($p=1,2,\infty$) that apply to neural network approximators of dynamical systems. We also discuss how our findings can be useful for invertibility certification in transformations between neural networks, e.g. between different levels of network pruning.