Robust Interpretation of Neural Network models
Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics, PMLR R1:255-262, 1997.
Artificial Neural Network seem very promising for regression and classification, especially for large covariate spaces. These methods represent a non-linear function as a composition of low dimensional ridge functions and therefore appear to be less sensitive to the dimensionality of the covariate space. However, due to non uniqueness of a global minimum and the existence of (possibly) many local minima, the model revealed by the network is non stable. We introduce a method to interpret neural network results which uses novel robustification techniques. This results in a robust interpretation of the model employed by the network. Simulated data from known models is used to demonstrate the interpretability results and to demonstrate the effects of different regularization methods on the robustness of the model. Graphical methods are introduced to present the interpretation results. We further demonstrate how interaction between covariates can be revealed. From this study we conclude that the interpretation method works well, but that NN models may sometimes be misinterpreted, especially if the approximations to the true model are less robust.