Gaining Free or LowCost Interpretability with Interpretable Partial Substitute
[edit]
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:65056514, 2019.
Abstract
This work addresses the situation where a blackbox model with good predictive performance is chosen over its interpretable competitors, and we show interpretability is still achievable in this case. Our solution is to find an interpretable substitute on a subset of data where the blackbox model is overkill or nearly overkill while leaving the rest to the blackbox. This transparency is obtained at minimal cost or no cost of the predictive performance. Under this framework, we develop a Hybrid Rule Sets (HyRS) model that uses decision rules to capture the subspace of data where the rules are as accurate or almost as accurate as the blackbox provided. To train a HyRS, we devise an efficient search algorithm that iteratively finds the optimal model and exploits theoretically grounded strategies to reduce computation. Our framework is agnostic to the blackbox during training. Experiments on structured and text data show that HyRS obtains an effective tradeoff between transparency and interpretability.
Related Material


