Fast Estimation of Partial Dependence Functions using Trees

Jinyang Liu, Tessa Steensgaard, Marvin N. Wright, Niklas Pfister, Munir Hiabu
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:39496-39534, 2025.

Abstract

Many existing interpretation methods are based on Partial Dependence (PD) functions that, for a pre-trained machine learning model, capture how a subset of the features affects the predictions by averaging over the remaining features. Notable methods include Shapley additive explanations (SHAP) which computes feature contributions based on a game theoretical interpretation and PD plots (i.e., 1-dim PD functions) that capture average marginal main effects. Recent work has connected these approaches using a functional decomposition and argues that SHAP values can be misleading since they merge main and interaction effects into a single local effect. However, a major advantage of SHAP compared to other PD-based interpretations has been the availability of fast estimation techniques, such as TreeSHAP. In this paper, we propose a new tree-based estimator, FastPD, which efficiently estimates arbitrary PD functions. We show that FastPD consistently estimates the desired population quantity – in contrast to path-dependent TreeSHAP which is inconsistent when features are correlated. For moderately deep trees, FastPD improves the complexity of existing methods from quadratic to linear in the number of observations. By estimating PD functions for arbitrary feature subsets, FastPD can be used to extract PD-based interpretations such as SHAP, PD plots and higher-order interaction effects.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-liu25bm, title = {Fast Estimation of Partial Dependence Functions using Trees}, author = {Liu, Jinyang and Steensgaard, Tessa and Wright, Marvin N. and Pfister, Niklas and Hiabu, Munir}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {39496--39534}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/liu25bm/liu25bm.pdf}, url = {https://proceedings.mlr.press/v267/liu25bm.html}, abstract = {Many existing interpretation methods are based on Partial Dependence (PD) functions that, for a pre-trained machine learning model, capture how a subset of the features affects the predictions by averaging over the remaining features. Notable methods include Shapley additive explanations (SHAP) which computes feature contributions based on a game theoretical interpretation and PD plots (i.e., 1-dim PD functions) that capture average marginal main effects. Recent work has connected these approaches using a functional decomposition and argues that SHAP values can be misleading since they merge main and interaction effects into a single local effect. However, a major advantage of SHAP compared to other PD-based interpretations has been the availability of fast estimation techniques, such as TreeSHAP. In this paper, we propose a new tree-based estimator, FastPD, which efficiently estimates arbitrary PD functions. We show that FastPD consistently estimates the desired population quantity – in contrast to path-dependent TreeSHAP which is inconsistent when features are correlated. For moderately deep trees, FastPD improves the complexity of existing methods from quadratic to linear in the number of observations. By estimating PD functions for arbitrary feature subsets, FastPD can be used to extract PD-based interpretations such as SHAP, PD plots and higher-order interaction effects.} }
Endnote
%0 Conference Paper %T Fast Estimation of Partial Dependence Functions using Trees %A Jinyang Liu %A Tessa Steensgaard %A Marvin N. Wright %A Niklas Pfister %A Munir Hiabu %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-liu25bm %I PMLR %P 39496--39534 %U https://proceedings.mlr.press/v267/liu25bm.html %V 267 %X Many existing interpretation methods are based on Partial Dependence (PD) functions that, for a pre-trained machine learning model, capture how a subset of the features affects the predictions by averaging over the remaining features. Notable methods include Shapley additive explanations (SHAP) which computes feature contributions based on a game theoretical interpretation and PD plots (i.e., 1-dim PD functions) that capture average marginal main effects. Recent work has connected these approaches using a functional decomposition and argues that SHAP values can be misleading since they merge main and interaction effects into a single local effect. However, a major advantage of SHAP compared to other PD-based interpretations has been the availability of fast estimation techniques, such as TreeSHAP. In this paper, we propose a new tree-based estimator, FastPD, which efficiently estimates arbitrary PD functions. We show that FastPD consistently estimates the desired population quantity – in contrast to path-dependent TreeSHAP which is inconsistent when features are correlated. For moderately deep trees, FastPD improves the complexity of existing methods from quadratic to linear in the number of observations. By estimating PD functions for arbitrary feature subsets, FastPD can be used to extract PD-based interpretations such as SHAP, PD plots and higher-order interaction effects.
APA
Liu, J., Steensgaard, T., Wright, M.N., Pfister, N. & Hiabu, M.. (2025). Fast Estimation of Partial Dependence Functions using Trees. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:39496-39534 Available from https://proceedings.mlr.press/v267/liu25bm.html.

Related Material