FEAT-KD: Learning Concise Representations for Single and Multi-Target Regression via TabNet Knowledge Distillation

Kei Sen Fong; Mehul Motani

FEAT-KD: Learning Concise Representations for Single and Multi-Target Regression via TabNet Knowledge Distillation

Kei Sen Fong, Mehul Motani

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:17378-17391, 2025.

Abstract

In this work, we propose a novel approach that combines the strengths of FEAT and TabNet through knowledge distillation (KD), which we term FEAT-KD. FEAT is an intrinsically interpretable machine learning (ML) algorithm that constructs a weighted linear combination of concisely-represented features discovered via genetic programming optimization, which can often be inefficient. FEAT-KD leverages TabNet’s deep-learning-based optimization and feature selection mechanisms instead. FEAT-KD finds a weighted linear combination of concisely-represented, symbolic features that are derived from piece-wise distillation of a trained TabNet model. We analyze FEAT-KD on regression tasks from two perspectives: (i) compared to TabNet, FEAT-KD significantly reduces model complexity while retaining competitive predictive performance, effectively converting a black-box deep learning model into a more interpretable white-box representation, (ii) compared to FEAT, our method consistently outperforms in prediction accuracy, produces more compact models, and reduces the complexity of learned symbolic expressions. In addition, we demonstrate that FEAT-KD easily supports multi-target regression, in which the shared features contribute to the interpretability of the system. Our results suggest that FEAT-KD is a promising direction for interpretable ML, bridging the gap between deep learning’s predictive power and the intrinsic transparency of symbolic models.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-fong25a,
  title = 	 {{FEAT}-{KD}: Learning Concise Representations for Single and Multi-Target Regression via {T}ab{N}et Knowledge Distillation},
  author =       {Fong, Kei Sen and Motani, Mehul},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {17378--17391},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/fong25a/fong25a.pdf},
  url = 	 {https://proceedings.mlr.press/v267/fong25a.html},
  abstract = 	 {In this work, we propose a novel approach that combines the strengths of FEAT and TabNet through knowledge distillation (KD), which we term FEAT-KD. FEAT is an intrinsically interpretable machine learning (ML) algorithm that constructs a weighted linear combination of concisely-represented features discovered via genetic programming optimization, which can often be inefficient. FEAT-KD leverages TabNet’s deep-learning-based optimization and feature selection mechanisms instead. FEAT-KD finds a weighted linear combination of concisely-represented, symbolic features that are derived from piece-wise distillation of a trained TabNet model. We analyze FEAT-KD on regression tasks from two perspectives: (i) compared to TabNet, FEAT-KD significantly reduces model complexity while retaining competitive predictive performance, effectively converting a black-box deep learning model into a more interpretable white-box representation, (ii) compared to FEAT, our method consistently outperforms in prediction accuracy, produces more compact models, and reduces the complexity of learned symbolic expressions. In addition, we demonstrate that FEAT-KD easily supports multi-target regression, in which the shared features contribute to the interpretability of the system. Our results suggest that FEAT-KD is a promising direction for interpretable ML, bridging the gap between deep learning’s predictive power and the intrinsic transparency of symbolic models.}
}

Endnote

%0 Conference Paper
%T FEAT-KD: Learning Concise Representations for Single and Multi-Target Regression via TabNet Knowledge Distillation
%A Kei Sen Fong
%A Mehul Motani
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-fong25a
%I PMLR
%P 17378--17391
%U https://proceedings.mlr.press/v267/fong25a.html
%V 267
%X In this work, we propose a novel approach that combines the strengths of FEAT and TabNet through knowledge distillation (KD), which we term FEAT-KD. FEAT is an intrinsically interpretable machine learning (ML) algorithm that constructs a weighted linear combination of concisely-represented features discovered via genetic programming optimization, which can often be inefficient. FEAT-KD leverages TabNet’s deep-learning-based optimization and feature selection mechanisms instead. FEAT-KD finds a weighted linear combination of concisely-represented, symbolic features that are derived from piece-wise distillation of a trained TabNet model. We analyze FEAT-KD on regression tasks from two perspectives: (i) compared to TabNet, FEAT-KD significantly reduces model complexity while retaining competitive predictive performance, effectively converting a black-box deep learning model into a more interpretable white-box representation, (ii) compared to FEAT, our method consistently outperforms in prediction accuracy, produces more compact models, and reduces the complexity of learned symbolic expressions. In addition, we demonstrate that FEAT-KD easily supports multi-target regression, in which the shared features contribute to the interpretability of the system. Our results suggest that FEAT-KD is a promising direction for interpretable ML, bridging the gap between deep learning’s predictive power and the intrinsic transparency of symbolic models.

APA

Fong, K.S. & Motani, M.. (2025). FEAT-KD: Learning Concise Representations for Single and Multi-Target Regression via TabNet Knowledge Distillation. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:17378-17391 Available from https://proceedings.mlr.press/v267/fong25a.html.

FEAT-KD: Learning Concise Representations for Single and Multi-Target Regression via TabNet Knowledge Distillation

Abstract

Cite this Paper

Related Material