Knowledge-Enriched Machine Learning for Tabular Data

Juyong Kim; Chandler Squires; Pradeep Ravikumar

Knowledge-Enriched Machine Learning for Tabular Data

Juyong Kim, Chandler Squires, Pradeep Ravikumar

Proceedings of the International Conference on Neuro-symbolic Systems, PMLR 288:260-292, 2025.

Abstract

In this paper, we introduce the general framework of knowledge-enriched machine learning for encoding and leveraging problem-specific deterministic knowledge, such as column descriptions in the tabular setting. We focus on supervised learning problems on tabular data and present a flexible encoding of such deterministic information in the form of concept kernels. We describe meta-algorithms which leverage this encoding and introduce KE-TALENT, a benchmarking suite adapted from TALENT to include concept kernels and metadata for each dataset. Experimental results on kernel-enriched versions of existing algorithms demonstrate improved performance, establishing baselines and grounding future research. Code is publicly available.

Cite this Paper

BibTeX

@InProceedings{pmlr-v288-kim25a,
  title = 	 {Knowledge-Enriched Machine Learning for Tabular Data},
  author =       {Kim, Juyong and Squires, Chandler and Ravikumar, Pradeep},
  booktitle = 	 {Proceedings of the International Conference on Neuro-symbolic Systems},
  pages = 	 {260--292},
  year = 	 {2025},
  editor = 	 {Pappas, George and Ravikumar, Pradeep and Seshia, Sanjit A.},
  volume = 	 {288},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {28--30 May},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v288/main/assets/kim25a/kim25a.pdf},
  url = 	 {https://proceedings.mlr.press/v288/kim25a.html},
  abstract = 	 {In this paper, we introduce the general framework of knowledge-enriched machine learning for encoding and leveraging problem-specific deterministic knowledge, such as column descriptions in the tabular setting. We focus on supervised learning problems on tabular data and present a flexible encoding of such deterministic information in the form of concept kernels. We describe meta-algorithms which leverage this encoding and introduce KE-TALENT, a benchmarking suite adapted from TALENT to include concept kernels and metadata for each dataset. Experimental results on kernel-enriched versions of existing algorithms demonstrate improved performance, establishing baselines and grounding future research. Code is publicly available.}
}

Endnote

%0 Conference Paper
%T Knowledge-Enriched Machine Learning for Tabular Data
%A Juyong Kim
%A Chandler Squires
%A Pradeep Ravikumar
%B Proceedings of the International Conference on Neuro-symbolic Systems
%C Proceedings of Machine Learning Research
%D 2025
%E George Pappas
%E Pradeep Ravikumar
%E Sanjit A. Seshia	
%F pmlr-v288-kim25a
%I PMLR
%P 260--292
%U https://proceedings.mlr.press/v288/kim25a.html
%V 288
%X In this paper, we introduce the general framework of knowledge-enriched machine learning for encoding and leveraging problem-specific deterministic knowledge, such as column descriptions in the tabular setting. We focus on supervised learning problems on tabular data and present a flexible encoding of such deterministic information in the form of concept kernels. We describe meta-algorithms which leverage this encoding and introduce KE-TALENT, a benchmarking suite adapted from TALENT to include concept kernels and metadata for each dataset. Experimental results on kernel-enriched versions of existing algorithms demonstrate improved performance, establishing baselines and grounding future research. Code is publicly available.

APA

Kim, J., Squires, C. & Ravikumar, P.. (2025). Knowledge-Enriched Machine Learning for Tabular Data. Proceedings of the International Conference on Neuro-symbolic Systems, in Proceedings of Machine Learning Research 288:260-292 Available from https://proceedings.mlr.press/v288/kim25a.html.

Related Material

Download PDF