Knowledge-Enriched Machine Learning for Tabular Data

Juyong Kim, Chandler Squires, Pradeep Ravikumar
Proceedings of the International Conference on Neuro-symbolic Systems, PMLR 288:260-292, 2025.

Abstract

In this paper, we introduce the general framework of knowledge-enriched machine learning for encoding and leveraging problem-specific deterministic knowledge, such as column descriptions in the tabular setting. We focus on supervised learning problems on tabular data and present a flexible encoding of such deterministic information in the form of concept kernels. We describe meta-algorithms which leverage this encoding and introduce KE-TALENT, a benchmarking suite adapted from TALENT to include concept kernels and metadata for each dataset. Experimental results on kernel-enriched versions of existing algorithms demonstrate improved performance, establishing baselines and grounding future research. Code is publicly available.

Cite this Paper


BibTeX
@InProceedings{pmlr-v288-kim25a, title = {Knowledge-Enriched Machine Learning for Tabular Data}, author = {Kim, Juyong and Squires, Chandler and Ravikumar, Pradeep}, booktitle = {Proceedings of the International Conference on Neuro-symbolic Systems}, pages = {260--292}, year = {2025}, editor = {Pappas, George and Ravikumar, Pradeep and Seshia, Sanjit A.}, volume = {288}, series = {Proceedings of Machine Learning Research}, month = {28--30 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v288/main/assets/kim25a/kim25a.pdf}, url = {https://proceedings.mlr.press/v288/kim25a.html}, abstract = {In this paper, we introduce the general framework of knowledge-enriched machine learning for encoding and leveraging problem-specific deterministic knowledge, such as column descriptions in the tabular setting. We focus on supervised learning problems on tabular data and present a flexible encoding of such deterministic information in the form of concept kernels. We describe meta-algorithms which leverage this encoding and introduce KE-TALENT, a benchmarking suite adapted from TALENT to include concept kernels and metadata for each dataset. Experimental results on kernel-enriched versions of existing algorithms demonstrate improved performance, establishing baselines and grounding future research. Code is publicly available.} }
Endnote
%0 Conference Paper %T Knowledge-Enriched Machine Learning for Tabular Data %A Juyong Kim %A Chandler Squires %A Pradeep Ravikumar %B Proceedings of the International Conference on Neuro-symbolic Systems %C Proceedings of Machine Learning Research %D 2025 %E George Pappas %E Pradeep Ravikumar %E Sanjit A. Seshia %F pmlr-v288-kim25a %I PMLR %P 260--292 %U https://proceedings.mlr.press/v288/kim25a.html %V 288 %X In this paper, we introduce the general framework of knowledge-enriched machine learning for encoding and leveraging problem-specific deterministic knowledge, such as column descriptions in the tabular setting. We focus on supervised learning problems on tabular data and present a flexible encoding of such deterministic information in the form of concept kernels. We describe meta-algorithms which leverage this encoding and introduce KE-TALENT, a benchmarking suite adapted from TALENT to include concept kernels and metadata for each dataset. Experimental results on kernel-enriched versions of existing algorithms demonstrate improved performance, establishing baselines and grounding future research. Code is publicly available.
APA
Kim, J., Squires, C. & Ravikumar, P.. (2025). Knowledge-Enriched Machine Learning for Tabular Data. Proceedings of the International Conference on Neuro-symbolic Systems, in Proceedings of Machine Learning Research 288:260-292 Available from https://proceedings.mlr.press/v288/kim25a.html.

Related Material