[edit]
Knowledge-Enriched Machine Learning for Tabular Data
Proceedings of the International Conference on Neuro-symbolic Systems, PMLR 288:260-292, 2025.
Abstract
In this paper, we introduce the general framework of knowledge-enriched machine learning for encoding and leveraging problem-specific deterministic knowledge, such as column descriptions in the tabular setting. We focus on supervised learning problems on tabular data and present a flexible encoding of such deterministic information in the form of concept kernels. We describe meta-algorithms which leverage this encoding and introduce KE-TALENT, a benchmarking suite adapted from TALENT to include concept kernels and metadata for each dataset. Experimental results on kernel-enriched versions of existing algorithms demonstrate improved performance, establishing baselines and grounding future research. Code is publicly available.