Understanding the Limits of Deep Tabular Methods with Temporal Shift

Haorun Cai, Han-Jia Ye
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:6366-6386, 2025.

Abstract

Deep tabular models have demonstrated remarkable success on i.i.d. data, excelling in a variety of structured data tasks. However, their performance often deteriorates under temporal distribution shifts, where trends and periodic patterns are present in the evolving data distribution over time. In this paper, we explore the underlying reasons for this failure in capturing temporal dependencies. We begin by investigating the training protocol, revealing a key issue in how the data is split for model training and validation. While existing approaches typically use temporal ordering for splitting, we show that even a random split significantly improves model performance. By accounting for reducing training lag and validation bias to achieve better generalization ability, our proposed splitting protocol offers substantial improvements across a variety of methods. Furthermore, we analyses how temporal data affects deep tabular representations, uncovering that these models often fail to capture crucial periodic and trend information. To address this gap, we introduce a plug-and-play temporal embedding based on Fourier series expansion to learn and incorporate temporal patterns, offering an adaptive approach to handle temporal shifts. Our experiments demonstrate that this temporal embedding, combined with the improved splitting strategy, provides a more effective and robust framework for learning from temporal tabular data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-cai25j, title = {Understanding the Limits of Deep Tabular Methods with Temporal Shift}, author = {Cai, Haorun and Ye, Han-Jia}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {6366--6386}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/cai25j/cai25j.pdf}, url = {https://proceedings.mlr.press/v267/cai25j.html}, abstract = {Deep tabular models have demonstrated remarkable success on i.i.d. data, excelling in a variety of structured data tasks. However, their performance often deteriorates under temporal distribution shifts, where trends and periodic patterns are present in the evolving data distribution over time. In this paper, we explore the underlying reasons for this failure in capturing temporal dependencies. We begin by investigating the training protocol, revealing a key issue in how the data is split for model training and validation. While existing approaches typically use temporal ordering for splitting, we show that even a random split significantly improves model performance. By accounting for reducing training lag and validation bias to achieve better generalization ability, our proposed splitting protocol offers substantial improvements across a variety of methods. Furthermore, we analyses how temporal data affects deep tabular representations, uncovering that these models often fail to capture crucial periodic and trend information. To address this gap, we introduce a plug-and-play temporal embedding based on Fourier series expansion to learn and incorporate temporal patterns, offering an adaptive approach to handle temporal shifts. Our experiments demonstrate that this temporal embedding, combined with the improved splitting strategy, provides a more effective and robust framework for learning from temporal tabular data.} }
Endnote
%0 Conference Paper %T Understanding the Limits of Deep Tabular Methods with Temporal Shift %A Haorun Cai %A Han-Jia Ye %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-cai25j %I PMLR %P 6366--6386 %U https://proceedings.mlr.press/v267/cai25j.html %V 267 %X Deep tabular models have demonstrated remarkable success on i.i.d. data, excelling in a variety of structured data tasks. However, their performance often deteriorates under temporal distribution shifts, where trends and periodic patterns are present in the evolving data distribution over time. In this paper, we explore the underlying reasons for this failure in capturing temporal dependencies. We begin by investigating the training protocol, revealing a key issue in how the data is split for model training and validation. While existing approaches typically use temporal ordering for splitting, we show that even a random split significantly improves model performance. By accounting for reducing training lag and validation bias to achieve better generalization ability, our proposed splitting protocol offers substantial improvements across a variety of methods. Furthermore, we analyses how temporal data affects deep tabular representations, uncovering that these models often fail to capture crucial periodic and trend information. To address this gap, we introduce a plug-and-play temporal embedding based on Fourier series expansion to learn and incorporate temporal patterns, offering an adaptive approach to handle temporal shifts. Our experiments demonstrate that this temporal embedding, combined with the improved splitting strategy, provides a more effective and robust framework for learning from temporal tabular data.
APA
Cai, H. & Ye, H.. (2025). Understanding the Limits of Deep Tabular Methods with Temporal Shift. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:6366-6386 Available from https://proceedings.mlr.press/v267/cai25j.html.

Related Material