Benchmarking Time Series Foundation Models on their Accuracy and Energy Consumption

Loïc Guibert, Benjamin Pasquier, Frédéric Montet, Beat Wolf
Proceedings of the Fourth Swiss AI Days, PMLR 309:46-55, 2026.

Abstract

Our study presents a benchmark of ten time-series foundation models to quantify their accuracy–energy trade-off in zero-shot forecasting. Using an in-house and a public dataset (School and MeteoSwiss; univariate and multivariate variants), a fixed sliding-window protocol (context 512, horizon 64), and dual energy instrumentation (external PDU and CodeCarbon), we report sMAPE and NMAE accuracy metrics alongside runtime, energy ($Wh$), and Energy per Billion Parameters. Results show pronounced dataset dependence in accuracy, while efficiency is primarily architecture-driven: Chronos-Bolt achieves consistently low energy and latency, TimesFM attains the best MeteoSwiss accuracy at low energy cost, and Moirai-MoE exhibits substantially higher energy expenditure for comparable errors. This work informs decision-makers, developers, and end-users about the energy requirements of time-series foundation models and highlights the importance of considering energy alongside accuracy when evaluating models for adoption, while encouraging the systematic reporting of accuracy–energy trade-offs.

Cite this Paper


BibTeX
@InProceedings{pmlr-v309-guibert26a, title = {Benchmarking Time Series Foundation Models on their Accuracy and Energy Consumption}, author = {Guibert, Lo{\"i}c and Pasquier, Benjamin and Montet, Fr{\'e}d{\'e}ric and Wolf, Beat}, booktitle = {Proceedings of the Fourth Swiss AI Days}, pages = {46--55}, year = {2026}, editor = {Kucharavy, Andrei and Delgado, Pamela and Schürch Todeschini, Valérie and Rumley, Sébastien}, volume = {309}, series = {Proceedings of Machine Learning Research}, month = {23--25 Mar}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v309/main/assets/guibert26a/guibert26a.pdf}, url = {https://proceedings.mlr.press/v309/guibert26a.html}, abstract = {Our study presents a benchmark of ten time-series foundation models to quantify their accuracy–energy trade-off in zero-shot forecasting. Using an in-house and a public dataset (School and MeteoSwiss; univariate and multivariate variants), a fixed sliding-window protocol (context 512, horizon 64), and dual energy instrumentation (external PDU and CodeCarbon), we report sMAPE and NMAE accuracy metrics alongside runtime, energy ($Wh$), and Energy per Billion Parameters. Results show pronounced dataset dependence in accuracy, while efficiency is primarily architecture-driven: Chronos-Bolt achieves consistently low energy and latency, TimesFM attains the best MeteoSwiss accuracy at low energy cost, and Moirai-MoE exhibits substantially higher energy expenditure for comparable errors. This work informs decision-makers, developers, and end-users about the energy requirements of time-series foundation models and highlights the importance of considering energy alongside accuracy when evaluating models for adoption, while encouraging the systematic reporting of accuracy–energy trade-offs.} }
Endnote
%0 Conference Paper %T Benchmarking Time Series Foundation Models on their Accuracy and Energy Consumption %A Loïc Guibert %A Benjamin Pasquier %A Frédéric Montet %A Beat Wolf %B Proceedings of the Fourth Swiss AI Days %C Proceedings of Machine Learning Research %D 2026 %E Andrei Kucharavy %E Pamela Delgado %E Valérie Schürch Todeschini %E Sébastien Rumley %F pmlr-v309-guibert26a %I PMLR %P 46--55 %U https://proceedings.mlr.press/v309/guibert26a.html %V 309 %X Our study presents a benchmark of ten time-series foundation models to quantify their accuracy–energy trade-off in zero-shot forecasting. Using an in-house and a public dataset (School and MeteoSwiss; univariate and multivariate variants), a fixed sliding-window protocol (context 512, horizon 64), and dual energy instrumentation (external PDU and CodeCarbon), we report sMAPE and NMAE accuracy metrics alongside runtime, energy ($Wh$), and Energy per Billion Parameters. Results show pronounced dataset dependence in accuracy, while efficiency is primarily architecture-driven: Chronos-Bolt achieves consistently low energy and latency, TimesFM attains the best MeteoSwiss accuracy at low energy cost, and Moirai-MoE exhibits substantially higher energy expenditure for comparable errors. This work informs decision-makers, developers, and end-users about the energy requirements of time-series foundation models and highlights the importance of considering energy alongside accuracy when evaluating models for adoption, while encouraging the systematic reporting of accuracy–energy trade-offs.
APA
Guibert, L., Pasquier, B., Montet, F. & Wolf, B.. (2026). Benchmarking Time Series Foundation Models on their Accuracy and Energy Consumption. Proceedings of the Fourth Swiss AI Days, in Proceedings of Machine Learning Research 309:46-55 Available from https://proceedings.mlr.press/v309/guibert26a.html.

Related Material