Benchmarking Time Series Foundation Models on their Accuracy and Energy Consumption

Loïc Guibert; Benjamin Pasquier; Frédéric Montet; Beat Wolf

Benchmarking Time Series Foundation Models on their Accuracy and Energy Consumption

Loïc Guibert, Benjamin Pasquier, Frédéric Montet, Beat Wolf

Proceedings of the Fourth Swiss AI Days, PMLR 309:46-55, 2026.

Abstract

Our study presents a benchmark of ten time-series foundation models to quantify their accuracy–energy trade-off in zero-shot forecasting. Using an in-house and a public dataset (School and MeteoSwiss; univariate and multivariate variants), a fixed sliding-window protocol (context 512, horizon 64), and dual energy instrumentation (external PDU and CodeCarbon), we report sMAPE and NMAE accuracy metrics alongside runtime, energy ($Wh$), and Energy per Billion Parameters. Results show pronounced dataset dependence in accuracy, while efficiency is primarily architecture-driven: Chronos-Bolt achieves consistently low energy and latency, TimesFM attains the best MeteoSwiss accuracy at low energy cost, and Moirai-MoE exhibits substantially higher energy expenditure for comparable errors. This work informs decision-makers, developers, and end-users about the energy requirements of time-series foundation models and highlights the importance of considering energy alongside accuracy when evaluating models for adoption, while encouraging the systematic reporting of accuracy–energy trade-offs.

Cite this Paper

BibTeX

@InProceedings{pmlr-v309-guibert26a,
  title = 	 {Benchmarking Time Series Foundation Models on their Accuracy and Energy Consumption},
  author =       {Guibert, Lo{\"i}c and Pasquier, Benjamin and Montet, Fr{\'e}d{\'e}ric and Wolf, Beat},
  booktitle = 	 {Proceedings of the Fourth Swiss AI Days},
  pages = 	 {46--55},
  year = 	 {2026},
  editor = 	 {Kucharavy, Andrei and Delgado, Pamela and Schürch Todeschini, Valérie and Rumley, Sébastien},
  volume = 	 {309},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--25 Mar},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v309/main/assets/guibert26a/guibert26a.pdf},
  url = 	 {https://proceedings.mlr.press/v309/guibert26a.html},
  abstract = 	 {Our study presents a benchmark of ten time-series foundation models to quantify their accuracy–energy trade-off in zero-shot forecasting. Using an in-house and a public dataset (School and MeteoSwiss; univariate and multivariate variants), a fixed sliding-window protocol (context 512, horizon 64), and dual energy instrumentation (external PDU and CodeCarbon), we report sMAPE and NMAE accuracy metrics alongside runtime, energy ($Wh$), and Energy per Billion Parameters. Results show pronounced dataset dependence in accuracy, while efficiency is primarily architecture-driven: Chronos-Bolt achieves consistently low energy and latency, TimesFM attains the best MeteoSwiss accuracy at low energy cost, and Moirai-MoE exhibits substantially higher energy expenditure for comparable errors. This work informs decision-makers, developers, and end-users about the energy requirements of time-series foundation models and highlights the importance of considering energy alongside accuracy when evaluating models for adoption, while encouraging the systematic reporting of accuracy–energy trade-offs.}
}

Endnote

%0 Conference Paper
%T Benchmarking Time Series Foundation Models on their Accuracy and Energy Consumption
%A Loïc Guibert
%A Benjamin Pasquier
%A Frédéric Montet
%A Beat Wolf
%B Proceedings of the Fourth Swiss AI Days
%C Proceedings of Machine Learning Research
%D 2026
%E Andrei Kucharavy
%E Pamela Delgado
%E Valérie Schürch Todeschini
%E Sébastien Rumley	
%F pmlr-v309-guibert26a
%I PMLR
%P 46--55
%U https://proceedings.mlr.press/v309/guibert26a.html
%V 309
%X Our study presents a benchmark of ten time-series foundation models to quantify their accuracy–energy trade-off in zero-shot forecasting. Using an in-house and a public dataset (School and MeteoSwiss; univariate and multivariate variants), a fixed sliding-window protocol (context 512, horizon 64), and dual energy instrumentation (external PDU and CodeCarbon), we report sMAPE and NMAE accuracy metrics alongside runtime, energy ($Wh$), and Energy per Billion Parameters. Results show pronounced dataset dependence in accuracy, while efficiency is primarily architecture-driven: Chronos-Bolt achieves consistently low energy and latency, TimesFM attains the best MeteoSwiss accuracy at low energy cost, and Moirai-MoE exhibits substantially higher energy expenditure for comparable errors. This work informs decision-makers, developers, and end-users about the energy requirements of time-series foundation models and highlights the importance of considering energy alongside accuracy when evaluating models for adoption, while encouraging the systematic reporting of accuracy–energy trade-offs.

APA

Guibert, L., Pasquier, B., Montet, F. & Wolf, B.. (2026). Benchmarking Time Series Foundation Models on their Accuracy and Energy Consumption. Proceedings of the Fourth Swiss AI Days, in Proceedings of Machine Learning Research 309:46-55 Available from https://proceedings.mlr.press/v309/guibert26a.html.

Related Material

Download PDF