Fast Min-$ε$ Segmented Regression using Constant-Time Segment Merging

Ansgar Lößer, Max Schlecht, Florian Schintke, Joel Witzke, Matthias Weidlich, Björn Scheuermann
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:40312-40327, 2025.

Abstract

Segmented regression is a statistical method that approximates a function $f$ by a piecewise function $\hat{f}$ using noisy data samples. Min-$\epsilon$ approaches aim to reduce the regression function’s mean squared error (MSE) for a given number of $k$ segments. An optimal solution for min-$\epsilon$ segmented regression is found in $\mathcal{O}(n^2)$ time (Bai & Perron, 1998; Yamamoto & Perron, 2013) for $n$ samples. For large datasets, current heuristics improve time complexity to $\mathcal{O}(n\log{n})$ (Acharya et al., 2016) but can result in large errors, especially when exactly $k$ segments are used. We present a method for min-$\epsilon$ segmented regression that combines the scalability of top existing heuristic solutions with a statistical efficiency similar to the optimal solution. This is achieved by using a new method to merge an initial set of segments using precomputed matrices from samples, allowing both merging and error calculation in constant time. Our approach, using the same samples and parameter $k$, produces segments with up to 1,000 times lower MSE compared to Acharya et al. (2016) in about 100 times less runtime on data sets over $10^4$ samples.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-losser25a, title = {Fast Min-$ε$ Segmented Regression using Constant-Time Segment Merging}, author = {L\"{o}{\ss}er, Ansgar and Schlecht, Max and Schintke, Florian and Witzke, Joel and Weidlich, Matthias and Scheuermann, Bj\"{o}rn}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {40312--40327}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/losser25a/losser25a.pdf}, url = {https://proceedings.mlr.press/v267/losser25a.html}, abstract = {Segmented regression is a statistical method that approximates a function $f$ by a piecewise function $\hat{f}$ using noisy data samples. Min-$\epsilon$ approaches aim to reduce the regression function’s mean squared error (MSE) for a given number of $k$ segments. An optimal solution for min-$\epsilon$ segmented regression is found in $\mathcal{O}(n^2)$ time (Bai & Perron, 1998; Yamamoto & Perron, 2013) for $n$ samples. For large datasets, current heuristics improve time complexity to $\mathcal{O}(n\log{n})$ (Acharya et al., 2016) but can result in large errors, especially when exactly $k$ segments are used. We present a method for min-$\epsilon$ segmented regression that combines the scalability of top existing heuristic solutions with a statistical efficiency similar to the optimal solution. This is achieved by using a new method to merge an initial set of segments using precomputed matrices from samples, allowing both merging and error calculation in constant time. Our approach, using the same samples and parameter $k$, produces segments with up to 1,000 times lower MSE compared to Acharya et al. (2016) in about 100 times less runtime on data sets over $10^4$ samples.} }
Endnote
%0 Conference Paper %T Fast Min-$ε$ Segmented Regression using Constant-Time Segment Merging %A Ansgar Lößer %A Max Schlecht %A Florian Schintke %A Joel Witzke %A Matthias Weidlich %A Björn Scheuermann %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-losser25a %I PMLR %P 40312--40327 %U https://proceedings.mlr.press/v267/losser25a.html %V 267 %X Segmented regression is a statistical method that approximates a function $f$ by a piecewise function $\hat{f}$ using noisy data samples. Min-$\epsilon$ approaches aim to reduce the regression function’s mean squared error (MSE) for a given number of $k$ segments. An optimal solution for min-$\epsilon$ segmented regression is found in $\mathcal{O}(n^2)$ time (Bai & Perron, 1998; Yamamoto & Perron, 2013) for $n$ samples. For large datasets, current heuristics improve time complexity to $\mathcal{O}(n\log{n})$ (Acharya et al., 2016) but can result in large errors, especially when exactly $k$ segments are used. We present a method for min-$\epsilon$ segmented regression that combines the scalability of top existing heuristic solutions with a statistical efficiency similar to the optimal solution. This is achieved by using a new method to merge an initial set of segments using precomputed matrices from samples, allowing both merging and error calculation in constant time. Our approach, using the same samples and parameter $k$, produces segments with up to 1,000 times lower MSE compared to Acharya et al. (2016) in about 100 times less runtime on data sets over $10^4$ samples.
APA
Lößer, A., Schlecht, M., Schintke, F., Witzke, J., Weidlich, M. & Scheuermann, B.. (2025). Fast Min-$ε$ Segmented Regression using Constant-Time Segment Merging. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:40312-40327 Available from https://proceedings.mlr.press/v267/losser25a.html.

Related Material