Learning state machines from data streams: A generic strategy and an improved heuristic

Robert Baumgartner, Sicco Verwer
Proceedings of 16th edition of the International Conference on Grammatical Inference, PMLR 217:117-141, 2023.

Abstract

State machines models are models that simulate the behavior of discrete event systems, capable of representing systems such as software systems, network interactions, and control systems, and have been researched extensively. The nature of most learning algorithms however is the assumption that all data be available at the begining of the algorithm, and little research has been done in learning state machines from streaming data. In this paper, we want to close this gap further by presenting a generic method for learning state machines from data streams, as well as a merge heuristic that uses sketches to account for incomplete prefix trees. We implement our approach in an open-source state merging library and compare it with existing methods. We show the effectiveness of our approach with respect to run-time, memory consumption, and quality of results on a well known open dataset.

Cite this Paper


BibTeX
@InProceedings{pmlr-v217-baumgartner23a, title = {Learning state machines from data streams: A generic strategy and an improved heuristic}, author = {Baumgartner, Robert and Verwer, Sicco}, booktitle = {Proceedings of 16th edition of the International Conference on Grammatical Inference}, pages = {117--141}, year = {2023}, editor = {Coste, Fran├žois and Ouardi, Faissal and Rabusseau, Guillaume}, volume = {217}, series = {Proceedings of Machine Learning Research}, month = {10--13 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v217/baumgartner23a/baumgartner23a.pdf}, url = {https://proceedings.mlr.press/v217/baumgartner23a.html}, abstract = {State machines models are models that simulate the behavior of discrete event systems, capable of representing systems such as software systems, network interactions, and control systems, and have been researched extensively. The nature of most learning algorithms however is the assumption that all data be available at the begining of the algorithm, and little research has been done in learning state machines from streaming data. In this paper, we want to close this gap further by presenting a generic method for learning state machines from data streams, as well as a merge heuristic that uses sketches to account for incomplete prefix trees. We implement our approach in an open-source state merging library and compare it with existing methods. We show the effectiveness of our approach with respect to run-time, memory consumption, and quality of results on a well known open dataset.} }
Endnote
%0 Conference Paper %T Learning state machines from data streams: A generic strategy and an improved heuristic %A Robert Baumgartner %A Sicco Verwer %B Proceedings of 16th edition of the International Conference on Grammatical Inference %C Proceedings of Machine Learning Research %D 2023 %E Fran├žois Coste %E Faissal Ouardi %E Guillaume Rabusseau %F pmlr-v217-baumgartner23a %I PMLR %P 117--141 %U https://proceedings.mlr.press/v217/baumgartner23a.html %V 217 %X State machines models are models that simulate the behavior of discrete event systems, capable of representing systems such as software systems, network interactions, and control systems, and have been researched extensively. The nature of most learning algorithms however is the assumption that all data be available at the begining of the algorithm, and little research has been done in learning state machines from streaming data. In this paper, we want to close this gap further by presenting a generic method for learning state machines from data streams, as well as a merge heuristic that uses sketches to account for incomplete prefix trees. We implement our approach in an open-source state merging library and compare it with existing methods. We show the effectiveness of our approach with respect to run-time, memory consumption, and quality of results on a well known open dataset.
APA
Baumgartner, R. & Verwer, S.. (2023). Learning state machines from data streams: A generic strategy and an improved heuristic. Proceedings of 16th edition of the International Conference on Grammatical Inference, in Proceedings of Machine Learning Research 217:117-141 Available from https://proceedings.mlr.press/v217/baumgartner23a.html.

Related Material