Learning Syntactic Monoids from Samples by extending known Algorithms for learning State Machines

Simon Dieck, Sicco Verwer
Proceedings of 16th edition of the International Conference on Grammatical Inference, PMLR 217:59-79, 2023.

Abstract

For the inference of regular languages, most current methods learn a version of deterministic finite automata. Syntactic monoids are an alternative representation of regular languages, which have some advantages over automata. For example, traces can be parsed starting from any index and the star-freeness of the language they represent can be checked in polynomial time. But, to date, there existed no passive learning algorithm for syntactic monoids. In this paper, we prove that known state-merging algorithms for learning deterministic finite automata can be instrumented to learn syntactic monoids instead, by using as the input a special structure proposed in this paper: the interfix-graph. Further, we introduce a method to encode frequencies on the interfix-graph, such that models can also be learned from only positive traces. We implemented this structure and performed experiments with both traditional data and data containing only positive traces. As such this work answers basic theoretical and experimental questions regarding a novel passive learning algorithm for syntactic monoids.

Cite this Paper


BibTeX
@InProceedings{pmlr-v217-dieck23a, title = {Learning Syntactic Monoids from Samples by extending known Algorithms for learning State Machines}, author = {Dieck, Simon and Verwer, Sicco}, booktitle = {Proceedings of 16th edition of the International Conference on Grammatical Inference}, pages = {59--79}, year = {2023}, editor = {Coste, Fran├žois and Ouardi, Faissal and Rabusseau, Guillaume}, volume = {217}, series = {Proceedings of Machine Learning Research}, month = {10--13 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v217/dieck23a/dieck23a.pdf}, url = {https://proceedings.mlr.press/v217/dieck23a.html}, abstract = {For the inference of regular languages, most current methods learn a version of deterministic finite automata. Syntactic monoids are an alternative representation of regular languages, which have some advantages over automata. For example, traces can be parsed starting from any index and the star-freeness of the language they represent can be checked in polynomial time. But, to date, there existed no passive learning algorithm for syntactic monoids. In this paper, we prove that known state-merging algorithms for learning deterministic finite automata can be instrumented to learn syntactic monoids instead, by using as the input a special structure proposed in this paper: the interfix-graph. Further, we introduce a method to encode frequencies on the interfix-graph, such that models can also be learned from only positive traces. We implemented this structure and performed experiments with both traditional data and data containing only positive traces. As such this work answers basic theoretical and experimental questions regarding a novel passive learning algorithm for syntactic monoids.} }
Endnote
%0 Conference Paper %T Learning Syntactic Monoids from Samples by extending known Algorithms for learning State Machines %A Simon Dieck %A Sicco Verwer %B Proceedings of 16th edition of the International Conference on Grammatical Inference %C Proceedings of Machine Learning Research %D 2023 %E Fran├žois Coste %E Faissal Ouardi %E Guillaume Rabusseau %F pmlr-v217-dieck23a %I PMLR %P 59--79 %U https://proceedings.mlr.press/v217/dieck23a.html %V 217 %X For the inference of regular languages, most current methods learn a version of deterministic finite automata. Syntactic monoids are an alternative representation of regular languages, which have some advantages over automata. For example, traces can be parsed starting from any index and the star-freeness of the language they represent can be checked in polynomial time. But, to date, there existed no passive learning algorithm for syntactic monoids. In this paper, we prove that known state-merging algorithms for learning deterministic finite automata can be instrumented to learn syntactic monoids instead, by using as the input a special structure proposed in this paper: the interfix-graph. Further, we introduce a method to encode frequencies on the interfix-graph, such that models can also be learned from only positive traces. We implemented this structure and performed experiments with both traditional data and data containing only positive traces. As such this work answers basic theoretical and experimental questions regarding a novel passive learning algorithm for syntactic monoids.
APA
Dieck, S. & Verwer, S.. (2023). Learning Syntactic Monoids from Samples by extending known Algorithms for learning State Machines. Proceedings of 16th edition of the International Conference on Grammatical Inference, in Proceedings of Machine Learning Research 217:59-79 Available from https://proceedings.mlr.press/v217/dieck23a.html.

Related Material