Proceedings of Machine Learning ResearchProceedings of The 13th International Conference on Grammatical Inference
Held in Delft, The Netherlands on 05-07 October 2016
Published as Volume 57 by the Proceedings of Machine Learning Research on 16 January 2017.
Volume Edited by:
Sicco Verwer
Menno van Zaanen
Rick Smetsers
Series Editors:
Neil D. Lawrence
Mark Reid
http://proceedings.mlr.press/v57/
Sun, 12 Mar 2017 18:18:18 +0000Sun, 12 Mar 2017 18:18:18 +0000Jekyll v3.4.1Sequence Prediction Using Neural Network ClassiersBeing able to guess the next element of a sequence is an important question in many fields. In this paper we present our approaches used in the Sequence Prediction ChallengE (SPiCe), whose goal is to compare the different approaches to that problem on the same datasets. We model sequence prediction as a classification problem and adapt three different neural network models to tackle it. The experimental results show that our neural network based approaches produce better overall performance than the baseline approaches provided in the competition. In the actual competition, we won the second place using these approaches.Mon, 16 Jan 2017 00:00:00 +0000
http://proceedings.mlr.press/v57/zhao16.html
http://proceedings.mlr.press/v57/zhao16.htmlModel Selection of Sequence Prediction Algorithms by CompressionThis paper describes estimating performance of sequence prediction algorithms and hyperparameters by compressing the training dataset itself with the probablities predicted by the trained model. With such estimation we can automate the selection and tuning process of learning algorithms. Spectral learning algorithm are experimented with.Mon, 16 Jan 2017 00:00:00 +0000
http://proceedings.mlr.press/v57/xi16.html
http://proceedings.mlr.press/v57/xi16.htmlInternational Conference on Grammatical Inference 2016: PrefaceMon, 16 Jan 2017 00:00:00 +0000
http://proceedings.mlr.press/v57/verwer16.html
http://proceedings.mlr.press/v57/verwer16.htmlUsing Model Theory for Grammatical Inference: a Case Study from PhonologyThis paper examines the characterization and learning of grammars defined by conjunctions of negative and positive literals (CNPL) where the literals correspond to structures in an enriched model theory of strings. CNPL logic represents an intermediate between conjunctions of negative literals (CNL) and a propositional-style logic, both of which have been well-studied in terms of the language classes they describe. Model-theoretic approaches to formal language theory have traditionally assumed that each position in a string belongs to exactly one unary relation. Using enriched models (which do no satisfy this assumption) presents a new avenue for investigation with potential applications in several fields including linguistics, planning and control, and molecular biology. We demonstrate the value of such structures and CNPL logic with a particular learning problem in phonology.Mon, 16 Jan 2017 00:00:00 +0000
http://proceedings.mlr.press/v57/strother-garcia16.html
http://proceedings.mlr.press/v57/strother-garcia16.htmlThe Generalized Smallest Grammar ProblemThe Smallest Grammar Problem – the problem of finding the smallest context-free grammar that generates exactly one given sequence – has never been successfully applied to grammatical inference. We investigate the reasons and propose an extended formulation that seeks to minimize non-recursive grammars, instead of straight-line programs. In addition, we provide very efficient algorithms that approximate the minimization problem of this class of grammars. Our empirical evaluation shows that we are able to find smaller models than the current best approximations to the Smallest Grammar Problem on standard benchmarks, and that the inferred rules capture much better the syntactic structure of natural language.Mon, 16 Jan 2017 00:00:00 +0000
http://proceedings.mlr.press/v57/siyari16.html
http://proceedings.mlr.press/v57/siyari16.htmlPredicting Sequential Data with LSTMs Augmented with Strictly 2-Piecewise Input VectorsRecurrent neural networks such as Long-Short Term Memory (LSTM) are often used to learn from various kinds of time-series data, especially those that involved long-distance dependencies. We introduce a vector representation for the Strictly 2-Piecewise (SP-2) formal languages, which encode certain kinds of long-distance dependencies using subsequences. These vectors are added to the LSTM architecture as an additional input. Through experiments with the problems in the SPiCe dataset, we demonstrate that for certain problems, these vectors slightly—but significantly—improve the top-5 score (normalized discounted cumulative gain) as well as the accuracy as compared to the LSTM architecture without the SP-2 input vector. These results are also compared to an LSTM architecture with an input vector based on bigrams.Mon, 16 Jan 2017 00:00:00 +0000
http://proceedings.mlr.press/v57/shibata16.html
http://proceedings.mlr.press/v57/shibata16.htmlEvaluation of Machine Learning Methods on SPiCeIn this paper, we introduce methods that we used to solve problems from the sequence prediction competition called SPiCe. We train a model from sequences in train data on each problem, and then predict a next symbol following each sequence in test data. We implement several methods to solve these problems. The experiment results show that XGBoost and neural network approaches have good performance overall.Mon, 16 Jan 2017 00:00:00 +0000
http://proceedings.mlr.press/v57/sato16.html
http://proceedings.mlr.press/v57/sato16.htmlLearning Deterministic Finite Automata from Infinite AlphabetsWe proposes an algorithm to learn automata infinite alphabets, or at least too large to enumerate. We apply it to define a generic model intended for regression, with transitions constrained by intervals over the alphabet. The algorithm is based on the Red & Blue framework for learning from an input sample. We show two small case studies where the alphabets are respectively the natural and real numbers, and show how nice properties of automata models like interpretability and graphical representation transfer to regression where typical models are hard to interpret.Mon, 16 Jan 2017 00:00:00 +0000
http://proceedings.mlr.press/v57/pellegrino16.html
http://proceedings.mlr.press/v57/pellegrino16.htmlA Spectral Method that Worked Well in the SPiCe’16 CompetitionWe present methods used in our submission to the Sequence Prediction ChallengE (SPiCe’16). The two methods used to solve the competition tasks were spectral learning and a count based method. Spectral learning led to better results on most of the problems.Mon, 16 Jan 2017 00:00:00 +0000
http://proceedings.mlr.press/v57/liza16.html
http://proceedings.mlr.press/v57/liza16.htmlFlexible State-Merging for Learning (P)DFAs in PythonWe present a Python package for learning (non-)probabilistic deterministic finite state automata and provide heuristics in the red-blue framework. As our package is built along the API of the popular scikit-learn package, it is easy to use and new learning methods are easy to add. It provides PDFA learning as an additional tool for sequence prediction or classification to data scientists, without the need to understand the algorithm itself but rather the limitations of PDFA as a model. With applications of automata learning in diverse fields such as network traffic analysis, software engineering and biology, a stratified package opens opportunities for practitioners.Mon, 16 Jan 2017 00:00:00 +0000
http://proceedings.mlr.press/v57/hammerschmidt16.html
http://proceedings.mlr.press/v57/hammerschmidt16.htmlInferring Non-resettable Mealy Machines with $n$ StatesAutomata inference algorithms are used to extract behavioural models of software components. However, when the software system cannot be reset, inference must be done from a single trace. This paper proposes an active learning algorithm that can infer a Mealy model under the assumption that the number of the states of the machine is known and that a characterization set for it is provided. This algorithm improves on a previous paper that used a looser assumption on the number of states. The complexity is polynomial in the number of states of the Mealy machine.Mon, 16 Jan 2017 00:00:00 +0000
http://proceedings.mlr.press/v57/groz16.html
http://proceedings.mlr.press/v57/groz16.htmlOnline Grammar Compression for Frequent Pattern DiscoveryVarious grammar compression algorithms have been proposed in the last decade. A grammar compression is a restricted CFG deriving the string deterministically. An efficient grammar compression develops a smaller CFG by finding duplicated patterns and removing them. This process is just a frequent pattern discovery by grammatical inference. While we can get any frequent pattern in linear time using a preprocessed string, a huge working space is required for longer patterns, and the whole string must be loaded into the memory preliminarily. We propose an online algorithm approximating this problem within a compressed space. The main contribution is an improvement of the previously best known approximation ratio Ω(\frac1\lg^2m) to Ω(\frac1\lg^*N\lg m) where m is the length of an optimal pattern in a string of length N and \lg^* is the iteration of the logarithm base 2. For a sufficiently large N, \lg^*N is practically constant. The experimental results show that our algorithm extracts nearly optimal patterns and achieves a significant improvement in memory consumption compared to the offline algorithm.Mon, 16 Jan 2017 00:00:00 +0000
http://proceedings.mlr.press/v57/fukunaga16.html
http://proceedings.mlr.press/v57/fukunaga16.htmlQuery Learning Automata with Helpful LabelsIn the active learning framework, a modified query learning algorithm benefiting by a nontrivial helpful labeling is able to learn automata with a reduced number of queries. In extremis, there exists a helpful labeling allowing the algorithm to learn automata even without counterexamples. We also review the correction queries defining them as particular types of labeling. We introduce minimal corrections, maximal corrections, and random corrections. An experimental approach compares the performance and limitations of various types of queries and corrections. The results show that algorithms using corrections require fewer queries in most of the cases.Mon, 16 Jan 2017 00:00:00 +0000
http://proceedings.mlr.press/v57/dediu16.html
http://proceedings.mlr.press/v57/dediu16.htmlTesting Distributional Properties of Context-Free GrammarsRecent algorithms for distributional learning of context-free grammars can learn all languages defined by grammars that have certain distributional properties: the finite kernel property (FKP) and the finite context property (FCP). In this paper we present some algorithms for approximately determining whether a given grammar has one of these properties. We then present the results of some experiments that indicate that with randomly generated context-free grammars in Chomsky normal form, which generate infinite languages and are derivationally sparse, nearly all grammars have the finite kernel property, whereas the finite context property is much less common.Mon, 16 Jan 2017 00:00:00 +0000
http://proceedings.mlr.press/v57/clark16.html
http://proceedings.mlr.press/v57/clark16.htmlLearning Top-Down Tree Transducers with Regular Domain InspectionWe study the problem of how to learn tree transformations on a given regular tree domain from a finite sample of input-output examples. We assume that the target tree transformation can be defined by a deterministic top-down tree transducer with regular domain inspection (DTOPIreg). An RPNI style learning algorithm that solves this problem in polynomial time and with polynomially many examples was presented at Pods’2010, but restricted to the case of path-closed regular domains. In this paper, we show that this restriction can be removed. For this, we present a new normal form for DTOPIreg by extending the Myhill-Nerode theorem for DTOP to regular domain inspections in a nontrivial manner. The RPNI style learning algorithm can also be lifted but becomes more involved too.Mon, 16 Jan 2017 00:00:00 +0000
http://proceedings.mlr.press/v57/boiret16.html
http://proceedings.mlr.press/v57/boiret16.htmlSimple K-star Categorial Dependency Grammars and their InferenceWe propose a novel subclass in the family of Categorial Dependency Grammars (CDG), based on a syntactic criterion on categorial types associated to words in the lexicon and study its learnability. This proposal relies on a linguistic principle and relates to a former non-constructive condition on iterated dependencies. We show that the projective CDG in this subclass are incrementally learnable in the limit from dependency structures. In contrast to previous proposals, our criterion is both syntactic and does not impose a (rigidity) bound on the number of categorial types associated to a word.Mon, 16 Jan 2017 00:00:00 +0000
http://proceedings.mlr.press/v57/bechet16.html
http://proceedings.mlr.press/v57/bechet16.htmlResults of the Sequence PredIction ChallengE (SPiCe): a Competition on Learning the Next Symbol in a SequenceThe Sequence PredIction ChallengE (SPiCe) is an on-line competition that took place between March and July 2016. Each of the 15 problems was made of a set of whole sequences as training sample, a validation set of prefixes, and a test set of prefixes. The aim was to submit a ranking of the 5 most probable symbols to be the next symbol of each prefix.Mon, 16 Jan 2017 00:00:00 +0000
http://proceedings.mlr.press/v57/balle16.html
http://proceedings.mlr.press/v57/balle16.htmlSp2Learn: A Toolbox for the Spectral Learning of Weighted AutomataSp2Learn is a Python toolbox for the spectral learning of weighted automata from a set of strings, licensed under Free BSD. This paper gives the main formal ideas behind the spectral learning algorithm and details the content of the toolbox. Use cases and an experimental section are also provided.Mon, 16 Jan 2017 00:00:00 +0000
http://proceedings.mlr.press/v57/arrivault16.html
http://proceedings.mlr.press/v57/arrivault16.html