Fast State Discovery for HMM Model Selection and Learning

Sajid M. Siddiqi, Geogrey J. Gordon, Andrew W. Moore
; Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, PMLR 2:492-499, 2007.

Abstract

Choosing the number of hidden states and their topology (model selection) and estimating model parameters (learning) are important problems for Hidden Markov Models. This paper presents a new state-splitting algorithm that addresses both these problems. The algorithm models more information about the dynamic context of a state during a split, enabling it to discover underlying states more effectively. Compared to previous top-down methods, the algorithm also touches a smaller fraction of the data per split, leading to faster model search and selection. Because of its efficiency and ability to avoid local minima, the state-splitting approach is a good way to learn HMMs even if the desired number of states is known beforehand. We compare our approach to previous work on synthetic data as well as several real-world data sets from the literature, revealing significant improvements in efficiency and test-set likelihoods. We also compare to previous algorithms on a sign-language recognition task, with positive results.

Cite this Paper


BibTeX
@InProceedings{pmlr-v2-siddiqi07a, title = {Fast State Discovery for HMM Model Selection and Learning}, author = {Sajid M. Siddiqi and Geogrey J. Gordon and Andrew W. Moore}, booktitle = {Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics}, pages = {492--499}, year = {2007}, editor = {Marina Meila and Xiaotong Shen}, volume = {2}, series = {Proceedings of Machine Learning Research}, address = {San Juan, Puerto Rico}, month = {21--24 Mar}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v2/siddiqi07a/siddiqi07a.pdf}, url = {http://proceedings.mlr.press/v2/siddiqi07a.html}, abstract = {Choosing the number of hidden states and their topology (model selection) and estimating model parameters (learning) are important problems for Hidden Markov Models. This paper presents a new state-splitting algorithm that addresses both these problems. The algorithm models more information about the dynamic context of a state during a split, enabling it to discover underlying states more effectively. Compared to previous top-down methods, the algorithm also touches a smaller fraction of the data per split, leading to faster model search and selection. Because of its efficiency and ability to avoid local minima, the state-splitting approach is a good way to learn HMMs even if the desired number of states is known beforehand. We compare our approach to previous work on synthetic data as well as several real-world data sets from the literature, revealing significant improvements in efficiency and test-set likelihoods. We also compare to previous algorithms on a sign-language recognition task, with positive results.} }
Endnote
%0 Conference Paper %T Fast State Discovery for HMM Model Selection and Learning %A Sajid M. Siddiqi %A Geogrey J. Gordon %A Andrew W. Moore %B Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2007 %E Marina Meila %E Xiaotong Shen %F pmlr-v2-siddiqi07a %I PMLR %J Proceedings of Machine Learning Research %P 492--499 %U http://proceedings.mlr.press %V 2 %W PMLR %X Choosing the number of hidden states and their topology (model selection) and estimating model parameters (learning) are important problems for Hidden Markov Models. This paper presents a new state-splitting algorithm that addresses both these problems. The algorithm models more information about the dynamic context of a state during a split, enabling it to discover underlying states more effectively. Compared to previous top-down methods, the algorithm also touches a smaller fraction of the data per split, leading to faster model search and selection. Because of its efficiency and ability to avoid local minima, the state-splitting approach is a good way to learn HMMs even if the desired number of states is known beforehand. We compare our approach to previous work on synthetic data as well as several real-world data sets from the literature, revealing significant improvements in efficiency and test-set likelihoods. We also compare to previous algorithms on a sign-language recognition task, with positive results.
RIS
TY - CPAPER TI - Fast State Discovery for HMM Model Selection and Learning AU - Sajid M. Siddiqi AU - Geogrey J. Gordon AU - Andrew W. Moore BT - Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics PY - 2007/03/11 DA - 2007/03/11 ED - Marina Meila ED - Xiaotong Shen ID - pmlr-v2-siddiqi07a PB - PMLR SP - 492 DP - PMLR EP - 499 L1 - http://proceedings.mlr.press/v2/siddiqi07a/siddiqi07a.pdf UR - http://proceedings.mlr.press/v2/siddiqi07a.html AB - Choosing the number of hidden states and their topology (model selection) and estimating model parameters (learning) are important problems for Hidden Markov Models. This paper presents a new state-splitting algorithm that addresses both these problems. The algorithm models more information about the dynamic context of a state during a split, enabling it to discover underlying states more effectively. Compared to previous top-down methods, the algorithm also touches a smaller fraction of the data per split, leading to faster model search and selection. Because of its efficiency and ability to avoid local minima, the state-splitting approach is a good way to learn HMMs even if the desired number of states is known beforehand. We compare our approach to previous work on synthetic data as well as several real-world data sets from the literature, revealing significant improvements in efficiency and test-set likelihoods. We also compare to previous algorithms on a sign-language recognition task, with positive results. ER -
APA
Siddiqi, S.M., Gordon, G.J. & Moore, A.W.. (2007). Fast State Discovery for HMM Model Selection and Learning. Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, in PMLR 2:492-499

Related Material