PeakSeg: constrained optimal segmentation and supervised penalty learning for peak detection in count data
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:324-332, 2015.
Peak detection is a central problem in genomic data analysis, and current algorithms for this task are unsupervised and mostly effective for a single data type and pattern (e.g. H3K4me3 data with a sharp peak pattern). We propose PeakSeg, a new constrained maximum likelihood segmentation model for peak detection with an efficient inference algorithm: constrained dynamic programming. We investigate unsupervised and supervised learning of penalties for the critical model selection problem. We show that the supervised method has state-of-the-art peak detection across all data sets in a benchmark that includes both sharp H3K4me3 and broad H3K36me3 patterns.