Enforcing Constraints in RNA Secondary Structure Predictions: A Post-Processing Framework Based on the Assignment Problem

Geewon Suh, Gyeongjo Hwang, Seokjun Kang, Doojin Baek, Mingeun Kang
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:46926-46941, 2024.

Abstract

RNA properties, such as function and stability, are intricately tied to their two-dimensional conformations. This has spurred the development of computational models for predicting the RNA secondary structures, leveraging dynamic programming or machine learning (ML) techniques. These structures are governed by specific rules; for example, only Watson-Crick and Wobble pairs are allowed, and sequences must not form sharp bends. Recent efforts introduced a systematic approach to post-process the predictions made by ML algorithms, aiming to modify them to respect the constraints. However, we still observe instances violating the requirements, significantly reducing biological relevance. To address this challenge, we present a novel post-processing framework for ML-based predictions on RNA secondary structures, inspired by the assignment problem in integer linear programming. Our algorithm offers a theoretical guarantee, ensuring that the resulting predictions adhere to the fundamental constraints of RNAs. Empirical evidence supports the efficacy of our approach, demonstrating improved predictive performance with no constraint violation, while requiring less running time.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-suh24a, title = {Enforcing Constraints in {RNA} Secondary Structure Predictions: A Post-Processing Framework Based on the Assignment Problem}, author = {Suh, Geewon and Hwang, Gyeongjo and Kang, Seokjun and Baek, Doojin and Kang, Mingeun}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {46926--46941}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/suh24a/suh24a.pdf}, url = {https://proceedings.mlr.press/v235/suh24a.html}, abstract = {RNA properties, such as function and stability, are intricately tied to their two-dimensional conformations. This has spurred the development of computational models for predicting the RNA secondary structures, leveraging dynamic programming or machine learning (ML) techniques. These structures are governed by specific rules; for example, only Watson-Crick and Wobble pairs are allowed, and sequences must not form sharp bends. Recent efforts introduced a systematic approach to post-process the predictions made by ML algorithms, aiming to modify them to respect the constraints. However, we still observe instances violating the requirements, significantly reducing biological relevance. To address this challenge, we present a novel post-processing framework for ML-based predictions on RNA secondary structures, inspired by the assignment problem in integer linear programming. Our algorithm offers a theoretical guarantee, ensuring that the resulting predictions adhere to the fundamental constraints of RNAs. Empirical evidence supports the efficacy of our approach, demonstrating improved predictive performance with no constraint violation, while requiring less running time.} }
Endnote
%0 Conference Paper %T Enforcing Constraints in RNA Secondary Structure Predictions: A Post-Processing Framework Based on the Assignment Problem %A Geewon Suh %A Gyeongjo Hwang %A Seokjun Kang %A Doojin Baek %A Mingeun Kang %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-suh24a %I PMLR %P 46926--46941 %U https://proceedings.mlr.press/v235/suh24a.html %V 235 %X RNA properties, such as function and stability, are intricately tied to their two-dimensional conformations. This has spurred the development of computational models for predicting the RNA secondary structures, leveraging dynamic programming or machine learning (ML) techniques. These structures are governed by specific rules; for example, only Watson-Crick and Wobble pairs are allowed, and sequences must not form sharp bends. Recent efforts introduced a systematic approach to post-process the predictions made by ML algorithms, aiming to modify them to respect the constraints. However, we still observe instances violating the requirements, significantly reducing biological relevance. To address this challenge, we present a novel post-processing framework for ML-based predictions on RNA secondary structures, inspired by the assignment problem in integer linear programming. Our algorithm offers a theoretical guarantee, ensuring that the resulting predictions adhere to the fundamental constraints of RNAs. Empirical evidence supports the efficacy of our approach, demonstrating improved predictive performance with no constraint violation, while requiring less running time.
APA
Suh, G., Hwang, G., Kang, S., Baek, D. & Kang, M.. (2024). Enforcing Constraints in RNA Secondary Structure Predictions: A Post-Processing Framework Based on the Assignment Problem. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:46926-46941 Available from https://proceedings.mlr.press/v235/suh24a.html.

Related Material