Learning Extrapolative Sequence Transformations from Markov Chains

Sophia Hager; Aleem Khan; Andrew Wang; Nicholas Andrews

Learning Extrapolative Sequence Transformations from Markov Chains

Sophia Hager, Aleem Khan, Andrew Wang, Nicholas Andrews

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:21570-21590, 2025.

Abstract

Most successful applications of deep learning involve similar training and test conditions. However, tasks such as biological sequence design involve searching for sequences that improve desirable properties beyond previously known values, which requires novel hypotheses that extrapolate beyond training data. In these settings, extrapolation may be achieved by using random search methods such as Markov chain Monte Carlo (MCMC), which, given an initial state, sample local transformations to approximate a target density that rewards states with the desired properties. However, even with a well-designed proposal, MCMC may struggle to explore large structured state spaces efficiently. Rather than relying on stochastic search, it would be desirable to have a model that greedily optimizes the properties of interest, successfully extrapolating in as few steps as possible. We propose to learn such a model from the Markov chains resulting from MCMC search. Specifically, our approach uses selected states from Markov chains as a source of training data for an autoregressive model, which is then able to efficiently generate novel sequences that extrapolate along the sequence-level properties of interest. The proposed approach is validated on three problems: protein sequence design, text sentiment control, and text anonymization. We find that the autoregressive model can extrapolate as well or better than MCMC, but with the additional benefits of scalability and significantly higher sample efficiency.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-hager25a,
  title = 	 {Learning Extrapolative Sequence Transformations from {M}arkov Chains},
  author =       {Hager, Sophia and Khan, Aleem and Wang, Andrew and Andrews, Nicholas},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {21570--21590},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/hager25a/hager25a.pdf},
  url = 	 {https://proceedings.mlr.press/v267/hager25a.html},
  abstract = 	 {Most successful applications of deep learning involve similar training and test conditions. However, tasks such as biological sequence design involve searching for sequences that improve desirable properties beyond previously known values, which requires novel hypotheses that extrapolate beyond training data. In these settings, extrapolation may be achieved by using random search methods such as Markov chain Monte Carlo (MCMC), which, given an initial state, sample local transformations to approximate a target density that rewards states with the desired properties. However, even with a well-designed proposal, MCMC may struggle to explore large structured state spaces efficiently. Rather than relying on stochastic search, it would be desirable to have a model that greedily optimizes the properties of interest, successfully extrapolating in as few steps as possible. We propose to learn such a model from the Markov chains resulting from MCMC search. Specifically, our approach uses selected states from Markov chains as a source of training data for an autoregressive model, which is then able to efficiently generate novel sequences that extrapolate along the sequence-level properties of interest. The proposed approach is validated on three problems: protein sequence design, text sentiment control, and text anonymization. We find that the autoregressive model can extrapolate as well or better than MCMC, but with the additional benefits of scalability and significantly higher sample efficiency.}
}

Endnote

%0 Conference Paper
%T Learning Extrapolative Sequence Transformations from Markov Chains
%A Sophia Hager
%A Aleem Khan
%A Andrew Wang
%A Nicholas Andrews
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-hager25a
%I PMLR
%P 21570--21590
%U https://proceedings.mlr.press/v267/hager25a.html
%V 267
%X Most successful applications of deep learning involve similar training and test conditions. However, tasks such as biological sequence design involve searching for sequences that improve desirable properties beyond previously known values, which requires novel hypotheses that extrapolate beyond training data. In these settings, extrapolation may be achieved by using random search methods such as Markov chain Monte Carlo (MCMC), which, given an initial state, sample local transformations to approximate a target density that rewards states with the desired properties. However, even with a well-designed proposal, MCMC may struggle to explore large structured state spaces efficiently. Rather than relying on stochastic search, it would be desirable to have a model that greedily optimizes the properties of interest, successfully extrapolating in as few steps as possible. We propose to learn such a model from the Markov chains resulting from MCMC search. Specifically, our approach uses selected states from Markov chains as a source of training data for an autoregressive model, which is then able to efficiently generate novel sequences that extrapolate along the sequence-level properties of interest. The proposed approach is validated on three problems: protein sequence design, text sentiment control, and text anonymization. We find that the autoregressive model can extrapolate as well or better than MCMC, but with the additional benefits of scalability and significantly higher sample efficiency.

APA

Hager, S., Khan, A., Wang, A. & Andrews, N.. (2025). Learning Extrapolative Sequence Transformations from Markov Chains. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:21570-21590 Available from https://proceedings.mlr.press/v267/hager25a.html.

Learning Extrapolative Sequence Transformations from Markov Chains

Abstract

Cite this Paper

Related Material