Score Matching with Missing Data

Josh Givens, Song Liu, Henry Reeve
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:19523-19561, 2025.

Abstract

Score matching is a vital tool for learning the distribution of data with applications across many areas including diffusion processes, energy based modelling, and graphical model estimation. Despite all these applications, little work explores its use when data is incomplete. We address this by adapting score matching (and its major extensions) to work with missing data in a flexible setting where data can be partially missing over any subset of the coordinates. We provide two separate score matching variations for general use, an importance weighting (IW) approach, and a variational approach. We provide finite sample bounds for our IW approach in finite domain settings and show it to have especially strong performance in small sample lower dimensional cases. Complementing this, we show our variational approach to be strongest in more complex high-dimensional settings which we demonstrate on graphical model estimation tasks on both real and simulated data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-givens25a, title = {Score Matching with Missing Data}, author = {Givens, Josh and Liu, Song and Reeve, Henry}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {19523--19561}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/givens25a/givens25a.pdf}, url = {https://proceedings.mlr.press/v267/givens25a.html}, abstract = {Score matching is a vital tool for learning the distribution of data with applications across many areas including diffusion processes, energy based modelling, and graphical model estimation. Despite all these applications, little work explores its use when data is incomplete. We address this by adapting score matching (and its major extensions) to work with missing data in a flexible setting where data can be partially missing over any subset of the coordinates. We provide two separate score matching variations for general use, an importance weighting (IW) approach, and a variational approach. We provide finite sample bounds for our IW approach in finite domain settings and show it to have especially strong performance in small sample lower dimensional cases. Complementing this, we show our variational approach to be strongest in more complex high-dimensional settings which we demonstrate on graphical model estimation tasks on both real and simulated data.} }
Endnote
%0 Conference Paper %T Score Matching with Missing Data %A Josh Givens %A Song Liu %A Henry Reeve %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-givens25a %I PMLR %P 19523--19561 %U https://proceedings.mlr.press/v267/givens25a.html %V 267 %X Score matching is a vital tool for learning the distribution of data with applications across many areas including diffusion processes, energy based modelling, and graphical model estimation. Despite all these applications, little work explores its use when data is incomplete. We address this by adapting score matching (and its major extensions) to work with missing data in a flexible setting where data can be partially missing over any subset of the coordinates. We provide two separate score matching variations for general use, an importance weighting (IW) approach, and a variational approach. We provide finite sample bounds for our IW approach in finite domain settings and show it to have especially strong performance in small sample lower dimensional cases. Complementing this, we show our variational approach to be strongest in more complex high-dimensional settings which we demonstrate on graphical model estimation tasks on both real and simulated data.
APA
Givens, J., Liu, S. & Reeve, H.. (2025). Score Matching with Missing Data. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:19523-19561 Available from https://proceedings.mlr.press/v267/givens25a.html.

Related Material