DeFusion: An Effective Decoupling Fusion Network for Multi-Modal Pregnancy Prediction

Xueqiang Ouyang, Jia Wei, Wenjie Huo, Xiaocong Wang, Rui Li, Jianlong Zhou
Proceedings of The 8th International Conference on Medical Imaging with Deep Learning, PMLR 301:1273-1293, 2026.

Abstract

Temporal embryo images and parental fertility table indicators are both valuable for pregnancy prediction in in vitro fertilization embryo transfer (IVF-ET). However, current machine learning models cannot make full use of the complementary information between the two modalities to improve pregnancy prediction performance. In this paper, we propose a Decoupling Fusion Network called DeFusion to effectively integrate the multi-modal information for IVF-ET pregnancy prediction. Specifically, we propose a decoupling fusion module that decouples the information from the different modalities into related and unrelated information, thereby achieving a more delicate fusion. And we fuse temporal embryo images with a spatial-temporal position encoding, and extract fertility table indicator information with a table transformer. To evaluate the effectiveness of our model, we use a new dataset including 4046 cases collected from Southern Medical University. The experiments show that our model outperforms state-of-the-art methods. Meanwhile, the performance on the eye disease prediction dataset reflects the modelś good generalization. Our code and dataset are available at https://github.com/Ou-Young-1999/DFNet.

Cite this Paper


BibTeX
@InProceedings{pmlr-v301-ouyang26a, title = {DeFusion: An Effective Decoupling Fusion Network for Multi-Modal Pregnancy Prediction}, author = {Ouyang, Xueqiang and Wei, Jia and Huo, Wenjie and Wang, Xiaocong and Li, Rui and Zhou, Jianlong}, booktitle = {Proceedings of The 8th International Conference on Medical Imaging with Deep Learning}, pages = {1273--1293}, year = {2026}, editor = {Tasdizen, Tolga and Elhabian, Shireen and Summers, Ronald and Chen, Chen and Koch, Lisa and Zhuang, Yan}, volume = {301}, series = {Proceedings of Machine Learning Research}, month = {09--11 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v301/main/assets/ouyang26a/ouyang26a.pdf}, url = {https://proceedings.mlr.press/v301/ouyang26a.html}, abstract = {Temporal embryo images and parental fertility table indicators are both valuable for pregnancy prediction in in vitro fertilization embryo transfer (IVF-ET). However, current machine learning models cannot make full use of the complementary information between the two modalities to improve pregnancy prediction performance. In this paper, we propose a Decoupling Fusion Network called DeFusion to effectively integrate the multi-modal information for IVF-ET pregnancy prediction. Specifically, we propose a decoupling fusion module that decouples the information from the different modalities into related and unrelated information, thereby achieving a more delicate fusion. And we fuse temporal embryo images with a spatial-temporal position encoding, and extract fertility table indicator information with a table transformer. To evaluate the effectiveness of our model, we use a new dataset including 4046 cases collected from Southern Medical University. The experiments show that our model outperforms state-of-the-art methods. Meanwhile, the performance on the eye disease prediction dataset reflects the modelś good generalization. Our code and dataset are available at https://github.com/Ou-Young-1999/DFNet.} }
Endnote
%0 Conference Paper %T DeFusion: An Effective Decoupling Fusion Network for Multi-Modal Pregnancy Prediction %A Xueqiang Ouyang %A Jia Wei %A Wenjie Huo %A Xiaocong Wang %A Rui Li %A Jianlong Zhou %B Proceedings of The 8th International Conference on Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2026 %E Tolga Tasdizen %E Shireen Elhabian %E Ronald Summers %E Chen Chen %E Lisa Koch %E Yan Zhuang %F pmlr-v301-ouyang26a %I PMLR %P 1273--1293 %U https://proceedings.mlr.press/v301/ouyang26a.html %V 301 %X Temporal embryo images and parental fertility table indicators are both valuable for pregnancy prediction in in vitro fertilization embryo transfer (IVF-ET). However, current machine learning models cannot make full use of the complementary information between the two modalities to improve pregnancy prediction performance. In this paper, we propose a Decoupling Fusion Network called DeFusion to effectively integrate the multi-modal information for IVF-ET pregnancy prediction. Specifically, we propose a decoupling fusion module that decouples the information from the different modalities into related and unrelated information, thereby achieving a more delicate fusion. And we fuse temporal embryo images with a spatial-temporal position encoding, and extract fertility table indicator information with a table transformer. To evaluate the effectiveness of our model, we use a new dataset including 4046 cases collected from Southern Medical University. The experiments show that our model outperforms state-of-the-art methods. Meanwhile, the performance on the eye disease prediction dataset reflects the modelś good generalization. Our code and dataset are available at https://github.com/Ou-Young-1999/DFNet.
APA
Ouyang, X., Wei, J., Huo, W., Wang, X., Li, R. & Zhou, J.. (2026). DeFusion: An Effective Decoupling Fusion Network for Multi-Modal Pregnancy Prediction. Proceedings of The 8th International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 301:1273-1293 Available from https://proceedings.mlr.press/v301/ouyang26a.html.

Related Material