Distributed Parallel Gradient Stacking(DPGS): Solving Whole Slide Image Stacking Challenge in Multi-Instance Learning

Boyuan Wu, Zefeng Wang, Xianwei Lin, Jiachun Xu, Jikai Yu, Zhou Shicheng, Hongda Chen, Lianxin Hu
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:67782-67792, 2025.

Abstract

Whole Slide Image (WSI) analysis is framed as a Multiple Instance Learning (MIL) problem, but existing methods struggle with non-stackable data due to inconsistent instance lengths, which degrades performance and efficiency. We propose a Distributed Parallel Gradient Stacking (DPGS) framework with Deep Model-Gradient Compression (DMGC) to address this. DPGS enables lossless MIL data stacking for the first time, while DMGC accelerates distributed training via joint gradient-model compression. Experiments on Camelyon16 and TCGA-Lung datasets demonstrate up to 31$\times$ faster training, up to a 99.2% reduction in model communication size at convergence, and up to a 9.3% improvement in accuracy compared to the baseline. To our knowledge, this is the first work to solve non-stackable data in MIL while improving both speed and accuracy.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-wu25ae, title = {Distributed Parallel Gradient {S}tacking({DPGS}): Solving Whole Slide Image Stacking Challenge in Multi-Instance Learning}, author = {Wu, Boyuan and Wang, Zefeng and Lin, Xianwei and Xu, Jiachun and Yu, Jikai and Shicheng, Zhou and Chen, Hongda and Hu, Lianxin}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {67782--67792}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/wu25ae/wu25ae.pdf}, url = {https://proceedings.mlr.press/v267/wu25ae.html}, abstract = {Whole Slide Image (WSI) analysis is framed as a Multiple Instance Learning (MIL) problem, but existing methods struggle with non-stackable data due to inconsistent instance lengths, which degrades performance and efficiency. We propose a Distributed Parallel Gradient Stacking (DPGS) framework with Deep Model-Gradient Compression (DMGC) to address this. DPGS enables lossless MIL data stacking for the first time, while DMGC accelerates distributed training via joint gradient-model compression. Experiments on Camelyon16 and TCGA-Lung datasets demonstrate up to 31$\times$ faster training, up to a 99.2% reduction in model communication size at convergence, and up to a 9.3% improvement in accuracy compared to the baseline. To our knowledge, this is the first work to solve non-stackable data in MIL while improving both speed and accuracy.} }
Endnote
%0 Conference Paper %T Distributed Parallel Gradient Stacking(DPGS): Solving Whole Slide Image Stacking Challenge in Multi-Instance Learning %A Boyuan Wu %A Zefeng Wang %A Xianwei Lin %A Jiachun Xu %A Jikai Yu %A Zhou Shicheng %A Hongda Chen %A Lianxin Hu %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-wu25ae %I PMLR %P 67782--67792 %U https://proceedings.mlr.press/v267/wu25ae.html %V 267 %X Whole Slide Image (WSI) analysis is framed as a Multiple Instance Learning (MIL) problem, but existing methods struggle with non-stackable data due to inconsistent instance lengths, which degrades performance and efficiency. We propose a Distributed Parallel Gradient Stacking (DPGS) framework with Deep Model-Gradient Compression (DMGC) to address this. DPGS enables lossless MIL data stacking for the first time, while DMGC accelerates distributed training via joint gradient-model compression. Experiments on Camelyon16 and TCGA-Lung datasets demonstrate up to 31$\times$ faster training, up to a 99.2% reduction in model communication size at convergence, and up to a 9.3% improvement in accuracy compared to the baseline. To our knowledge, this is the first work to solve non-stackable data in MIL while improving both speed and accuracy.
APA
Wu, B., Wang, Z., Lin, X., Xu, J., Yu, J., Shicheng, Z., Chen, H. & Hu, L.. (2025). Distributed Parallel Gradient Stacking(DPGS): Solving Whole Slide Image Stacking Challenge in Multi-Instance Learning. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:67782-67792 Available from https://proceedings.mlr.press/v267/wu25ae.html.

Related Material