Completing Visual Objects via Bridging Generation and Segmentation

Xiang Li, Yinpeng Chen, Chung-Ching Lin, Hao Chen, Kai Hu, Rita Singh, Bhiksha Raj, Lijuan Wang, Zicheng Liu
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:27531-27546, 2024.

Abstract

This paper presents a novel approach to object completion, with the primary goal of reconstructing a complete object from its partially visible components. Our method, named MaskComp, delineates the completion process through iterative stages of generation and segmentation. In each iteration, the object mask is provided as an additional condition to boost image generation, and, in return, the generated images can lead to a more accurate mask by fusing the segmentation of images. We demonstrate that the combination of one generation and one segmentation stage effectively functions as a mask denoiser. Through alternation between the generation and segmentation stages, the partial object mask is progressively refined, providing precise shape guidance and yielding superior object completion results. Our experiments demonstrate the superiority of MaskComp over existing approaches, e.g., ControlNet and Stable Diffusion, establishing it as an effective solution for object completion.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-li24j, title = {Completing Visual Objects via Bridging Generation and Segmentation}, author = {Li, Xiang and Chen, Yinpeng and Lin, Chung-Ching and Chen, Hao and Hu, Kai and Singh, Rita and Raj, Bhiksha and Wang, Lijuan and Liu, Zicheng}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {27531--27546}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/li24j/li24j.pdf}, url = {https://proceedings.mlr.press/v235/li24j.html}, abstract = {This paper presents a novel approach to object completion, with the primary goal of reconstructing a complete object from its partially visible components. Our method, named MaskComp, delineates the completion process through iterative stages of generation and segmentation. In each iteration, the object mask is provided as an additional condition to boost image generation, and, in return, the generated images can lead to a more accurate mask by fusing the segmentation of images. We demonstrate that the combination of one generation and one segmentation stage effectively functions as a mask denoiser. Through alternation between the generation and segmentation stages, the partial object mask is progressively refined, providing precise shape guidance and yielding superior object completion results. Our experiments demonstrate the superiority of MaskComp over existing approaches, e.g., ControlNet and Stable Diffusion, establishing it as an effective solution for object completion.} }
Endnote
%0 Conference Paper %T Completing Visual Objects via Bridging Generation and Segmentation %A Xiang Li %A Yinpeng Chen %A Chung-Ching Lin %A Hao Chen %A Kai Hu %A Rita Singh %A Bhiksha Raj %A Lijuan Wang %A Zicheng Liu %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-li24j %I PMLR %P 27531--27546 %U https://proceedings.mlr.press/v235/li24j.html %V 235 %X This paper presents a novel approach to object completion, with the primary goal of reconstructing a complete object from its partially visible components. Our method, named MaskComp, delineates the completion process through iterative stages of generation and segmentation. In each iteration, the object mask is provided as an additional condition to boost image generation, and, in return, the generated images can lead to a more accurate mask by fusing the segmentation of images. We demonstrate that the combination of one generation and one segmentation stage effectively functions as a mask denoiser. Through alternation between the generation and segmentation stages, the partial object mask is progressively refined, providing precise shape guidance and yielding superior object completion results. Our experiments demonstrate the superiority of MaskComp over existing approaches, e.g., ControlNet and Stable Diffusion, establishing it as an effective solution for object completion.
APA
Li, X., Chen, Y., Lin, C., Chen, H., Hu, K., Singh, R., Raj, B., Wang, L. & Liu, Z.. (2024). Completing Visual Objects via Bridging Generation and Segmentation. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:27531-27546 Available from https://proceedings.mlr.press/v235/li24j.html.

Related Material