Spider: A Unified Framework for Context-dependent Concept Segmentation

Xiaoqi Zhao, Youwei Pang, Wei Ji, Baicheng Sheng, Jiaming Zuo, Lihe Zhang, Huchuan Lu
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:60906-60926, 2024.

Abstract

Different from the context-independent (CI) concepts such as human, car, and airplane, context-dependent (CD) concepts require higher visual understanding ability, such as camouflaged object and medical lesion. Despite the rapid advance of many CD understanding tasks in respective branches, the isolated evolution leads to their limited cross-domain generalisation and repetitive technique innovation. Since there is a strong coupling relationship between foreground and background context in CD tasks, existing methods require to train separate models in their focused domains. This restricts their real-world CD concept understanding towards artificial general intelligence (AGI). We propose a unified model with a single set of parameters, Spider, which only needs to be trained once. With the help of the proposed concept filter driven by the image-mask group prompt, Spider is able to understand and distinguish diverse strong context-dependent concepts to accurately capture the Prompter’s intention. Without bells and whistles, Spider significantly outperforms the state-of-the-art specialized models in 8 different context-dependent segmentation tasks, including 4 natural scenes (salient, camouflaged, and transparent objects and shadow) and 4 medical lesions (COVID-19, polyp, breast, and skin lesion with color colonoscopy, CT, ultrasound, and dermoscopy modalities). Besides, Spider shows obvious advantages in continuous learning. It can easily complete the training of new tasks by fine-tuning parameters less than 1% and bring a tolerable performance degradation of less than 5% for all old tasks. The source code will be publicly available at https://github.com/Xiaoqi-Zhao-DLUT/Spider-UniCDSeg.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-zhao24j, title = {Spider: A Unified Framework for Context-dependent Concept Segmentation}, author = {Zhao, Xiaoqi and Pang, Youwei and Ji, Wei and Sheng, Baicheng and Zuo, Jiaming and Zhang, Lihe and Lu, Huchuan}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {60906--60926}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/zhao24j/zhao24j.pdf}, url = {https://proceedings.mlr.press/v235/zhao24j.html}, abstract = {Different from the context-independent (CI) concepts such as human, car, and airplane, context-dependent (CD) concepts require higher visual understanding ability, such as camouflaged object and medical lesion. Despite the rapid advance of many CD understanding tasks in respective branches, the isolated evolution leads to their limited cross-domain generalisation and repetitive technique innovation. Since there is a strong coupling relationship between foreground and background context in CD tasks, existing methods require to train separate models in their focused domains. This restricts their real-world CD concept understanding towards artificial general intelligence (AGI). We propose a unified model with a single set of parameters, Spider, which only needs to be trained once. With the help of the proposed concept filter driven by the image-mask group prompt, Spider is able to understand and distinguish diverse strong context-dependent concepts to accurately capture the Prompter’s intention. Without bells and whistles, Spider significantly outperforms the state-of-the-art specialized models in 8 different context-dependent segmentation tasks, including 4 natural scenes (salient, camouflaged, and transparent objects and shadow) and 4 medical lesions (COVID-19, polyp, breast, and skin lesion with color colonoscopy, CT, ultrasound, and dermoscopy modalities). Besides, Spider shows obvious advantages in continuous learning. It can easily complete the training of new tasks by fine-tuning parameters less than 1% and bring a tolerable performance degradation of less than 5% for all old tasks. The source code will be publicly available at https://github.com/Xiaoqi-Zhao-DLUT/Spider-UniCDSeg.} }
Endnote
%0 Conference Paper %T Spider: A Unified Framework for Context-dependent Concept Segmentation %A Xiaoqi Zhao %A Youwei Pang %A Wei Ji %A Baicheng Sheng %A Jiaming Zuo %A Lihe Zhang %A Huchuan Lu %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-zhao24j %I PMLR %P 60906--60926 %U https://proceedings.mlr.press/v235/zhao24j.html %V 235 %X Different from the context-independent (CI) concepts such as human, car, and airplane, context-dependent (CD) concepts require higher visual understanding ability, such as camouflaged object and medical lesion. Despite the rapid advance of many CD understanding tasks in respective branches, the isolated evolution leads to their limited cross-domain generalisation and repetitive technique innovation. Since there is a strong coupling relationship between foreground and background context in CD tasks, existing methods require to train separate models in their focused domains. This restricts their real-world CD concept understanding towards artificial general intelligence (AGI). We propose a unified model with a single set of parameters, Spider, which only needs to be trained once. With the help of the proposed concept filter driven by the image-mask group prompt, Spider is able to understand and distinguish diverse strong context-dependent concepts to accurately capture the Prompter’s intention. Without bells and whistles, Spider significantly outperforms the state-of-the-art specialized models in 8 different context-dependent segmentation tasks, including 4 natural scenes (salient, camouflaged, and transparent objects and shadow) and 4 medical lesions (COVID-19, polyp, breast, and skin lesion with color colonoscopy, CT, ultrasound, and dermoscopy modalities). Besides, Spider shows obvious advantages in continuous learning. It can easily complete the training of new tasks by fine-tuning parameters less than 1% and bring a tolerable performance degradation of less than 5% for all old tasks. The source code will be publicly available at https://github.com/Xiaoqi-Zhao-DLUT/Spider-UniCDSeg.
APA
Zhao, X., Pang, Y., Ji, W., Sheng, B., Zuo, J., Zhang, L. & Lu, H.. (2024). Spider: A Unified Framework for Context-dependent Concept Segmentation. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:60906-60926 Available from https://proceedings.mlr.press/v235/zhao24j.html.

Related Material