Noise-Guided Predicate Representation Extraction and Diffusion-Enhanced Discretization for Scene Graph Generation

Guoqing Zhang; Shichao Kan; Fanghui Zhang; Wanru Xu; Yue Zhang; Yigang Cen

Noise-Guided Predicate Representation Extraction and Diffusion-Enhanced Discretization for Scene Graph Generation

Guoqing Zhang, Shichao Kan, Fanghui Zhang, Wanru Xu, Yue Zhang, Yigang Cen

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:75234-75252, 2025.

Abstract

Scene Graph Generation (SGG) is a fundamental task in visual understanding, aimed at providing more precise local detail comprehension for downstream applications. Existing SGG methods often overlook the diversity of predicate representations and the consistency among similar predicates when dealing with long-tail distributions. As a result, the model’s decision layer fails to effectively capture details from the tail end, leading to biased predictions. To address this, we propose a Noise-Guided Predicate Representation Extraction and Diffusion-Enhanced Discretization (NoDIS) method. On the one hand, expanding the predicate representation space enhances the model’s ability to learn both common and rare predicates, thus reducing prediction bias caused by data scarcity. We propose a conditional diffusion model to reconstructs features and increase the diversity of representations for same category predicates. On the other hand, independent predicate representations in the decision phase increase the learning complexity of the decision layer, making accurate predictions more challenging. To address this issue, we introduce a discretization mapper that learns consistent representations among similar predicates, reducing the learning difficulty and decision ambiguity in the decision layer. To validate the effectiveness of our method, we integrate NoDIS with various SGG baseline models and conduct experiments on multiple datasets. The results consistently demonstrate superior performance.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-zhang25ak,
  title = 	 {Noise-Guided Predicate Representation Extraction and Diffusion-Enhanced Discretization for Scene Graph Generation},
  author =       {Zhang, Guoqing and Kan, Shichao and Zhang, Fanghui and Xu, Wanru and Zhang, Yue and Cen, Yigang},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {75234--75252},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/zhang25ak/zhang25ak.pdf},
  url = 	 {https://proceedings.mlr.press/v267/zhang25ak.html},
  abstract = 	 {Scene Graph Generation (SGG) is a fundamental task in visual understanding, aimed at providing more precise local detail comprehension for downstream applications. Existing SGG methods often overlook the diversity of predicate representations and the consistency among similar predicates when dealing with long-tail distributions. As a result, the model’s decision layer fails to effectively capture details from the tail end, leading to biased predictions. To address this, we propose a Noise-Guided Predicate Representation Extraction and Diffusion-Enhanced Discretization (NoDIS) method. On the one hand, expanding the predicate representation space enhances the model’s ability to learn both common and rare predicates, thus reducing prediction bias caused by data scarcity. We propose a conditional diffusion model to reconstructs features and increase the diversity of representations for same category predicates. On the other hand, independent predicate representations in the decision phase increase the learning complexity of the decision layer, making accurate predictions more challenging. To address this issue, we introduce a discretization mapper that learns consistent representations among similar predicates, reducing the learning difficulty and decision ambiguity in the decision layer. To validate the effectiveness of our method, we integrate NoDIS with various SGG baseline models and conduct experiments on multiple datasets. The results consistently demonstrate superior performance.}
}

Endnote

%0 Conference Paper
%T Noise-Guided Predicate Representation Extraction and Diffusion-Enhanced Discretization for Scene Graph Generation
%A Guoqing Zhang
%A Shichao Kan
%A Fanghui Zhang
%A Wanru Xu
%A Yue Zhang
%A Yigang Cen
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-zhang25ak
%I PMLR
%P 75234--75252
%U https://proceedings.mlr.press/v267/zhang25ak.html
%V 267
%X Scene Graph Generation (SGG) is a fundamental task in visual understanding, aimed at providing more precise local detail comprehension for downstream applications. Existing SGG methods often overlook the diversity of predicate representations and the consistency among similar predicates when dealing with long-tail distributions. As a result, the model’s decision layer fails to effectively capture details from the tail end, leading to biased predictions. To address this, we propose a Noise-Guided Predicate Representation Extraction and Diffusion-Enhanced Discretization (NoDIS) method. On the one hand, expanding the predicate representation space enhances the model’s ability to learn both common and rare predicates, thus reducing prediction bias caused by data scarcity. We propose a conditional diffusion model to reconstructs features and increase the diversity of representations for same category predicates. On the other hand, independent predicate representations in the decision phase increase the learning complexity of the decision layer, making accurate predictions more challenging. To address this issue, we introduce a discretization mapper that learns consistent representations among similar predicates, reducing the learning difficulty and decision ambiguity in the decision layer. To validate the effectiveness of our method, we integrate NoDIS with various SGG baseline models and conduct experiments on multiple datasets. The results consistently demonstrate superior performance.

APA

Zhang, G., Kan, S., Zhang, F., Xu, W., Zhang, Y. & Cen, Y.. (2025). Noise-Guided Predicate Representation Extraction and Diffusion-Enhanced Discretization for Scene Graph Generation. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:75234-75252 Available from https://proceedings.mlr.press/v267/zhang25ak.html.

Noise-Guided Predicate Representation Extraction and Diffusion-Enhanced Discretization for Scene Graph Generation

Abstract

Cite this Paper

Related Material