Scene Graph Generation Strategy with Co-occurrence Knowledge and Learnable Term Frequency

Hyeongjin Kim, Sangwon Kim, Dasom Ahn, Jong Taek Lee, Byoung Chul Ko
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:24094-24109, 2024.

Abstract

Scene graph generation (SGG) is an important task in image understanding because it represents the relationships between objects in an image as a graph structure, making it possible to understand the semantic relationships between objects intuitively. Previous SGG studies used a message-passing neural networks (MPNN) to update features, which can effectively reflect information about surrounding objects. However, these studies have failed to reflect the co-occurrence of objects during SGG generation. In addition, they only addressed the long-tail problem of the training dataset from the perspectives of sampling and learning methods. To address these two problems, we propose CooK, which reflects the Co-occurrence Knowledge between objects, and the learnable term frequency-inverse document frequency (TF-$l$-IDF) to solve the long-tail problem. We applied the proposed model to the SGG benchmark dataset, and the results showed a performance improvement of up to 3.8% compared with existing state-of-the-art models in SGGen subtask. The proposed method exhibits generalization ability from the results obtained, showing uniform performance improvement for all MPNN models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-kim24n, title = {Scene Graph Generation Strategy with Co-occurrence Knowledge and Learnable Term Frequency}, author = {Kim, Hyeongjin and Kim, Sangwon and Ahn, Dasom and Lee, Jong Taek and Ko, Byoung Chul}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {24094--24109}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/kim24n/kim24n.pdf}, url = {https://proceedings.mlr.press/v235/kim24n.html}, abstract = {Scene graph generation (SGG) is an important task in image understanding because it represents the relationships between objects in an image as a graph structure, making it possible to understand the semantic relationships between objects intuitively. Previous SGG studies used a message-passing neural networks (MPNN) to update features, which can effectively reflect information about surrounding objects. However, these studies have failed to reflect the co-occurrence of objects during SGG generation. In addition, they only addressed the long-tail problem of the training dataset from the perspectives of sampling and learning methods. To address these two problems, we propose CooK, which reflects the Co-occurrence Knowledge between objects, and the learnable term frequency-inverse document frequency (TF-$l$-IDF) to solve the long-tail problem. We applied the proposed model to the SGG benchmark dataset, and the results showed a performance improvement of up to 3.8% compared with existing state-of-the-art models in SGGen subtask. The proposed method exhibits generalization ability from the results obtained, showing uniform performance improvement for all MPNN models.} }
Endnote
%0 Conference Paper %T Scene Graph Generation Strategy with Co-occurrence Knowledge and Learnable Term Frequency %A Hyeongjin Kim %A Sangwon Kim %A Dasom Ahn %A Jong Taek Lee %A Byoung Chul Ko %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-kim24n %I PMLR %P 24094--24109 %U https://proceedings.mlr.press/v235/kim24n.html %V 235 %X Scene graph generation (SGG) is an important task in image understanding because it represents the relationships between objects in an image as a graph structure, making it possible to understand the semantic relationships between objects intuitively. Previous SGG studies used a message-passing neural networks (MPNN) to update features, which can effectively reflect information about surrounding objects. However, these studies have failed to reflect the co-occurrence of objects during SGG generation. In addition, they only addressed the long-tail problem of the training dataset from the perspectives of sampling and learning methods. To address these two problems, we propose CooK, which reflects the Co-occurrence Knowledge between objects, and the learnable term frequency-inverse document frequency (TF-$l$-IDF) to solve the long-tail problem. We applied the proposed model to the SGG benchmark dataset, and the results showed a performance improvement of up to 3.8% compared with existing state-of-the-art models in SGGen subtask. The proposed method exhibits generalization ability from the results obtained, showing uniform performance improvement for all MPNN models.
APA
Kim, H., Kim, S., Ahn, D., Lee, J.T. & Ko, B.C.. (2024). Scene Graph Generation Strategy with Co-occurrence Knowledge and Learnable Term Frequency. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:24094-24109 Available from https://proceedings.mlr.press/v235/kim24n.html.

Related Material