LatentGNN: Learning Efficient Non-local Relations for Visual Recognition

Songyang Zhang; Xuming He; Shipeng Yan

LatentGNN: Learning Efficient Non-local Relations for Visual Recognition

Songyang Zhang, Xuming He, Shipeng Yan

Proceedings of the 36th International Conference on Machine Learning, PMLR 97:7374-7383, 2019.

Abstract

Capturing long-range dependencies in feature representations is crucial for many visual recognition tasks. Despite recent successes of deep convolutional networks, it remains challenging to model non-local context relations between visual features. A promising strategy is to model the feature context by a fully-connected graph neural network (GNN), which augments traditional convolutional features with an estimated non-local context representation. However, most GNN-based approaches require computing a dense graph affinity matrix and hence have difficulty in scaling up to tackle complex real-world visual problems. In this work, we propose an efficient and yet flexible non-local relation representation based on a novel class of graph neural networks. Our key idea is to introduce a latent space to reduce the complexity of graph, which allows us to use a low-rank representation for the graph affinity matrix and to achieve a linear complexity in computation. Extensive experimental evaluations on three major visual recognition tasks show that our method outperforms the prior works with a large margin while maintaining a low computation cost.

Cite this Paper

BibTeX

@InProceedings{pmlr-v97-zhang19f,
  title = 	 {{L}atent{GNN}: Learning Efficient Non-local Relations for Visual Recognition},
  author =       {Zhang, Songyang and He, Xuming and Yan, Shipeng},
  booktitle = 	 {Proceedings of the 36th International Conference on Machine Learning},
  pages = 	 {7374--7383},
  year = 	 {2019},
  editor = 	 {Chaudhuri, Kamalika and Salakhutdinov, Ruslan},
  volume = 	 {97},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {09--15 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v97/zhang19f/zhang19f.pdf},
  url = 	 {https://proceedings.mlr.press/v97/zhang19f.html},
  abstract = 	 {Capturing long-range dependencies in feature representations is crucial for many visual recognition tasks. Despite recent successes of deep convolutional networks, it remains challenging to model non-local context relations between visual features. A promising strategy is to model the feature context by a fully-connected graph neural network (GNN), which augments traditional convolutional features with an estimated non-local context representation. However, most GNN-based approaches require computing a dense graph affinity matrix and hence have difficulty in scaling up to tackle complex real-world visual problems. In this work, we propose an efficient and yet flexible non-local relation representation based on a novel class of graph neural networks. Our key idea is to introduce a latent space to reduce the complexity of graph, which allows us to use a low-rank representation for the graph affinity matrix and to achieve a linear complexity in computation. Extensive experimental evaluations on three major visual recognition tasks show that our method outperforms the prior works with a large margin while maintaining a low computation cost.}
}

Endnote

%0 Conference Paper
%T LatentGNN: Learning Efficient Non-local Relations for Visual Recognition
%A Songyang Zhang
%A Xuming He
%A Shipeng Yan
%B Proceedings of the 36th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2019
%E Kamalika Chaudhuri
%E Ruslan Salakhutdinov	
%F pmlr-v97-zhang19f
%I PMLR
%P 7374--7383
%U https://proceedings.mlr.press/v97/zhang19f.html
%V 97
%X Capturing long-range dependencies in feature representations is crucial for many visual recognition tasks. Despite recent successes of deep convolutional networks, it remains challenging to model non-local context relations between visual features. A promising strategy is to model the feature context by a fully-connected graph neural network (GNN), which augments traditional convolutional features with an estimated non-local context representation. However, most GNN-based approaches require computing a dense graph affinity matrix and hence have difficulty in scaling up to tackle complex real-world visual problems. In this work, we propose an efficient and yet flexible non-local relation representation based on a novel class of graph neural networks. Our key idea is to introduce a latent space to reduce the complexity of graph, which allows us to use a low-rank representation for the graph affinity matrix and to achieve a linear complexity in computation. Extensive experimental evaluations on three major visual recognition tasks show that our method outperforms the prior works with a large margin while maintaining a low computation cost.

APA

Zhang, S., He, X. & Yan, S.. (2019). LatentGNN: Learning Efficient Non-local Relations for Visual Recognition. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:7374-7383 Available from https://proceedings.mlr.press/v97/zhang19f.html.

LatentGNN: Learning Efficient Non-local Relations for Visual Recognition

Abstract

Cite this Paper

Related Material