Receptive Fields As Experts in Convolutional Neural Architectures

Dongze Lian, Weihao Yu, Xinchao Wang
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:29531-29544, 2024.

Abstract

The size of spatial receptive fields, from the early 3$\times$3 convolutions in VGGNet to the recent 7$\times$7 convolutions in ConvNeXt, has always played a critical role in architecture design. In this paper, we propose a Mixture of Receptive Fields (MoRF) instead of using a single receptive field. MoRF contains the combinations of multiple receptive fields with different sizes, e.g., convolutions with different kernel sizes, which can be regarded as experts. Such an approach serves two functions: one is to select the appropriate receptive field according to the input, and the other is to expand the network capacity. Furthermore, we also introduce two types of routing mechanisms, hard routing and soft routing to automatically select the appropriate receptive field experts. In the inference stage, the selected receptive field experts are merged via re-parameterization to maintain a similar inference speed compared to the single receptive field. To demonstrate the effectiveness of MoRF, we integrate the MoRF concept into multiple architectures, e.g., ResNet and ConvNeXt. Extensive experiments show that our approach outperforms the baselines in image classification, object detection, and segmentation tasks without significantly increasing the inference time.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-lian24b, title = {Receptive Fields As Experts in Convolutional Neural Architectures}, author = {Lian, Dongze and Yu, Weihao and Wang, Xinchao}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {29531--29544}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/lian24b/lian24b.pdf}, url = {https://proceedings.mlr.press/v235/lian24b.html}, abstract = {The size of spatial receptive fields, from the early 3$\times$3 convolutions in VGGNet to the recent 7$\times$7 convolutions in ConvNeXt, has always played a critical role in architecture design. In this paper, we propose a Mixture of Receptive Fields (MoRF) instead of using a single receptive field. MoRF contains the combinations of multiple receptive fields with different sizes, e.g., convolutions with different kernel sizes, which can be regarded as experts. Such an approach serves two functions: one is to select the appropriate receptive field according to the input, and the other is to expand the network capacity. Furthermore, we also introduce two types of routing mechanisms, hard routing and soft routing to automatically select the appropriate receptive field experts. In the inference stage, the selected receptive field experts are merged via re-parameterization to maintain a similar inference speed compared to the single receptive field. To demonstrate the effectiveness of MoRF, we integrate the MoRF concept into multiple architectures, e.g., ResNet and ConvNeXt. Extensive experiments show that our approach outperforms the baselines in image classification, object detection, and segmentation tasks without significantly increasing the inference time.} }
Endnote
%0 Conference Paper %T Receptive Fields As Experts in Convolutional Neural Architectures %A Dongze Lian %A Weihao Yu %A Xinchao Wang %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-lian24b %I PMLR %P 29531--29544 %U https://proceedings.mlr.press/v235/lian24b.html %V 235 %X The size of spatial receptive fields, from the early 3$\times$3 convolutions in VGGNet to the recent 7$\times$7 convolutions in ConvNeXt, has always played a critical role in architecture design. In this paper, we propose a Mixture of Receptive Fields (MoRF) instead of using a single receptive field. MoRF contains the combinations of multiple receptive fields with different sizes, e.g., convolutions with different kernel sizes, which can be regarded as experts. Such an approach serves two functions: one is to select the appropriate receptive field according to the input, and the other is to expand the network capacity. Furthermore, we also introduce two types of routing mechanisms, hard routing and soft routing to automatically select the appropriate receptive field experts. In the inference stage, the selected receptive field experts are merged via re-parameterization to maintain a similar inference speed compared to the single receptive field. To demonstrate the effectiveness of MoRF, we integrate the MoRF concept into multiple architectures, e.g., ResNet and ConvNeXt. Extensive experiments show that our approach outperforms the baselines in image classification, object detection, and segmentation tasks without significantly increasing the inference time.
APA
Lian, D., Yu, W. & Wang, X.. (2024). Receptive Fields As Experts in Convolutional Neural Architectures. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:29531-29544 Available from https://proceedings.mlr.press/v235/lian24b.html.

Related Material