Harmonic-NAS: Hardware-Aware Multimodal Neural Architecture Search on Resource-constrained Devices

Mohamed Imed Eddine Ghebriout, Halima Bouzidi, Smail Niar, Hamza Ouarnoughi
Proceedings of the 15th Asian Conference on Machine Learning, PMLR 222:374-389, 2024.

Abstract

The recent surge of interest surrounding Multimodal Neural Networks (MM-NN) is attributed to their ability to effectively process and integrate multiscale information from diverse data sources. MM-NNs extract and fuse features from multiple modalities using adequate unimodal backbones and specific fusion networks. Although this helps strengthen the multimodal information representation, designing such networks is labor-intensive. It requires tuning the architectural parameters of the unimodal backbones, choosing the fusing point, and selecting the operations for fusion. Furthermore, multimodality AI is emerging as a cutting-edge option in Internet of Things (IoT) systems where inference latency and energy consumption are critical metrics in addition to accuracy. In this paper, we propose \textit{Harmonic-NAS}, a framework for the joint optimization of unimodal backbones and multimodal fusion networks with hardware awareness on resource-constrained devices. \textit{Harmonic-NAS} involves a two-tier optimization approach for the unimodal backbone architectures and fusion strategy and operators. By incorporating the hardware dimension into the optimization, evaluation results on various devices and multimodal datasets have demonstrated the superiority of \textit{Harmonic-NAS} over state-of-the-art approaches achieving up to $\sim \textbf{10.9%} accuracy improvement, $\sim$\textbf{1.91x} latency reduction, and $\sim$\textbf{2.14x} energy efficiency gain.

Cite this Paper


BibTeX
@InProceedings{pmlr-v222-ghebriout24a, title = {{Harmonic-NAS}: {H}ardware-Aware Multimodal Neural Architecture Search on Resource-constrained Devices}, author = {Ghebriout, Mohamed Imed Eddine and Bouzidi, Halima and Niar, Smail and Ouarnoughi, Hamza}, booktitle = {Proceedings of the 15th Asian Conference on Machine Learning}, pages = {374--389}, year = {2024}, editor = {Yanıkoğlu, Berrin and Buntine, Wray}, volume = {222}, series = {Proceedings of Machine Learning Research}, month = {11--14 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v222/ghebriout24a/ghebriout24a.pdf}, url = {https://proceedings.mlr.press/v222/ghebriout24a.html}, abstract = {The recent surge of interest surrounding Multimodal Neural Networks (MM-NN) is attributed to their ability to effectively process and integrate multiscale information from diverse data sources. MM-NNs extract and fuse features from multiple modalities using adequate unimodal backbones and specific fusion networks. Although this helps strengthen the multimodal information representation, designing such networks is labor-intensive. It requires tuning the architectural parameters of the unimodal backbones, choosing the fusing point, and selecting the operations for fusion. Furthermore, multimodality AI is emerging as a cutting-edge option in Internet of Things (IoT) systems where inference latency and energy consumption are critical metrics in addition to accuracy. In this paper, we propose \textit{Harmonic-NAS}, a framework for the joint optimization of unimodal backbones and multimodal fusion networks with hardware awareness on resource-constrained devices. \textit{Harmonic-NAS} involves a two-tier optimization approach for the unimodal backbone architectures and fusion strategy and operators. By incorporating the hardware dimension into the optimization, evaluation results on various devices and multimodal datasets have demonstrated the superiority of \textit{Harmonic-NAS} over state-of-the-art approaches achieving up to $\sim \textbf{10.9%} accuracy improvement, $\sim$\textbf{1.91x} latency reduction, and $\sim$\textbf{2.14x} energy efficiency gain.} }
Endnote
%0 Conference Paper %T Harmonic-NAS: Hardware-Aware Multimodal Neural Architecture Search on Resource-constrained Devices %A Mohamed Imed Eddine Ghebriout %A Halima Bouzidi %A Smail Niar %A Hamza Ouarnoughi %B Proceedings of the 15th Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Berrin Yanıkoğlu %E Wray Buntine %F pmlr-v222-ghebriout24a %I PMLR %P 374--389 %U https://proceedings.mlr.press/v222/ghebriout24a.html %V 222 %X The recent surge of interest surrounding Multimodal Neural Networks (MM-NN) is attributed to their ability to effectively process and integrate multiscale information from diverse data sources. MM-NNs extract and fuse features from multiple modalities using adequate unimodal backbones and specific fusion networks. Although this helps strengthen the multimodal information representation, designing such networks is labor-intensive. It requires tuning the architectural parameters of the unimodal backbones, choosing the fusing point, and selecting the operations for fusion. Furthermore, multimodality AI is emerging as a cutting-edge option in Internet of Things (IoT) systems where inference latency and energy consumption are critical metrics in addition to accuracy. In this paper, we propose \textit{Harmonic-NAS}, a framework for the joint optimization of unimodal backbones and multimodal fusion networks with hardware awareness on resource-constrained devices. \textit{Harmonic-NAS} involves a two-tier optimization approach for the unimodal backbone architectures and fusion strategy and operators. By incorporating the hardware dimension into the optimization, evaluation results on various devices and multimodal datasets have demonstrated the superiority of \textit{Harmonic-NAS} over state-of-the-art approaches achieving up to $\sim \textbf{10.9%} accuracy improvement, $\sim$\textbf{1.91x} latency reduction, and $\sim$\textbf{2.14x} energy efficiency gain.
APA
Ghebriout, M.I.E., Bouzidi, H., Niar, S. & Ouarnoughi, H.. (2024). Harmonic-NAS: Hardware-Aware Multimodal Neural Architecture Search on Resource-constrained Devices. Proceedings of the 15th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 222:374-389 Available from https://proceedings.mlr.press/v222/ghebriout24a.html.

Related Material