[edit]
BlitzMask: Real-Time Instance Segmentation Approach for Mobile Devices
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:1799-1811, 2023.
Abstract
We propose a fast and low complexity anchor-free instance segmentation approach BlitzMask. For the first time, the approach achieves competitive results for real-time inference on mobile devices. The model architecture modifies CenterNet by adding a new lite head to the CenterNet architecture. The model contains only layers optimized for inference on mobile devices, e.g. batch normalization, standard convolution, depthwise convolution, and can be easily embedded into a mobile device. The instance segmentation task requires finding an arbitrary (not a priori fixed) number of instance masks. The proposed method predicts the number of instance masks separately for each image using a predicted heatmap. Then, it decomposes each instance mask over a predicted spanning set, which is an output of the lite head. The approach uses training from scratch with a new optimization process and a new loss function. A model with EfficientNet-Lite B4 backbone and 320x320 input resolution achieves 28.9 mask AP at 29.2 fps on Samsung S21 GPU and 28.0 mask AP at 39.4 fps on Samsung S21 DSP. This sets a new speed benchmark for inference for instance segmentation on mobile devices.