(PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork

Tianjin Huang; Yong Tao; Meng Fang; Li Shen; Fan Liu; Yulong Pei; Mykola Pechenizkiy; Tianlong Chen

(PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork

Tianjin Huang, Yong Tao, Meng Fang, Li Shen, Fan Liu, Yulong Pei, Mykola Pechenizkiy, Tianlong Chen

Conference on Parsimony and Learning, PMLR 328:629-643, 2026.

Abstract

Large-scale neural networks have demonstrated remarkable performance in different domains like vision and language processing, although at the cost of massive computation resources. As illustrated by compression literature, structural model pruning is a prominent algorithm to encourage model efficiency, thanks to its acceleration-friendly sparsity patterns. One of the key questions of structural pruning is how to estimate the channel significance. In parallel, work on data-centric AI has shown that prompting-based techniques enable impressive generalization of large language models across diverse downstream tasks. In this paper, we investigate a charming possibility - leveraging visual prompts to capture the channel importance and derive high-quality structural sparsity. To this end, we propose a novel algorithmic framework, namely PASS. It is a tailored hyper-network to take both visual prompts and network weight statistics as input, and output layer-wise channel sparsity in a recurrent manner. Such designs consider the intrinsic channel dependency between layers. Comprehensive experiments across multiple network architectures and six datasets demonstrate the superiority of PASS in locating good structural sparsity. For example, at the same FLOPs level, PASS subnetworks achieve $1%\sim 3%$ better accuracy on Food101 dataset; or with a similar performance of $80%$ accuracy, PASS subnetworks obtain $0.35\times$ more speedup than the baselines.

Cite this Paper

BibTeX

@InProceedings{pmlr-v328-huang26b,
  title = 	 {(PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork},
  author =       {Huang, Tianjin and Tao, Yong and Fang, Meng and Shen, Li and Liu, Fan and Pei, Yulong and Pechenizkiy, Mykola and Chen, Tianlong},
  booktitle = 	 {Conference on Parsimony and Learning},
  pages = 	 {629--643},
  year = 	 {2026},
  editor = 	 {Burkholz, Rebekka and Liu, Shiwei and Ravishankar, Saiprasad and Redman, William and Huang, Wei and Su, Weijie and Zhu, Zhihui},
  volume = 	 {328},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--26 Mar},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v328/main/assets/huang26b/huang26b.pdf},
  url = 	 {https://proceedings.mlr.press/v328/huang26b.html},
  abstract = 	 {Large-scale neural networks have demonstrated remarkable performance in different domains like vision and language processing, although at the cost of massive computation resources. As illustrated by compression literature, structural model pruning is a prominent algorithm to encourage model efficiency, thanks to its acceleration-friendly sparsity patterns. One of the key questions of structural pruning is how to estimate the channel significance. In parallel, work on data-centric AI has shown that prompting-based techniques enable impressive generalization of large language models across diverse downstream tasks. In this paper, we investigate a charming possibility - leveraging visual prompts to capture the channel importance and derive high-quality structural sparsity. To this end, we propose a novel algorithmic framework, namely PASS. It is a tailored hyper-network to take both visual prompts and network weight statistics as input, and output layer-wise channel sparsity in a recurrent manner. Such designs consider the intrinsic channel dependency between layers. Comprehensive experiments across multiple network architectures and six datasets demonstrate the superiority of PASS in locating good structural sparsity. For example, at the same FLOPs level, PASS subnetworks achieve $1%\sim 3%$ better accuracy on Food101 dataset; or with a similar performance of $80%$ accuracy, PASS subnetworks obtain $0.35\times$ more speedup than the baselines.}
}

Endnote

%0 Conference Paper
%T (PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork
%A Tianjin Huang
%A Yong Tao
%A Meng Fang
%A Li Shen
%A Fan Liu
%A Yulong Pei
%A Mykola Pechenizkiy
%A Tianlong Chen
%B Conference on Parsimony and Learning
%C Proceedings of Machine Learning Research
%D 2026
%E Rebekka Burkholz
%E Shiwei Liu
%E Saiprasad Ravishankar
%E William Redman
%E Wei Huang
%E Weijie Su
%E Zhihui Zhu	
%F pmlr-v328-huang26b
%I PMLR
%P 629--643
%U https://proceedings.mlr.press/v328/huang26b.html
%V 328
%X Large-scale neural networks have demonstrated remarkable performance in different domains like vision and language processing, although at the cost of massive computation resources. As illustrated by compression literature, structural model pruning is a prominent algorithm to encourage model efficiency, thanks to its acceleration-friendly sparsity patterns. One of the key questions of structural pruning is how to estimate the channel significance. In parallel, work on data-centric AI has shown that prompting-based techniques enable impressive generalization of large language models across diverse downstream tasks. In this paper, we investigate a charming possibility - leveraging visual prompts to capture the channel importance and derive high-quality structural sparsity. To this end, we propose a novel algorithmic framework, namely PASS. It is a tailored hyper-network to take both visual prompts and network weight statistics as input, and output layer-wise channel sparsity in a recurrent manner. Such designs consider the intrinsic channel dependency between layers. Comprehensive experiments across multiple network architectures and six datasets demonstrate the superiority of PASS in locating good structural sparsity. For example, at the same FLOPs level, PASS subnetworks achieve $1%\sim 3%$ better accuracy on Food101 dataset; or with a similar performance of $80%$ accuracy, PASS subnetworks obtain $0.35\times$ more speedup than the baselines.

APA

Huang, T., Tao, Y., Fang, M., Shen, L., Liu, F., Pei, Y., Pechenizkiy, M. & Chen, T.. (2026). (PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork. Conference on Parsimony and Learning, in Proceedings of Machine Learning Research 328:629-643 Available from https://proceedings.mlr.press/v328/huang26b.html.

(PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork

Abstract

Cite this Paper

Related Material