Black-Box Tuning for Language-Model-as-a-Service

Tianxiang Sun; Yunfan Shao; Hong Qian; Xuanjing Huang; Xipeng Qiu

Black-Box Tuning for Language-Model-as-a-Service

Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu

Proceedings of the 39th International Conference on Machine Learning, PMLR 162:20841-20855, 2022.

Abstract

Extremely large pre-trained language models (PTMs) such as GPT-3 are usually released as a service. It allows users to design task-specific prompts to query the PTMs through some black-box APIs. In such a scenario, which we call Language-Model-as-a-Service (LMaaS), the gradients of PTMs are usually unavailable. Can we optimize the task prompts by only accessing the model inference APIs? This paper proposes the black-box tuning framework to optimize the continuous prompt prepended to the input text via derivative-free optimization. Instead of optimizing in the original high-dimensional prompt space, which is intractable for traditional derivative-free optimization, we perform optimization in a randomly generated subspace due to the low intrinsic dimensionality of large PTMs. The experimental results show that the black-box tuning with RoBERTa on a few labeled samples not only significantly outperforms manual prompt and GPT-3’s in-context learning, but also surpasses the gradient-based counterparts, i.e., prompt tuning and full model tuning.

Cite this Paper

BibTeX


@InProceedings{pmlr-v162-sun22e,
  title = 	 {Black-Box Tuning for Language-Model-as-a-Service},
  author =       {Sun, Tianxiang and Shao, Yunfan and Qian, Hong and Huang, Xuanjing and Qiu, Xipeng},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {20841--20855},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/sun22e/sun22e.pdf},
  url = 	 {https://proceedings.mlr.press/v162/sun22e.html},
  abstract = 	 {Extremely large pre-trained language models (PTMs) such as GPT-3 are usually released as a service. It allows users to design task-specific prompts to query the PTMs through some black-box APIs. In such a scenario, which we call Language-Model-as-a-Service (LMaaS), the gradients of PTMs are usually unavailable. Can we optimize the task prompts by only accessing the model inference APIs? This paper proposes the black-box tuning framework to optimize the continuous prompt prepended to the input text via derivative-free optimization. Instead of optimizing in the original high-dimensional prompt space, which is intractable for traditional derivative-free optimization, we perform optimization in a randomly generated subspace due to the low intrinsic dimensionality of large PTMs. The experimental results show that the black-box tuning with RoBERTa on a few labeled samples not only significantly outperforms manual prompt and GPT-3’s in-context learning, but also surpasses the gradient-based counterparts, i.e., prompt tuning and full model tuning.}
}

Endnote

%0 Conference Paper
%T Black-Box Tuning for Language-Model-as-a-Service
%A Tianxiang Sun
%A Yunfan Shao
%A Hong Qian
%A Xuanjing Huang
%A Xipeng Qiu
%B Proceedings of the 39th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Kamalika Chaudhuri
%E Stefanie Jegelka
%E Le Song
%E Csaba Szepesvari
%E Gang Niu
%E Sivan Sabato	
%F pmlr-v162-sun22e
%I PMLR
%P 20841--20855
%U https://proceedings.mlr.press/v162/sun22e.html
%V 162
%X Extremely large pre-trained language models (PTMs) such as GPT-3 are usually released as a service. It allows users to design task-specific prompts to query the PTMs through some black-box APIs. In such a scenario, which we call Language-Model-as-a-Service (LMaaS), the gradients of PTMs are usually unavailable. Can we optimize the task prompts by only accessing the model inference APIs? This paper proposes the black-box tuning framework to optimize the continuous prompt prepended to the input text via derivative-free optimization. Instead of optimizing in the original high-dimensional prompt space, which is intractable for traditional derivative-free optimization, we perform optimization in a randomly generated subspace due to the low intrinsic dimensionality of large PTMs. The experimental results show that the black-box tuning with RoBERTa on a few labeled samples not only significantly outperforms manual prompt and GPT-3’s in-context learning, but also surpasses the gradient-based counterparts, i.e., prompt tuning and full model tuning.

APA


Sun, T., Shao, Y., Qian, H., Huang, X. & Qiu, X.. (2022). Black-Box Tuning for Language-Model-as-a-Service. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:20841-20855 Available from https://proceedings.mlr.press/v162/sun22e.html.

Related Material

Download PDF