[edit]
Universal Video Face Restoration Method Based on Vision-Language Model
Proceedings of the 16th Asian Conference on Machine Learning, PMLR 260:367-382, 2025.
Abstract
The video face restoration aims to restore high-quality face video from low-quality face video, but most existing methods typically focus on specific and single degradation scene such as denoising or deblurring. However, the universal video face restoration should restore face video in various degradation scenes. In this paper, we use language prompt which describes the face information including gender, appearance and expression to guide video face restoration. To enhance the applicability, we remove the language prompt by ControlNet and incorporate the human-level knowledge from vision-language models into general networks to improve the video face restoration performance and enable the universal video face restoration. In addition, we construct a degradation dataset, which contains multiple degradations in the same scene and captions which describe the face information. Our extensive experiments show that our approach achieves highly competitive performance in universal video face restoration.