[edit]
Efficient Alignment of Large Language Models via Data Sampling
Proceedings of The 4th NeurIPS Efficient Natural Language and Speech Processing Workshop, PMLR 262:55-72, 2024.
Abstract
Despite the capabilities of Large Language Models (LLMs), the output is not always safe or desirable. Aligning the models to human values is a critical step for the safe adoption of these models. Aligning LLMs employ huge amounts of data, computation, and time. Moreover, curating data with human feedback is expensive and takes time. Recent research depicts the benefit of data engineering in the fine-tuning and pre-training paradigms to bring down such costs. However, alignment differs from the afore-mentioned paradigms and it is unclear if data efficient alignment is feasible. In this work, we first aim to understand how the performance of LLM alignment scales with data. We find out that LLM alignment performance follows an exponential plateau pattern which tapers off post a rapid initial increase. We identify data subsampling as a viable method to reduce resources required for alignment. Further, we propose a methodology for efficient alignment by identifying a small high quality subset thereby reducing the computation and time required by alignment. We evaluate the proposed methodology over multiple datasets and compare the results. We find that the model aligned using our proposed methodology outperforms other sampling methods and performs comparable to the model aligned with the full dataset while using a fraction of the resources.