[edit]
Current Evaluation Methods are a Bottleneck in Automatic Question Generation
Proceedings of the 2024 AAAI Conference on Artificial Intelligence, PMLR 257:3-8, 2024.
Abstract
This study provides a comprehensive review of frequently used evaluation methods for assessing the quality of automatic question generation (AQG) systems based on computational linguistics techniques and large language models. As we present a comprehensive overview of the current state of evaluation methods, we discuss the advantages and limitations of each method. Furthermore, we elucidate the next steps for the full integration of automatic question generation systems in educational settings to achieve effective personalization and adaptation.