Improving Assessment of Tutoring Practices using Retrieval-Augmented Generation

Zifei F. Han, Jionghao Lin, Ashish Gurung, Danielle R Thomas, Eason Chen, Conrad Borchers, Shivang Gupta, Kenneth R Koedinger
Proceedings of the 2024 AAAI Conference on Artificial Intelligence, PMLR 257:66-76, 2024.

Abstract

One-on-one tutoring is an effective instructional method for enhancing learning, yet its efficacy hinges on tutor competencies. Novice math tutors often prioritize content-specific guidance, neglecting aspects such as social-emotional learning. Social-emotional learning promotes equity and inclusion and nurtures relationships with students, which is crucial for holistic student development. Assessing the competencies of tutors accurately and efficiently can drive the development of tailored tutor training programs. However, evaluating novice tutor ability during real-time tutoring remains challenging as it typically requires experts-in-the-loop. To address this challenge, this study harnesses Generative Pre-trained Transformers (GPT), such as GPT-3.5 and GPT-4, to automatically assess tutors’ ability of using social-emotional tutoring strategies. Moreover, this study also reports on the financial dimensions and considerations of employing these models in real-time and at scale for automated assessment. Four prompting strategies were assessed: two basic Zero-shot prompt strategies, Tree of Thought prompting, and Retrieval-Augmented Generator (RAG) prompting. The results indicate that RAG prompting demonstrated the most accurate performance (assessed by the level of hallucination and correctness in the generated assessment texts) and the lowest financial costs. These findings inform the development of personalized tutor training interventions to enhance the educational effectiveness of tutored learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v257-han24a, title = {Improving Assessment of Tutoring Practices using Retrieval-Augmented Generation}, author = {Han, Zifei F. and Lin, Jionghao and Gurung, Ashish and Thomas, Danielle R and Chen, Eason and Borchers, Conrad and Gupta, Shivang and Koedinger, Kenneth R}, booktitle = {Proceedings of the 2024 AAAI Conference on Artificial Intelligence}, pages = {66--76}, year = {2024}, editor = {Ananda, Muktha and Malick, Debshila Basu and Burstein, Jill and Liu, Lydia T. and Liu, Zitao and Sharpnack, James and Wang, Zichao and Wang, Serena}, volume = {257}, series = {Proceedings of Machine Learning Research}, month = {26--27 Feb}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v257/main/assets/han24a/han24a.pdf}, url = {https://proceedings.mlr.press/v257/han24a.html}, abstract = {One-on-one tutoring is an effective instructional method for enhancing learning, yet its efficacy hinges on tutor competencies. Novice math tutors often prioritize content-specific guidance, neglecting aspects such as social-emotional learning. Social-emotional learning promotes equity and inclusion and nurtures relationships with students, which is crucial for holistic student development. Assessing the competencies of tutors accurately and efficiently can drive the development of tailored tutor training programs. However, evaluating novice tutor ability during real-time tutoring remains challenging as it typically requires experts-in-the-loop. To address this challenge, this study harnesses Generative Pre-trained Transformers (GPT), such as GPT-3.5 and GPT-4, to automatically assess tutors’ ability of using social-emotional tutoring strategies. Moreover, this study also reports on the financial dimensions and considerations of employing these models in real-time and at scale for automated assessment. Four prompting strategies were assessed: two basic Zero-shot prompt strategies, Tree of Thought prompting, and Retrieval-Augmented Generator (RAG) prompting. The results indicate that RAG prompting demonstrated the most accurate performance (assessed by the level of hallucination and correctness in the generated assessment texts) and the lowest financial costs. These findings inform the development of personalized tutor training interventions to enhance the educational effectiveness of tutored learning.} }
Endnote
%0 Conference Paper %T Improving Assessment of Tutoring Practices using Retrieval-Augmented Generation %A Zifei F. Han %A Jionghao Lin %A Ashish Gurung %A Danielle R Thomas %A Eason Chen %A Conrad Borchers %A Shivang Gupta %A Kenneth R Koedinger %B Proceedings of the 2024 AAAI Conference on Artificial Intelligence %C Proceedings of Machine Learning Research %D 2024 %E Muktha Ananda %E Debshila Basu Malick %E Jill Burstein %E Lydia T. Liu %E Zitao Liu %E James Sharpnack %E Zichao Wang %E Serena Wang %F pmlr-v257-han24a %I PMLR %P 66--76 %U https://proceedings.mlr.press/v257/han24a.html %V 257 %X One-on-one tutoring is an effective instructional method for enhancing learning, yet its efficacy hinges on tutor competencies. Novice math tutors often prioritize content-specific guidance, neglecting aspects such as social-emotional learning. Social-emotional learning promotes equity and inclusion and nurtures relationships with students, which is crucial for holistic student development. Assessing the competencies of tutors accurately and efficiently can drive the development of tailored tutor training programs. However, evaluating novice tutor ability during real-time tutoring remains challenging as it typically requires experts-in-the-loop. To address this challenge, this study harnesses Generative Pre-trained Transformers (GPT), such as GPT-3.5 and GPT-4, to automatically assess tutors’ ability of using social-emotional tutoring strategies. Moreover, this study also reports on the financial dimensions and considerations of employing these models in real-time and at scale for automated assessment. Four prompting strategies were assessed: two basic Zero-shot prompt strategies, Tree of Thought prompting, and Retrieval-Augmented Generator (RAG) prompting. The results indicate that RAG prompting demonstrated the most accurate performance (assessed by the level of hallucination and correctness in the generated assessment texts) and the lowest financial costs. These findings inform the development of personalized tutor training interventions to enhance the educational effectiveness of tutored learning.
APA
Han, Z.F., Lin, J., Gurung, A., Thomas, D.R., Chen, E., Borchers, C., Gupta, S. & Koedinger, K.R.. (2024). Improving Assessment of Tutoring Practices using Retrieval-Augmented Generation. Proceedings of the 2024 AAAI Conference on Artificial Intelligence, in Proceedings of Machine Learning Research 257:66-76 Available from https://proceedings.mlr.press/v257/han24a.html.

Related Material