Neural machine translation for automated feedback on children’s early-stage writing

Jonas Vestergaard Jensen, Mikkel Jordahn, Michael Riis Andersen
Proceedings of the 5th Northern Lights Deep Learning Conference ({NLDL}), PMLR 233:104-112, 2024.

Abstract

In this work, we address the problem of assessing and constructing feedback for early-stage writing automatically using machine learning. Early-stage writing is typically vastly different from conventional writing due to phonetic spelling and lack of proper grammar, punctuation, spacing etc. Consequently, early-stage writing is highly non-trivial to analyze using common linguistic metrics. We propose to use sequence-to-sequence models for translating early-stage writing by students into conventional writing, which allows the translated text to be analyzed using linguistic metrics. Furthermore, we propose a novel robust likelihood to mitigate the effect of label noise in the dataset. We investigate the proposed methods using a set of numerical experiments and demonstrate that the conventional text can be predicted with high accuracy.

Cite this Paper


BibTeX
@InProceedings{pmlr-v233-jensen24b, title = {Neural machine translation for automated feedback on children’s early-stage writing}, author = {Jensen, Jonas Vestergaard and Jordahn, Mikkel and Andersen, Michael Riis}, booktitle = {Proceedings of the 5th Northern Lights Deep Learning Conference ({NLDL})}, pages = {104--112}, year = {2024}, editor = {Lutchyn, Tetiana and Ramírez Rivera, Adín and Ricaud, Benjamin}, volume = {233}, series = {Proceedings of Machine Learning Research}, month = {09--11 Jan}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v233/jensen24b/jensen24b.pdf}, url = {https://proceedings.mlr.press/v233/jensen24b.html}, abstract = {In this work, we address the problem of assessing and constructing feedback for early-stage writing automatically using machine learning. Early-stage writing is typically vastly different from conventional writing due to phonetic spelling and lack of proper grammar, punctuation, spacing etc. Consequently, early-stage writing is highly non-trivial to analyze using common linguistic metrics. We propose to use sequence-to-sequence models for translating early-stage writing by students into conventional writing, which allows the translated text to be analyzed using linguistic metrics. Furthermore, we propose a novel robust likelihood to mitigate the effect of label noise in the dataset. We investigate the proposed methods using a set of numerical experiments and demonstrate that the conventional text can be predicted with high accuracy.} }
Endnote
%0 Conference Paper %T Neural machine translation for automated feedback on children’s early-stage writing %A Jonas Vestergaard Jensen %A Mikkel Jordahn %A Michael Riis Andersen %B Proceedings of the 5th Northern Lights Deep Learning Conference ({NLDL}) %C Proceedings of Machine Learning Research %D 2024 %E Tetiana Lutchyn %E Adín Ramírez Rivera %E Benjamin Ricaud %F pmlr-v233-jensen24b %I PMLR %P 104--112 %U https://proceedings.mlr.press/v233/jensen24b.html %V 233 %X In this work, we address the problem of assessing and constructing feedback for early-stage writing automatically using machine learning. Early-stage writing is typically vastly different from conventional writing due to phonetic spelling and lack of proper grammar, punctuation, spacing etc. Consequently, early-stage writing is highly non-trivial to analyze using common linguistic metrics. We propose to use sequence-to-sequence models for translating early-stage writing by students into conventional writing, which allows the translated text to be analyzed using linguistic metrics. Furthermore, we propose a novel robust likelihood to mitigate the effect of label noise in the dataset. We investigate the proposed methods using a set of numerical experiments and demonstrate that the conventional text can be predicted with high accuracy.
APA
Jensen, J.V., Jordahn, M. & Andersen, M.R.. (2024). Neural machine translation for automated feedback on children’s early-stage writing. Proceedings of the 5th Northern Lights Deep Learning Conference ({NLDL}), in Proceedings of Machine Learning Research 233:104-112 Available from https://proceedings.mlr.press/v233/jensen24b.html.

Related Material