The NGT200 Dataset: Geometric Multi-View Isolated Sign Recognition

Oline Ranum, David R. Wessels, Gomer Otterspeer, Erik J. Bekkers, Floris Roelofsen, Jari I. Andersen
Proceedings of the Geometry-grounded Representation Learning and Generative Modeling Workshop (GRaM), PMLR 251:286-302, 2024.

Abstract

Sign Language Processing (SLP) provides a foundation for a more inclusive future in language technology; however, the field faces several significant challenges that must be addressed to achieve practical, real-world applications. This work addresses multi-view isolated sign recognition (MV-ISR), and highlights the essential role of 3D awareness and geometry in SLP systems. We introduce the NGT200 dataset, a novel spatio-temporal multi-view benchmark, establishing MV-ISR as distinct from single-view ISR (SV-ISR). We demonstrate the benefits of synthetic data and propose conditioning sign representations on spatial symmetries inherent in sign language. Leveraging an SE(2) equivariant model improves MV-ISR performance by 8-22 percent over the baseline.

Cite this Paper


BibTeX
@InProceedings{pmlr-v251-ranum24a, title = {The NGT200 Dataset: Geometric Multi-View Isolated Sign Recognition}, author = {Ranum, Oline and Wessels, David R. and Otterspeer, Gomer and Bekkers, Erik J. and Roelofsen, Floris and Andersen, Jari I.}, booktitle = {Proceedings of the Geometry-grounded Representation Learning and Generative Modeling Workshop (GRaM)}, pages = {286--302}, year = {2024}, editor = {Vadgama, Sharvaree and Bekkers, Erik and Pouplin, Alison and Kaba, Sekou-Oumar and Walters, Robin and Lawrence, Hannah and Emerson, Tegan and Kvinge, Henry and Tomczak, Jakub and Jegelka, Stephanie}, volume = {251}, series = {Proceedings of Machine Learning Research}, month = {29 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v251/main/assets/ranum24a/ranum24a.pdf}, url = {https://proceedings.mlr.press/v251/ranum24a.html}, abstract = {Sign Language Processing (SLP) provides a foundation for a more inclusive future in language technology; however, the field faces several significant challenges that must be addressed to achieve practical, real-world applications. This work addresses multi-view isolated sign recognition (MV-ISR), and highlights the essential role of 3D awareness and geometry in SLP systems. We introduce the NGT200 dataset, a novel spatio-temporal multi-view benchmark, establishing MV-ISR as distinct from single-view ISR (SV-ISR). We demonstrate the benefits of synthetic data and propose conditioning sign representations on spatial symmetries inherent in sign language. Leveraging an SE(2) equivariant model improves MV-ISR performance by 8-22 percent over the baseline.} }
Endnote
%0 Conference Paper %T The NGT200 Dataset: Geometric Multi-View Isolated Sign Recognition %A Oline Ranum %A David R. Wessels %A Gomer Otterspeer %A Erik J. Bekkers %A Floris Roelofsen %A Jari I. Andersen %B Proceedings of the Geometry-grounded Representation Learning and Generative Modeling Workshop (GRaM) %C Proceedings of Machine Learning Research %D 2024 %E Sharvaree Vadgama %E Erik Bekkers %E Alison Pouplin %E Sekou-Oumar Kaba %E Robin Walters %E Hannah Lawrence %E Tegan Emerson %E Henry Kvinge %E Jakub Tomczak %E Stephanie Jegelka %F pmlr-v251-ranum24a %I PMLR %P 286--302 %U https://proceedings.mlr.press/v251/ranum24a.html %V 251 %X Sign Language Processing (SLP) provides a foundation for a more inclusive future in language technology; however, the field faces several significant challenges that must be addressed to achieve practical, real-world applications. This work addresses multi-view isolated sign recognition (MV-ISR), and highlights the essential role of 3D awareness and geometry in SLP systems. We introduce the NGT200 dataset, a novel spatio-temporal multi-view benchmark, establishing MV-ISR as distinct from single-view ISR (SV-ISR). We demonstrate the benefits of synthetic data and propose conditioning sign representations on spatial symmetries inherent in sign language. Leveraging an SE(2) equivariant model improves MV-ISR performance by 8-22 percent over the baseline.
APA
Ranum, O., Wessels, D.R., Otterspeer, G., Bekkers, E.J., Roelofsen, F. & Andersen, J.I.. (2024). The NGT200 Dataset: Geometric Multi-View Isolated Sign Recognition. Proceedings of the Geometry-grounded Representation Learning and Generative Modeling Workshop (GRaM), in Proceedings of Machine Learning Research 251:286-302 Available from https://proceedings.mlr.press/v251/ranum24a.html.

Related Material