Classification of Adolescents’ Risky Behavior in Instant Messaging Conversations

Jaromı́r Plhák, Ondřej Sotolář, Michaela Lebedı́ková, David Šmahel
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:2390-2404, 2023.

Abstract

Previous research on detecting risky online behavior has been rather scattered, typically identifying single risks in online samples. To our knowledge, the presented research is the first that presents a process of building models that can efficiently detect the following four online risky behavior: (1) aggression, harassment, hate; (2) mental health; (3) use of alcohol, and drugs; and (4) sexting. Furthermore, the corpora in this research are unique because of the usage of private instant messaging conversations in the Czech language provided by adolescents. The combination of publicly unavailable and unique data with high-quality annotations of specific psychological phenomena allowed us for precise detection using transformer machine learning models that can handle sequential data and involve the context of utterances. The impact of the context length and text augmentation on model efficiency is discussed in detail. The final model provides promising results with an acceptable F1 score. Therefore, we believe that the model could be used in various applications, e.g., parental applications, chatbots, or services provided by Internet providers. Future research could investigate the usage of the model in other languages.

Cite this Paper


BibTeX
@InProceedings{pmlr-v206-plhak23a, title = {Classification of Adolescents’ Risky Behavior in Instant Messaging Conversations}, author = {Plh\'ak, Jarom{\'\i}r and Sotol\'a\v{r}, Ond\v{r}ej and Lebed{\'\i}kov\'a, Michaela and \v{S}mahel, David}, booktitle = {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics}, pages = {2390--2404}, year = {2023}, editor = {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem}, volume = {206}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v206/plhak23a/plhak23a.pdf}, url = {https://proceedings.mlr.press/v206/plhak23a.html}, abstract = {Previous research on detecting risky online behavior has been rather scattered, typically identifying single risks in online samples. To our knowledge, the presented research is the first that presents a process of building models that can efficiently detect the following four online risky behavior: (1) aggression, harassment, hate; (2) mental health; (3) use of alcohol, and drugs; and (4) sexting. Furthermore, the corpora in this research are unique because of the usage of private instant messaging conversations in the Czech language provided by adolescents. The combination of publicly unavailable and unique data with high-quality annotations of specific psychological phenomena allowed us for precise detection using transformer machine learning models that can handle sequential data and involve the context of utterances. The impact of the context length and text augmentation on model efficiency is discussed in detail. The final model provides promising results with an acceptable F1 score. Therefore, we believe that the model could be used in various applications, e.g., parental applications, chatbots, or services provided by Internet providers. Future research could investigate the usage of the model in other languages.} }
Endnote
%0 Conference Paper %T Classification of Adolescents’ Risky Behavior in Instant Messaging Conversations %A Jaromı́r Plhák %A Ondřej Sotolář %A Michaela Lebedı́ková %A David Šmahel %B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2023 %E Francisco Ruiz %E Jennifer Dy %E Jan-Willem van de Meent %F pmlr-v206-plhak23a %I PMLR %P 2390--2404 %U https://proceedings.mlr.press/v206/plhak23a.html %V 206 %X Previous research on detecting risky online behavior has been rather scattered, typically identifying single risks in online samples. To our knowledge, the presented research is the first that presents a process of building models that can efficiently detect the following four online risky behavior: (1) aggression, harassment, hate; (2) mental health; (3) use of alcohol, and drugs; and (4) sexting. Furthermore, the corpora in this research are unique because of the usage of private instant messaging conversations in the Czech language provided by adolescents. The combination of publicly unavailable and unique data with high-quality annotations of specific psychological phenomena allowed us for precise detection using transformer machine learning models that can handle sequential data and involve the context of utterances. The impact of the context length and text augmentation on model efficiency is discussed in detail. The final model provides promising results with an acceptable F1 score. Therefore, we believe that the model could be used in various applications, e.g., parental applications, chatbots, or services provided by Internet providers. Future research could investigate the usage of the model in other languages.
APA
Plhák, J., Sotolář, O., Lebedı́ková, M. & Šmahel, D.. (2023). Classification of Adolescents’ Risky Behavior in Instant Messaging Conversations. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:2390-2404 Available from https://proceedings.mlr.press/v206/plhak23a.html.

Related Material