Forecasting Tennis Player Matches Based on Machine Learning

Bai Rui, Chong Kar Hing, Li Haoyuan, Teh Jia Yew
Proceedings of 2024 International Conference on Machine Learning and Intelligent Computing, PMLR 245:393-403, 2024.

Abstract

This paper aims to highlight the extensive potential of analytics with the use of machine learning to improve sports modelling. We propose a supervised machine learning approach to further extend the optimization of machine learning in predicting the flow of points in tennis matches. Using data sourced from the 2023 Wimbledon Gentlemen’s singles matches, we used Grey Relational Analysis and membership functions from fuzzy set theory to extract and rank 7 features that exhibit impactful ties with the player’s match outcome, which includes player’s serve status, games won in current set, ranking difference, distance covered, serve speed, previous victory status, and unforced error, respectively. We implemented these features to build 3 supervised models and compare their predictive performances, namely K-Nearest Neighbours, XGBoost and Logistic Regression. We adopted a train test split measure of 300 training sets and 100 testing sets. Using performance metrics such as confusion matrices, ROC curves, F1, Precision, Recall, and Accuracy scores, we constructed a scoring table to rank implemented models. Our results demonstrated that XGBoost exhibited the most significant predictive performance, followed by KNN and Logistic Regression. 5-Fold cross-validation feature stability and sensitivity analysis suggests that the feature space cre- ated is robust and stable where features are not easily subject to change in short-term predictions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v245-rui24b, title = {Forecasting Tennis Player Matches Based on Machine Learning}, author = {Rui, Bai and Kar Hing, Chong and Haoyuan, Li and Jia Yew, Teh}, booktitle = {Proceedings of 2024 International Conference on Machine Learning and Intelligent Computing}, pages = {393--403}, year = {2024}, editor = {Nianyin, Zeng and Pachori, Ram Bilas}, volume = {245}, series = {Proceedings of Machine Learning Research}, month = {26--28 Apr}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v245/main/assets/rui24b/rui24b.pdf}, url = {https://proceedings.mlr.press/v245/rui24b.html}, abstract = {This paper aims to highlight the extensive potential of analytics with the use of machine learning to improve sports modelling. We propose a supervised machine learning approach to further extend the optimization of machine learning in predicting the flow of points in tennis matches. Using data sourced from the 2023 Wimbledon Gentlemen’s singles matches, we used Grey Relational Analysis and membership functions from fuzzy set theory to extract and rank 7 features that exhibit impactful ties with the player’s match outcome, which includes player’s serve status, games won in current set, ranking difference, distance covered, serve speed, previous victory status, and unforced error, respectively. We implemented these features to build 3 supervised models and compare their predictive performances, namely K-Nearest Neighbours, XGBoost and Logistic Regression. We adopted a train test split measure of 300 training sets and 100 testing sets. Using performance metrics such as confusion matrices, ROC curves, F1, Precision, Recall, and Accuracy scores, we constructed a scoring table to rank implemented models. Our results demonstrated that XGBoost exhibited the most significant predictive performance, followed by KNN and Logistic Regression. 5-Fold cross-validation feature stability and sensitivity analysis suggests that the feature space cre- ated is robust and stable where features are not easily subject to change in short-term predictions.} }
Endnote
%0 Conference Paper %T Forecasting Tennis Player Matches Based on Machine Learning %A Bai Rui %A Chong Kar Hing %A Li Haoyuan %A Teh Jia Yew %B Proceedings of 2024 International Conference on Machine Learning and Intelligent Computing %C Proceedings of Machine Learning Research %D 2024 %E Zeng Nianyin %E Ram Bilas Pachori %F pmlr-v245-rui24b %I PMLR %P 393--403 %U https://proceedings.mlr.press/v245/rui24b.html %V 245 %X This paper aims to highlight the extensive potential of analytics with the use of machine learning to improve sports modelling. We propose a supervised machine learning approach to further extend the optimization of machine learning in predicting the flow of points in tennis matches. Using data sourced from the 2023 Wimbledon Gentlemen’s singles matches, we used Grey Relational Analysis and membership functions from fuzzy set theory to extract and rank 7 features that exhibit impactful ties with the player’s match outcome, which includes player’s serve status, games won in current set, ranking difference, distance covered, serve speed, previous victory status, and unforced error, respectively. We implemented these features to build 3 supervised models and compare their predictive performances, namely K-Nearest Neighbours, XGBoost and Logistic Regression. We adopted a train test split measure of 300 training sets and 100 testing sets. Using performance metrics such as confusion matrices, ROC curves, F1, Precision, Recall, and Accuracy scores, we constructed a scoring table to rank implemented models. Our results demonstrated that XGBoost exhibited the most significant predictive performance, followed by KNN and Logistic Regression. 5-Fold cross-validation feature stability and sensitivity analysis suggests that the feature space cre- ated is robust and stable where features are not easily subject to change in short-term predictions.
APA
Rui, B., Kar Hing, C., Haoyuan, L. & Jia Yew, T.. (2024). Forecasting Tennis Player Matches Based on Machine Learning. Proceedings of 2024 International Conference on Machine Learning and Intelligent Computing, in Proceedings of Machine Learning Research 245:393-403 Available from https://proceedings.mlr.press/v245/rui24b.html.

Related Material