A Hybrid XGBoost and Stacked Regression Model with Optimized Feature Selection for Port Throughput Prediction

Ning Ding, Fuyang Zhao, Xiaoyu Wang
Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing, PMLR 278:798-807, 2025.

Abstract

The prediction of port throughput is very important for port operation management. Aiming at the prediction of port throughput, this study selected the level of economic and trade and development vitality in the hinterland, regional transportation capacity in the hinterland, port infrastructure conditions and other first-class indicators and 13 second-class indicators, used xgboost to analyze the importance of characteristics, and screened out 9 key influencing factors. The combined model based on xgboost and stacking algorithm was constructed, and the parameters were optimized by cross validation and grid search method. Taking Dalian port as an example, the experiment shows that when xgboost stacking model is used to predict port throughput, MAE, MAPE and RMSE are the lowest, and $R^2$ is the highest. The prediction performance is significantly better than other models, which verifies the effectiveness and superiority of the model in port throughput prediction, and provides a new method and idea for port throughput prediction.

Cite this Paper


BibTeX
@InProceedings{pmlr-v278-ding25a, title = {A Hybrid XGBoost and Stacked Regression Model with Optimized Feature Selection for Port Throughput Prediction}, author = {Ding, Ning and Zhao, Fuyang and Wang, Xiaoyu}, booktitle = {Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing}, pages = {798--807}, year = {2025}, editor = {Zeng, Nianyin and Pachori, Ram Bilas and Wang, Dongshu}, volume = {278}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v278/main/assets/ding25a/ding25a.pdf}, url = {https://proceedings.mlr.press/v278/ding25a.html}, abstract = {The prediction of port throughput is very important for port operation management. Aiming at the prediction of port throughput, this study selected the level of economic and trade and development vitality in the hinterland, regional transportation capacity in the hinterland, port infrastructure conditions and other first-class indicators and 13 second-class indicators, used xgboost to analyze the importance of characteristics, and screened out 9 key influencing factors. The combined model based on xgboost and stacking algorithm was constructed, and the parameters were optimized by cross validation and grid search method. Taking Dalian port as an example, the experiment shows that when xgboost stacking model is used to predict port throughput, MAE, MAPE and RMSE are the lowest, and $R^2$ is the highest. The prediction performance is significantly better than other models, which verifies the effectiveness and superiority of the model in port throughput prediction, and provides a new method and idea for port throughput prediction.} }
Endnote
%0 Conference Paper %T A Hybrid XGBoost and Stacked Regression Model with Optimized Feature Selection for Port Throughput Prediction %A Ning Ding %A Fuyang Zhao %A Xiaoyu Wang %B Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing %C Proceedings of Machine Learning Research %D 2025 %E Nianyin Zeng %E Ram Bilas Pachori %E Dongshu Wang %F pmlr-v278-ding25a %I PMLR %P 798--807 %U https://proceedings.mlr.press/v278/ding25a.html %V 278 %X The prediction of port throughput is very important for port operation management. Aiming at the prediction of port throughput, this study selected the level of economic and trade and development vitality in the hinterland, regional transportation capacity in the hinterland, port infrastructure conditions and other first-class indicators and 13 second-class indicators, used xgboost to analyze the importance of characteristics, and screened out 9 key influencing factors. The combined model based on xgboost and stacking algorithm was constructed, and the parameters were optimized by cross validation and grid search method. Taking Dalian port as an example, the experiment shows that when xgboost stacking model is used to predict port throughput, MAE, MAPE and RMSE are the lowest, and $R^2$ is the highest. The prediction performance is significantly better than other models, which verifies the effectiveness and superiority of the model in port throughput prediction, and provides a new method and idea for port throughput prediction.
APA
Ding, N., Zhao, F. & Wang, X.. (2025). A Hybrid XGBoost and Stacked Regression Model with Optimized Feature Selection for Port Throughput Prediction. Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing, in Proceedings of Machine Learning Research 278:798-807 Available from https://proceedings.mlr.press/v278/ding25a.html.

Related Material