[edit]
A Hybrid XGBoost and Stacked Regression Model with Optimized Feature Selection for Port Throughput Prediction
Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing, PMLR 278:798-807, 2025.
Abstract
The prediction of port throughput is very important for port operation management. Aiming at the prediction of port throughput, this study selected the level of economic and trade and development vitality in the hinterland, regional transportation capacity in the hinterland, port infrastructure conditions and other first-class indicators and 13 second-class indicators, used xgboost to analyze the importance of characteristics, and screened out 9 key influencing factors. The combined model based on xgboost and stacking algorithm was constructed, and the parameters were optimized by cross validation and grid search method. Taking Dalian port as an example, the experiment shows that when xgboost stacking model is used to predict port throughput, MAE, MAPE and RMSE are the lowest, and $R^2$ is the highest. The prediction performance is significantly better than other models, which verifies the effectiveness and superiority of the model in port throughput prediction, and provides a new method and idea for port throughput prediction.