Comparing the Prediction Accuracy of Statistical Models and Artificial Neural Networks in Breast Cancer
Pre-proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics, PMLR R0:87-87, 1995.
Predicting survival is important in cancer because it determines patient therapy, it matches patients for clinical trials, and it provides information to the patient. For over thirty years measuring cancer outcome has been based on the TNM Stage model. There are two problems with this model: (1) it is not very accurate (44% accurate for breast cancer), and (2) its accuracy can not be improved because predictive variables can not be added to the model without increasing the model’s complexity to the point where it is not longer useful to the clinician. There are several statistical models that have the potential to replace the existing TNM Stage model. All of these models can integrate new prognostic factors to increase measurement accuracy. But they are not all equally accurate, and they do not all equally meet the criteria for a new prognostic system set by the American Joint Committee on Cancer (Burke HB, Henson DE. Criteria for prognostic factors and for an enhanced prognostic system. Cancer 1993 ; 72: 3131-5). We compare the most powerful statistical models in terms of their accuracy in predicting five year breast cancer-specific survival. These models include principal component analysis, classification and regression tress both pruned and shrunk, stepwise logistic regression, and five types of artificial neural networks.