[edit]
Framework for a Generic Knowledge Discovery Toolkit
Pre-proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics, PMLR R0:457-468, 1995.
Abstract
Industrial and commercial firms accumulate vast quantities of data in the course of their day-today business. The primary use of this data is to monitor business processes: inventory, maintenance actions, and so on. However this data contains much valuable information that, if accessible, would enhance the understanding of, and aid in improving the performance of, the processes being monitored. Traditional statistical procedures provide some insight into this data, but they are often misused in non-expert hands. With the rapidly increasing quantity of data, it is no longer cost effective for trained statisticians to analyze all the data. The number of variables and observations in these datasets is often very large, and the number of candidate statistical models that might be considered is too large to permit manual systematic exploration. In this type of situation, a Knowledge Discovery (KD) tool is the most effective way to explore the data.