Dataset Cataloging Metadata for Machine Learning Applications Research
Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics, PMLR R1:139-146, 1997.
As the field of machine learning (ML) matures, two types of data archives are developing: collections of benchmark data sets used to test the performance of new algorithms, and data stores to which machine learning/data mining algorithms are applied to create scientific or commercial applications. At present, the catalogs of these archives are ad hoc and not tailored to machine learning analysis. This paper considers the cataloging metadata required to support these two types of repositories, and discusses the organizational support necessary for archive catalog maintenance.