Random Sets Approach and its Applications

Vladimir Nikulin
Proceedings of the Workshop on the Causation and Prediction Challenge at WCCI 2008, PMLR 3:65-76, 2008.

Abstract

The random sets approach is heuristic in nature and has been inspired by the growing speed of computations. For example, we can consider a large number of classifiers where any single classifier is based on a relatively small subset of randomly selected features or random sets of features. Using cross-validation we can rank all random sets according to the selected criterion, and use this ranking for further feature selection. Another application of random sets was motivated by the huge imbalanced data, which represent significant problem because the corresponding classifier has a tendency to ignore patterns with smaller representation in the training set. Again, we propose to consider a large number of balanced training subsets where representatives from both patterns are selected randomly. The above models demonstrated competitive results in two data mining competitions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v3-nikulin08a, title = {Random Sets Approach and its Applications}, author = {Nikulin, Vladimir}, booktitle = {Proceedings of the Workshop on the Causation and Prediction Challenge at WCCI 2008}, pages = {65--76}, year = {2008}, editor = {Guyon, Isabelle and Aliferis, Constantin and Cooper, Greg and Elisseeff, André and Pellet, Jean-Philippe and Spirtes, Peter and Statnikov, Alexander}, volume = {3}, series = {Proceedings of Machine Learning Research}, address = {Hong Kong}, month = {03--04 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v3/nikulin08a/nikulin08a.pdf}, url = {http://proceedings.mlr.press/v3/nikulin08a.html}, abstract = {The random sets approach is heuristic in nature and has been inspired by the growing speed of computations. For example, we can consider a large number of classifiers where any single classifier is based on a relatively small subset of randomly selected features or random sets of features. Using cross-validation we can rank all random sets according to the selected criterion, and use this ranking for further feature selection. Another application of random sets was motivated by the huge imbalanced data, which represent significant problem because the corresponding classifier has a tendency to ignore patterns with smaller representation in the training set. Again, we propose to consider a large number of balanced training subsets where representatives from both patterns are selected randomly. The above models demonstrated competitive results in two data mining competitions.} }
Endnote
%0 Conference Paper %T Random Sets Approach and its Applications %A Vladimir Nikulin %B Proceedings of the Workshop on the Causation and Prediction Challenge at WCCI 2008 %C Proceedings of Machine Learning Research %D 2008 %E Isabelle Guyon %E Constantin Aliferis %E Greg Cooper %E André Elisseeff %E Jean-Philippe Pellet %E Peter Spirtes %E Alexander Statnikov %F pmlr-v3-nikulin08a %I PMLR %P 65--76 %U http://proceedings.mlr.press/v3/nikulin08a.html %V 3 %X The random sets approach is heuristic in nature and has been inspired by the growing speed of computations. For example, we can consider a large number of classifiers where any single classifier is based on a relatively small subset of randomly selected features or random sets of features. Using cross-validation we can rank all random sets according to the selected criterion, and use this ranking for further feature selection. Another application of random sets was motivated by the huge imbalanced data, which represent significant problem because the corresponding classifier has a tendency to ignore patterns with smaller representation in the training set. Again, we propose to consider a large number of balanced training subsets where representatives from both patterns are selected randomly. The above models demonstrated competitive results in two data mining competitions.
RIS
TY - CPAPER TI - Random Sets Approach and its Applications AU - Vladimir Nikulin BT - Proceedings of the Workshop on the Causation and Prediction Challenge at WCCI 2008 DA - 2008/12/31 ED - Isabelle Guyon ED - Constantin Aliferis ED - Greg Cooper ED - André Elisseeff ED - Jean-Philippe Pellet ED - Peter Spirtes ED - Alexander Statnikov ID - pmlr-v3-nikulin08a PB - PMLR DP - Proceedings of Machine Learning Research VL - 3 SP - 65 EP - 76 L1 - http://proceedings.mlr.press/v3/nikulin08a/nikulin08a.pdf UR - http://proceedings.mlr.press/v3/nikulin08a.html AB - The random sets approach is heuristic in nature and has been inspired by the growing speed of computations. For example, we can consider a large number of classifiers where any single classifier is based on a relatively small subset of randomly selected features or random sets of features. Using cross-validation we can rank all random sets according to the selected criterion, and use this ranking for further feature selection. Another application of random sets was motivated by the huge imbalanced data, which represent significant problem because the corresponding classifier has a tendency to ignore patterns with smaller representation in the training set. Again, we propose to consider a large number of balanced training subsets where representatives from both patterns are selected randomly. The above models demonstrated competitive results in two data mining competitions. ER -
APA
Nikulin, V.. (2008). Random Sets Approach and its Applications. Proceedings of the Workshop on the Causation and Prediction Challenge at WCCI 2008, in Proceedings of Machine Learning Research 3:65-76 Available from http://proceedings.mlr.press/v3/nikulin08a.html.

Related Material