Feature-Budgeted Random Forest

Feng Nan, Joseph Wang, Venkatesh Saligrama
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:1983-1991, 2015.

Abstract

We seek decision rules for \it prediction-time cost reduction, where complete data is available for training, but during prediction-time, each feature can only be acquired for an additional cost. We propose a novel random forest algorithm to minimize prediction error for a user-specified \it average feature acquisition budget. While random forests yield strong generalization performance, they do not explicitly account for feature costs and furthermore require low correlation among trees, which amplifies costs. Our random forest grows trees with low acquisition cost and high strength based on greedy minimax cost-weighted-impurity splits. Theoretically, we establish near-optimal acquisition cost guarantees for our algorithm. Empirically, on a number of benchmark datasets we demonstrate competitive accuracy-cost curves against state-of-the-art prediction-time algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v37-nan15, title = {Feature-Budgeted Random Forest}, author = {Nan, Feng and Wang, Joseph and Saligrama, Venkatesh}, booktitle = {Proceedings of the 32nd International Conference on Machine Learning}, pages = {1983--1991}, year = {2015}, editor = {Bach, Francis and Blei, David}, volume = {37}, series = {Proceedings of Machine Learning Research}, address = {Lille, France}, month = {07--09 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v37/nan15.pdf}, url = {https://proceedings.mlr.press/v37/nan15.html}, abstract = {We seek decision rules for \it prediction-time cost reduction, where complete data is available for training, but during prediction-time, each feature can only be acquired for an additional cost. We propose a novel random forest algorithm to minimize prediction error for a user-specified \it average feature acquisition budget. While random forests yield strong generalization performance, they do not explicitly account for feature costs and furthermore require low correlation among trees, which amplifies costs. Our random forest grows trees with low acquisition cost and high strength based on greedy minimax cost-weighted-impurity splits. Theoretically, we establish near-optimal acquisition cost guarantees for our algorithm. Empirically, on a number of benchmark datasets we demonstrate competitive accuracy-cost curves against state-of-the-art prediction-time algorithms.} }
Endnote
%0 Conference Paper %T Feature-Budgeted Random Forest %A Feng Nan %A Joseph Wang %A Venkatesh Saligrama %B Proceedings of the 32nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2015 %E Francis Bach %E David Blei %F pmlr-v37-nan15 %I PMLR %P 1983--1991 %U https://proceedings.mlr.press/v37/nan15.html %V 37 %X We seek decision rules for \it prediction-time cost reduction, where complete data is available for training, but during prediction-time, each feature can only be acquired for an additional cost. We propose a novel random forest algorithm to minimize prediction error for a user-specified \it average feature acquisition budget. While random forests yield strong generalization performance, they do not explicitly account for feature costs and furthermore require low correlation among trees, which amplifies costs. Our random forest grows trees with low acquisition cost and high strength based on greedy minimax cost-weighted-impurity splits. Theoretically, we establish near-optimal acquisition cost guarantees for our algorithm. Empirically, on a number of benchmark datasets we demonstrate competitive accuracy-cost curves against state-of-the-art prediction-time algorithms.
RIS
TY - CPAPER TI - Feature-Budgeted Random Forest AU - Feng Nan AU - Joseph Wang AU - Venkatesh Saligrama BT - Proceedings of the 32nd International Conference on Machine Learning DA - 2015/06/01 ED - Francis Bach ED - David Blei ID - pmlr-v37-nan15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 37 SP - 1983 EP - 1991 L1 - http://proceedings.mlr.press/v37/nan15.pdf UR - https://proceedings.mlr.press/v37/nan15.html AB - We seek decision rules for \it prediction-time cost reduction, where complete data is available for training, but during prediction-time, each feature can only be acquired for an additional cost. We propose a novel random forest algorithm to minimize prediction error for a user-specified \it average feature acquisition budget. While random forests yield strong generalization performance, they do not explicitly account for feature costs and furthermore require low correlation among trees, which amplifies costs. Our random forest grows trees with low acquisition cost and high strength based on greedy minimax cost-weighted-impurity splits. Theoretically, we establish near-optimal acquisition cost guarantees for our algorithm. Empirically, on a number of benchmark datasets we demonstrate competitive accuracy-cost curves against state-of-the-art prediction-time algorithms. ER -
APA
Nan, F., Wang, J. & Saligrama, V.. (2015). Feature-Budgeted Random Forest. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:1983-1991 Available from https://proceedings.mlr.press/v37/nan15.html.

Related Material