Evasion and Hardening of Tree Ensemble Classifiers

Alex Kantchelian, J. D. Tygar, Anthony Joseph
Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:2387-2396, 2016.

Abstract

Classifier evasion consists in finding for a given instance x the “nearest” instance x’ such that the classifier predictions of x and x’ are different. We present two novel algorithms for systematically computing evasions for tree ensembles such as boosted trees and random forests. Our first algorithm uses a Mixed Integer Linear Program solver and finds the optimal evading instance under an expressive set of constraints. Our second algorithm trades off optimality for speed by using symbolic prediction, a novel algorithm for fast finite differences on tree ensembles. On a digit recognition task, we demonstrate that both gradient boosted trees and random forests are extremely susceptible to evasions. Finally, we harden a boosted tree model without loss of predictive accuracy by augmenting the training set of each boosting round with evading instances, a technique we call adversarial boosting.

Cite this Paper


BibTeX
@InProceedings{pmlr-v48-kantchelian16, title = {Evasion and Hardening of Tree Ensemble Classifiers}, author = {Kantchelian, Alex and Tygar, J. D. and Joseph, Anthony}, booktitle = {Proceedings of The 33rd International Conference on Machine Learning}, pages = {2387--2396}, year = {2016}, editor = {Balcan, Maria Florina and Weinberger, Kilian Q.}, volume = {48}, series = {Proceedings of Machine Learning Research}, address = {New York, New York, USA}, month = {20--22 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v48/kantchelian16.pdf}, url = {https://proceedings.mlr.press/v48/kantchelian16.html}, abstract = {Classifier evasion consists in finding for a given instance x the “nearest” instance x’ such that the classifier predictions of x and x’ are different. We present two novel algorithms for systematically computing evasions for tree ensembles such as boosted trees and random forests. Our first algorithm uses a Mixed Integer Linear Program solver and finds the optimal evading instance under an expressive set of constraints. Our second algorithm trades off optimality for speed by using symbolic prediction, a novel algorithm for fast finite differences on tree ensembles. On a digit recognition task, we demonstrate that both gradient boosted trees and random forests are extremely susceptible to evasions. Finally, we harden a boosted tree model without loss of predictive accuracy by augmenting the training set of each boosting round with evading instances, a technique we call adversarial boosting.} }
Endnote
%0 Conference Paper %T Evasion and Hardening of Tree Ensemble Classifiers %A Alex Kantchelian %A J. D. Tygar %A Anthony Joseph %B Proceedings of The 33rd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2016 %E Maria Florina Balcan %E Kilian Q. Weinberger %F pmlr-v48-kantchelian16 %I PMLR %P 2387--2396 %U https://proceedings.mlr.press/v48/kantchelian16.html %V 48 %X Classifier evasion consists in finding for a given instance x the “nearest” instance x’ such that the classifier predictions of x and x’ are different. We present two novel algorithms for systematically computing evasions for tree ensembles such as boosted trees and random forests. Our first algorithm uses a Mixed Integer Linear Program solver and finds the optimal evading instance under an expressive set of constraints. Our second algorithm trades off optimality for speed by using symbolic prediction, a novel algorithm for fast finite differences on tree ensembles. On a digit recognition task, we demonstrate that both gradient boosted trees and random forests are extremely susceptible to evasions. Finally, we harden a boosted tree model without loss of predictive accuracy by augmenting the training set of each boosting round with evading instances, a technique we call adversarial boosting.
RIS
TY - CPAPER TI - Evasion and Hardening of Tree Ensemble Classifiers AU - Alex Kantchelian AU - J. D. Tygar AU - Anthony Joseph BT - Proceedings of The 33rd International Conference on Machine Learning DA - 2016/06/11 ED - Maria Florina Balcan ED - Kilian Q. Weinberger ID - pmlr-v48-kantchelian16 PB - PMLR DP - Proceedings of Machine Learning Research VL - 48 SP - 2387 EP - 2396 L1 - http://proceedings.mlr.press/v48/kantchelian16.pdf UR - https://proceedings.mlr.press/v48/kantchelian16.html AB - Classifier evasion consists in finding for a given instance x the “nearest” instance x’ such that the classifier predictions of x and x’ are different. We present two novel algorithms for systematically computing evasions for tree ensembles such as boosted trees and random forests. Our first algorithm uses a Mixed Integer Linear Program solver and finds the optimal evading instance under an expressive set of constraints. Our second algorithm trades off optimality for speed by using symbolic prediction, a novel algorithm for fast finite differences on tree ensembles. On a digit recognition task, we demonstrate that both gradient boosted trees and random forests are extremely susceptible to evasions. Finally, we harden a boosted tree model without loss of predictive accuracy by augmenting the training set of each boosting round with evading instances, a technique we call adversarial boosting. ER -
APA
Kantchelian, A., Tygar, J.D. & Joseph, A.. (2016). Evasion and Hardening of Tree Ensemble Classifiers. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:2387-2396 Available from https://proceedings.mlr.press/v48/kantchelian16.html.

Related Material