[edit]
Efficient Bandit Combinatorial Optimization Algorithm with Zero-suppressed Binary Decision Diagrams
Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR 84:585-594, 2018.
Abstract
We consider bandit combinatorial optimization (BCO) problems. A BCO instance generally has a huge set of all feasible solutions, which we call the action set. To avoid dealing with such huge action sets directly, we propose an algorithm that takes advantage of zero-suppressed binary decision diagrams, which encode action sets as compact graphs. The proposed algorithm achieves either $O(T^{2/3})$ regret with high probability or $O(\sqrt{T})$ expected regret at any $T$-th round. Typically, our algorithm works efficiently for BCO problems defined on networks. Experiments show that our algorithm is applicable to various large BCO instances including adaptive routing problems on real-world networks.