Efficient Bandit Combinatorial Optimization Algorithm with Zerosuppressed Binary Decision Diagrams
[edit]
Proceedings of the TwentyFirst International Conference on Artificial Intelligence and Statistics, PMLR 84:585594, 2018.
Abstract
We consider bandit combinatorial optimization (BCO) problems. A BCO instance generally has a huge set of all feasible solutions, which we call the action set. To avoid dealing with such huge action sets directly, we propose an algorithm that takes advantage of zerosuppressed binary decision diagrams, which encode action sets as compact graphs. The proposed algorithm achieves either $O(T^{2/3})$ regret with high probability or $O(\sqrt{T})$ expected regret at any $T$th round. Typically, our algorithm works efficiently for BCO problems defined on networks. Experiments show that our algorithm is applicable to various large BCO instances including adaptive routing problems on realworld networks.
Related Material


