Divergences and Risks for Multiclass Experiments
Proceedings of the 25th Annual Conference on Learning Theory, PMLR 23:28.1-28.20, 2012.
Csiszár’s $f$-divergence is a way to measure the similarity of two probability distributions. We study the extension of $f$-divergence to more than two distributions to measure their joint similarity. By exploiting classical results from the comparison of experiments literature we prove the resulting divergence satisfies all the same properties as the traditional binary one. Considering the multidistribution case actually makes the proofs simpler. The key to these results is a formal bridge between these multidistribution $f$-divergences and Bayes risks for multiclass classification problems.