On Learning Distributions from their Samples
Proceedings of The 28th Conference on Learning Theory, PMLR 40:1066-1100, 2015.
One of the most natural and important questions in statistical learning is: how well can a distribution be approximated from its samples. Surprisingly, this question has so far been resolved for only one loss, the KL-divergence and even in this case, the estimator used is ad hoc and not well understood. We study distribution approximations for general loss measures. For \ell_2^2 we determine the best approximation possible, for \ell_1 and χ^2 we derive tight bounds on the best approximation, and when the probabilities are bounded away from zero, we resolve the question for all sufficiently smooth loss measures, thereby providing a coherent understanding of the rate at which distributions can be approximated from their samples.