[edit]

# Learning Functional Distributions with Private Labels

*Proceedings of the 40th International Conference on Machine Learning*, PMLR 202:37728-37744, 2023.

#### Abstract

We study the problem of learning functional distributions in the presence of noise. A functional is a map from the space of features to

*distributions*over a set of labels, and is often assumed to belong to a known class of hypotheses $\mathcal{F}$. Features are generated by a general random process and labels are sampled independently from feature-dependent distributions. In privacy sensitive applications, labels are passed through a noisy kernel. We consider*online learning*, where at each time step, a predictor attempts to predict the*actual*(label) distribution given only the features and*noisy*labels in prior steps. The performance of the predictor is measured by the expected KL-risk that compares the predicted distributions to the underlying truth. We show that the*minimax*expected KL-risk is of order $\tilde{\Theta}(\sqrt{T\log|\mathcal{F}|})$ for finite hypothesis class $\mathcal{F}$ and*any*non-trivial noise level. We then extend this result to general infinite classes via the concept of*stochastic sequential covering*and provide matching lower and upper bounds for a wide range of natural classes.