Understanding Self-Training for Gradual Domain Adaptation

Ananya Kumar, Tengyu Ma, Percy Liang
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:5468-5479, 2020.

Abstract

Machine learning systems must adapt to data distributions that evolve over time, in applications ranging from sensor networks and self-driving car perception modules to brain-machine interfaces. Traditional domain adaptation is only guaranteed to work when the distribution shift is small; empirical methods combine several heuristics for larger shifts but can be dataset specific. To adapt to larger shifts we consider gradual domain adaptation, where the goal is to adapt an initial classifier trained on a source domain given only unlabeled data that shifts gradually in distribution towards a target domain. We prove the first non-vacuous upper bound on the error of self-training with gradual shifts, under settings where directly adapting to the target domain can result in unbounded error. The theoretical analysis leads to algorithmic insights, highlighting that regularization and label sharpening are essential even when we have infinite data. Leveraging the gradual shift structure leads to higher accuracies on a rotating MNIST dataset, a forest Cover Type dataset, and a realistic Portraits dataset.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-kumar20c, title = {Understanding Self-Training for Gradual Domain Adaptation}, author = {Kumar, Ananya and Ma, Tengyu and Liang, Percy}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {5468--5479}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/kumar20c/kumar20c.pdf}, url = {https://proceedings.mlr.press/v119/kumar20c.html}, abstract = {Machine learning systems must adapt to data distributions that evolve over time, in applications ranging from sensor networks and self-driving car perception modules to brain-machine interfaces. Traditional domain adaptation is only guaranteed to work when the distribution shift is small; empirical methods combine several heuristics for larger shifts but can be dataset specific. To adapt to larger shifts we consider gradual domain adaptation, where the goal is to adapt an initial classifier trained on a source domain given only unlabeled data that shifts gradually in distribution towards a target domain. We prove the first non-vacuous upper bound on the error of self-training with gradual shifts, under settings where directly adapting to the target domain can result in unbounded error. The theoretical analysis leads to algorithmic insights, highlighting that regularization and label sharpening are essential even when we have infinite data. Leveraging the gradual shift structure leads to higher accuracies on a rotating MNIST dataset, a forest Cover Type dataset, and a realistic Portraits dataset.} }
Endnote
%0 Conference Paper %T Understanding Self-Training for Gradual Domain Adaptation %A Ananya Kumar %A Tengyu Ma %A Percy Liang %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-kumar20c %I PMLR %P 5468--5479 %U https://proceedings.mlr.press/v119/kumar20c.html %V 119 %X Machine learning systems must adapt to data distributions that evolve over time, in applications ranging from sensor networks and self-driving car perception modules to brain-machine interfaces. Traditional domain adaptation is only guaranteed to work when the distribution shift is small; empirical methods combine several heuristics for larger shifts but can be dataset specific. To adapt to larger shifts we consider gradual domain adaptation, where the goal is to adapt an initial classifier trained on a source domain given only unlabeled data that shifts gradually in distribution towards a target domain. We prove the first non-vacuous upper bound on the error of self-training with gradual shifts, under settings where directly adapting to the target domain can result in unbounded error. The theoretical analysis leads to algorithmic insights, highlighting that regularization and label sharpening are essential even when we have infinite data. Leveraging the gradual shift structure leads to higher accuracies on a rotating MNIST dataset, a forest Cover Type dataset, and a realistic Portraits dataset.
APA
Kumar, A., Ma, T. & Liang, P.. (2020). Understanding Self-Training for Gradual Domain Adaptation. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:5468-5479 Available from https://proceedings.mlr.press/v119/kumar20c.html.

Related Material