Multi-task {K}ernel {L}earning Based on {P}robabilistic {L}ipschitzness

Anastasia Pentina, Shai Ben-David
Proceedings of Algorithmic Learning Theory, PMLR 83:682-701, 2018.

Abstract

In multi-task learning the learner is given data for a set of related learning tasks and aims to improve the overall learning performance by transferring information between them. A typical assumption exploited in this setting is that the tasks share a beneficial representation that can be learned form the joint training data of all tasks. This way, the training data of each task can be utilized to enhance the learning of other tasks in the set. Probabilistic Lipschitzness (PL) is a parameter that reflects one way in which some data representation can be beneficial for a classification learning task. In this work we propose to achieve multi-task learning by learning a kernel function relative to which each of the tasks in the set has a "high level" of probabilistic Lipschitzness. In order to be able to do that, we need to introduce a new variant of PL - one that allows reliable estimation of its value from finite size samples. We show that by having access to large amounts of training data in total (possibly the union of training sets for various tasks), the learner can identify a kernel function that would lead to fast learning rates per task when used for Nearest Neighbor classification or in a cluster-based active labeling procedure.

Cite this Paper


BibTeX
@InProceedings{pmlr-v83-pentina18a, title = {Multi-task {K}ernel {L}earning Based on {P}robabilistic {L}ipschitzness}, author = {Pentina, Anastasia and Ben-David, Shai}, booktitle = {Proceedings of Algorithmic Learning Theory}, pages = {682--701}, year = {2018}, editor = {Janoos, Firdaus and Mohri, Mehryar and Sridharan, Karthik}, volume = {83}, series = {Proceedings of Machine Learning Research}, month = {07--09 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v83/pentina18a/pentina18a.pdf}, url = {https://proceedings.mlr.press/v83/pentina18a.html}, abstract = {In multi-task learning the learner is given data for a set of related learning tasks and aims to improve the overall learning performance by transferring information between them. A typical assumption exploited in this setting is that the tasks share a beneficial representation that can be learned form the joint training data of all tasks. This way, the training data of each task can be utilized to enhance the learning of other tasks in the set. Probabilistic Lipschitzness (PL) is a parameter that reflects one way in which some data representation can be beneficial for a classification learning task. In this work we propose to achieve multi-task learning by learning a kernel function relative to which each of the tasks in the set has a "high level" of probabilistic Lipschitzness. In order to be able to do that, we need to introduce a new variant of PL - one that allows reliable estimation of its value from finite size samples. We show that by having access to large amounts of training data in total (possibly the union of training sets for various tasks), the learner can identify a kernel function that would lead to fast learning rates per task when used for Nearest Neighbor classification or in a cluster-based active labeling procedure. } }
Endnote
%0 Conference Paper %T Multi-task {K}ernel {L}earning Based on {P}robabilistic {L}ipschitzness %A Anastasia Pentina %A Shai Ben-David %B Proceedings of Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2018 %E Firdaus Janoos %E Mehryar Mohri %E Karthik Sridharan %F pmlr-v83-pentina18a %I PMLR %P 682--701 %U https://proceedings.mlr.press/v83/pentina18a.html %V 83 %X In multi-task learning the learner is given data for a set of related learning tasks and aims to improve the overall learning performance by transferring information between them. A typical assumption exploited in this setting is that the tasks share a beneficial representation that can be learned form the joint training data of all tasks. This way, the training data of each task can be utilized to enhance the learning of other tasks in the set. Probabilistic Lipschitzness (PL) is a parameter that reflects one way in which some data representation can be beneficial for a classification learning task. In this work we propose to achieve multi-task learning by learning a kernel function relative to which each of the tasks in the set has a "high level" of probabilistic Lipschitzness. In order to be able to do that, we need to introduce a new variant of PL - one that allows reliable estimation of its value from finite size samples. We show that by having access to large amounts of training data in total (possibly the union of training sets for various tasks), the learner can identify a kernel function that would lead to fast learning rates per task when used for Nearest Neighbor classification or in a cluster-based active labeling procedure.
APA
Pentina, A. & Ben-David, S.. (2018). Multi-task {K}ernel {L}earning Based on {P}robabilistic {L}ipschitzness. Proceedings of Algorithmic Learning Theory, in Proceedings of Machine Learning Research 83:682-701 Available from https://proceedings.mlr.press/v83/pentina18a.html.

Related Material