Sample efficient learning of image-based diagnostic classifiers via probabilistic labels

Roberto Vega, Pouneh Gorji, Zichen Zhang, Xuebin Qin, Abhilash Rakkunedeth, Jeevesh Kapur, Jacob Jaremko, Russell Greiner
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:739-747, 2021.

Abstract

Deep learning approaches often require huge datasets to achieve good generalization. This complicates its use in tasks like image-based medical diagnosis, where the small training datasets are usually insufficient to learn appropriate data representations. For such sensitive tasks it is also important to provide the confidence in the predictions. Here, we propose a way to learn and use probabilistic labels to train accurate and calibrated deep networks from relatively small datasets. We observe gains of up to 22% in the accuracy of models trained with these labels, as compared with traditional approaches, in three classification tasks: diagnosis of hip dysplasia, fatty liver, and glaucoma. The outputs of models trained with probabilistic labels are calibrated, allowing the interpretation of its predictions as proper probabilities. We anticipate this approach will apply to other tasks where few training instances are available and expert knowledge can be encoded as probabilities.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-vega21a, title = { Sample efficient learning of image-based diagnostic classifiers via probabilistic labels }, author = {Vega, Roberto and Gorji, Pouneh and Zhang, Zichen and Qin, Xuebin and Rakkunedeth, Abhilash and Kapur, Jeevesh and Jaremko, Jacob and Greiner, Russell}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {739--747}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/vega21a/vega21a.pdf}, url = {https://proceedings.mlr.press/v130/vega21a.html}, abstract = { Deep learning approaches often require huge datasets to achieve good generalization. This complicates its use in tasks like image-based medical diagnosis, where the small training datasets are usually insufficient to learn appropriate data representations. For such sensitive tasks it is also important to provide the confidence in the predictions. Here, we propose a way to learn and use probabilistic labels to train accurate and calibrated deep networks from relatively small datasets. We observe gains of up to 22% in the accuracy of models trained with these labels, as compared with traditional approaches, in three classification tasks: diagnosis of hip dysplasia, fatty liver, and glaucoma. The outputs of models trained with probabilistic labels are calibrated, allowing the interpretation of its predictions as proper probabilities. We anticipate this approach will apply to other tasks where few training instances are available and expert knowledge can be encoded as probabilities. } }
Endnote
%0 Conference Paper %T Sample efficient learning of image-based diagnostic classifiers via probabilistic labels %A Roberto Vega %A Pouneh Gorji %A Zichen Zhang %A Xuebin Qin %A Abhilash Rakkunedeth %A Jeevesh Kapur %A Jacob Jaremko %A Russell Greiner %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-vega21a %I PMLR %P 739--747 %U https://proceedings.mlr.press/v130/vega21a.html %V 130 %X Deep learning approaches often require huge datasets to achieve good generalization. This complicates its use in tasks like image-based medical diagnosis, where the small training datasets are usually insufficient to learn appropriate data representations. For such sensitive tasks it is also important to provide the confidence in the predictions. Here, we propose a way to learn and use probabilistic labels to train accurate and calibrated deep networks from relatively small datasets. We observe gains of up to 22% in the accuracy of models trained with these labels, as compared with traditional approaches, in three classification tasks: diagnosis of hip dysplasia, fatty liver, and glaucoma. The outputs of models trained with probabilistic labels are calibrated, allowing the interpretation of its predictions as proper probabilities. We anticipate this approach will apply to other tasks where few training instances are available and expert knowledge can be encoded as probabilities.
APA
Vega, R., Gorji, P., Zhang, Z., Qin, X., Rakkunedeth, A., Kapur, J., Jaremko, J. & Greiner, R.. (2021). Sample efficient learning of image-based diagnostic classifiers via probabilistic labels . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:739-747 Available from https://proceedings.mlr.press/v130/vega21a.html.

Related Material