Considerations for data acquisition and modeling strategies: Mitosis detection in computational pathology

Zongliang Ji, Philip Rosenfield, Christina Eng, Sarah Bettigole, Danielle C Gibson, Hamid Masoudi, Matthew Hanna, Nicolo Fusi, Kristen A Severson
Medical Imaging with Deep Learning, PMLR 227:1051-1066, 2024.

Abstract

Preparing data for machine learning tasks in health and life science applications requires decisions that affect the cost, model properties and performance. In this work, we study the implication of data collection strategies, focusing on a case study of mitosis detection. Specifically, we investigate the use of expert and crowd-sourced labelers, the impact of aggregated vs single labels, and the framing of the problem as either classification or object detection. Our results demonstrate the value of crowd-sourced labels, importance of uncertainty quantification, and utility of negative samples.

Cite this Paper


BibTeX
@InProceedings{pmlr-v227-ji24b, title = {Considerations for data acquisition and modeling strategies: Mitosis detection in computational pathology}, author = {Ji, Zongliang and Rosenfield, Philip and Eng, Christina and Bettigole, Sarah and Gibson, Danielle C and Masoudi, Hamid and Hanna, Matthew and Fusi, Nicolo and Severson, Kristen A}, booktitle = {Medical Imaging with Deep Learning}, pages = {1051--1066}, year = {2024}, editor = {Oguz, Ipek and Noble, Jack and Li, Xiaoxiao and Styner, Martin and Baumgartner, Christian and Rusu, Mirabela and Heinmann, Tobias and Kontos, Despina and Landman, Bennett and Dawant, Benoit}, volume = {227}, series = {Proceedings of Machine Learning Research}, month = {10--12 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v227/ji24b/ji24b.pdf}, url = {https://proceedings.mlr.press/v227/ji24b.html}, abstract = {Preparing data for machine learning tasks in health and life science applications requires decisions that affect the cost, model properties and performance. In this work, we study the implication of data collection strategies, focusing on a case study of mitosis detection. Specifically, we investigate the use of expert and crowd-sourced labelers, the impact of aggregated vs single labels, and the framing of the problem as either classification or object detection. Our results demonstrate the value of crowd-sourced labels, importance of uncertainty quantification, and utility of negative samples.} }
Endnote
%0 Conference Paper %T Considerations for data acquisition and modeling strategies: Mitosis detection in computational pathology %A Zongliang Ji %A Philip Rosenfield %A Christina Eng %A Sarah Bettigole %A Danielle C Gibson %A Hamid Masoudi %A Matthew Hanna %A Nicolo Fusi %A Kristen A Severson %B Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2024 %E Ipek Oguz %E Jack Noble %E Xiaoxiao Li %E Martin Styner %E Christian Baumgartner %E Mirabela Rusu %E Tobias Heinmann %E Despina Kontos %E Bennett Landman %E Benoit Dawant %F pmlr-v227-ji24b %I PMLR %P 1051--1066 %U https://proceedings.mlr.press/v227/ji24b.html %V 227 %X Preparing data for machine learning tasks in health and life science applications requires decisions that affect the cost, model properties and performance. In this work, we study the implication of data collection strategies, focusing on a case study of mitosis detection. Specifically, we investigate the use of expert and crowd-sourced labelers, the impact of aggregated vs single labels, and the framing of the problem as either classification or object detection. Our results demonstrate the value of crowd-sourced labels, importance of uncertainty quantification, and utility of negative samples.
APA
Ji, Z., Rosenfield, P., Eng, C., Bettigole, S., Gibson, D.C., Masoudi, H., Hanna, M., Fusi, N. & Severson, K.A.. (2024). Considerations for data acquisition and modeling strategies: Mitosis detection in computational pathology. Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 227:1051-1066 Available from https://proceedings.mlr.press/v227/ji24b.html.

Related Material