Where are the Whales: A Human-in-the-loop Detection Method for Identifying Whales in High-resolution Satellite Imagery

Caleb Robinson, Kimberly T. Goetz, Christin B. Khan, Meredith Sackett, Kathleen Leonard, Rahul M Dodhia, Juan M Lavista Ferres
Proceedings of The TerraBytes {ICML} Workshop: Towards global datasets and models for Earth Observation, PMLR 292:1-12, 2025.

Abstract

Effective monitoring of whale populations is critical for conservation, but traditional survey methods are expensive and difficult to scale. While prior work has shown that whales can be identified in very high-resolution (VHR) satellite imagery, large-scale automated detection remains challenging due to a lack of annotated imagery, variability in image quality and environmental conditions, and the cost of building robust machine learning pipelines over massive remote sensing archives. We present a semi-automated approach for surfacing possible whale detections in VHR imagery using a statistical anomaly detection method that flags spatial outliers, i.e. “interesting points”. We pair this detector with a web-based labeling interface designed to enable experts to quickly annotate the interesting points. We evaluate our system on three benchmark scenes with known whale annotations and achieve recalls of 90.3% to 96.4%, while reducing the area requiring expert inspection by up to 99.8% — from over 1,000 sq km to less than 2 sq km in some cases. Our method does not rely on labeled training data and offers a scalable first step toward future machine-assisted marine mammal monitoring from space. We have open sourced the entire pipeline at \url{https://github.com/microsoft/whales}.

Cite this Paper


BibTeX
@InProceedings{pmlr-v292-robinson25a, title = {Where are the Whales: A Human-in-the-loop Detection Method for Identifying Whales in High-resolution Satellite Imagery}, author = {Robinson, Caleb and Goetz, Kimberly T. and Khan, Christin B. and Sackett, Meredith and Leonard, Kathleen and Dodhia, Rahul M and Lavista Ferres, Juan M}, booktitle = {Proceedings of The TerraBytes {ICML} Workshop: Towards global datasets and models for Earth Observation}, pages = {1--12}, year = {2025}, editor = {Audebert, Nicolas and Azizpour, Hossein and Barrière, Valentin and Castillo Navarro, Javiera and Czerkawski, Mikolaj and Fang, Heng and Francis, Alistair and Marsocci, Valerio and Nascetti, Andrea and Yadav, Ritu}, volume = {292}, series = {Proceedings of Machine Learning Research}, month = {19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v292/main/assets/robinson25a/robinson25a.pdf}, url = {https://proceedings.mlr.press/v292/robinson25a.html}, abstract = {Effective monitoring of whale populations is critical for conservation, but traditional survey methods are expensive and difficult to scale. While prior work has shown that whales can be identified in very high-resolution (VHR) satellite imagery, large-scale automated detection remains challenging due to a lack of annotated imagery, variability in image quality and environmental conditions, and the cost of building robust machine learning pipelines over massive remote sensing archives. We present a semi-automated approach for surfacing possible whale detections in VHR imagery using a statistical anomaly detection method that flags spatial outliers, i.e. “interesting points”. We pair this detector with a web-based labeling interface designed to enable experts to quickly annotate the interesting points. We evaluate our system on three benchmark scenes with known whale annotations and achieve recalls of 90.3% to 96.4%, while reducing the area requiring expert inspection by up to 99.8% — from over 1,000 sq km to less than 2 sq km in some cases. Our method does not rely on labeled training data and offers a scalable first step toward future machine-assisted marine mammal monitoring from space. We have open sourced the entire pipeline at \url{https://github.com/microsoft/whales}.} }
Endnote
%0 Conference Paper %T Where are the Whales: A Human-in-the-loop Detection Method for Identifying Whales in High-resolution Satellite Imagery %A Caleb Robinson %A Kimberly T. Goetz %A Christin B. Khan %A Meredith Sackett %A Kathleen Leonard %A Rahul M Dodhia %A Juan M Lavista Ferres %B Proceedings of The TerraBytes {ICML} Workshop: Towards global datasets and models for Earth Observation %C Proceedings of Machine Learning Research %D 2025 %E Nicolas Audebert %E Hossein Azizpour %E Valentin Barrière %E Javiera Castillo Navarro %E Mikolaj Czerkawski %E Heng Fang %E Alistair Francis %E Valerio Marsocci %E Andrea Nascetti %E Ritu Yadav %F pmlr-v292-robinson25a %I PMLR %P 1--12 %U https://proceedings.mlr.press/v292/robinson25a.html %V 292 %X Effective monitoring of whale populations is critical for conservation, but traditional survey methods are expensive and difficult to scale. While prior work has shown that whales can be identified in very high-resolution (VHR) satellite imagery, large-scale automated detection remains challenging due to a lack of annotated imagery, variability in image quality and environmental conditions, and the cost of building robust machine learning pipelines over massive remote sensing archives. We present a semi-automated approach for surfacing possible whale detections in VHR imagery using a statistical anomaly detection method that flags spatial outliers, i.e. “interesting points”. We pair this detector with a web-based labeling interface designed to enable experts to quickly annotate the interesting points. We evaluate our system on three benchmark scenes with known whale annotations and achieve recalls of 90.3% to 96.4%, while reducing the area requiring expert inspection by up to 99.8% — from over 1,000 sq km to less than 2 sq km in some cases. Our method does not rely on labeled training data and offers a scalable first step toward future machine-assisted marine mammal monitoring from space. We have open sourced the entire pipeline at \url{https://github.com/microsoft/whales}.
APA
Robinson, C., Goetz, K.T., Khan, C.B., Sackett, M., Leonard, K., Dodhia, R.M. & Lavista Ferres, J.M.. (2025). Where are the Whales: A Human-in-the-loop Detection Method for Identifying Whales in High-resolution Satellite Imagery. Proceedings of The TerraBytes {ICML} Workshop: Towards global datasets and models for Earth Observation, in Proceedings of Machine Learning Research 292:1-12 Available from https://proceedings.mlr.press/v292/robinson25a.html.

Related Material