Contextual Bandits for Adapting Treatment in a Mouse Model of de Novo Carcinogenesis

Audrey Durand; Charis Achilleos; Demetris Iacovides; Katerina Strati; Georgios D. Mitsis; Joelle Pineau

Contextual Bandits for Adapting Treatment in a Mouse Model of de Novo Carcinogenesis

Audrey Durand, Charis Achilleos, Demetris Iacovides, Katerina Strati, Georgios D. Mitsis, Joelle Pineau

Proceedings of the 3rd Machine Learning for Healthcare Conference, PMLR 85:67-82, 2018.

Abstract

In this work, we present a specific case study where we aim to design effective treatment allocation strategies and validate these using a mouse model of skin cancer. Collecting data for modelling treatments effectiveness on animal models is an expensive and time consuming process. Moreover, acquiring this information during the full range of disease stages is hard to achieve with a conventional random treatment allocation procedure, as poor treatments cause deterioration of subject health. We therefore aim to design an adaptive allocation strategy to improve the efficiency of data collection by allocating more samples for exploring promising treatments. We cast this application as a contextual bandit problem and introduce a simple and practical algorithm for exploration-exploitation in this framework. The work builds on a recent class of approaches for non-contextual bandits that relies on subsampling to compare treatment options using an equivalent amount of information. On the technical side, we extend the subsampling strategy to the case of bandits with context, by applying subsampling within Gaussian Process regression. On the experimental side, preliminary results using 10 mice with skin tumours suggest that the proposed approach extends by more than 50% the subjects life duration compared with baseline strategies: no treatment, random treatment allocation, and constant chemotherapeutic agent. By slowing the tumour growth rate, the adaptive procedure gathers information about treatment effectiveness on a broader range of tumour volumes, which is crucial for eventually deriving sequential pharmacological treatment strategies for cancer.

Cite this Paper

BibTeX


@InProceedings{pmlr-v85-durand18a,
  title = 	 {Contextual Bandits for Adapting Treatment in a Mouse Model of de Novo Carcinogenesis},
  author =       {Durand, Audrey and Achilleos, Charis and Iacovides, Demetris and Strati, Katerina and Mitsis, Georgios D. and Pineau, Joelle},
  booktitle = 	 {Proceedings of the 3rd Machine Learning for Healthcare Conference},
  pages = 	 {67--82},
  year = 	 {2018},
  editor = 	 {Doshi-Velez, Finale and Fackler, Jim and Jung, Ken and Kale, David and Ranganath, Rajesh and Wallace, Byron and Wiens, Jenna},
  volume = 	 {85},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--18 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v85/durand18a/durand18a.pdf},
  url = 	 {https://proceedings.mlr.press/v85/durand18a.html},
  abstract = 	 {In this work, we present a specific case study where we aim to design effective treatment allocation strategies and validate these using a mouse model of skin cancer. Collecting data for modelling treatments effectiveness on animal models is an expensive and time consuming process. Moreover, acquiring this information during the full range of disease stages is hard to achieve with a conventional random treatment allocation procedure, as poor treatments cause deterioration of subject health. We therefore aim to design an adaptive allocation strategy to improve the efficiency of data collection by allocating more samples for exploring promising treatments. We cast this application as a contextual bandit problem and introduce a simple and practical algorithm for exploration-exploitation in this framework. The work builds on a recent class of approaches for non-contextual bandits that relies on subsampling to compare treatment options using an equivalent amount of information. On the technical side, we extend the subsampling strategy to the case of bandits with context, by applying subsampling within Gaussian Process regression. On the experimental side, preliminary results using 10 mice with skin tumours suggest that the proposed approach extends by more than 50% the subjects life duration compared with baseline strategies: no treatment, random treatment allocation, and constant chemotherapeutic agent. By slowing the tumour growth rate, the adaptive procedure gathers information about treatment effectiveness on a broader range of tumour volumes, which is crucial for eventually deriving sequential pharmacological treatment strategies for cancer.}
}

Endnote

%0 Conference Paper
%T Contextual Bandits for Adapting Treatment in a Mouse Model of de Novo Carcinogenesis
%A Audrey Durand
%A Charis Achilleos
%A Demetris Iacovides
%A Katerina Strati
%A Georgios D. Mitsis
%A Joelle Pineau
%B Proceedings of the 3rd Machine Learning for Healthcare Conference
%C Proceedings of Machine Learning Research
%D 2018
%E Finale Doshi-Velez
%E Jim Fackler
%E Ken Jung
%E David Kale
%E Rajesh Ranganath
%E Byron Wallace
%E Jenna Wiens	
%F pmlr-v85-durand18a
%I PMLR
%P 67--82
%U https://proceedings.mlr.press/v85/durand18a.html
%V 85
%X In this work, we present a specific case study where we aim to design effective treatment allocation strategies and validate these using a mouse model of skin cancer. Collecting data for modelling treatments effectiveness on animal models is an expensive and time consuming process. Moreover, acquiring this information during the full range of disease stages is hard to achieve with a conventional random treatment allocation procedure, as poor treatments cause deterioration of subject health. We therefore aim to design an adaptive allocation strategy to improve the efficiency of data collection by allocating more samples for exploring promising treatments. We cast this application as a contextual bandit problem and introduce a simple and practical algorithm for exploration-exploitation in this framework. The work builds on a recent class of approaches for non-contextual bandits that relies on subsampling to compare treatment options using an equivalent amount of information. On the technical side, we extend the subsampling strategy to the case of bandits with context, by applying subsampling within Gaussian Process regression. On the experimental side, preliminary results using 10 mice with skin tumours suggest that the proposed approach extends by more than 50% the subjects life duration compared with baseline strategies: no treatment, random treatment allocation, and constant chemotherapeutic agent. By slowing the tumour growth rate, the adaptive procedure gathers information about treatment effectiveness on a broader range of tumour volumes, which is crucial for eventually deriving sequential pharmacological treatment strategies for cancer.

APA


Durand, A., Achilleos, C., Iacovides, D., Strati, K., Mitsis, G.D. & Pineau, J.. (2018). Contextual Bandits for Adapting Treatment in a Mouse Model of de Novo Carcinogenesis. Proceedings of the 3rd Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 85:67-82 Available from https://proceedings.mlr.press/v85/durand18a.html.

Related Material

Download PDF