KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Balamurali Murugesan, Sricharan Vijayarangan, Kaushik Sarveswaran, Keerthi Ram, Mohanasankar Sivaprakasam
Proceedings of the Third Conference on Medical Imaging with Deep Learning, PMLR 121:515-526, 2020.

Abstract

Deep learning networks are being developed in every stage of the MRI workflow and have provided state-of-the-art results. However, this has come at the cost of increased computation requirement and storage. Hence, replacing the networks with compact models at various stages in the MRI workflow can significantly reduce the required storage space and provide considerable speedup. In computer vision, knowledge distillation is a commonly used method for model compression. In our work, we propose a knowledge distillation (KD) framework for the image to image problems in the MRI workflow in order to develop compact, low-parameter models without a significant drop in performance. We propose a combination of the attention-based feature distillation method and imitation loss and demonstrate its effectiveness on the popular MRI reconstruction architecture, DC-CNN. We conduct extensive experiments using Cardiac, Brain, and Knee MRI datasets for 4x, 5x and 8x accelerations. We observed that the student network trained with the assistance of the teacher using our proposed KD framework provided significant improvement over the student network trained without assistance across all the datasets and acceleration factors. Specifically, for the Knee dataset, the student network achieves $65%$ parameter reduction, 2x faster CPU running time, and 1.5x faster GPU running time compared to the teacher. Furthermore, we compare our attention-based feature distillation method with other feature distillation methods. We also conduct an ablative study to understand the significance of attention-based distillation and imitation loss. We also extend our KD framework for MRI super-resolution and show encouraging results.

Cite this Paper


BibTeX
@InProceedings{pmlr-v121-murugesan20a, title = {KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow}, author = {Murugesan, Balamurali and Vijayarangan, Sricharan and Sarveswaran, Kaushik and Ram, Keerthi and Sivaprakasam, Mohanasankar}, booktitle = {Proceedings of the Third Conference on Medical Imaging with Deep Learning}, pages = {515--526}, year = {2020}, editor = {Arbel, Tal and Ben Ayed, Ismail and de Bruijne, Marleen and Descoteaux, Maxime and Lombaert, Herve and Pal, Christopher}, volume = {121}, series = {Proceedings of Machine Learning Research}, month = {06--08 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v121/murugesan20a/murugesan20a.pdf}, url = {https://proceedings.mlr.press/v121/murugesan20a.html}, abstract = {Deep learning networks are being developed in every stage of the MRI workflow and have provided state-of-the-art results. However, this has come at the cost of increased computation requirement and storage. Hence, replacing the networks with compact models at various stages in the MRI workflow can significantly reduce the required storage space and provide considerable speedup. In computer vision, knowledge distillation is a commonly used method for model compression. In our work, we propose a knowledge distillation (KD) framework for the image to image problems in the MRI workflow in order to develop compact, low-parameter models without a significant drop in performance. We propose a combination of the attention-based feature distillation method and imitation loss and demonstrate its effectiveness on the popular MRI reconstruction architecture, DC-CNN. We conduct extensive experiments using Cardiac, Brain, and Knee MRI datasets for 4x, 5x and 8x accelerations. We observed that the student network trained with the assistance of the teacher using our proposed KD framework provided significant improvement over the student network trained without assistance across all the datasets and acceleration factors. Specifically, for the Knee dataset, the student network achieves $65%$ parameter reduction, 2x faster CPU running time, and 1.5x faster GPU running time compared to the teacher. Furthermore, we compare our attention-based feature distillation method with other feature distillation methods. We also conduct an ablative study to understand the significance of attention-based distillation and imitation loss. We also extend our KD framework for MRI super-resolution and show encouraging results.} }
Endnote
%0 Conference Paper %T KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow %A Balamurali Murugesan %A Sricharan Vijayarangan %A Kaushik Sarveswaran %A Keerthi Ram %A Mohanasankar Sivaprakasam %B Proceedings of the Third Conference on Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2020 %E Tal Arbel %E Ismail Ben Ayed %E Marleen de Bruijne %E Maxime Descoteaux %E Herve Lombaert %E Christopher Pal %F pmlr-v121-murugesan20a %I PMLR %P 515--526 %U https://proceedings.mlr.press/v121/murugesan20a.html %V 121 %X Deep learning networks are being developed in every stage of the MRI workflow and have provided state-of-the-art results. However, this has come at the cost of increased computation requirement and storage. Hence, replacing the networks with compact models at various stages in the MRI workflow can significantly reduce the required storage space and provide considerable speedup. In computer vision, knowledge distillation is a commonly used method for model compression. In our work, we propose a knowledge distillation (KD) framework for the image to image problems in the MRI workflow in order to develop compact, low-parameter models without a significant drop in performance. We propose a combination of the attention-based feature distillation method and imitation loss and demonstrate its effectiveness on the popular MRI reconstruction architecture, DC-CNN. We conduct extensive experiments using Cardiac, Brain, and Knee MRI datasets for 4x, 5x and 8x accelerations. We observed that the student network trained with the assistance of the teacher using our proposed KD framework provided significant improvement over the student network trained without assistance across all the datasets and acceleration factors. Specifically, for the Knee dataset, the student network achieves $65%$ parameter reduction, 2x faster CPU running time, and 1.5x faster GPU running time compared to the teacher. Furthermore, we compare our attention-based feature distillation method with other feature distillation methods. We also conduct an ablative study to understand the significance of attention-based distillation and imitation loss. We also extend our KD framework for MRI super-resolution and show encouraging results.
APA
Murugesan, B., Vijayarangan, S., Sarveswaran, K., Ram, K. & Sivaprakasam, M.. (2020). KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow. Proceedings of the Third Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 121:515-526 Available from https://proceedings.mlr.press/v121/murugesan20a.html.

Related Material