Quantum Policy Gradient Algorithm with Optimized Action Decoding

Nico Meyer; Daniel Scherer; Axel Plinge; Christopher Mutschler; Michael Hartmann

Quantum Policy Gradient Algorithm with Optimized Action Decoding

Nico Meyer, Daniel Scherer, Axel Plinge, Christopher Mutschler, Michael Hartmann

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:24592-24613, 2023.

Abstract

Quantum machine learning implemented by variational quantum circuits (VQCs) is considered a promising concept for the noisy intermediate-scale quantum computing era. Focusing on applications in quantum reinforcement learning, we propose an action decoding procedure for a quantum policy gradient approach. We introduce a quality measure that enables us to optimize the classical post-processing required for action selection, inspired by local and global quantum measurements. The resulting algorithm demonstrates a significant performance improvement in several benchmark environments. With this technique, we successfully execute a full training routine on a 5-qubit hardware device. Our method introduces only negligible classical overhead and has the potential to improve VQC-based algorithms beyond the field of quantum reinforcement learning.

Cite this Paper

BibTeX


@InProceedings{pmlr-v202-meyer23a,
  title = 	 {Quantum Policy Gradient Algorithm with Optimized Action Decoding},
  author =       {Meyer, Nico and Scherer, Daniel and Plinge, Axel and Mutschler, Christopher and Hartmann, Michael},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {24592--24613},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/meyer23a/meyer23a.pdf},
  url = 	 {https://proceedings.mlr.press/v202/meyer23a.html},
  abstract = 	 {Quantum machine learning implemented by variational quantum circuits (VQCs) is considered a promising concept for the noisy intermediate-scale quantum computing era. Focusing on applications in quantum reinforcement learning, we propose an action decoding procedure for a quantum policy gradient approach. We introduce a quality measure that enables us to optimize the classical post-processing required for action selection, inspired by local and global quantum measurements. The resulting algorithm demonstrates a significant performance improvement in several benchmark environments. With this technique, we successfully execute a full training routine on a 5-qubit hardware device. Our method introduces only negligible classical overhead and has the potential to improve VQC-based algorithms beyond the field of quantum reinforcement learning.}
}

Endnote

%0 Conference Paper
%T Quantum Policy Gradient Algorithm with Optimized Action Decoding
%A Nico Meyer
%A Daniel Scherer
%A Axel Plinge
%A Christopher Mutschler
%A Michael Hartmann
%B Proceedings of the 40th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Andreas Krause
%E Emma Brunskill
%E Kyunghyun Cho
%E Barbara Engelhardt
%E Sivan Sabato
%E Jonathan Scarlett	
%F pmlr-v202-meyer23a
%I PMLR
%P 24592--24613
%U https://proceedings.mlr.press/v202/meyer23a.html
%V 202
%X Quantum machine learning implemented by variational quantum circuits (VQCs) is considered a promising concept for the noisy intermediate-scale quantum computing era. Focusing on applications in quantum reinforcement learning, we propose an action decoding procedure for a quantum policy gradient approach. We introduce a quality measure that enables us to optimize the classical post-processing required for action selection, inspired by local and global quantum measurements. The resulting algorithm demonstrates a significant performance improvement in several benchmark environments. With this technique, we successfully execute a full training routine on a 5-qubit hardware device. Our method introduces only negligible classical overhead and has the potential to improve VQC-based algorithms beyond the field of quantum reinforcement learning.

APA


Meyer, N., Scherer, D., Plinge, A., Mutschler, C. & Hartmann, M.. (2023). Quantum Policy Gradient Algorithm with Optimized Action Decoding. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:24592-24613 Available from https://proceedings.mlr.press/v202/meyer23a.html.

Quantum Policy Gradient Algorithm with Optimized Action Decoding

Abstract

Cite this Paper

Related Material