Quantum Policy Gradient Algorithm with Optimized Action Decoding

Nico Meyer, Daniel Scherer, Axel Plinge, Christopher Mutschler, Michael Hartmann
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:24592-24613, 2023.

Abstract

Quantum machine learning implemented by variational quantum circuits (VQCs) is considered a promising concept for the noisy intermediate-scale quantum computing era. Focusing on applications in quantum reinforcement learning, we propose an action decoding procedure for a quantum policy gradient approach. We introduce a quality measure that enables us to optimize the classical post-processing required for action selection, inspired by local and global quantum measurements. The resulting algorithm demonstrates a significant performance improvement in several benchmark environments. With this technique, we successfully execute a full training routine on a 5-qubit hardware device. Our method introduces only negligible classical overhead and has the potential to improve VQC-based algorithms beyond the field of quantum reinforcement learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-meyer23a, title = {Quantum Policy Gradient Algorithm with Optimized Action Decoding}, author = {Meyer, Nico and Scherer, Daniel and Plinge, Axel and Mutschler, Christopher and Hartmann, Michael}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {24592--24613}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/meyer23a/meyer23a.pdf}, url = {https://proceedings.mlr.press/v202/meyer23a.html}, abstract = {Quantum machine learning implemented by variational quantum circuits (VQCs) is considered a promising concept for the noisy intermediate-scale quantum computing era. Focusing on applications in quantum reinforcement learning, we propose an action decoding procedure for a quantum policy gradient approach. We introduce a quality measure that enables us to optimize the classical post-processing required for action selection, inspired by local and global quantum measurements. The resulting algorithm demonstrates a significant performance improvement in several benchmark environments. With this technique, we successfully execute a full training routine on a 5-qubit hardware device. Our method introduces only negligible classical overhead and has the potential to improve VQC-based algorithms beyond the field of quantum reinforcement learning.} }
Endnote
%0 Conference Paper %T Quantum Policy Gradient Algorithm with Optimized Action Decoding %A Nico Meyer %A Daniel Scherer %A Axel Plinge %A Christopher Mutschler %A Michael Hartmann %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-meyer23a %I PMLR %P 24592--24613 %U https://proceedings.mlr.press/v202/meyer23a.html %V 202 %X Quantum machine learning implemented by variational quantum circuits (VQCs) is considered a promising concept for the noisy intermediate-scale quantum computing era. Focusing on applications in quantum reinforcement learning, we propose an action decoding procedure for a quantum policy gradient approach. We introduce a quality measure that enables us to optimize the classical post-processing required for action selection, inspired by local and global quantum measurements. The resulting algorithm demonstrates a significant performance improvement in several benchmark environments. With this technique, we successfully execute a full training routine on a 5-qubit hardware device. Our method introduces only negligible classical overhead and has the potential to improve VQC-based algorithms beyond the field of quantum reinforcement learning.
APA
Meyer, N., Scherer, D., Plinge, A., Mutschler, C. & Hartmann, M.. (2023). Quantum Policy Gradient Algorithm with Optimized Action Decoding. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:24592-24613 Available from https://proceedings.mlr.press/v202/meyer23a.html.

Related Material