Adaptive UAV Inspection of PV Panels Using Goal-Conditioned Reinforcement Learning and Zigzag Coverage Planning

Imen Habibi, Ikbal Msadaa, Khaled Grayaa
DLI 2025 Research Track, PMLR 302:1-17, 2026.

Abstract

Accurate and efficient inspection of photovoltaic (PV) panels is critical for early anomaly detection and energy yield optimization. This study presents an autonomous Unmanned Aerial Vehicles UAV-based inspection framework that leverages Goal-Conditioned Reinforcement Learning (GCRL) for adaptive path tracking. The UAV follows a mathematically defined zigzag trajectory while dynamically responding to disturbances such as wind drift. Instead of rigid waypoint following, the agent is conditioned on successive inspection goals and learns optimal movement strategies using Proximal Policy Optimization (PPO). The environment incorporates realistic wind noise and UAV momentum, requiring the policy to learn corrective behaviors under uncertainty. Simulation results demonstrate the agent’s ability to achieve robust full-surface coverage, minimize overlap, and maintain trajectory alignment, highlighting the effectiveness of this learning-based inspection strategy. Keywords: PV inspection, UAV path tracking, Goal-Conditioned Reinforcement Learning, PPO, Zigzag trajectory, Wind robustness, Autonomous drone coverage.

Cite this Paper


BibTeX
@InProceedings{pmlr-v302-habibi26a, title = {Adaptive UAV Inspection of PV Panels Using Goal-Conditioned Reinforcement Learning and Zigzag Coverage Planning}, author = {Habibi, Imen and Msadaa, Ikbal and Grayaa, Khaled}, booktitle = {DLI 2025 Research Track}, pages = {1--17}, year = {2026}, editor = {Haddad, Hatem and Kahira, Albert Njoroge and Bourhim, Sofia and Olatunji, Iyiola Emmanuel and Makhafola, Lesego and Mwase, Christine}, volume = {302}, series = {Proceedings of Machine Learning Research}, month = {17--22 Aug}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v302/main/assets/habibi26a/habibi26a.pdf}, url = {https://proceedings.mlr.press/v302/habibi26a.html}, abstract = {Accurate and efficient inspection of photovoltaic (PV) panels is critical for early anomaly detection and energy yield optimization. This study presents an autonomous Unmanned Aerial Vehicles UAV-based inspection framework that leverages Goal-Conditioned Reinforcement Learning (GCRL) for adaptive path tracking. The UAV follows a mathematically defined zigzag trajectory while dynamically responding to disturbances such as wind drift. Instead of rigid waypoint following, the agent is conditioned on successive inspection goals and learns optimal movement strategies using Proximal Policy Optimization (PPO). The environment incorporates realistic wind noise and UAV momentum, requiring the policy to learn corrective behaviors under uncertainty. Simulation results demonstrate the agent’s ability to achieve robust full-surface coverage, minimize overlap, and maintain trajectory alignment, highlighting the effectiveness of this learning-based inspection strategy. Keywords: PV inspection, UAV path tracking, Goal-Conditioned Reinforcement Learning, PPO, Zigzag trajectory, Wind robustness, Autonomous drone coverage.} }
Endnote
%0 Conference Paper %T Adaptive UAV Inspection of PV Panels Using Goal-Conditioned Reinforcement Learning and Zigzag Coverage Planning %A Imen Habibi %A Ikbal Msadaa %A Khaled Grayaa %B DLI 2025 Research Track %C Proceedings of Machine Learning Research %D 2026 %E Hatem Haddad %E Albert Njoroge Kahira %E Sofia Bourhim %E Iyiola Emmanuel Olatunji %E Lesego Makhafola %E Christine Mwase %F pmlr-v302-habibi26a %I PMLR %P 1--17 %U https://proceedings.mlr.press/v302/habibi26a.html %V 302 %X Accurate and efficient inspection of photovoltaic (PV) panels is critical for early anomaly detection and energy yield optimization. This study presents an autonomous Unmanned Aerial Vehicles UAV-based inspection framework that leverages Goal-Conditioned Reinforcement Learning (GCRL) for adaptive path tracking. The UAV follows a mathematically defined zigzag trajectory while dynamically responding to disturbances such as wind drift. Instead of rigid waypoint following, the agent is conditioned on successive inspection goals and learns optimal movement strategies using Proximal Policy Optimization (PPO). The environment incorporates realistic wind noise and UAV momentum, requiring the policy to learn corrective behaviors under uncertainty. Simulation results demonstrate the agent’s ability to achieve robust full-surface coverage, minimize overlap, and maintain trajectory alignment, highlighting the effectiveness of this learning-based inspection strategy. Keywords: PV inspection, UAV path tracking, Goal-Conditioned Reinforcement Learning, PPO, Zigzag trajectory, Wind robustness, Autonomous drone coverage.
APA
Habibi, I., Msadaa, I. & Grayaa, K.. (2026). Adaptive UAV Inspection of PV Panels Using Goal-Conditioned Reinforcement Learning and Zigzag Coverage Planning. DLI 2025 Research Track, in Proceedings of Machine Learning Research 302:1-17 Available from https://proceedings.mlr.press/v302/habibi26a.html.

Related Material