Semantically-Driven Deep Reinforcement Learning for Inspection Path Planning
Grzegorz Malczyk, Mihir Kulkarni, Kostas Alexis
AI summary
Problem
Traditional exploration algorithms inefficiently cover entire environments without prioritizing relevant targets, while existing reinforcement learning methods lack semantic awareness and struggle to transfer to real-world robotic platforms.
Approach
The method trains an end-to-end deep RL policy using only local ego-centric occupancy maps, spatial visit history, and semantic-masked depth images to simultaneously learn collision-free navigation and targeted visual inspection.
Key results
- Achieves up to 92.1% semantic surface coverage in simulated environments
- Successfully transfers trained policy to real-world aerial robot flights
- Boosts inspection coverage by 22.3% using spatial visit score maps
- Generalizes across unknown scenes and unseen semantic classes without prior maps
Why it matters
Enables efficient, targeted autonomous inspection for industrial monitoring and disaster response by focusing computational and navigational resources on objects of substantive interest.
Abstract
This paper introduces a novel semantics-aware in- spection planning policy derived through deep reinforcement learning. Reflecting the fact that within autonomous informative path planning missions in unknown environments, it is often only a sparse set of objects of interest that need to be inspected, the method contributes an end-to-end policy that simultaneously per- forms semantic object visual inspection combined with collision- free navigation. Assuming access only to the instantaneous depth map, the associated segmentation image, the ego-centric local occupancy, and the history of past positions in the robot’s neighborhood, the method demonstrates robust generalizability and successful crossing of the sim2real gap. Beyond simulations and extensive comparison studies, the approach is verified in experimental evaluations onboard a flying robot deployed in novel environments with previously unseen semantics and overall geometric configurations.