Wait, That Feels Familiar: Learning to Extrapolate Human Preferences for Preference-Aligned Path Planning
Haresh Karnan, Elvin Yang, Garrett Warnell, Peter Stone, Joydeep Biswas
Abstract
Autonomous mobility tasks such as last-mile de- livery require reasoning about operator-indicated preferences over terrains on which the robot should navigate to ensure both robot safety and mission success. However, coping with out of distribution data from novel terrains or appearance changes due to lighting variations remains a fundamental problem in visual terrain-adaptive navigation. Existing solutions either require labor-intensive manual data re-collection and labeling or use hand-coded reward functions that may not align with operator preferences. In this work, we posit that operator preferences for visually novel terrains, which the robot should adhere to, can often be extrapolated from established terrain preferences within the inertial-proprioceptive-tactile domain. Leveraging this insight, we introduce Preference extrApolation for Terrain-awarE Robot Navigation (PATERN), a novel frame- work for extrapolating operator terrain preferences for visual navigation. PATERN learns to map inertial-proprioceptive-tactile measurements from the robot’s observations to a representation space and performs nearest-neighbor search in this space to estimate operator preferences over novel terrains. Through physical robot experiments in outdoor environments, we assess PATERN’s capability to extrapolate preferences and generalize to novel terrains and challenging lighting conditions. Compared to baseline approaches, our findings indicate that PATERN 1 robustly generalizes to diverse terrains and varied lighting conditions, while navigating in a preference-aligned manner.