Bridging Perception and Planning: Towards End-To-End Planning for Signal Temporal Logic Tasks
Bowen Ye, Junyue Huang, Yang Liu, Xiaozhen Qiao, Xiang Yin
AI summary
Problem
Current Signal Temporal Logic (STL) planning methods rely on pre-defined maps or structured abstractions, making them brittle in unstructured real-world environments where perception and planning are decoupled.
Approach
S-MSP is a differentiable end-to-end transformer that ingests synchronized multi-view camera images and an STL specification to directly output a feasible trajectory, using a structure-aware Mixture-of-Experts to route temporal sub-tasks to specialized experts.
Key results
- First end-to-end baseline for STL-constrained trajectory synthesis from raw multi-view camera observations
- Structure-aware MoE model that decomposes STL formulas into temporally anchored sub-tasks for efficient learning
- High-fidelity Gazebo-based benchmark dataset with synchronized multi-view imagery and annotated STL specifications
- State-of-the-art STL satisfaction and trajectory feasibility with improved performance and no additional planning latency
Why it matters
It enables robust, perception-driven autonomous planning for complex temporal tasks in unstructured environments, advancing end-to-end robotics and safe autonomous systems.
Abstract
We investigate the task and motion planning problem for Signal Temporal Logic (STL) specifications in robotics. Existing STL methods rely on pre-defined maps or mobility representations, which are ineffective in unstruc- tured real-world environments. We propose the Structured- MoE STL Planner (S-MSP), a differentiable framework that maps synchronized multi-view camera observations and an STL specification directly to a feasible trajectory. S-MSP integrates STL constraints within a unified pipeline, trained with a composite loss that combines trajectory reconstruction and STL robustness. A structure-aware Mixture-of-Experts (MoE) model enables horizon-aware specialization by projecting sub-tasks into temporally anchored embeddings. We evaluate S-MSP using a high-fidelity simulation of factory-logistics scenarios with temporally constrained tasks. Experiments show that S- MSP outperforms single-expert baselines in STL satisfaction and trajectory feasibility. A rule-based safety filter at inference improves physical executability without compromising logical correctness, showcasing the practicality of the approach.