A Single-Stage Spectrum-Domain Network for Trajectory Prediction
Beihao Xia, Qinmu Peng, Xinge You
AI summary
Problem
Existing spectrum-domain trajectory prediction methods process low- and high-frequency components independently, missing their complementary dynamics, while two-stage approaches suffer from error propagation and high computational latency.
Approach
S3-Net decomposes observed trajectories into frequency spectra using a Discrete Fourier Transform, then applies a bilinear fusion module to explicitly model cross-interactions between global trends and local variations before decoding future paths.
Key results
- Up to 16.8%/15.1% ADE/FDE reduction over single-stage spectrum baselines on SDD
- Top-two performance across multiple ETH-UCY subsets
- Compact model size and low inference latency for real-time use
- Bilinear fusion consistently boosts accuracy on ETH-UCY and SDD
Why it matters
Provides a highly efficient and accurate prediction framework suitable for real-time deployment in autonomous driving, robot navigation, and multi-agent systems.
Abstract
Trajectory prediction is a fundamental yet chal- lenging task in intelligent systems. Existing methods are mainly categorized as single-stage time-domain, two-stage time-domain, or two-stage spectrum-domain approaches, while single-stage spectrum-domain methods have been relatively underexplored. In the frequency domain, low-frequency components reflect global motion trends, while high-frequency components capture fine-grained local variations. Most existing spectrum-domain approaches process these components independently, overlook- ing their intrinsic complementarity. Inspired by the success of bilinear models in explicitly capturing cross-factor interactions, we propose S3-Net, a single-stage spectrum-domain trajectory prediction network with a bilinear fusion module that integrates low- and high-frequency dynamics. This design yields richer spectral representations and enables accurate, socially compli- ant, and multimodal predictions. Experiments on the ETH-UCY and Stanford Drone Datasets demonstrate that S3-Net achieves up to 16.8%/15.1% ADE/FDE reduction over spectrum-domain baselines while maintaining a compact model size and low inference latency, making it suitable for real-time scenarios.