Motion Manifold Flow Primitives for Task-Conditioned Trajectory Generation under Complex Task-Motion Dependencies
Yonghyeon Lee, Byeongho Lee, Seungyeon Kim, Frank Park
AI summary
Problem
Existing manifold-based trajectory models rely on shared latent priors and conditional autoencoders, causing them to fail when motion distributions shift drastically across different task parameters like language commands. Additionally, high-dimensional trajectory data and limited demonstration datasets hinder effective learning.
Approach
MMFP separates manifold learning from conditional distribution modeling by training a flow matching model in the latent space of a pre-learned trajectory manifold, allowing it to adaptively capture complex, task-dependent motion shifts.
Key results
- Captures complex task-motion dependencies where prior methods fail
- Outperforms diffusion, flow matching, and manifold baselines in accuracy and distribution similarity
- Successfully generates diverse trajectories for many-to-many text-motion mappings
- Demonstrates superior interpolation and generalization over latent diffusion models with limited data
Why it matters
Provides a scalable framework for robots to generate diverse, task-adaptive motions from complex inputs like natural language, advancing imitation learning and autonomous navigation.
Abstract
Effective movement primitives should be capable of encoding and generating a rich repertoire of trajectories condi- tioned on task-defining parameters such as vision or language inputs. While recent methods based on the motion manifold hypothesis, which assumes that a set of trajectories lies on a lower-dimensional nonlinear subspace, address challenges such as limited dataset size and the high dimensionality of trajectory data, they often struggle to capture complex task-motion depen- dencies, i.e., when motion distributions shift drastically with task variations. To address this, we introduce Motion Manifold Flow Primitives (MMFP), a framework that decouples the training of the motion manifold from task-conditioned distributions. Specifically, we employ flow matching models, state-of-the-art conditional deep generative models, to learn task-conditioned distributions in the latent coordinate space of the learned motion manifold. Experiments are conducted on language-guided trajectory generation tasks, where many-to-many text-motion correspondences introduce complex task-motion dependencies, highlighting MMFP’s superiority over existing methods.