← Back ICRA 2026

CATALYST: Cognitive-To-Autonomy-Inspired Two-Stage Training Data Generation with Local-System-Aware Selection Technique

Taehoon Kim, Sehoon Oh

PDF

AI summary

Key figure (auto-extracted from paper)

Physics-guided, two-stage training data generation yields more accurate dynamics regression and reliable feedforward control than conventional sampling methods.

robotic dynamics training data generation physics-informed learning probabilistic local models trajectory optimization feedforward control

Problem

Conventional learning-based robotic dynamics modeling relies on random or uniform data sampling, which often misses dynamically critical regions and degrades model performance and generalization.

Approach

The CATALYST framework first identifies optimal local model centers using CAD-derived inertia matrices, then optimizes excitation trajectories to visit these centers while enforcing physical constraints and maximizing informative velocity-acceleration statistics.

Key results

Identifies optimal GMM cluster centers using CAD inertia priors
Generates operating-point-centered excitation trajectories satisfying RoM and statistical constraints
Achieves lower torque regression error than Spread, RoM, Tukey-chirp, and cubic baselines
Delivers more reliable feedforward control performance in simulation

Why it matters

It provides a principled, physics-informed pipeline for training data design that enhances sample efficiency and control accuracy for data-driven robotic dynamics modeling.

Abstract

In conventional learning-based robotic dynamics modeling, physical information is mostly incorporated into the model or loss function, while the design of training data often relies on random sampling or uniform coverage, which can limit performance. To address this gap, this paper proposes the Cognitive-to-Autonomy-inspired Two-stage trAining data generation with Local-sYstem-aware Selection Technique (CAT- ALYST) framework, which generates optimal training data based on physics priors and the modeling structure of the chosen learning model. Stage 1 uses the CAD-derived inertia matrix M(q) to approximate the joint distribution of [q, M] with a Probabilistic Local Model (PLM), thereby identifying the optimal locations for the local model centers (μopt k ). Stage 2 then optimizes an Operating-Point-Centered Excitation Trajec- tory (OPCET). This optimization simultaneously (i) aligns the trajectory with the target operating points (lm), (ii) enforces range-of-motion (RoM) constraints (lr), and (iii) achieves de- sirable velocity–acceleration statistics (large volume, isotropy, low correlation, captured by ls). We validate the approach in simulation using a 3-DoF yaw–pitch–pitch manipulator, which allows visual demonstration of the process and outcomes. We then analyze the framework step by step. Results show that each stage meets its objective. A PLM trained on data generated by the proposed trajectories outperforms baselines (Spread/RoM, ill-centered, Tukey-windowed chirp, and cubic) in both torque regression and control. Thus, CATALYST yields more accurate regression and more reliable feedforward control than conventional designs.

Index terms

Data Sets for Robot Learning Integrated Planning and Learning Task and Motion Planning