← Back ICRA 2026

Robot Deformable Object Manipulation Via NMPC-Generated Demonstrations in Deep Reinforcement Learning

Haoyuan Wang, Zihao Dong, Tong Zhu, HongLiang Lei, Weizhuang Shi, Zejia Zhang, Wei Luo, Weiwei Wan, Xinxing Chen, Jian Huang

PDF

AI summary

Key figure (auto-extracted from paper)

FADERL, a lightweight demonstration-enhanced RL framework augmented with NMPC-generated data, significantly boosts learning efficiency and achieves high real-world success rates in fabric manipulation without heavy computational costs.

Deformable object manipulation Deep reinforcement learning Demonstration learning NMPC Fuzzy systems Fabric manipulation

Problem

Robotic manipulation of deformable objects suffers from inefficient reinforcement learning and the high cost of collecting human demonstration data, while large vision-language models are too computationally heavy for real-time control.

Approach

The authors propose FADERL, which integrates fuzzy systems, generative adversarial behavior cloning, and conditional policy learning into deep RL, and uses nonlinear model predictive control to synthetically generate high-quality demonstration data at low cost.

Key results

Achieves 2.01× higher global average reward and reduces standard deviation to 45% compared to Rainbow-DDPG
NMPC-generated demonstrations match human demonstration performance in simulation
Physical fabric tasks achieve 83.3%, 80.0%, and 96.7% success rates for diagonal folding, central-axis folding, and flattening
Provides a lightweight, task-specific alternative to computationally intensive large-scale vision-language models

Why it matters

Enables efficient, low-cost robotic manipulation of deformable objects for practical applications in manufacturing, medical surgery, and service robotics.

Abstract

In this work, we conducted research on deformable object manipulation by robots based on demonstration-enhanced reinforcement learning (RL). We present FADERL (Fuzzy- Augmented Demonstration-Embedded Reinforcement Learning), a novel framework for robotic manipulation of deformable objects that significantly improves reinforcement learning efficiency through synergistic unification of High-Dimensional Takagi-Sugeno-Kang (HTSK) fuzzy systems, Generative Adversarial Behavior Cloning (GABC), and Conditional Policy Learning (CPL). Compared to the Rainbow-DDPG baseline, FADERL achieves 2.01× higher global average reward and reduces standard deviation to 45% while requiring fewer computational resources. To address the high cost of human demonstration collection, we introduce a Nonlinear Model Predictive Control (NMPC)-based data augmentation method that generates high-quality demonstrations at minimal cost. Simulation results demonstrate that NMPC-generated demonstrations enable FADERL to achieve performance comparable to human demonstrations. Physical experiments on fabric manipulation tasks—diagonal folding, central-axis folding, and flattening—achieve success rates of 83.3%, 80.0%, and 96.7% respectively, validating our approach’s effectiveness in real-world scenarios. Unlike computationally intensive large-model approaches, FADERL provides a lightweight, Received 7 June 2025; revised 17 September 2025; accepted 26 Octo- ber 2025. Date of publication 3 November 2025; date of current version 13 November 2025. This article was recommended for publication by Asso- ciate Editor C. Zeng and Editor X. Liu upon evaluation of the reviewers’ comments. This work was supported in part by Hubei Science and Technology Major Project under Grant 2024BAA007, in part by the National Natural Science Foundation of China under Grant 62333007 and Grant U24A20280, and in part by Hubei Provincial Technology Innovation Program under Grant 2025DJA047. (Haoyuan Wang and Zihao Dong contributed equally to this work.) (Corresponding authors: Xinxing Chen; Jian Huang.) Haoyuan Wang, Tong Zhu, Hongliang Lei, Weizhuang Shi, Zejia Zhang, Xinxing Chen, and Jian Huang are with Hubei Key Laboratory of Brain- Inspired Intelligent Systems, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology (HUST), Wuhan, Hubei 430074, China, and also with the Key Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automa- tion, Huazhong University of Science and Technology (HUST), Wuhan, Hubei 430074, China (e-mail: why427@hust.edu.cn; zt1021@hust.edu.cn; leihl@hust.edu.cn; swz@hust.edu.cn; zejiazhang@hust.edu.cn; cxx@hust. edu.cn; huang jan@hust.edu.cn). Zihao Dong is with China Academy of Aerospace System and Innovation, Beijing 100032, China (e-mail: ddzh130@163.com). Wei Luo is with the Department of Innovation Center, China Ship Development and Design Center, Wuhan, Hubei 430064, China (e-mail: csddc weiluo@163.com). Weiwei Wan is with the Graduate School of Engineering Science, The Osaka University, Toyonaka 560-0043, Japan (e-mail: wan@sys.es.osaka-u.ac. jp). This article has supplementary downloadable material available at https://doi.org/10.1109/TASE.2025.3627775, provided by the authors. Digital Object Identifier 10.1109/TASE.2025.3627775 task-specific solution with efficient adaptability, making it suitable for practical robotic applications in manufacturing, medical surgery, and service robotics. Note to Practitioners—This study addresses the challenges of efficiency and cost in robotic manipulation of deformable objects by proposing a lightweight algorithmic framework (FADERL) and a low-cost demonstration data augmentation approach based on NMPC. Traditional methods often rely on extensive manual demonstration data or high computational resources. Our solution integrates fuzzy system optimization, reinforcement learning enhancements, and automated demonstration generation to improve success rates in tasks like folding and flattening while reducing data acquisition costs. Experimental results demonstrate robust performance in both simulation and real- world scenarios. The methodology is applicable to industrial applications (e.g., flexible cable assembly), medical surgery (e.g., soft tissue suturing), and household service robots (e.g., clothing organization). Compared to current large-scale models, the proposed algorithm requires fewer computational resources and supports task-specific customization.

Index terms

Deep Learning in Grasping and Manipulation Reinforcement Learning Dexterous Manipulation