← Back

vla for contact rich manipulation

done top 25 · 25 papers

100 relevance

CRAFT: Adapting VLA Models to Contact-Rich Manipulation Via Force-Aware Curriculum Fine-Tuning

Yike Zhang, Yaonan Wang, Xinxin Sun, Kaizhen Huang, Zhiyuan Xu, Ji Junjie, Zhengping Che, Jian Tang, Kangcheng Liu, Jingtao Sun

The paper directly addresses the topic by proposing a framework specifically designed to adapt VLA models for contact-rich manipulation.

Graphic Abstract PDF
100 relevance

Audio-VLA: Adding Contact Audio Perception to Vision-Language-Action Model for Robotic Manipulation

Xiangyi Wei, Haotian Zhang, Xinyi Cao, Siyu Xie, Weifeng Ge, Yang Li, Changbo Wang

The paper directly proposes a Vision-Language-Action (VLA) model specifically designed to enhance contact-rich manipulation by incorporating audio perception.

Graphic Abstract PDF
100 relevance

FD-VLA: Force-Distilled Vision-Language-Action Model for Contact-Rich Manipulation

Ruiteng Zhao, Wenshuo Wang, Yicheng Ma, Xiaocong Li, Francis TAY, Marcelo H Ang Jr, Haiyue Zhu

The paper explicitly focuses on developing a Vision-Language-Action (VLA) model specifically for contact-rich manipulation.

Graphic Abstract PDF
100 relevance

Enhancing VLA Precision in Robotic Manipulation Via FiLM-Based Force/Torque-Vision Integration

Gunhee Nam, Ayoung Hong

The paper directly proposes a method to enhance VLA models specifically for contact-rich robotic manipulation tasks using Force/Torque integration.

PDF
100 relevance

Learning End-To-End Dexterous Arm-Hand VLA Policies with Shared Autonomy: DexGrasp AI Copilot for Efficient Teleoperation

Yu Cui, Yujian Zhang, Lina Tao, Yang Li, Xinyu Yi, Zhibin (Alex) Li

The paper directly addresses the development of an end-to-end VLA policy for dexterous arm-hand manipulation using tactile feedback and force-adaptive actions, which is a core example of contact-rich manipulation.

Graphic Abstract PDF
100 relevance

MLA: A Multisensory Language�Action Model for Multimodal Understanding and Forecasting in Robotic Manipulation

Zhuoyang Liu, Jiaming Liu, Jiadong XU, Nuowei Han, Chenyang Gu, Hao Chen, kaichen zhou, Renrui Zhang, Kai Chin Hsieh, Kun Wu, Zhengping Che, Jian Tang, Shanghang Zhang

The paper explicitly focuses on a Multisensory Language-Action model designed specifically to improve complex and contact-rich robotic manipulation by integrating tactile tokens and other sensory modalities.

Graphic Abstract PDF
95 relevance

Dexora: Open-Source VLA for High-DoF Bimanual Dexterity

Hang Zhao, Pengwei Wang, Shanghang Zhang,, Guocai Yao, Jianyu Chen, Hongyang Li, Hao Zhao

The paper presents a VLA system specifically designed for high-DoF bimanual dexterity, which is inherently centered on contact-rich manipulation.

Graphic Abstract PDF
95 relevance

NeuroVLA: Surgical Scenario-Aware Learning of Debulking Skills in Endoscopic Robotic Neurosurgery Via Vision-Language-Action Model

Tat Ming Danny Chan, Hongbin Liu, Renzhi Wang, and Hongliang Ren

The paper explicitly proposes a Vision-Language-Action (VLA) model for surgical debulking, which inherently involves contact-rich manipulation tasks such as grasping and transferring.

Graphic Abstract PDF
90 relevance

IMPACT: Intelligent Motion Planning with Acceptable Contact Trajectories Via Vision-Language Models

Yiyang Ling, Karan Owalekar, Oluwatobiloba Adesanya, Erdem Bıyık, Daniel Seita

The paper directly addresses contact-rich manipulation by leveraging Vision-Language Models (VLMs) to determine acceptable contacts and guide motion planning.

Graphic Abstract PDF
85 relevance

Rethinking the Practicality of Vision-Language-Action Model: A Comprehensive Benchmark and an Improved Baseline

Wenxuan Song, Jiayi Chen, Xiaoquan Sun, Huashuo Lei, Yikai Qin, wei zhao, Pengxiang Ding, Han Zhao, Tongxin Wang, Pengxu Hou, Zhide Zhong, Haodong Yan, Donglin Wang, Jun Ma, Haoang Li

The paper directly addresses Vision-Language-Action (VLA) models for various manipulation tasks across different embodiments, which is central to contact-rich manipulation.

Graphic Abstract PDF
85 relevance

DAM-VLA: A Dynamic Action Model-Based Vision-Language-Action Framework for Robot Manipulation

Xiongfeng Peng, Jiaqian Yu, dingzhe li, Yixiang Jin, Lu Xu, Mao Yamin, Chao Zhang, Weiming Li, Sujin Jang, Dongwook Lee, Daehyun Ji

The paper presents a VLA framework that explicitly addresses the need for precise manipulation by separating gross arm movement from fine gripper control, which is fundamental for contact-rich tasks.

Graphic Abstract PDF
85 relevance

OmniVLA: Physically-Grounded Multimodal VLA with Unified Multi-Sensor Perception for Robotic Manipulation

Heyu Guo, Shanmu Wang, Ruichun Ma, Shiqi Jiang, Yasaman Ghasempour, Omid Abari, Baining Guo, Lili Qiu

The paper proposes a VLA model integrating non-RGB sensors (radar, infrared, audio) which are critical for overcoming visual occlusions and perceiving physical interactions typical of contact-rich manipulation.

Graphic Abstract PDF
85 relevance

ManipForce: Force-Guided Policy Learning with Frequency-Aware Representation for Contact-Rich Manipulation

Geonhyup Lee, Youngjin Lee, Kangmin Kim, Seongju Lee, Sangjun Noh, Seunghyeok Back, Kyoobin Lee

The paper directly addresses contact-rich manipulation using a multimodal transformer policy combining vision and force data, although it focuses on Vision-Force-Action rather than incorporating Language.

Graphic Abstract PDF
85 relevance

IVRA: Improving Visual-Token Relations for Robot Action Policy with Training-Free Hint-Based Guidance

Jongwoo Park, Kanchana Ranasinghe, Jinhyeok Jang, Cristina Mata, Yoo Sung Jang, Michael S. Ryoo

The paper focuses on improving the spatial understanding of VLA models for precise robotic manipulation across benchmarks like LIBERO, which is highly relevant to the requirements of contact-rich tasks.

Graphic Abstract PDF
75 relevance

AugVLA-3D: Depth-Driven Feature Augmentation for Vision-Language-Action Models

Zhifeng Rao,∗, Wenlong Chen∗, Lei Xie, Xia Hua, Dongfu Yin, Zhen Tian, F. Richard Yu

The paper focuses on improving VLA models' spatial understanding via depth augmentation, which is critical for precise robotic manipulation, although it does not explicitly address the tactile or force-feedback aspects typical of 'contact-rich' tasks.

Graphic Abstract PDF
75 relevance

RealMirror: A Comprehensive, Open-Source Vision-Language-Action Platform for Embodied AI

Cong Tai, Zhaoyu Zheng, Haixu Long, Hansheng Wu, Haodong Xiang, Zhengbin Long, Jun Xiong, Rong Shi, Shizhuang Zhang, Gang Qiu, He Wang, Ruifeng Li, Jun Huang, Bin Chang, Shuai Feng, Tao Shen

The paper presents a comprehensive VLA platform and benchmark for humanoid robots, which are central to contact-rich manipulation, though it focuses more on the infrastructure and Sim2Real transfer than specific contact-rich techniques.

Graphic Abstract PDF
75 relevance

Goal-VLA: Image-Generative VLMs as Object-Centric World Models Empowering Zero-shot Robot Manipulation

Haonan Chen, Jingxiang Guo, Bangjun Wang, Tianrui Zhang, Xuchuan Huang, Yiwen Hou, Boren Zheng, Chenrui Tie, Jiajun Deng, Lin Shao

The paper proposes a VLA framework for robotic manipulation using generative VLMs, but it focuses more on high-level goal generation and spatial reasoning than specifically on contact-rich dynamics.

Graphic Abstract PDF
65 relevance

VLA-Reasoner: Empowering Vision-Language-Action Models with Reasoning Via Online Monte Carlo Tree Search

wenkai@e.ntu.edu.sg, ziwei.wang@ntu.edu.sg

The paper focuses on improving VLA performance for long-horizon general robotic manipulation via MCTS and reasoning, but does not explicitly address the specific challenges of contact-rich manipulation such as force control or tactile feedback.

Graphic Abstract PDF
45 relevance

The Better You Learn, the Smarter You Prune: Towards Efficient Vision-Language-Action Models Via Differentiable Token Pruning

Titong Jiang, Xuefeng Jiang, Yuan Ma, Xin Wen, Bailin Li, Kun Zhan, Peng Jia, Yahui Liu, Sheng Sun, Xianpeng Lang

While the paper focuses on improving the efficiency of VLA models used in robotic manipulation, it addresses computational overhead rather than the specific physical or control challenges associated with contact-rich manipulation.

Graphic Abstract PDF
40 relevance

IntentionVLA: Generalizable and Efficient Embodied Intention Reasoning for Human�Robot Interaction

Yandu Chen, Kefan Gu, Yuqing Wen, Yucheng Zhao, Tiancai Wang, Liqiang Nie

While the paper focuses on VLA models for manipulation and HRI, it emphasizes high-level intention reasoning rather than the low-level physics or control challenges specific to contact-rich manipulation.

Graphic Abstract PDF
30 relevance

EveryDayVLA: A Vision-Language-Action Model for Affordable Robotic Manipulation

Samarth Chopra, Alexander McMoil, Benjamin Carnovale, Evan Sokolson, Rajkumar Kubendran, Samuel Dickerson

While the paper discusses Vision-Language-Action (VLA) models for robotic manipulation, it focuses on hardware affordability and general task success rather than the specific challenges of contact-rich manipulation.

Graphic Abstract PDF
30 relevance

ShapeForce: Low-Cost Soft Robotic Wrist for Contact-Rich Manipulation

Jinxuan Zhu, Zihao Yan, Yangyu Xiao, Jingxiang Guo, Chenrui Tie, Xinyi Cao, Yuhang Zheng, Lin Shao

The paper focuses on a hardware sensor for contact-rich manipulation but does not mention or utilize Vision-Language-Action (VLA) models.

Graphic Abstract PDF
20 relevance

Exploiting Vulnerabilities: Universal Adversarial Attacks on Vision-Language-Action Models in Robotics

Songhua Yang, Ziyu Liu, Yuanwei Liu, Xuetao Li, Xuanye Fei, He Huang, Zheng WANG, Miao Li

While the paper discusses Vision-Language-Action (VLA) models in robotics, its focus is on adversarial security attacks rather than the mechanics or implementation of contact-rich manipulation.

Graphic Abstract PDF
10 relevance

Contact Detection and Manipulation with a Shape-Memory Alloy Based Soft Gripper

Louis Plottel, Richard Desatnik, Dinesh K. Patel, Philip LeDuc, Carmel Majidi

While the paper addresses contact-rich manipulation via soft robotics and sensing, it does not involve Vision-Language-Action (VLA) models or AI-driven policy learning.

PDF
10 relevance

VSL-Skin: Individually Addressable Phase-Change Voxel Skin for Variable-Stiffness and Virtual Joints Bridging Soft and Rigid Robots

Zihan Oliver Zeng, JIAJUN AN, Preston Luk, Upinder Kaur

The paper focuses on hardware materials for variable stiffness and morphological control, whereas the topic of interest is high-level Vision-Language-Action (VLA) models.

Graphic Abstract PDF