Optimized Design and Calibration of a Human-Eye-Sized Active Binocular Vision System Based on Spherical Parallel Mechanism
and Xiaolin Zhang
AI summary
Problem
Miniaturizing active binocular vision systems to human-eye size for humanoid robots is challenging due to complex kinematic error coupling in parallel mechanisms, which makes traditional calibration inaccurate and reliant on large datasets.
Approach
The team engineered a compact spherical parallel manipulator-based monocular vision unit and paired two into a binocular system, then developed a two-branch neural network for kinematic calibration that was further refined into a four-branch fine-tuning model to minimize data needs.
Key results
- Designed a 30 mm diameter, 65 mm baseline human-eye-sized active binocular vision system integrated into a humanoid head.
- Developed a two-branch optimization neural network that reduces rotational prediction error by 16% and translational error by 5% compared to single-branch models.
- Introduced a four-branch fine-tuning strategy achieving comparable accuracy to fully trained models using only 343 data points.
- Demonstrated accurate 3D stereo reconstruction during robot movement on the miniaturized system.
Why it matters
This work enables precise, compact visual perception for humanoid robots by providing a scalable, low-data calibration method for miniaturized parallel vision systems.
Abstract
The Active Binocular Vision System (ABVS), resem- bling the human eye, demonstrates potential for improving visual perception in robotic systems, especially in dynamic and complex environments. In this letter, we present an optimized design of a three degree-of-freedom (DoF) Active Monocular Vision System (AMVS) based on a Spherical Parallel Manipulator (SPM). By combining two identical AMVS units, we form an ABVS, which has been successfully integrated into a humanoid robotic head. Due to the highly nonlinear kinematics of SPM and complex error coupling in its multi-link structure, traditional end-to-end neural network training methods are insufficient in accuracy and require large datasets. To address these challenges, we propose a two-branch optimization network that significantly improves calibration accuracy. Furthermore, we introduce a four-branch fine-tuning strategy that enables accurate kinematic models to be obtained with only a small amount of data from new AMVS devices. Experimental results demonstrate that the two-branch optimization network reduces rotational prediction error by 16% and translational error by 5% compared to a single-branch net- work. Furthermore, the four-branch fine-tuning network achieves comparable accuracy to a fully trained single-branch network using only 343 data points. Finally, our ABVS shows the capability to perform 3D visual tasks, such as stereo reconstruction during movement.