Learning-Based Joint Control with Hierarchical Reinforcement Learning and On-Device Execution
Satoshi Yagi, Jun Morimoto
AI summary
Problem
Conventional robot control relies on manually tuned PID controllers that degrade under varying loads and require extensive setup, while high-level reinforcement learning struggles to capture fast motor dynamics and generalize across different joints.
Approach
The method trains a fast, low-level neural policy for direct current-to-PWM control and a slower, high-level policy for position control, with the lower layer quantized and deployed on onboard microcontrollers.
Key results
- Eliminates manual PID tuning via learned current control
- Outperforms non-hierarchical RL in tracking accuracy and speed
- Enables single position policy sharing across multiple joints
- Validates real-time quantized policy execution on microcontrollers
Why it matters
This approach streamlines robot deployment by removing manual tuning bottlenecks and enabling scalable, hardware-efficient control policies that generalize across joints.
Abstract
In typical robot learning, deep reinforcement learn- ing policies are employed in the upper control layer to gen- erate target joint angles for robot motion, while conventional controllers are used in the fast lower control layer to control each joint motor. This paper presents a fully neural network- based hierarchical reinforcement learning approach for real-time robot joint control. The proposed method divides joint control into two layers: a high-frequency current control policy and a low-frequency position control policy. The current control policy drives the motor to follow the target current while learning the dynamic characteristics of the joint. The position control policy generates the target current to achieve a desired joint angle, allowing learning and inference at a slower frequency. By decoupling motor dynamics from position control, our method improves learning performance and enables policy generalization across joints. Experimental results on a three-joint robotic arm demonstrate the effectiveness of the proposed approach, including posture control using a shared position control policy across joints.