MS-rPPG: Multi-Spectral State Space Model for Remote Photoplethysmography in Driver Monitoring Systems
Jiho Choi, Sang Jun Lee
AI summary
Problem
Remote photoplethysmography (rPPG) for driver health monitoring struggles with accuracy and robustness due to uncontrolled illumination changes and frequent head movements in real-world driving scenarios.
Approach
The proposed framework fuses RGB and near-infrared facial videos using a frequency-domain cross-spectral linear modulation module and a multi-spectral Mamba state space model to efficiently capture long-range temporal and cross-channel dependencies.
Key results
- Introduced CSLM for adaptive multi-spectral feature fusion based on physiological frequency priors.
- Developed MS-Mamba, a linear-complexity state space model with bidirectional channel scanning for robust temporal modeling.
- Collected MS-Drive, a real-world driving dataset with synchronized RGB, NIR, and ECG data from 50 diverse participants.
- Achieved state-of-the-art heart rate estimation accuracy and robustness on both the MR-NIRP Car and MS-Drive datasets under unconstrained driving conditions.
Why it matters
It enables reliable, contactless driver health monitoring in real-world vehicles, advancing safe and autonomous driving technologies.
Abstract
Remote photoplethysmography (rPPG) is a camera-based technique for measuring physiological signals, particularly cardiac activity. From the remotely measured signals, heart rate can be estimated, which is crucial for health monitoring. In this study, we investigate a driver health monitoring system based on remote heart rate estimation. However, driving environments represent uncontrolled settings where videos are subject to varying illumination conditions and frequent head movements. We introduce MS-rPPG, a multi- spectral framework that combines RGB with near-infrared (NIR) face video to alleviate rPPG estimation under challenging driving conditions. To combine the complementary features from two spectral videos, we propose a cross-spectral lin- ear modulation (CSLM) strategy based on frequency-domain analysis. Moreover, we introduce MS-Mamba, a novel state space model designed to effectively model long-range temporal dependencies while jointly capturing cross-channel interactions between multi-spectral features. We collected a real-world dataset called MS-Drive, which was recorded from 50 par- ticipants while driving the vehicle. The proposed method was evaluated on the MR-NIRP Car dataset and MS-Drive datasets. The experimental results indicate that MS-rPPG shows better robustness and heart rate estimation accuracy than previous methods, highlighting its promise for driver health monitoring. The codes are available at github.com/ziiho08/MS-rPPG.