Research Analyzer
← Back ICRA 2026

MOTOR: A Multimodal Dataset for Two-Wheeler Rider Behavior Understanding

Varun Paturkar, Shankar Gangisetty, C.V. Jawahar

PDF

AI summary

Key figure (auto-extracted from paper)
Combining RGB video, rider eye-gaze, and vehicle telemetry consistently outperforms single-modality baselines for recognizing two-wheeler behaviors and classifying maneuver legality in dense traffic.
Two-wheeler behavior multimodal dataset rider gaze traffic legality action recognition intelligent transportation

Problem

Two-wheeler rider behavior is critically underexplored compared to four-wheelers, with existing datasets lacking scale, multimodal data, and annotations for unconventional behaviors and traffic legality in dense, unstructured traffic.

Approach

The authors introduce the MOTOR dataset, a large-scale, multi-view, multimodal collection of synchronized front/rear/helmet videos, eye-gaze, audio, and telemetry from 16 riders, and benchmark behavior recognition and legality classification using CNN and Transformer backbones with multimodal fusion.

Key results

  • Introduced MOTOR, the first large-scale multi-view multimodal dataset for two-wheeler behavior in dense traffic.
  • Provided rich annotations covering 12 riding maneuvers (6 conventional, 6 unconventional) with legality labels.
  • Demonstrated that multimodal fusion of video, gaze, and telemetry consistently improves performance for behavior recognition and legality classification.
  • Delivered exhaustive modality ablation and class-wise accuracy analyses across state-of-the-art action recognition backbones.

Why it matters

Provides a foundational benchmark for developing safety-critical models, legality-aware prediction, and intelligent transportation systems tailored to two-wheeler dynamics in the Global South.

Abstract

Two-wheelers account for a disproportionately high share of road fatalities in the Global South. Research on two-wheeler rider behavior, however, lags far behind four- wheelers, where multimodal datasets have driven major ad- vances in Advanced Driver Assistance Systems (ADAS). To address this gap, we present the MOtorized TwO-wheeler Rider (MOTOR) dataset, the first large-scale, multi-view, multimodal resource dedicated to two-wheelers in dense, unstructured traffic. MOTOR comprises 2,500 sequences (25+ hours of video data) collected from 16 riders and integrates synchronized front, rear, and helmet videos, rider eye-gaze from wearable trackers, on-road audio, and telemetry (GPS, accelerometer, gyroscope). Rich annotations capture traffic context, rider state, 12 riding maneuvers spanning conventional and unconventional behaviors, and legality labels (Legal, Illegal, Unspecified). We benchmark rider behavior recognition and maneuver legal- ity classification using state-of-the-art video action recogni- tion backbones (CNN and Transformer-based), extended with multimodal fusion, and find that combining RGB, gaze, and telemetry consistently yields the best performance. MOTOR thus provides a unique foundation for advancing safety-critical understanding of two-wheeler riding. It offers the research community a benchmark to develop and evaluate models for behavior analysis, legality-aware prediction, and intelligent transportation systems. Dataset and code is available at https: //varuniiith.github.io/MOTOR-Dataset/

Index terms

Data Sets for Robotic Vision Computer Vision for Transportation Intelligent Transportation Systems

Related papers