← Back ICRA 2026

Factorizing Diffusion Policies for Observation Modality Prioritization

Omkar Deepak Patil, Prabin Kumar Rath, Kartikay Milind Pangaonkar, Eric Rosen, Nakul Gopalan

PDF

AI summary

Key figure (auto-extracted from paper)

Factorizing diffusion policies to explicitly prioritize specific sensory modalities yields significantly higher sample efficiency and robustness to noisy inputs than standard joint conditioning.

Diffusion policies modality prioritization robotic skill learning distribution shift robustness factorized learning multimodal conditioning

Problem

Standard diffusion policies jointly condition on all sensory inputs, ignoring their task-dependent relevance and becoming brittle when specific modalities are noisy or shifted.

Approach

FDP splits learning into a base policy trained on prioritized modalities and a residual policy that captures the effect of all other modalities, enabling explicit prioritization orders during training.

Key results

15% absolute success rate improvement in low-data regimes across four simulated benchmarks
40% higher success rate under distribution shifts like visual distractors and camera occlusions
Mathematical derivation of a factorized diffusion framework with a novel block-wise residual architecture
Flexible modality prioritization as a tunable hyperparameter without manual weight tuning

Why it matters

Enables safer and more sample-efficient deployment of diffusion-based robot policies in real-world settings where sensory data is unreliable or shifts over time.

Abstract

Diffusion models have been extensively leveraged for learning robot skills from demonstrations. These poli- cies are conditioned on several observational modalities such as proprioception, vision and tactile. However, observational modalities have varying levels of influence for different tasks that diffusion polices fail to capture. In this work, we propose ‘Factorized Diffusion Policies’ abbreviated as FDP, a novel policy formulation that enables observational modalities to have differing influence on the action diffusion process by design. This results in learning policies where certain obser- vations modalities can be prioritized over the others such as vision>tactile or proprioception>vision. FDP achieves modality prioritization by factorizing the observa- tional conditioning for diffusion process, resulting in more performant and robust policies. Our factored approach shows strong performance improvements in low-data regimes with 15% absolute improvement in success rate on several simulated benchmarks when compared to a standard diffusion policy that jointly conditions on all input modalities. Moreover, our benchmark and real-world experiments show that factored policies are naturally more robust with 40% higher absolute success rate across several visuomotor tasks under distribution shifts such as visual distractors or camera occlusions, where existing diffusion policies fail catastrophically. FDP thus offers a safer and more robust alternative to standard diffusion policies for real-world deployment. Code and videos are available at https://fdp-policy.github.io/fdp-policy/.

Index terms

Learning from Demonstration Probability and Statistical Methods Imitation Learning