← Back ICRA 2026

Viewpoint-Agnostic Manipulation Policies with Strategic Vantage Selection

Sreevishakh Vasudevan Nampoothiri, Som Sagar, Ransalu Senanayake

PDF

AI summary

Key figure (auto-extracted from paper)

Strategically selecting a few optimal camera viewpoints via Bayesian optimization dramatically improves robot manipulation policy robustness to viewpoint shifts, outperforming random or grid-based fine-tuning with minimal data.

Viewpoint selection Bayesian optimization Robot manipulation Policy fine-tuning Domain generalization Vision-based control

Problem

Vision-based manipulation policies trained on a single viewpoint fail when camera angles change during deployment, while naively aggregating data from many random views is costly and destabilizes learning due to excessive visual diversity.

Approach

Vantage treats viewpoint selection as a Bayesian optimization problem, using a Gaussian process surrogate and batched Upper Confidence Bound to iteratively identify and fine-tune on a small set of highly informative camera poses.

Key results

Increases task success rate by ~25% for diffusion policies under viewpoint shifts
Achieves near-optimal viewpoint convergence with only a handful of fine-tuning steps
Provides theoretical guarantees on regret bounds, convergence rates, and robustness to camera placement errors
Consistently outperforms fixed, grid, and random selection strategies across simulated and real-world tasks

Why it matters

Enables reliable, data-efficient fine-tuning of vision-guided robot policies for dynamic camera setups without costly brute-force data collection.

Abstract

Since vision-based manipulation policies are typi- cally trained from data gathered from a single viewpoint, their performance drops when the view changes during deployment. Naively aggregating demonstrations from numerous random views is not only costly but also known to destabilize learning, as excessive visual diversity acts as noise. We present Vantage, a viewpoint selection framework to fine-tune any pre-trained policy on a small, strategically chosen set of camera poses to induce viewpoint-agnostic behavior. Instead of relying on costly brute-force search over viewpoints, Vantage formulates camera placement as an information gain optimization problem in a continuous space. This approach balances exploration of novel poses with exploitation of promising ones, while also providing theoretical guarantees about convergence and robustness. Across manipulation tasks and policy families, Vantage consistently improves success under viewpoint shifts compared to fixed, grid, or random data selection strategies with only a handful of fine-tuning steps. Experiments conducted on simulated and real-world setups show that Vantage increases the task success rate by ≈25% for diffusion policies, and yields robust gains in dynamic-camera settings. GitHub: https: //github.com/sreevishakhv/Vantage_Public

Index terms

Incremental Learning Continual Learning AI-Enabled Robotics