← Back ICRA 2026

PO-GVINS: A Tightly Coupled GNSS-Visual-Inertial Navigation Framework Using Pose-Only Representation

Zhuo Xu, Feng Zhu, Zihang Zhang, Chang Jian, Jiarui Lv, Yuantai Zhang, Xiaohong Zhang

PDF

AI summary

Key figure (auto-extracted from paper)

PO-GVINS eliminates feature linearization errors and state dimension explosion in visual-inertial navigation by using a pose-only formulation tightly coupled with GNSS-RTK for drift-free, high-precision positioning.

Pose-only formulation GNSS-visual-inertial fusion Tightly coupled navigation Real-time kinematics Visual-inertial odometry Filtering-based SLAM

Problem

Visual-inertial navigation systems suffer from accumulated drift, linearization errors from explicit 3D feature modeling, and state dimension explosion that degrades real-time performance. Existing tightly coupled GNSS-VINS frameworks often fail to fully utilize raw GNSS data or inherit these VINS limitations.

Approach

The method reformulates visual observations using a pose-only representation that expresses feature depth via camera poses instead of explicit 3D parameters, then sequentially fuses these measurements with raw GNSS-RTK pseudorange and carrier phase data in a filtering-based framework.

Key results

Pose-only formulation eliminates feature linearization errors and prevents state dimension explosion
Tightly coupled GNSS-RTK integration with continuous integer ambiguity resolution ensures drift-free positioning
PO-VINS outperforms MSCKF and matches optimization-based VINS accuracy while improving real-time performance
PO-GVINS achieves robust, high-precision positioning in challenging GNSS-degraded environments

Why it matters

Provides a computationally efficient and highly accurate navigation solution for autonomous vehicles and robots operating in environments where GNSS signals are intermittently blocked.

Abstract

Accurate and reliable positioning is essential for per- ception, decision-making, and other high-level applications in autonomous driving, autonomous aerial vehicles, and intelligent robotics. Due to the inherent limitations of standalone sensors, in- tegratingheterogeneoussensorswithcomplementarycapabilitiesis an effective approach to achieving this goal. The visual-inertial nav- igation system (VINS) fuses visual cameras and inertial measure- ment units (IMUs) to explore unknown environments. It requires a priori knowledge of 3D features and jointly estimates camera poses andfeaturepositions,whichinevitablyintroducesfeaturelineariza- tion errors. Meanwhile, the dimensionality of the system state in- creases with abundant textures, degrading real-time performance. To eliminate accumulated errors from VINS, frameworks further fuse measurements from the Global Navigation Satellite System (GNSS), but still suffer from similar limitations. To address the aforementioned issues, we propose a filtering-based, tightly coupled GNSS-visual-inertial positioning framework with a pose-only for- mulation applied to VINS, termed PO-GVINS. We first apply the PO formulation to our VINS (PO-VINS). GNSS raw measurements are subsequently incorporated, with integer ambiguities resolved, to achieve accurate and drift-free state estimation. Extensive ex- periments demonstrate that the proposed PO-VINS significantly outperforms the multi-state constraint Kalman filter (MSCKF) and achieves accuracy comparable to that of optimization-based VINS. By further incorporating GNSS measurements, PO-GVINS achieves accurate, drift-free state estimation, making it a robust solution for positioning in challenging environments. Received 16 June 2025; accepted 23 August 2025. Date of publication 5 September 2025; date of current version 11 September 2025. This article was recommended for publication by Associate Editor S. Santra and Editor A. Banerjee upon evaluation of the reviewers’ comments. This work was sup- ported in part by the National Science Fund for Distinguished Young Scholars of China under Grant 42425003, in part by the National Key Research and Development Program of China under Grant 2022YFB3903802, and in part by the National Natural Science Foundation of China under Grant 42374031 and Grant 42388102. (Corresponding author: Feng Zhu.) Zhuo Xu, Zihang Zhang, Chang Jian, Jiarui Lv, and Yuantai Zhang are with the School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China (e-mail: zhuoxu@whu.edu.cn; zihangzhang@whu.edu.cn; 2024202140027@whu.edu.cn; lvjiauri@whu.edu.cn; officialtai@whu.edu.cn). Feng Zhu is with the School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China, and also with Hubei Luojia Laboratory, Wuhan Univer- sity, Wuhan 430079, China (e-mail: fzhu@whu.edu.cn). Xiaohong Zhang is with the School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China, and also with the Chinese Antarctic Center of Surveying and Mapping, Wuhan University, Wuhan 430079, China (e-mail: xhzhang@sgg.whu.edu.cn). Datasets used in this research are available at https://gitee.com/lv-jiarui/ SmartPNT-MSF-Datasets.git. Digital Object Identifier 10.1109/LRA.2025.3606792

Index terms

Localization Visual-Inertial SLAM Autonomous Vehicle Navigation