UltraHiT: A Hierarchical Transformer Architecture for Generalizable Internal Carotid Artery Robotic Ultrasonography
Teng Wang, Haojun Jiang, Yuxuan Wang, Zhenguo Sun, Xiangjie Yan, Xiang LI, Gao Huang
AI summary
Problem
Existing robotic ultrasound systems struggle to generalize to the internal carotid artery due to its deep location, tortuous path, and significant individual anatomical variations.
Approach
The method uses a hierarchical transformer that identifies anatomical variations and dynamically switches between a knowledge-based standard scanning path and a data-driven adaptive corrector to guide the probe.
Key results
- First autonomous ICA longitudinal scanning achieved on a robotic platform
- Novel hierarchical transformer architecture effectively handles anatomical variability
- Collection of the first large-scale ICA scanning dataset with 164 trajectories and 72K samples
- 95% success rate on unseen individuals, outperforming baselines with improved robustness
Why it matters
Enables reliable, automated ICA ultrasound for clinical cerebrovascular assessment, addressing sonographer shortages and improving diagnostic accessibility.
Abstract
Carotid ultrasound is crucial for the assessment of cerebrovascular health, particularly the internal carotid artery (ICA). While previous research has explored automating carotid ultrasound, none has tackled the challenging ICA. This is primarily due to its deep location, tortuous course, and significant individual variations, which greatly increase scanning complexity. To address this, we propose a Hierarchical Transformer-based decision architecture, namely UltraHiT, that integrates high-level variation assessment with low-level action decision. Our motivation stems from conceptualizing individual vascular structures as morphological variations de- rived from a standard vascular model. The high-level module identifies variation and switches between two low-level modules: an adaptive corrector for variations, or a standard executor for normal cases. Specifically, both the high-level module and the adaptive corrector are implemented as causal transformers that generate predictions based on the historical scanning sequence. To ensure generalizability, we collected the first large- scale ICA scanning dataset comprising 164 trajectories and 72K samples from 28 subjects of both genders. Based on the above innovations, our approach achieves a 95% success rate in locating the ICA on unseen individuals, outperforming baselines and demonstrating its effectiveness. Project website: https://ultrahit-thu.github.io/UltraHiT/.