ARTEMIS: Active Real-Time Textured Environment Meshing with Interactive Semantics
Yigu Ge, Zhenhuan Ma, Shihao Tang, Yangxi Shi, Xinkai Liang, Hao Fang
AI summary
Problem
Existing 3D reconstruction and SLAM systems compromise between geometric precision, semantic richness, and photorealistic texture, while lacking real-time interactivity for embodied AI agents.
Approach
The system uses a Semantic Brush methodology that fuses LiDAR, visual, and inertial data with natural language queries through a two-stage pipeline: semantically-guided mesh optimization to preserve sharp boundaries, followed by parallel refinement of texture and semantics using a unified reliability metric.
Key results
- State-of-the-art geometric accuracy on KITTI benchmarks
- Superior photorealistic texture mapping over point-cloud SLAM
- Real-time semantic highlighting from natural language queries
- Sub-100ms processing time enabling live sensor integration
Why it matters
It advances embodied AI by providing a responsive, semantically rich Living Map that bridges the gap between static 3D reconstruction and interactive, language-driven environmental understanding.
Abstract
To advance 3D reconstruction from static dig- ital replicas towards semantically interactive Living Maps responsive to an agent’s queries, we propose ARTEMIS, a system for Active Real-time Textured Environment Meshing with Interactive Semantics. At its core, our Semantic Brush is a methodology comprised of tightly-coupled modules for segmentation, constraint, and refinement that operate in a two-stage, coarse-to-fine pipeline. Initially, its segmentation and constraint modules translate natural language into a semantically-aware mesh, enforcing sharp object boundaries with a unified energy function. Subsequently, its refinement module computes a unified reliability metric from color and depth consistency to guide the joint optimization of the texture map and semantic labels. This holistic process inherently filters unreliable measurements, establishing a complete interactive workflow from language input to real-time highlighting on a high-fidelity textured mesh. We evaluated ARTEMIS on public datasets and in real-world scenarios. The results demonstrate its state-of-the-art accuracy in mesh reconstruction, while simultaneously attaining high fidelity in both texture and semantics. To share our findings and make contributions to the community, our code will be made publicly available. *This work was supported in part by the National Nature Science Foundation of China (NSFC) under Grant (No.62133002). 1All authors are with School of Automation, Beijing Institute of Technol- ogy. Yigu Ge (bengay@bit.edu.cn), Shihao Tang (shihaotang@bit.edu.cn). 2The corresponding author: Hao Fang (fangh@bit.edu.cn).