Blogs
Breakthrough at ChinaSI 2025: How Stereo Vision Revolutionizes SLAM

From July 18 to 20, 2025, the China Spatial Intelligence Conference (ChinaSI 2025) was held in Shenzhen’s Guangming Cloud Valley. Hosted by Sun Yat-sen University under the China Society for Image and Graphics, the event welcomed nearly 1,000 experts, researchers, and product developers, uniting around topics such as SLAM, spatial intelligence, 3D reconstruction, robot perception, and multimodal sensing.
Across eight featured tracks—including SLAM & 3D Reconstruction and Spatial Intelligence & Robotics—stereo vision systems emerged as keystones for dense depth perception, real-time localization, and mapping. Live demos and poster sessions demonstrated stereo SLAM in robotics, drones, and spatial AI systems.
Why Stereo Vision Is Fundamental to SLAM
- Direct 3D Depth via Disparity
Stereo vision captures synchronized left/right images and computes depth via pixel disparity, mirroring human binocular perception. This avoids the scale ambiguity or drift present in monocular SLAM systems, enabling more reliable 3D reconstruction. - Centimeter-Level Accuracy for Rich Mapping
Advanced algorithms like SGM and PatchMatch produce dense depth outputs with centimeter-level precision. These high-quality depth maps feed SLAM frameworks to build stable, loop-closed point clouds. - Resilience in Dynamic & Low-Texture Environments
Stereo SLAM demonstrates higher robustness than monocular or RGB-D systems in scenarios involving motion or sparse visual features. When combined with semantic segmentation or dynamic object filtering, error drift is further minimized. - Intrinsic Scale Estimation for Global Consistency
The known baseline of stereo cameras provides true scaling without external sensors, simplifying system configuration and preventing drift. This intrinsic scaling advantage ensures global consistency across mapping workflows. - Cost & Deployment Advantages over LiDAR
Stereo cameras are cheaper, lower power, and easier to install than LiDAR, yet still offer rich spatial texture and depth data. This makes them ideal for indoor robotics, drones, and mobile SLAM applications. - Semantic-Level Perception for Obstacle Awareness
Depth maps from stereo feed into semantic models to detect obstacle location and size. This enables SLAM pipelines to combine geometrical understanding with obstacle-aware navigation.
ChinaSI 2025 On-site Highlights

The atmosphere at ChinaSI 2025 was palpably energetic—attendees streamed through the entrance, demo stations buzzed with activity, and poster zones were packed. The conference transformed into a tangible showcase of spatial intelligence: tech, people, and vision were all visibly intertwined.
Live Demo & Attendee Engagement
- A queue of professionals snaked at the venue entrances, each exhibitor flaunting QR codes for quick passes. Inside, neon-lit booths and AR overlays mingled with algorithm posters, creating a high-tech ambiance. Live Demo stations drew circles of curious spectators as stereo-enabled robots navigated narrow corridors, updating point cloud maps in real-time. Each successful dodge—or mapping refinement—drew applause.
- At the poster display area, university and industry teams stood by large A0 panels, explaining stereo SLAM methods to onlookers. Visitors peppered them with detailed questions: "What's your depth precision?" "How fast can the map build?" and "How stable is drift error during navigation?"
Tech Showcases & Demonstrated Solutions
- In the SLAM & 3D Reconstruction hall, robotic platforms performed stereo SLAM with live mapping. A simulated urban street path enabled mapping while the robot traversed it, showing audience how stereo depth underpins navigation and obstacle avoidance.
- The Spatial Intelligence & Robotics section displayed humanoids equipped with Viobot2 modules, autonomously localizing, navigating, and demonstrating SLAM-based tasks in real-time—highlighting potential for service robotics.
- A side Track discussion for CTO-level attendees covered deployment strategies for stereo VIO modules, including cost analysis, SDK support, developer ecosystems, and post-deployment maintenance. The discourse ranged from hardware engineering to business practicality.
Industry Trends & Intellectual Tone
- ChinaSI retains its heritage in academic SLAM but firmly expanded into a broader “spatial intelligence” narrative. Speakers stressed stereo cameras aren’t just sensors—they’re the very inputs to temporal AI models constructing dynamic 3D world maps.
- Many startup founders and CTOs in attendance asserted: “Without stable stereo depth perception, we can’t deploy inspection robots, mapping drones, or indoor navigation platforms at scale.”
Media Coverage & Narrative Amplification
- Tech platforms like CSDN, Zhihu, and blog aggregators published high-engagement posts with headlines such as "ChinaSI 2025 Gala: Elite SLAM Ecosystem Arrayed" and "Stereo Vision Live Demo Debuts"—drawing tens of thousands of pageviews.
- Mobile video snippets featured robot demonstrations, including shots of Jetson compute boards connected to stereo modules, Viobot2 hardware insertion, and mapping in progress, amplifying the event's visibility and impact.
Viobot2 Module: Stereo Fish‑Eye + IMU VIO Case Study
Viobot2 (RoboBaton‑Viobot2)from Hessian Matrix is a high-integration VIO-SLAM module with:
- Stereo Fish-Eye Cameras + 6-DoF IMU + Optional GNSS
Features a 60 mm stereo fish-eye pair with 164.7° FOV, built-in IMU, and L1+L5 GNSS fusion for indoor-outdoor positioning. - Embedded Compute & Interfaces
Includes an 8-core CPU (max 2.4 GHz), Mali GPU, and 6 TOPS NPU; ~138g and ~11W power consumption; supports USB, Type‑C, RJ45, CAN, I2C, UART. - High‑Frequency Pose & Depth Output
Visual-GNSS fused pose data at 10–200 Hz and depth maps from 0.5–5 m enable front-end SLAM and mapping workflows. - Reinforced Mechanical Design
Metal structural reinforcement increases deformation tolerance by 245%, preserving stereo baseline calibration.
Viobot2 integrates with RoboBaton UI/SDK, supporting structure-from-motion mapping, loop closure relocalization (~500 sqm), and real-time ROS1/ROS2 navigation workflows.