Blogs
What Problems Does ZUPT Solve in VIO/VSLAM? Tuning Tips and Practical Fixes
Why is ZUPT Hailed as a "Universal Handbook"in VIO/VSLAM?
ZUPT is considered a "cheat-sheet" solution because it addresses the challenges VIO (Visual-Inertial Odometry) and VSLAM (Visual Simultaneous Localization and Mapping) systems face in extreme situations, especially when dealing with poor camera or IMU quality. This includes issues like unknown rolling shutter exposure times, lack of MCU (Microcontroller Unit) synchronization, inaccurate intrinsic calibration, and IMU glitches, delays, or severe biases.
Numerous anomalies can occur, and the primary prerequisite for solving them is establishing a strong foundation. Here's a ranked list of the most critical foundational factors in tightly-coupled multi-sensor systems:
- Shutter Type: Global shutters significantly outperform rolling shutters indoors. Outdoors, however, global shutters require more fine-tuning, and rolling shutters see a slight resurgence in their advantages.
- Time Synchronization (Td): For any tightly-coupled multi-sensor system, the inability to precisely synchronize time will severely degrade system performance. It is crucial to keep the time difference between sensors within 10 milliseconds.
- IMU Data Quality: IMU data glitches must be strictly controlled, and convergence delays optimized. These two points are inherently contradictory, as adding filters typically introduces delay. Therefore, selecting a high-quality IMU at the source is paramount; never sacrifice performance for a few tens of yuan in cost.
- Extrinsic Calibration: Extrinsic parameters (e.g., Qbc, Pbc, Gc0) are far more critical than intrinsic parameters, and their calibration process is also more complex.
- Initialization Quality: VINS-MONO has relatively low initialization overhead, while ORB-SLAM's performance is mediocre. Systems like DSO/VI-DSO/DM-VIO, however, have extremely high initialization overhead, and their initialization quality directly impacts performance throughout all subsequent operational stages.
- Photometric Consistency: Photometric information (such as lux and auto-exposure) generally has a moderate impact on feature-point-based systems like VINS/ORB. However, for direct-method systems such as DSO/DM-VIO, it is a decisive factor. Without a deep understanding of DSO's photometric calibration, true direct methods cannot be achieved. Blindly attempting DM-VIO parameter tuning without understanding these fundamentals is undoubtedly a waste of time and effort, as DM-VIO has extremely high overhead and its front-end and back-end parallelization is exceptionally challenging.
- Intrinsic Calibration: The impact of intrinsic parameters is relatively minor. If one cannot even master basic tools like Zhang's calibration or Kalibr, then pursuing computer vision might not be suitable.
In practice, if all the above points (1-7) are perfectly and precisely executed, a tightly-coupled system with prior, visual, and IMU constraints will operate well, and the interplay between covariances and uncertainty propagation will highly conform to theoretical expectations. However, even in such cases, ZUPT may still be necessary in certain situations (though its frequency of application will significantly decrease).
Although ZUPT is considered a "universal handbook" and can be applied to almost any extreme situation, I still recommend prioritizing the robust completion of the seven foundational tasks mentioned above rather than over-relying on ZUPT.
Engineering Applications of ZUPT in Extreme Scenarios


Here are specific engineering strategies for ZUPT when dealing with extreme situations in VIO/VSLAM systems:
- Zero-Velocity Drift:
- Scenario: In the absence of disturbances, a robot rapidly starting from a standstill causes IMU bias, which in turn corrupts the scale and leads to drift.
- Solution: Employ Euclidean distance chain statistics as the ZUPT judgment condition. Once zero-velocity is detected, quickly transition the VIO system into pure IMU Odometry (IO) mode while maintaining the continuity of the sliding window (SWF).
- Zero-Velocity Frontal Disturbance:
- Scenario: The robot is at zero velocity, but there are numerous moving targets (e.g., people) in front.
- Solution: Similar to the first scenario, but additionally, while maintaining SWF continuity, continuously update a static background map after the condition is activated.
- Elevator Scenarios (Ascending/Descending):
- Scenario: The robot is inside an elevator that is moving up or down.
- Solution: Utilize changes in the Z-axis acceleration component from 6-DOF IMU data to trigger the ZUPT condition. When the condition is active, the system enters IO mode, and switches back to VIO mode when the elevator motion stops.
- High-Speed Roll/Pitch Angular Motion:
- Scenario: Due to the four-degree-of-freedom unobservability issue, high-speed, large-angle Roll or Pitch movements are a fatal weakness for VIO. Even if the aforementioned foundational work (points 1-7) is perfectly executed, these problems are difficult to solve completely (DM-VIO achieves good coupling through powerful DSO photometric optimization and high-overhead delayed marginalization).
- ZUPT Solution: By incorporating Euclidean distance separate chain statistics conditions for Y-axis and Z-axis angular velocities, the system can quickly enter pure IO mode and maintain SWF continuity, almost perfectly resolving this issue. However, for certain extreme cases, such as a high-speed head-up motion immediately followed by static observation of the sky (high-speed large Pitch), the system may still not be able to handle it perfectly.
- Visual Blind Spots (VIO/VSLAM "Blindness"):
- Scenario: The robot enters areas with sparse visual features, such as tunnels, large white walls, or vast grasslands.
- ZUPT Solution: While these situations are almost unsolvable, ZUPT or delayed marginalization can help the system "limp along" for a period. The method involves establishing specific judgment conditions. For example, when continuously encountering a white wall, RANSAC-processed feature points may exhibit a highly coherent linear distribution. When this condition is triggered, the system enters pure IO mode and maintains SWF continuity. Once feature points begin to disperse, VIO mode is restored. However, it's important to note that if the blind spot scenario is too long, the system will eventually fail. Similar to delayed marginalization, any strategy has its limits and may ultimately lead to significant deviations.
- Large Frontal Disturbances During Motion:
- Scenario: A vehicle is moving, but someone is maliciously obstructing or damaging its path.
- Approach: This situation falls outside the typical application scope of ZUPT. However, one can still attempt to establish judgment conditions based on the temporal distribution of feature points. In such cases, the Mahalanobis distance of feature point fluctuations across various sub-blocks will exhibit anomalies along the time axis. Conditions should be established based on the vehicle's actual operating situation. Once the condition is active, the system enters pure IO mode and updates the earliest static image after the condition takes effect to enter SWF. Overall, the sixth scenario is extremely difficult to handle and should be avoided if possible. If the system is equipped with an NPU (Neural Processing Unit) or a depth camera (D Camera), conditions can be established through machine learning or similar methods. In practical operation, the best strategy for this situation is "first yield, second observe, third pass," meaning stop, observe, and then safely proceed.
Through the above analysis, we have generally covered some of the most extreme situations encountered in the practical application of VIO and VSLAM. Although ZUPT performs excellently in these scenarios, we still recommend prioritizing the optimization of the multi-sensor system's fundamental hardware and calibration quality to enhance system performance, unless absolutely necessary.
Visual Spatial Computing Camera VIOBOT2 Binocular Fisheye Slam Camera - myrobotproject.com