Visual Odometry (VO) is a technique that uses visual sensors like cameras and computer vision algorithms to estimate the motion trajectory and displacement of objects such as robots, vehicles, or drones. This article will delve into the origins of visual odometry, its core principles, a comparison with traditional odometry, and its widespread applications in modern technology.

1、What is Odometry? Origins and Principles of Traditional Odometry

The term “odometry” originally referred to mechanical odometers, devices used to measure the distance traveled by vehicles like cars and bicycles. Traditional odometers work as follows:
- Mechanism: They calculate travel distance based on the number of wheel rotations and wheel circumference. For example, a car’s odometer records wheel speed via a gear system to estimate total mileage.
- Characteristics:
- Advantages: High accuracy and less susceptibility to environmental factors like light or road conditions.
- Disadvantages: Prone to mechanical wear or sensor drift, which can lead to accumulated errors over long-term use.
- Applications: Widely used for recording vehicle mileage or assisting navigation systems.
Traditional odometry relies on physical sensors (e.g., wheel encoders, accelerometers, or gyroscopes) to estimate displacement by directly measuring physical quantities like rotation speed or acceleration.

2、Definition and Core Technologies of Visual Odometry



Visual Odometry uses cameras to capture environmental images and employs image processing and computer vision techniques (e.g., feature matching, optical flow, or deep learning) to estimate an object’s motion trajectory. Compared to traditional odometry, visual odometry does not require physical contact and offers greater adaptability. Its core technologies include:
- Motion Estimation: Analyzing successive image frames (e.g., changes in feature point positions) to calculate the translation and rotation of the camera or mobile platform. For instance, ORB or SIFT algorithms can identify changes in the positions of the same objects in images to infer camera motion.
- Path Reconstruction: Accumulating relative motion between adjacent frames to reconstruct the complete motion trajectory of the camera or platform, similar to the “travel path” recorded by traditional odometers.
- Scene Structure Recovery: Using techniques like triangulation, visual odometry can estimate the 3D spatial structure of objects in the environment, improving localization accuracy.
Visual Odometry Workflow:
- Image Acquisition: The camera continuously captures environmental images, forming an image sequence.
- Feature Extraction: Key points (e.g., corners) or optical flow features are identified in the images.
- Motion Calculation: Algorithms estimate relative motion (translation and rotation) between frames.
- Trajectory Optimization: Motion data is accumulated and the overall path estimation is optimized to reduce errors.
Visual odometry’s characteristics are akin to “short-term memory,” as it only processes information from adjacent frames and does not rely on earlier historical data. This makes it computationally efficient but can also lead to accumulated errors affecting long-term accuracy.

3、Visual Odometry Interpretation in “14 Lectures on Visual SLAM: From Theory to Practice”

In “14 Lectures on Visual SLAM: From Theory to Practice” the author describes Visual Odometry (VO) as follows:
“For now, it is enough to know that VO can estimate camera motion through images between adjacent frames and restore the spatial structure of the scene. It is called ‘odometry’ because, like an actual odometer, it only calculates motion between adjacent moments and has no relation to information from further in the past. In this respect, VO is like a species with only short-term memory.”
Analysis of this Passage:
- Adjacent Frame Motion Estimation: Visual odometry calculates relative motion between successive image frames without relying on older frame data.
- Scene Structure Recovery: By analyzing feature point changes from different viewpoints, triangulation is used to estimate the 3D spatial structure, similar to how humans perceive depth through binocular disparity.
- “Odometry” Analogy: Visual odometry is similar to traditional odometry in that it focuses only on the current and previous moment’s motion, leading to simpler calculations but potential error accumulation.
- Short-Term Memory Characteristic: It processes only adjacent frame information, akin to “short-term memory,” which is computationally efficient but ignores long-term historical data, making it susceptible to continuous error effects.
Pros and Cons Analysis:
- Advantages: Computationally simple, real-time capable, suitable for resource-constrained devices.
- Disadvantages: Accumulated errors can lead to trajectory drift, requiring optimization in conjunction with SLAM (Simultaneous Localization and Mapping).

4、Comparison of Visual Odometry and Traditional Odometry

Here’s a detailed comparison between visual odometry and traditional odometry:

| Feature | Traditional Odometry | Visual Odometry |
| Sensor Type | Wheel encoders, accelerometers, gyroscopes | Cameras |
| Measurement Method | Direct measurement of physical quantities (e.g., rotational speed, acceleration) | Image feature matching, optical flow, deep learning |
| Advantages | High accuracy, less affected by environment | No physical contact required, low cost, adaptable to complex environments |
| Disadvantages | Susceptible to mechanical wear, sensor drift | Affected by lighting, sparse texture, potential for accumulated errors |
| Typical Applications | Car odometers, inertial navigation | Robot navigation, autonomous driving, drone positioning |

5、Practical Applications of Visual Odometry

Visual odometry demonstrates strong application value in several fields, including:
- Robot Navigation: Helps robots navigate autonomously in unknown environments, estimating position and trajectory in real-time, widely used in service and industrial robots.
- Autonomous Driving: Assists vehicle localization and path planning, enhancing driving safety when combined with high-precision maps.
- Drone Positioning: Achieves precise positioning and obstacle avoidance in indoor environments or where GPS signals are weak.
- Augmented Reality (AR): Provides real-time spatial localization for AR devices, enhancing the integration of virtual and real experiences.

6、How to Optimize Visual Odometry Performance



To improve the accuracy and robustness of visual odometry, the following measures can be taken:
- Algorithm Optimization: Utilize advanced feature extraction algorithms (e.g., ORB-SLAM) or deep learning models (e.g., CNN) to enhance matching accuracy.
- Multi-Sensor Fusion: Combine with IMU (Inertial Measurement Unit) or GPS to compensate for limitations in visual data.
- Environmental Adaptability: Employ robust optical flow methods or multi-view geometry for environments with varying lighting or sparse textures.
In summary, visual odometry is a motion trajectory estimation technique based on visual information, achieving functions similar to traditional odometry through cameras and computer vision algorithms. It has wide-ranging applications in fields such as robot navigation, autonomous driving, and drone positioning. Although its “short-term memory” characteristic makes it efficient, error accumulation is a challenge that needs to be addressed.

Add comment