COMP24011 Introduction to Artificial Intelligence
Lab Exercise 4: Features for Estimating Autonomous Vehicle Poses
I. Preparatory Work
For this exercise, we will use Python 3. As part of your preparation for the exercise, you will
need to:
A. Install a package with Python bindings for OpenCV, a very popular computer vision
framework. You can do this by running the following line in your terminal:
pip3 install opencv-contrib-python
NOTE: Apart from the above package, the other libraries that you will need for this
exercise are already installed on the virtual machine provided to you by the
Department
1
. If you wish to use your own development environment in your own
machine, you will need to install numpy
2 yourself.
B. Download a Python-based monoVO library
3 customised for our purposes.
Decompressing the downloaded archive will create a directory named
monoVO-COMP24011. This directory contains the code described in Section III below,
and will form the basis of your deliverable which will be submitted via Gitlab (see
Section V). Hence you should copy this directory into your local repository.
NOTE: This library is also available in Github
4 but it is not advisable to use it as it does
not have the added functionalities specific for this exercise; it also requires Python 2.
C. Download the KITTI data set
5
that we will be using for this exercise. Decompress it
and take note of the path (e.g., /home/csimage/Downloads/KITTI) as you will need it
later.
II. Introduction to the Problem
In this exercise, you will implement and compare three different strategies for matching visual
features in a series of images captured during the navigation of an autonomous vehicle (AV).
These features are used in estimating poses (i.e., camera trajectories) based on a visual
odometry algorithm. Below are some terms and their definitions to help clarify some concepts
in autonomous robot navigation.
Odometry is the use of sensors to estimate a robot’s change in position relative to a known
position. Visual odometry (VO) is a specific type of odometry where only cameras are used as
sensors, as opposed to using, e.g., global positioning system (GPS) sensors or light detection
and ranging (LIDAR) sensors. It is based on the analysis of a sequence of camera images.
Simultaneous localisation and mapping (SLAM) is a task whereby a robot needs to build a
map of its current environment while at the same time trying to determine its position relative
to that map.
5 https://www.dropbox.com/s/5rx…
4 https://github.com/uoip/monoV…
3 https://online.manchester.ac….
2 https://numpy.org/install/
1 https://wiki.cs.manchester.ac…
1
In this exercise, you will explore a monocular (single-camera) VO solution to the 2012 SLAM
Evaluation challenge
6
, which made use of the KITTI data set. However, it is worth noting that
VO is limited in that it can only perform trajectory estimation after each pose, and hence
trajectory optimisation is achieved only locally. Global optimisation is achieved through loop
closure: the correction of the trajectory upon revisiting an already encountered location.
The monocular VO solution provided to you is already fully functional. It uses feature tracking
to identify feature correspondences between adjacent images. Specifically, it uses SIFT
(Scale-Invariant Feature Transform) as its feature detector and builds upon classes and
methods that are available in OpenCV.
Your task for this exercise is to implement the following feature matching strategies: (1)
distance thresholding, (2) nearest neighbour, and (3) nearest neighbour distance ratio.
III. Running the provided code
A. Without command-line arguments
$ python3 test.py -d <dataset_path>
where dataset_path is the path to the KITTI dataset you decompressed as part of the
preparatory work described in Section I above.
This will display the sequence of images/frames (all 4541 images in the KITTI data set)
together with an image where you can see the true trajectory (drawn in green) and the
estimated trajectory (drawn in red).
B. With command-line arguments
$ python3 test.py -d <dataset_path> -f <frame_index> -m
<matching_algorithm> -t <threshold> -o <output_path>
Running the above command will display the sequence of images but only until the specified
frame index; when that frame index is reached, the code will:(1) compute feature matches
between the corresponding image Ik and the image preceding it Ik-1
, (2) generate an output file
containing the distance values for obtained matches, and (3) display the two images side by
side. Further details are provided below.
IV. Tasks
Stub code is provided in the visual_odometry.py file, where you will find a function named
featureMatching. It takes as parameters:
image_ref (the previous image)
image_cur (the current image)
matching_algorithm (the user-specified matching strategy)
threshold_value (the user-specified threshold value that will be used by
matching_algorithm if applicable)
output_path (the user-specified path for the file that will contain feature matching
outputs–more on this below)
You will write most of your code within the featureMatching function.
You will find that the first line of code in this function is one that instantiates a SIFT feature
detector.
6 http://www.cvlibs.net/dataset…
2
Task 1:Write the line(s) of code that will compute the SIFT descriptors. You can refer to the
OpenCV SIFT Tutorial
7
for information on how to do this in OpenCV. (1 mark)
Task 2: As mentioned above, one of the parameters of the function is matching_algorithm,
which specifies the matching strategy that should be run. For the tasks below, you are allowed
to build upon OpenCV’s BFMatcher class
8
. Where applicable, your solution should make use
of the user-specified threshold value.
A. Distance thresholding. If the value of the parameter is 1, matches should be selected
based on distance thresholding.
Task 3: Write the lines of code that implement feature matching based on distance
thresholding. (5 marks)
B. Nearest neighbour (NN). If the value of the parameter is 2, matches should be
selected on the basis of being the nearest neighbour(s).
Task 4: Write the lines of code that implement feature matching based on nearest
neighbours. (5 marks)
C. Nearest neighbour distance ratio. If the value of the parameter is 3, matches should be
selected based on the NN distance ratio.
Task 5: Write the lines of code that implement feature matching based on the nearest
neighbours distance ratio. (5 marks)
Task 6: Write code that will print the results of feature matching to a text file. Place your
solution inside the printMatchesToFile function provided as part of the stub code. Note that
the path to the output text file is based on the value of the output_path parameter specified
by the user; if none was provided by the user, the stub code will write the output to a file called
feature_matches.txt inside the working directory. (3 marks)
Each line in the generated output text file should follow the format:
<feature_index>: <match1_dist> <match2_distance> … <matchn_dist>
where each distance value is rounded off to two decimal places. Please see below for an
example:
0: 69.52 133.31 182.31
1: 54.84 166.74
2: 151.49
3: 176.01
4: 115.58
5:
6: 92.52
7: 113.57 192.01
8: 98.59 195.33
…
NOTE: There is a single whitespace after the colon, as well as between distance values. If
there are no matches for a given feature, there should still be a line corresponding to that
feature albeit without a list of distance values (see line that starts with 5: above for an
example).
8 https://opencv24-python-tutor…
7 https://docs.opencv.org/maste…
3
A script named test_output_file_format.py is provided to you; you can use this to check if
the format of the file you generated is correct. To do this, run the following:
python3 test_output_file_format.py <file_path>
where file_path specifies the path to the output file whose format you wish to check. If the
format is incorrect, the script will print the message “Please check the file format.”, in which
case you should revise your printMatchesToFile function.
Task 7: Visualise the matches obtained (in Tasks 3 to 5 above) by calling the drawMatches
function, supplying it with your own parameter values (for further guidance on this function,
see comments in the stub code provided to you). (1 mark)
V. Deliverable
Please upload your solution, i.e., your copy of the monoVO-COMP24011 folder that
contains your modifications to the stub code, via Gitlab by the specified deadline (18:00
on the 10th December 2021).
Specifically, commit your work to the visualSlam branch, using the 24011-lab4-S-SLAM tag.
A sample workflow for doing this is as follows:
- After cloning
- your Gitlab repository for COMP24011, copy your own extracted
monoVO-COMP24011 directory into the cloned repository. - Switch to the visualSlam branch by issuing the following command:
git switch visualSlam - Commit your new files to the local repository:
git add .
git commit -m “some comment” - Apply a tag:
git tag 24011-lab4-S-SLAM - Push your files before the deadline:
git push
git push –tags - As per the instructions at
https://wiki.cs.manchester.ac…