Machine Vision Practical Three - Invariants

5%, to be set ~22/11/04 and completed by 6/12/2004.

1. Objective

In the lectures, you have heard about principal axis decomposition, least squares fitting procedures, and geometric invariants. The aim of this practical is to illustrate some of these concepts and to exercise some of the techniques you used in the first practical assignment.

2. Procedure

Carry out each of the steps listed below and, as in the first two practicals, write a brief report containing:

Include any code you write yourself (suitably commented) in an appendix.

3. Exercises

A simple object in the form of an asymmetric, cardboard "flag" is supplied. Use it for the exercises below.

3.1 Image Acquisition

Acquire and store suitable grey-level images of the "flag" supplied. Your images should be taken under appropriate lighting and viewing conditions so that the flag, placed against a bland background, fills a reasonably large proportion of the field of view. You should aim eventually to use three images from a variety of viewpoints, in one of which the camera axis should be approximately normal to the plane of the flag. In one of the others the camera axis should be tilted relative to the plane of the object to produce some distortion of the shape of the image, whilst the third should be taken from a much more oblique viewpoint so as to produce strong perspective foreshortening. Take care to ensure however that you have still formed a well defined image.

Use these images in each of 3.2 -3.5 below. It is probably best to perform all of the operations indicated on one of the images (which may be a test image) first in order to be sure that your implementation works and to minimise the number of images and intermediate results you have to store.

3.2 Edge Detection

Use the Canny edge detector supplied in order to find the edges of an image of the flag. Print the resulting edge image, both the strength and orientation, and save it for the next step, 3.3. Only use your own edge detector if you are confident of its performance.

Comment on the quality of the edge maps obtained and on the ease with which you could choose suitable parameters in the Canny edge detector (scale, ie width of the Gaussian filter, thresholds etc).

3.3 Edge Histogramming

The flag is defined by five line segments. In order to process the images further, you need to find which edge pixels (edgels) belong to which line. There are a number of ways you could do this, including the use of a Hough transform or by edgel linking followed by segmentation and a least squares fitting procedure. In this section and the next, a simple technique based on histogramming the edgel orientations will be developed.

Write the code to construct a histogram of the orientations of the edgels in your image. Orientation should be defined as the angle between the normal to the edgel and the x axis (say). Take care to choose the sense of your normal consistently (eg from bright to dark throughout the image) and to calculate the angle correctly over the whole range of 360 degrees. Choose a suitable resolution for the histogram so that you should be able to distinguish edgels belonging to each of the five line segments defining the object.

Comment on the number of peaks in your histogram, the resolution used, on the ease with which you can in fact distinguish to which line segment edgels belong, and on the presence (or absence) of noise and outliers. Note that if you do not define the sense of the edgel normals consistently, you are likely to find that you cannot distinguish some of the line segments in some of the images. Comment on why this is likely to be so.

Note that you may select the edgels belonging to a particular peak of the histogram by a manual method such as, for example, by looking at a plot of the histogram and choosing appropriate values of the angle either side of the peak. If time allows, though it is not required for credit for this assignment, it is instructive to consider, and perhaps also to implement, how the edgels belonging to each peak might be selected automatically: (a) when it is known there are (or should be) five relevant peaks, and (b) when it is not known how many relevant peaks there are (ie the shape of the object is not known).

3.4 Line Finding

Modify the program you wrote in section 3.3 so that you can record which edgels belong to which peaks in the histogram. Select each of the peaks in turn and fit a straight line to the positions of edgels contained in it. Since errors in the x and y locations of an edgel are similar, use a homogeneous least squares fitting procedure (ie PCA) in which you fit a line of the form

a x + b y + c = 0

and find the direction of the line from a principal axis decomposition of the distribution of edgel locations.

Check that the results obtained are reasonable: (i) by comparison with the peak positions in your histogram and (ii) by comparison with the image data. In both cases be as quantitative as you can in your comparison and comment on the accuracy of the lines fitted. Comment on whether your results are affected by outliers.

3.5 Finding the Vertices

Write a program to calculate the vertices of your image of the flag by intersecting the lines obtained in 3.4. Note that since the equation describes a line of infinite length, you can generate many intersections. Consider whether you can determine which intersections correspond to the vertices of the flag: (i) from the peaks in the histogram used to define the lines, (ii) by a neighbourhood test on the points used to define each line, and implement one of these methods. Comment on the generality of these approaches. If time is short, select the appropriate lines to intersect by hand.

Again, check that the results obtained are reasonable by comparison with your image and comment on the accuracy of the vertices obtained. Note if any of them are particularly inaccurate or uncertain and, if possible, explain why this is so.

3.6 Calculating Invariants

According to the theory presented in the lectures, four points may be used to define two functionally independent invariants of a planar object under affine projection, whilst five points yields two independent invariants under perspective projection.

Under affine projection the invariants may be defined as:

I1 = |m(123)| / |m(134)|
I2 = |m(124)| / |m(234)|

where m(ijk)=(x(i),x(j),x(k)) is a matrix constructed from the homogeneous co-ordinates of image points x(i) = (x(i), y(i), 1)T and | m(ijk) | is the determinant of the matrix m(ijk). Under perspective projection, cross ratios of similar determinants are required:

I1 = (|m(431)| |m(521)|) / (|m(421)| |m(531)|)
I2 = (|m(421)| |m(532)|) / (|m(432)| |m(521)|)

Calculate these invariants using the vertices of your flag images as computed in 3.5 above and by direct measurement from the flag using a ruler. The vertices of the flag were designed to have co-ordinates: (0,2), (22,0), (14,8), (24,18) and (0,16) so you should also calculate the nominal value of the invariants. Comment on the accuracy of your results and on their constancy (or otherwise).

If time allows, consider whether there are any other invariants you could calculate in each case (i) from the lines themselves, (ii) from line intersections other than those corresponding to the vertices. Comment on whether you think any of these in either (i) or (ii) above would provide any new information independent of the invariants calculated from the vertices of the flag.

4. Reporting

As usual, beware of labouring long and hard to produce a pretty report. There is a rapidly diminishing rate of return for such efforts. Evidence that the exercise has been completed and content of the report is much more important than presentation.


Bernard Buxton,
November 2004.