This appendix presents some of the underlying mathematics associated with spline representations and the snake. It is not meant to be an introduction to the subject. Rather it is added for completeness to discuss certain important aspects of the system not addressed elsewhere in this paper. Knowledge of these aspects is not necessary to understand the basic principles of the approach discussed in this paper, but would be necessary if one wanted to duplicate the system. More detailed explanation is given in Drummond (1999). Some specific papers that address these ideas in much greater detail are: for splines (Terzopoulos 1986) and for snakes (Leymarie & Levine, 1993; Cohen & Cohen, 1993).
Splines are piecewise polynomials where the degree of the polynomial determines the continuity and smoothness of the function approximation. Additional smoothing constraints can be introduced by penalty terms which reduce the size of various differentials. One way then to view spline fitting is in the form of an energy functional such as Equation 6.
Here, there is an energy associated with the goodness of fit, some
measure of how close the approximating function is to the input
function. This is typically the least squares distance between the
functions. There is an energy associated with the smoothness of the
function. Two very commonly used smoothness controls produce the
membrane and thin plate splines by restricting the first and second
differentials of the function respectively. To fit the spline to the
function, the total energy must be minimized. A necessary condition for
this is an Euler-Lagrange differential equation such as Equation
7. Here controls the tension in the
spline (the resistance to stretching) and
the stiffness
(the resistance to bending). Often the error function will be based on
individual data points and the left hand side of Equation
7 would include delta functions.
In this work, such splines have been used for a number of purposes.
When fitting the snake, measures of the first and second differential
are needed. A two dimensional quadratic spline is fitted to the
discrete representation of the maximum Q-values. An of
0.2 is used (
is zero) to limit overshoot
(Drummond 1996) to prevent false edges. Values from an identical
spline except using an
of 2.0 are squared and then
divided into the differential values. This normalizes the
differentials, so that the size of edges is not dependent on where
they occur in the function. The same type of spline is used to
produce the bowls associated with the rooms as discussed in Section
3.2.1. Here
is 1.0 and
is 0.5
giving roughly Gaussian smoothing. The values used to produce this
function are weighted. Values close to one are given weights of 200,
lower values a weight of 1. This prevents the sides of the bowls from
collapsing under smoothing.
A one dimensional cubic spline is used in locating the doorways. These
are found by steepest descent on the value of the differential along
the body of the snake. This differential contains many local minima
not associated with doorways. These arise either from the inherent
noise in the process or from errors of fit in the snake. The aim is to
remove the ones not associated with doorways by smoothing and
thresholding. This is achieved by first sampling the gradient at
points along the snake. The values are then normalized to lie between
zero and one. The spline has an of 0.15 (
of
0.0). Here a weighted least mean squares fit is used. The weighting
function is the inverse square of the values, preventing the spline
from being overwhelmed by large values. Starting points for steepest
descent are changes in the sign of the coefficients of the gradient of
the spline. The initial step size is set to slightly larger than a
knot spacing and then decreased over time. When a local minimum is
found if the value exceeds a threshold (of 0.5), it is rejected.
To represent the snake, the model of the spline must be changed
somewhat. The snake itself is a one dimensional cubic spline. But the
energy minimum that is being sought is in the differential of the
function, subject to other constraints. The dynamics of the
snake are defined by the Euler-Langrange equation shown in Equation
8.
An of 512 minimizes changes to the snake's shape as it
grows, by penalizing the difference in the second differential to the
previous time step scaled by the ratio of their lengths. An
of 8.0 is the initial stiffness of the snake. This is reduced
proportionately to the snake's length to give the spline more degrees
of freedom. A
of 96 and a
of 96 control the momentum
and the drag on the snake respectively. As in Cohen and Cohen
(1993), a factor is added to the energy associated with the
differential that is in the direction normal to the body of the snake,
as shown in Equation 9. But instead of it being a
constant, a variable is used to produce the mercury model discussed in
Section 3.2.1.
The energy minimization process is carried out iteratively
interleaving steps for the x and y directions. The differential of
for the x direction is given by
Equation 10, a similar equation is used for the y
direction.
The snake grows under the forces of the mercury model until it reaches
an approximately stable position, subject only to small oscillations.
It is then converted into a polygon by finding the corners (where the
normal passes through where
). The
coefficient
is set to zero everywhere. The coefficient
is set to zero at the corners and 15 between them. This
produces a polygon which is flexible at its vertices.
To detect the features as early as possible in the learning process, as discussed in Section 2.4, the height of the gradient is scaled according to the signal to noise ratio. The noise arises from variations in the low level learning process and the stochastic nature of the task. Both the size of the features and the noise grow with time and are somewhat normalized by this scaling process. The idea is to collect uniformly sampled values of the function shown in Equation 10 for both the x and y directions and find the median of their absolute values. The median is not strongly affected by extreme values and thus largely ignores the size of the features, measuring only the noise of the regions in between.