Footnotes
- ...moves.
- Considering global changes also motivated
redistribution of individual observations in ITERATE.
As Nevins [1995] notes in commentary on experimental
comparisons between
of ITERATE and COBWEB [Fisher, Xu,
Zard, 1992],
even global movement of single
observations typically did not perform as well as local movement of
sets of observations simultaneously, as implemented by COBWEB's
merging and splitting operators.
- ...repository.
- A reduced
mushroom data set was obtained by randomly selecting 1000
observations from the original data set.
- ...orderings.
- A standard
deviation of
indicates that the standard deviation
was non-0, but not observable at the 2nd
decimal place after rounding.
- ...seconds.
- Routines were implemented in
SUN Common Lisp, compiled,
and run on a SUN 3/60.
- ...stabilization.
- Similar timing results occur
in other computational
contexts as well. Consider the relation between insertion sort and Shell
sort. Shell sort's final `pass' of a table is an insertion sort
that is limited to moving table elements between
consecutive table locations at a time. The large efficiency advantage
of Shell Sort stems from the fact that previous passes of the table
have moved elements large distances, thus by the final pass, the table
is nearly sorted.
- ...observations.
- Importantly, SNOB (and AUTOCLASS) assumes probabilistic assignment of observations to clusters.
- ...paper.
- ITERATE uses a measure for redistribution
[Fisher &
Langley, 1990] that probably smoothes `cliffs', and it
uses an ISODATA, non-sequential version of redistribution.
- ...earlier.
- Classification
is identical to
sorting except that the observation is not
added to the clustering and statistics at each node encountered
during sorting are not permanently updated to reflect the new observation.
- ...trials.
- The `standard deviations' given in Row 3 are
actually the mean of the standard deviations over the
frontier sizes for individual variables.
- ...construction.
- For purposes of evaluating
the merits of our validation strategy in terms of error rate, we also held
out a separate test set. Having demonstrated the point, however,
we would not require that a separate test set be held out
when using resampling as a validation strategy.
- ...removed,
- The observation
is physically removed, and
variable value statistics at clusters that lie along the
path from root to the observation are decremented.
- ...nodes.
- In fact, cost is not constant across
observations, even those that are classified along exactly the same
path -- the number of variables that one need test depends
on the observation's values along previously examined variables.
- ...gain.
- Jan Hajek
independently pointed out the relationship between the CU measure
and the Gini Index, and made suggestions on when one might
select one or another of the normalizations above.
- ...prediction strategy [...].
- Importantly,
prediction with COBWEB is actually performed using a
probability maximizing strategy -- the most frequent value
of a variable at a cluster is always predicted. Fisher
[1987b] discusses the advantage of constructing
clusters with an implicit probability matching strategy, even in
cases where these clusters will be exploited with a probability
maximizing strategy.
- ...decoupled).
- The MML and Bayesian approaches
of SNOB and AUTOCLASS support probabilistic
assignment of observations to clusters, but the importance of
decoupling and cohesion remain.
JAIR, 4
Douglas H. Fisher
Sat Mar 30 11:37:23 CST 1996