The evaluation of the various strategies discussed in this paper reflect two paradigms for validating clusterings. Internal validation is concerned with evaluating the merits of the control strategy that searches the space of clusterings: evaluating the extent that the search strategy uncovers clusterings of high quality as measured by the objective function. Internal validation was the focus of Section 3.4. External validation is concerned with determining the utility of a discovered clustering relative to some performance task. We have noted that several authors point to minimization of error rate in pattern completion as a generic performance task that motivates their choice of objective function. External validation was the focus of Section 4.2.
This section explores validation issues more closely, identifies both error rate and simplicity (or `cost') as necessary external criteria for discriminating clustering utility, suggests a number of alternative objective functions that might be usefully compared using these criteria, and speculates that these external validation criteria (taken collectively) reflect reasonable criteria that data analysts may use to judge the utility of clusterings.