3.5 Correlation Among Methods

Next: 3.6 Bagging versus Simple Up: 3. Results Previous: 3.4 Ensemble Size

3.5 Correlation Among Methods

As suggested above, it appears that the performance of many of the ensemble methods are highly correlated with one another. To help identify these consistencies, Table 3 presents the correlation coefficients of the performance of all seven ensemble methods. For each data set, performance is measured as the ensemble error rate divided by the single-classifier error rate. Thus a high correlation (i.e., one near 1.0) suggests that two methods are consistent in the domains in which they have the greatest impact on test-set error reduction.

**Table 3:** Performance correlation coefficients across ensemble learning methods. Performance is measured by the ratio of the ensemble method's test-set error divided by the single component classifier's test-set error.
$\begin{table}\begin{tabular}{\textwidth}{@{\extracolsep{\fill}}\vert l\vert ccc... ...37 & 0.35 & 0.60 & 0.63 & 0.69 & 0.96 & 1.00 \\ \hline \end{tabular}\end{table}$

Table 3 provides numerous interesting insights. The first is that the neural-network ensemble methods are strongly correlated with one another and the decision-tree ensemble methods are strongly correlated with one another; however, there is less correlation between any neural-network ensemble method and any decision-tree ensemble method. Not surprisingly, Ada-boosting and Arcing are strongly correlated, even across different component learning algorithms. This suggests that Boosting's effectiveness depends more on the data set than whether the component learning algorithm is a neural network or decision tree. Bagging on the other hand, is not correlated across component learning algorithms. These results are consistent with our later claim that while Boosting is a powerful ensemble method, it is more susceptible to a noisy data set than Bagging.

Next: 3.6 Bagging versus Simple Up: 3. Results Previous: 3.4 Ensemble Size

David Opitz
1999-08-24