next up previous
Next: 3.2 Methodology Up: 3. Results Previous: 3. Results

3.1 Data Sets

To evaluate the performance of Bagging and Boosting, we obtained a number of data sets from the University of Wisconsin Machine Learning repository as well as the UCI data set repository [Murphy Aha1994]. These data sets were hand selected such that they (a) came from real-world problems, (b) varied in characteristics, and (c) were deemed useful by previous researchers. Table 1 gives the characteristics of our data sets. The data sets chosen vary across a number of dimensions including: the type of the features in the data set (i.e., continuous, discrete, or a mix of the two); the number of output classes; and the number of examples in the data set. Table 1 also shows the architecture and training parameters used in our neural networks experiments.


  
Table 1: Summary of the data sets used in this paper. Shown are the number of examples in the data set; the number of output classes; the number of continuous and discrete input features; the number of input, output, and hidden units used in the neural networks tested; and how many epochs each neural network was trained.
\begin{table}
\begin{tabular*}{\textwidth}{@{\extracolsep{\fill}}\vert l\vert rc...
...ehicle & 846 & 4 & 18 & - & 18 & 4 & 10 & 40 \\ \hline
\end{tabular*}\end{table}


next up previous
Next: 3.2 Methodology Up: 3. Results Previous: 3. Results
David Opitz
1999-08-24