Domains

Next: Results Up: Integrative Windowing Previous: Experimental Setup

Domains

We evaluated the algorithms on a variety of reasonably large and noise-free training sets from the UCI collection of Machine Learning databases. As our implementation can only handle 2-class problems, we constructed a binary version of the multi-class Shuttle domain by discriminating examples of majority class from all other classes. In the KRK illegality domain we used a propositional version of the original relational learning problem [46], where each position is encoded with features that correspond to the truth values of the 18 different meaningful instantiations of the adjacent, equal, and less_than relations in the background knowledge.

Domain	Size	`c4.5 -t1` vs. `c4.5`	Redundancy
Mushroom	8,124	98.8 %	46.61 %
KRKN	10,000	91.2 %	46.05 %
KRKP	3,196	112.8 %	43.81 %
KRK (prop.)	10,000	113.8 %	21.88 %
Tic-Tac-Toe	958	258.0 %	4.15 %
Binary Shuttle	43,500	55.3 %	--

Table 2 shows the total number of examples available for each domain and the ratio of the average run-time of C4.5 with windowing (invoked using the parameter setting -t 1) versus C4.5 without windowing. The last column shows the redundancy of the domain, estimated with Møller's conditional population entropy heuristic (2). Interestingly enough, there seems to be a (negative) correlation between the performance of C4.5's windowing algorithm and this redundancy measure.¹⁰ In general, the results with C4.5 confirm the results of [59] that not much can be gained with the use of windowing for ID3-like learners. The only exception is the Shuttle domain, where windowing can save almost half of C4.5's run-time.

Next: Results Up: Integrative Windowing Previous: Experimental Setup