next up previous
Next: Related Work Up: Analysis of the Proposed Previous: Comparison of the qg

4.2 Experimental Evaluation of the Heuristics

For the purpose of comparing the qg and qc measures, a TP/FP convex hull for each of the two measures has been constructed. The procedure was repeated for stages A-C. The TP/FP convex hulls for the qg measure were constructed so that for different g values many subgroups were constructed. Among them those lying on the convex hull in the TP/FP space were selected: this resulted in convex hulls presented by the thick lines in Figures 16-18. The thin lines represent the TP/FP convex hulls obtained in the same way for subgroups induced by the qc measure, for c values between 0.1 and 50.

Figures 16-18 for stages A-C demonstrate that both curves agree in the largest part of the TP/FP space, but that for small FP values the qg measure is able to find subgroups covering more positive examples. According to the analysis in the previous section, this was the expected result. In order to make the difference more obvious only the left part of the TP/FP space is shown in these figures.

\epsfbox{pkdd2002_f1m.eps}
Figure 16: The left part of the TP/FP space presenting the TP/FP convex hulls of subgroups induced using quality measures qg = TP/(FP+g) (thick line) and $q_{c} = TP - c \cdot FP$ (thin line) at data stage A. Labels A1-C1 denote positions of subgroups selected by the medical expert as interesting risk group descriptions.

\epsfbox{pkdd2002_f2m.eps}
Figure 17: The left part of the TP/FP convex hulls representing subgroups induced at data stage B.

\epsfbox{pkdd2002_f3m.eps}
Figure 18: The left part of the TP/FP convex hulls representing subgroups induced at data stage C.

The differences between the TP/FP convex hulls for qg and qc measures may seem small and insignificant, but in reality it is not so. The majority of interesting subgroups (this claim is supported also by patterns A1-C1 selected by the domain expert) are subgroups with a small false positive rate which lie in the range in which qg works better. In addition, for subgroups with FP=0 the true positive rate in our examples was about two times larger for subgroups induced with qg than with qc. Furthermore, note that for stages A and B there are two out of five subgroups (A2 and C1) which lie in the gap between the TP/FP convex hulls. If the qc measure instead of qg measure were used in the experiments with CHD domain, at least subgroup A2 could not have been detected.


next up previous
Next: Related Work Up: Analysis of the Proposed Previous: Comparison of the qg