Next: Solving the General NxN
Up: Experimenting with MICRO-HILLARY
Previous: The Effect of the
Table 12:
The resources consumed while learning in various
domains. Each number represents the mean over 100 learning sessions.
|
Operator applications |
CPU seconds |
Problems |
Domain |
Mean |
Std. |
Mean |
Std. |
Mean |
Std. |
24-puzzle |
859,497 |
186,823 |
4842.0 |
2270.0 |
62.8 |
7.2 |
10-Cannibals |
216,572 |
88,467 |
14.2 |
6.0 |
63.9 |
12.0 |
10-Stones |
144,303 |
1,653 |
8.4 |
1.9 |
52.0 |
0.3 |
5-Hanoi |
377,671 |
80,968 |
35.2 |
3.5 |
76.2 |
9.5 |
Grid |
1,956,972 |
1,191,383 |
58.8 |
35.0 |
185.0 |
60.2 |
|
Table 13:
Statistics of the macro sets generated in various
domains. Each number represents the mean over 100 sets.
|
Total number |
Mean Length |
Max Length |
Domain |
Mean |
Std. |
Mean |
Std. |
Mean |
Std. |
24-puzzle |
15.32 |
0.79 |
8.68 |
0.24 |
18.00 |
0.0 |
10-cannibals |
2.16 |
0.39 |
2.93 |
0.15 |
3.98 |
0.2 |
10-stones |
1.20 |
0.40 |
2.00 |
0.00 |
2.00 |
0.0 |
5-Hanoi |
11.47 |
0.71 |
7.24 |
0.25 |
16.00 |
0.0 |
Grid |
17.50 |
3.14 |
8.62 |
0.82 |
16.36 |
1.5 |
|
Table 14:
The performance of MICRO-HILLARY (in operator applications)
before and after learning in various domains.
|
Before learning |
After learning |
Domain |
Mean |
Std. |
Mean |
Std. |
24-puzzle |
711,545 |
134,807.0 |
1540.0 |
57.5 |
10-cannibals |
57 |
40.5 |
31.6 |
0.4 |
10-stones |
206 |
84.8 |
126.6 |
8.1 |
5-Hanoi |
10,993 |
14,623.3 |
156.0 |
9.4 |
Grid |
1,529 |
1313 |
369.0 |
35.0 |
|
We have applied MICRO-HILLARY to the other domains specified in Section
4.1.
Tables 12,
13 and 14
show the mean results for 100 learning
sessions.
MICRO-HILLARY was able to reach quiescence in all the domains. The 10-stones
and 10-cannibals domains are very simple. One or two macros were
sufficient to reach quiescence. Note that we used the same
quiescence parameter, 50 problems, for all the domains.
After solving each problem,
MICRO-HILLARY increases by 100 the length of the random sequence
used for generating a training problem.
Therefore, MICRO-HILLARY spends 125,000 operator applications
just to make sure that there is nothing new to learn.
In the simple domains, this amounts to most of the resources
used by MICRO-HILLARY.
It is interesting to look at the macros learned in the grid domain.
Most of the macros have a structure of
,
where S stands for south, W for west, N for north,
and S and N are equal in number. Such macros are used to make
detours around walls that block the search.
MICRO-HILLARY was able to improve the performance of problem solving
in each of the domains. The most notable improvement is
in the 24-puzzle domain where the performance after learning
is 462 times better than the performance before learning.
Next: Solving the General NxN
Up: Experimenting with MICRO-HILLARY
Previous: The Effect of the
Shaul Markovitch
1998-07-21