Thesis

Note: all files are gzipped Postscript and formatted for double-sided printing.

The complete dissertation in one file:

  • thesis.ps.gz (216 pages)
  • Or, each section in its own file:
    
     LEARNING EVALUATION FUNCTIONS FOR GLOBAL OPTIMIZATION
    
     CONTENTS AND ABSTRACT
    
     CHAPTER 1  Introduction
                1.1  Motivation: Learning Evaluation Functions
                1.2  The Promise of Reinforcement Learning
                1.3  Outline of the Dissertation
    
     CHAPTER 2  Learning Evaluation Functions for Sequential Decision Making
                2.1  Value Function Approximation (VFA)
                2.2  VFA in Deterministic Domains: ``Grow-Support''
                2.3  VFA in Acyclic Domains: ``ROUT''
                2.4  Discussion
    
     CHAPTER 3  Learning Evaluation Functions for Global Optimization
                3.1  Introduction
                3.2  The ``STAGE'' Algorithm
                3.3  Illustrative Examples
                3.4  Theoretical and Computational Issues
    
     CHAPTER 4  STAGE: Empirical Results
                4.1  Experimental Methodology
                4.2  Bin-packing
                4.3  VLSI Channel Routing
                4.4  Bayes Network Learning
                4.5  Radiotherapy Treatment Planning
                4.6  Cartogram Design
                4.7  Boolean Satisfiability
                4.8  Boggle Board Setup
                4.9  Discussion
    
     CHAPTER 5  STAGE: Analysis
                5.1  Explaining STAGE's Success
                5.2  Empirical Studies of Parameter Choices
                5.3  Discussion
    
     CHAPTER 6  STAGE: Extensions
                6.1  Least-Squares TD(lambda)
                6.2  Transfer
                6.3  Discussion
    
     CHAPTER 7  Related Work
                7.1  Adaptive Multi-Restart Techniques
                7.2  Reinforcement Learning for Optimization
                7.3  Rollouts and Learning for AI Search
                7.4  Genetic Algorithms
                7.5  Discussion
    
     CHAPTER 8  Conclusions
                8.1  Contributions
                8.2  Future Directions
                8.3  Concluding Remarks
    
    APPENDIX A  Proofs
                A.1  The Best-So-Far Procedure Is Markovian
                A.2  Least-Squares TD(1) Is Equivalent to Linear Regression
    
    APPENDIX B  Simulated Annealing
                B.1  Annealing Schedules
                B.2  The ``Modified Lam'' Schedule
                B.3  Experiments
    
    APPENDIX C  Implementation Details of Problem Instances
                C.1  Bin-packing
                C.2  VLSI Channel Routing
                C.3  Bayes Network Learning
                C.4  Radiotherapy Treatment Planning
                C.5  Cartogram Design
                C.6  Boolean Satisfiability
    
    REFERENCES
    

    Justin Boyan
    Last modified: Mon Aug 17 15:29:04 EDT 1998