FOGA-92: FOUNDATIONS OF GENETIC ALGORITHMS

Vail, CO., July 24 - 29, 1992

------------------------------

"Genetic Algorithms Are NOT Function Optimizers"

Kenneth A. De Jong

Computer Science Department
George Mason University
Fairfax, VA 22030, USA
kdejong@aic.gmu.edu

Abstract:

Genetic Algorithms (GAs) have received a great deal of attention
regarding their potential as optimization techniques for complex
functions.  The level of interest and success in this area has led to a
number of improvements to GA-based function optimizers and a good deal
of progress in characterizing the kinds of functions that are easy/hard
for GAs to optimize.  With all this activity, there has been a natural
tendency to equate GAs with function optimization.  However, the
motivating context of Holland's initial GA work was the design and
implementation of robust adaptive systems.  In this paper we argue that
a proper understanding of GAs in this broader adaptive systems context
is a necessary prerequisite for understanding their potential
application to any problem domain.  We then use these insights to better
understand the strengths and limitations of GAs as function optimizers.

------------------------------

"Modeling simple genetic algorithms"

Michael Vose (vose@cs.utk.edu)

Abstract:

Two models of the simple genetic algorithm are reviewed, extended, and
unified.  The result incorporates both short term (transient) and long
term (asymptotic) GA behavior.  This leads to a geometric interpretation
of genetic search which explains population trajectories.

------------------------------

"Remapping Hyperspace During Genetic Search: Canonical Delta Folding"

Keith Mathias and Darrell Whitley

Department of Computer Science
Colorado State University
Fort Collins, Colorado 80523
mathiask, whitley@cs.colostate.edu 

Abstract:

Delta coding, an iterative strategy for genetic search, has been
empirically shown to improve genetic search performance for some
optimization problems.  However, detailed analyses reveal that the
remapping of hyperspace produced by delta coding is rudimentary. The
underlying mechanisms of delta coding are examined and a more robust
remapping method is introduced.  Canonical delta folding provides a
canonical ordering of the strings in the search space and then employs a
powerful folding mechanism for remapping hyperspace.  The behavior of
canonical delta folding for remapping deceptive functions is examined
using a set of exact equations for modeling the computational behavior
of a simple genetic algorithm.

------------------------------

"Population Diversity in an Immune System Model: Implications for 
Genetic Search"

Robert E. Smith
Dept. of Engineering Mechanics
The University of Alabama
P.O. Box 870278
Tuscaloosa, AL 35487

Stephanie Forrest
Dept. of Computer Science
University of New Mexico
Albuquerque, NM  87131
forrest@unmvax.cs.unm.edu

Alan  S. Perelson
Theoretical Division
Los Alamos National Laboratory
Los Alamos, NM  87545
asp@receptor.lanl.gov

Abstract:

In typical applications, genetic algorithms (GAs) process populations of
potential problem solutions to evolve a single population member that
specifies an ``optimized'' solution.  The majority of GA analysis has
focused these optimization applications.  In other applications (notably
"learning classifier systems" and certain connectionist learning
systems), a GA searches for a population of cooperative structures that
jointly perform a computational task. This paper presents an analysis of
this type of GA problem.  The analysis considers a simplified
genetics-based machine learning system: a model of an immune system. In
this model, a GA must discover a set of pattern-matching "antibodies"
that effectively match a set of "antigen" patterns.  Analysis shows how
a GA can automatically evolve and sustain a diverse, cooperative
population.  The cooperation emerges as a natural part of the
antigen-antibody matching procedure.  This emergent effect is shown to
be similar to fitness sharing, an explicit technique for multi-modal GA
optimization.  The results imply that procedures like those in the
immune system model could promote diverse, cooperative populations
without the explicit, global calculations required by fitness sharing.

------------------------------

"Genetic Set Recombination"

Nicholas J. Radcliffe (njr@epcc.ed.ac.uk)

Edinburgh Parallel Computing Centre
University of Edinburgh
James Clerk Maxwell Building
The King's Buildings
Mayfield Road
Edinburgh
EH9 3JZ
Scotland

Abstract:

The application of genetic algorithms to optimisation problems for which
the solution is a set or multiset (bag) is considered.  A previous
extension of schema analysis, known as forma analysis, is further
developed and used to construct principled representations and operators
for problems in this class.  The extensions to forma analysis include
the introduction of genes whose values cannot be assigned independently
and a method for mediating between desirable but sometimes incompatible
properties of recombination operators.

------------------------------

"REAL-CODED GENETIC ALGORITHMS AND INTERVAL-SCHEMATA"

Larry J. Eshelman and J. David Schaffer

(lje@philabs.philips.com, ds1@philabs.philips.com)

Philips Laboratories
North American Philips Corporation
345 Scarborough Road
Briarcliff Manor, New York 10510

Abstract:

In this paper we introduce interval-schemata as a tool for analyzing
real-coded genetic algorithms (GAs).  We show how interval-schemata are
analogous to Holland's symbol-schemata and provide a key to
understanding the implicit parallelism of real-valued GAs.  We also show
how they support the intuition that real-coded GAs should have an
advantage over binary coded GAs in exploiting local continuities in
function optimization.  On the basis of our analysis we predict some
failure modes for real-coded GAs using several different crossover
operators and present some experimental results that support these
predictions.  We also introduce a crossover operator for real-coded GAs
that is able to avoid some of these failure modes.

------------------------------

"Analyzing Deception in Trap Functions"

Kalyanmoy Deb and David E. Goldberg

(deb@gal1.ge.uiuc.edu and goldberg@vmd.cso.uiuc.edu)

Abstract:

A flat-population schema analysis is performed to find conditions for
full deception in trap functions. It is found that the necessary and
sufficient condition for an l-bit fully deceptive trap function is that
all order l-1 schemata are misleading, and it is observed that the trap
functions commonly used in a number of test suites are not fully
deceptive.  Further analysis suggests that in a fully deceptive trap
function, the locally optimal function value may be as low as 50% of the
globally optimal function value. In this context, the limiting ratio of
the locally and the globally optimal function value for a number of
fully deceptive functions currently in use are calculated.  The analysis
indicates that trap functions allow more flexibility in designing a
deceptive function. It is also found that the proportion of fully
deceptive functions in the family of trap functions is only O(log(l)/l),
and that more than half of the trap functions are fully easy functions.

------------------------------

"Hierarchical automatic function definition in genetic programming" 

John R. Koza
Computer Science Department
Stanford University
Stanford, CA 94305 USA
E-MAIL: Koza@Sunburn.Stanford.Edu
PHONE: 415-941-0336     FAX: 415-941-9430

Abstract:

A key goal in machine learning and artificial intelligence is to
automatically and dynamically decompose problems into simpler problems.
This paper describes two extensions to genetic programming, called
"automatic" function definition and "hierarchical automatic" function
definition, wherein functions that might be useful in solving a problem
are automatically defined on- the-fly during a run.  The automatically
defined functions are defined in terms of dummy variables so that they
can be repeatedly called from the "main" result-producing part of the
program with different instantiations of the dummy variables.  In the
"hierarchical" version of automatic function definition, automatically
defined functions may call other automatically defined functions, thus
creating a hierarchy of dependencies among the automatically defined
functions.  Thus, a problem can be decomposed into smaller problems in
the form of a hierarchy of calls to automatically defined functions.
The even-11-parity problem was solved using using hierarchical automatic
function definition.

------------------------------

"Recombination Distributions for Genetic Algorithms"

Lashon B. Booker
The MITRE Corporation
7525 Colshire Drive
McLean, VA 22102-3481
booker@mitre.org

Though genetic algorithms are loosely based on the principles of genetic
variation and natural selection, the theory of mathematical genetics has
not played a large role in most analyses of genetic algorithms.  This
paper reviews some well known results in mathematical genetics that use
probability distributions to characterize the effects of recombination
on multiple loci in the absence of selection.  The relevance of this
characterization to genetic algorithm research is illustrated by using
it to quantify certain inductive biases associated with crossover
operators.  The potential significance of this work for the theory of
genetic algorithms is discussed.

------------------------------

"Deception Considered Harmful"

John J. Grefenstette

Navy Center for Applied Research in Artificial Intelligence
Code 5514
Naval Research Laboratory
Washington, DC 20375-5000
E-mail: GREF@AIC.NRL.NAVY.MIL

Abstract:

A central problem in the theory of genetic algorithms is the
characterization of problems that are difficult for GAs to optimize.
Many attempts to characterize such problems focus on the notion of
"Deception", defined in terms of the static average fitness of competing
schemas.  This article examines the Static Building Block Hypothesis
(SBBH), the underlying assumption used to define Deception.  Exploiting
contradictions between the SBBH and the Schema Theorem, we show that
Deception is neither necessary nor sufficient for problems to be
difficult for GAs.  This article argues that the characterization of
hard problems must take into account the basic features of genetic
algorithms, especially their dynamic, biased sampling strategy.

------------------------------

"Towards a Stronger Building-Blocks Hypothesis: Effects of Relative
Building-Block Fitness on GA Performance" 

Stephanie Forrest
Dept. of Computer Science 
University of New Mexico 
Albuquerque, NM 87131 
forrest@unmvax.cs.unm.edu

Melanie Mitchell
AI Laboratory 
University of Michigan 
Ann Arbor, MI 48109 
mm@santafe.edu

Abstract:

The building-blocks hypothesis states that the GA works well when short,
low-order, highly-fit schemas recombine to form even more highly fit
higher-order schemas.  The ability to produce fitter and fitter partial
solutions by combining building blocks is believed to be a primary
source of the GA's search power, but the GA research community currently
lacks precise and quantitative descriptions of how schema processing
actually takes place during the typical evolution of a GA search.
Another open problem is to characterize in more detail the types of
fitness landscapes for which crossover will be an effective operator.
In this paper we first describe a class of fitness landscapes (the
``Royal Road'' functions) that we have designed to investigate these
questions.  We then present some unexpected experimental results
concerning the GA's performance on simple instances of these landscapes,
in which we vary the strength of reinforcement from ``stepping
stones''---fit intermediate-order schemas obtained by recombining fit
low-order schemas.

------------------------------

"Accounting for Noise in the Sizing of Populations"

David E. Goldberg, Kalyanmoy Deb, and James H. Clark

Goldberg) goldberg@vmd.cso.uiuc.edu, (Deb) deb@gal1.ge.uiuc.edu

Abstract:

This paper considers the effect of noise on the quality of convergence
of genetic algorithms (GAs). A population-sizing equation is derived to
ensure that minimum signal-to-noise ratios are favorable to the
discrimination of the best building blocks required to solve a problem
of bounded deception. In five test problems of varying degrees of
nonlinearity, nonuniform scaling, and nondeterminism, the sizing
relation proves to be a conservative predictor of average correct
convergence. These results suggest how the sizing equation may be viewed
as a coarse delineation of a boundary between two distinct types of GA
behavior. Besides discussing a number of extensions of this work, the
paper discusses how these results may one day lead to rigorous proofs of
convergence for recombinative GAs operating on problems of bounded
deception.

------------------------------

"SYNTACTIC ANALYSIS OF CONVERGENCE IN GENETIC ALGORITHMS"

Sushil J. Louis				Gregory J. E. Rawlins 
louis@cs.indiana.edu			rawlins@cs.indiana.edu

Dept. of Computer Science
Lindley Hall 215
Indiana University
Bloomington, IN 47405


Absract:

We use the average hamming distance of a population as a syntactic
metric to obtain probabilistic bounds on the time convergence of genetic
algorithms.  Analysis of a {\em flat} function provides worst case time
complexity for static functions and gives a theoretical basis to the
problem of premature convergence. We suggest simple changes that
mitigate this problem and help fight deception.  Further, employing
linearly computable syntactic information, we can provide upper limits
on the time beyond which progress is unlikely on an arbitrary function.
Preliminary results support our analysis.

------------------------------

"Generation Gaps Revisited"

Kenneth A. De Jong and Jayshree Sarma
George Mason University
Fairfax, VA 22030 USA
kdejong@aic.gmu.edu
jsarma@cs.gmu.edu

Abstract:

There has been a lot of recent interest in so-called "steady state"
genetic algorithms (GAs) which, among other things, replace only a few
individuals (typically 1 or 2) each generation from a fixed size
population of size N.  Understanding the advantages and/or disadvantages
of replacing only a fraction of the population each generation (rather
than the entire population) was a goal of some of the earliest GA
research.  In spite of considerable progress in our understanding of GAs
since then, the pros/cons of overlapping generations remains a somewhat
cloudy issue.  However, recent theoretical and empirical results provide
the background for a much clearer understanding of this issue.  In this
paper we review, combine, and extend these results in a way that
significantly sharpens our insight.

------------------------------

"Crossover or Mutation?"

William M. Spears
AI Center - Code 5510
Naval Research Laboratory
Washington, D.C. 20375
spears@aic.nrl.navy.mil

Abstract:

Genetic algorithms rely principally on two genetic operators - crossover
and mutation.  Although there exists a large body of conventional wisdom
concerning the roles of crossover and mutation, these roles have not
been captured in a theoretical fashion.  For example, it has never been
theoretically shown that mutation is in some sense "less powerful" than
crossover or vice versa.  This paper provides some answers to these
questions by theoretically demonstrating that there are some important
characteristics of each operator that are not captured by the other.

------------------------------

"Learning Boolean Functions with Genetic Algorithms: A PAC Analysis"

Johannes P. Ros

Computer Science Department
University of Pittsburgh
Pittsburgh, PA 15260
ros@cs.pitt.edu

Abstract:

This paper presents a genetic algorithm for learning the classes of
Boolean conjuncts and disjuncts (which include the sets of kCNF and kDNF
formulas) in the context of the distribution-free learning model (the
`probably approximately correct' or PAC model). A PAC learner produces
with high probability an accurate representation for an unknown concept
from training examples that are chosen at random from a fixed but
arbitrary probability distribution.  The analysis of the genetic PAC
learner is complete: For any reasonable crossover operator and any
confidence and accuracy level, the result provides the number of
generations and the size of the population to accomplish this learning
task.

------------------------------

"Is the genetic algorithm a cooperative learner"

Helen Cobb 

Navy Center for Applied Research in Artificial Intelligence
Code 5514
Naval Research Laboratory
Washington, DC 20375-5000
E-mail: cobb@AIC.NRL.Navy.Mil

Abstract:

This paper begins to explore an analogy between the usual competitive
learning metaphor presented in the genetic algorithm (GA) literature and
the cooperative learning metaphor discussed by Clearwater, et al.  In a
blackboard cooperative learning paradigm, agents share partial results
with one another through a common blackboard.  By occasionally accessing
the blackboard for a partial solution, an agent can dramatically
increase its speed in finding the overall solution to a problem.  The
study of Clearwater et al. shows that the resulting speed distribution
among the agents is lognormal.  The GA can also be described in terms of
an analogous cooperative learning paradigm.  Unlike the blackboard
learner, the GA shares information by copying and recombining the
solutions of the agents.  This slower propagation of information is
necessary because the GA cannot directly evaluate parts of a solution or
"partial solutions." The extent to which the GA is cooperative also
depends on the choice of heuristics used to modify the cannonical GA.
The few test cases presented in this paper suggest that the GA may at
times yield an approximately lognormal distribution or a mixture of
lognormal distributions.  While the results look promising, more
analysis of the algorithm's overall process is required.

------------------------------

"Simulated Crossover in Genetic Algorithms"

Gilbert Syswerda

BBN Laboratories
BBN Systems and Technologies
10 Moulton Street
Cambridge, MA 02138

Abstract:

A new class of crossover operator, simulated crossover, is presented.
Simulated crossover treats the population of a genetic algorithm as a
conditional variable to a probability density function that predicts the
likelihood of generating samples in the problem space. A specific
version of simulated crossover, bit-based simulated crossover, is
explored. Its ability to perform schema recombination and reproduction
is compared analytically against other crossover operators, and its
performance is checked on several test problems.

------------------------------

"An Executable Model of a Simple Genetic Algorithm"

Darrell Whitley (whitley@CS.ColoState.EDU)

Abstract:

A set of executable equations are defined which model the ideal behavior
of a simple genetic algorithm.  The equations assume an infinitely large
population and require the enumeration of all points in the search
space.  When implemented as a computer program, the equations can be
used to study problems of up to approximately 15 bits.  These equations
represent an extension of earlier work by Bridges and Goldberg. At the
same time these equations are a special case of a model introduced by
Vose and Liepins.  The various models are reviewed and the predictive
behavior of the executable equations is examined. Then the executable
equations are extended by quanitifying the behavior of a reduced
surrogate operator and a uniform crossover operator.  In addition, these
equations are used to study the computational behavior of a parallel
island model implementation of the simple genetic algorithm.

------------------------------