Date: Tue, 24 Jul 90 09:08:45 EDT From: booker@AIC.NRL.Navy.Mil Subject: Workshop abstract The Classifier System Concept Description Language by Lashon B. Booker Legitimate concerns have been raised about the expressive adequacy of the classifier language. This paper shows that many of those concerns stem from the inadequacies of the binary encodings typically used with classifier systems, not the classifier language per se. In particular, we describe some straightforward binary encodings for attribute-based instance spaces. These encodings give classifier systems the ability to represent ordinal and nominal attributes as expressively as most symbolic machine learning systems, without sacrificing the building blocks required by the genetic algorithm. ----------------------------------- Date: Wed, 1 Aug 90 19:56:18 EDT From: Gilbert Syswerda Subject: Workshop Abstract A Study of Reproduction in Generational and Steady State Genetic Algorithms Two techniques of population control are currently used in the field of genetic algorithms: generational and steady state. It has become apparent that the two techniques are actually quite different and that steady state in general performs much better than generational. In this paper, I study the performance of each with regard to reproduction. ----------------------------------------- Date: Mon, 30 Jul 90 07:41:34 EDT From: ds1@philabs.Philips.Com (Dave Schaffer) Subject: workshop abstract Spurious Correlations and Premature Convergence in Genetic Algorithms J. David Schaffer Larry J. Eshelman Daniel Offutt Philips Laboratories, North American Philips Corporation, 345 Scarborough Road, Briarcliff Manor, New York 10510 ABSTRACT What distinguishes genetic algorithms (GAs) from other search methods is their inherent exploitive sampling ability known as implicit parallelism. We argue, however, that this exploitive behavior makes GAs sensitive to spurious correlations between schemata that contribute to performance and schemata that are parasitic. If not combatted, this can lead to premature convergence. Among crossover operators, some are more disruptive than others, and traditional arguments have held that less disruption is better for implicit parallelism. To explore this issue we examine the behavior of two crossover operators, two-point and uniform crossover, on a problem contrived to contain specific spurious correlations. The more disruptive operator, uniform crossover, is more effective at combatting the spurious correlations at the expense of also more disruption of the effective schemata. Elitist selection procedures are shown to be able to ameliorate this somewhat, suggesting that research into ways of dealing with the effects of the inevitable sampling errors may lead to generally more robust algorithms. ------------------------------------ Date: Tue, 7 Aug 90 09:03:12 EDT From: lje@philabs.Philips.Com (Larry Eshelman) Subject: Abstract, GA Workshop The following is a brief abstract of the paper I presented at the IU GA Workshop. --Larry Eshelman ABSTRACT This paper describes and analyzes CHC, a nontraditional genetic algorithm which combines a conservative selection strategy that always preserves the best individuals found so far with a radical (highly disruptive) recombination operator that produces offspring that are maximally different from both parents. It is argued that the traditional reasons for preferring a recombination operator with a low probability of disrupting schemata do not hold when such a conservative selection strategy is used, and that, on the contrary, at least some highly disruptive crossover operators provide more effective search. Empirical evidence is provided to support these claims. Date: Wed, 15 Aug 90 14:45:02 EDT From: gref@AIC.NRL.Navy.Mil Subject: FOGA-90 Abstract Weakest Conditions for Implicit Parallelism John Grefenstette Many interesting varieties of genetic algorithms have been designed and implemented in the last fifteen years. One way to improve our understanding of genetic algorithms is to identify properties that are invariant across these seemingly different versions. This talk focuses on invariances among GAs that differ along two dimensions: (1) the way user-defined objective function is mapped to a fitness measure, and (2) the way the the fitness measure is used to assign offspring to parents. A GA is called admissible if it meets what seem to be the weakest reasonable requirements along these dimensions. It is shown that any admissible GA exhibits a form of implicit parallelism. Date: Thu, 16 Aug 90 12:38:44 EDT From: dejong@AIC.NRL.Navy.Mil Subject: abstract An Analysis of Multi-point Crossover K. De Jong & Wm. Spears At the FOGA Workshop, we presented some theoretical results on n-point and uniform crossover. This analysis extended the work from De Jong's thesis, which dealt with disruption of n-point crossover on 2nd order schemata. We have made various extensions to this theory: 1) An analysis of the disruption of n-point crossover on kth order schemata. 2) The computation of tighter bounds on the disruption caused by n-point crossover, by examining the cases in which both parents are members of the schema (i.e., the values of the defining bits are the same). 3) An analysis of the disruption caused by uniform crossover on kth order schemata. Disruption of schemata is important for understanding the effects of crossover when populations are diverse (typically early in the evolutionary process). If a population becomes quite homogeneous, another factor becomes important: the probability that the offspring produced by crossover will be different than their parents in some way (thus generating a new sample). These probability computations turn out to be the same as the above disruption computations. In other words, those operators that are more disruptive are also more likely to create new individuals from parents with nearly identical genetic material. These theoretical results appear to explain many of the conflicting experimental results concerning the advantages/disadvantages of 1-pt, 2-pt, and uniform crossover and suggest an interplay with population size. With a small population, more disruptive crossover operators such as uniform or n-pt ( n>2) yield better results because they help overcome the limited carrying capacity of smaller populations and the tendency for more homogeneity. However, with larger populations, less disruptive crossover operators (2-pt) work better, as suggested by Holland's original analysis. ----- End Included Message ----- Date: Fri, 17 Aug 90 09:23:32 CDT From: Robert Elliott Smith Subject: Variable Default Hierarchy Separation in a Classifier System Variable Default Hierarchy Separation in a Classifier System Robert E. Smith and David E. Goldberg The University of Alabama Department of Engineering Mechanics Tuscaloosa, Alabama Abstract Traditionally, some form of auction-based, specificity-biased allocation of credit(AOC) and conflict resolution (CR) scheme has been used to encourage default hierarchy formation in the LCS. These AOC/CR schemes have two major limitations. The first is that they lead to a fixed steady-state difference between the bids of rules. This fixed separation is dictated by pre-specified LCS parameters and the given specificity difference between rules. Since unknown environmental factors (e.g. noise, the repeated application of taxes, and reward delays) dictate what amount of separation is adequate for a given environment, these parameters cannot be specified for general utility. The second limitation is that specificity may not reflect the correct priority of a rule in a default hierarchy. Classifiers can be placed into working default hierarchies in priority positions that are not related to their specificity. It is true that default hierarchies can be formed for any given environment such that specificity-based prioritization is possible. However, if the system is only able to exploit this restricted class of default hierarchies, the rule discovery burden on the GA is increased. Under this restriction, a system can only thoroughly evaluate rules in the context of sets that can be prioritized based on specificity, thus limiting the GA's effectiveness. Additionally, the class of effective rule sets that can be discovered by the GA is narrowed. To allow for more robust default hierarchy formation in the LCS, this study suggests a modified CR scheme that associates two separate measures with each classifier. The first measure is a reward estimate that is updated in the same manner as strength in the traditional LCS scheme. The second measure is a priority factor that biases CR. The priority factor is updated using a necessity auction. A necessity auction is a more realistic simulation of a real auction where the winning classifier need only pay out the bid of its nearest competitor. Under the modified scheme, bids are based on reward estimates, but conflicts are resolved based on the product of the reward estimate and the priority factor. In terms of economic analogy, the priority factor can be thought of as a cash reserve, or as some other net-profit-based influence on the auction procedure. Note that this scheme has the benefit of yielding a more clear-cut classifier fitness measure for the GA. Since the reward estimate in this scheme is not used to measure a classifier's priority (as happens with strength in the traditional scheme), it can more adequately reflect a classifier's fitness in the current rule set. Analysis and experimentation show that separate priority factors and the necessity auction induce variable bid separation for multi-level default hierarchies. This separation adapts to fit the characteristics of the environment and leads to consistent preference of the exception over the default. Robert Elliott Smith Department of Engineering of Mechanics The University of Alabama P. O. Box 870278 Tuscaloosa, Alabama 35487 <> rob@galab2.mh.ua.edu <> (205) 348-4661 ------------------------------------------- Date: Wed, 22 Aug 90 15:37:32 EDT From: jima@starbase.MITRE.ORG (Jim Antonisse) Subject: A Grammar-Based Genetic Algorithm A Grammar-Based Genetic Algorithm H. James Antonisse antonisse@ai.mitre.org The MITRE Corporation W418 7525 Colshire Drive, McLean, VA 22102 (703) 883-7887 One major obstacle to a widened role for the genetic algorithm (GA) has been its foundation on unstructured problem representations. Although unstructured representations form the basis of the theory underlying the GA, such representations contrast sharply with most work in the artificial intelligence community. The presentation proposed a general reformulation of the genetic algorithm that makes it appropriate to any representation that can be cast in a formal grammar. First a reconstruction of the basis of the source of the GA's power was presented. The reconstruction clarified the three biases, the statistical bias, inductive bias, and proximity bias, inherent in the GA approach. The problem that confronts the GA in structured domains was illustrated and a solution presented that is appropriate for any domain defined by a grammar. The approach results in a new crossover operator that folds the grammar into the operator in a completely general way. Fragments of legal expressions that are crossed using this operator are guaranteed to lead to new expressions that are legal in the grammar. ------------------------------------------- Date: Wed, 22 Aug 90 18:15:06 -0300 From: yuval@wisdom.weizmann.ac.il (Yuval Davidor) Subject: Epistasis Variance Epistasis Variance - Suitability of a Representation to a Genetic Algorithm Yuval Davidor Department of Applied Mathematics and Computer Science, Weizmann Institute of Science, Rehovot 76100, Israel. yuval@wisdom.weizmann.ac.il The choice of a representation is the most important constraint imposed on a GA. The discussion on epistasis variance attempts to suggest a new viewpoint on the interaction between a representation and a GA, and a criterion with which the suitability of a particular representation for a GA can be assessed. The paper promotes the following dogma: 1) GAs only 'see' the representation and therefore, all nonlinearities embedded in the application (objective function, population, and representation structure) are folded into the representation. 2) GAs will work well if one can estimate fairly well the value of a complete genotype by knowing the value of its alleles. The second point suggests nonlinear interactions among the representation parameters, and that if these nonlinearities are moderate, then the given representation is suitable for a GA (no statements are offered at present to the optimum amount of nonlinearity). Biologists call these interactions, Epistasis, which means in Greek "masking ". Either too much or too little epistasis (nonlinear interactions) detracts from the relative efficiency of a GA. It is suggested that measures to qualify the suitability of a representation to a GA search can be developed with the concept of epistasis by attempting to decompose genotypes' fitness into allelic values. Measuring the variance of the discrepancy between the original and re-constructed fitnesses indicates the amount of epistasis and hence, the suitability of the representation to a GA. Furthermore, thus quantifying epistasis offers a powerful mechanism to `spy' on the on other central issues such as premature convergence and optimal population size). -------------------------------------- Date: Sat, 25 Aug 90 18:21:42 CDT From: Deb Subject: Comparative Analysis of Selection Schemes Used in GAs A Comparative Analysis of Selection Schemes Used in Genetic Algorithms David E. Goldberg and Kalyanmoy Deb This paper considers a number of selection schemes commonly used in modern genetic algorithms. Specifically, proportionate reproduction, ranking selection, tournament selection, and Genitor (or `steady state') selection are compared on the basis of solutions to deterministic difference or differential equations, which are verified through computer simulations. The analysis provides convenient approximate or exact solutions as well as useful convergence time and growth ratio estimates. The paper recommends practical application of the analyses and suggests a number of paths for more detailed analytical investigation of selection techniques. ------------------------------------ Date: Mon, 27 Aug 90 16:40 EST From: "Barry R. Fox" <0003415313@mcimail.com> An Investigation of Genetic Operators for Sequencing Problems Producing offspring for sequencing problems requires different recombination operators than those used on traditional bit string problems. Little work has been done to ehplain how reordering operators recognize and preserve good "building blocks" for sequences. We describe a model for looking at sequences which reveals the predecessor/successor relationship between any two elements. This model may be used to compare two sequences and recognize the relations they have in common and then generate offspring which preserve some or all of these relations. Mary Beth McMahon and Barry Fox ----------------------------------------- Date: Mon, 27 Aug 90 09:46:02 EDT From: Rick_Riolo@um.cc.umich.edu Lookahead Planning and Latent Learning in Classifier Systems. Rick L. Riolo University of Michigan Ann Arbor, MI, 48109, USA rick_riolo@um.cc.umich.edu Classifier systems (CSs) have been used to simulate and describe the behavior of adaptive organisms, animats, and robots. However, CS implementations to date have all been reactive systems, which use simple S-R rules and which base their learning algorithms on trial-and-error reinforcement techniques similar to the Hullian Law of Effect. While these systems have exhibited interesting behavior and good adaptive capacity, they cannot do other types of learning which require having explicit internal models of the external world, e.g., using complex plans as humans do, or doing ``latent learning'' of the type observed in rats. This paper describes a CS that is able to learn and use internal models both to greatly decrease the time to learn general sequential decision tasks and to enable the system to exhibit latent learning. (This is the abstract from a paper that will appear in Procedings of the Confernce on Simulating Animal Behavior (SAB 90), Paris, Sept 1990; that paper includes some of the results presented at the FOGA workshop.) ------------------------------------- Date: Thu, 30 Aug 90 18:17:37 EDT From: Heinz.Muehlenbein@K.GP.CS.CMU.EDU Parallel genetic algorithms and combinatorial optimization Heinz Muhlenbein Parallel genetic algorithms (PGA) use two major modifications compared to the genetic algorithm. Firstly, selection for mating is distributed. Individuals live in a 2-D world. Selection of a mate is done by each individual independently in its neighborhood. Secondly, each individual may improve its fitness during its lifetime by e.g. local hill climbing. The PGA is totally asynchronous, running with maximal efficiency on MIMD parallel computers. This has been achieved by the fact that the individuals are active and not acted on like in the genetic algorithm. We will show the power of the PGA with three combinatorial problems: the traveling salesman problem, the autocorrelation problem, and the graph partitioning problem. In all these examples, the PGA has found solutions of very large problems, which are comparable or even better than any other solution found by other heuristics. This is proof by experimental results. In the talk we will also outline why and when the PGA is succesful. Firstly, the success is depending on the genetic representation of the combinatorial problem. Secondly, a suitable crossover operator and an efficient local hill climbing method is important. Abstractly, a PGA is a parallel search with information exchange between the individuals. If we represent the combinatorial problem as a fitness landscape in a certain configuration space, we see, that a PGA tries to jump from two local minima to a third, still better local minima, by using the crossover operator. This jump is (probabilisticly) succesful, if the fitness landscape has a certain correlation. We will show the correlation for the traveling salesman problem by a configuration space analysis. The PGA explores implicitly the above correlation. The PGA is a very simple algorithm and can be applied to many combinatorial problems with great success. Nevertheless, its analysis is very complex and has implications also for biology and physics.