=============================================================================== cthomp-thesis Inductive Learning for Abductive Diagnosis new system for learning by induction, called \lab\/, is presented. \lab\/ (Learning for ABduction) learns abductive rules based on a set of training examples. Our goal is to find a small knowledge base which, when used abductively, diagnoses the training examples correctly, in addition to generalizing well to unseen examples. This is in contrast to past systems, which inductively learn rules which are used deductively. Abduction is particularly well suited to diagnosis, in which we are given a set of symptoms (manifestations) and we want our output to be a set of disorders which explain why the manifestations are present. Each training example is associated with potentially multiple categories, instead of one, which is the case with typical learning systems. Building the knowledge base requires a choice between multiple possibilities, and the number of possibilities grows exponentially with the number of training examples. One method of choosing the best knowledge base is described and implemented. The final system is experimentally evaluated, using data from the domain of diagnosing brain damage due to stroke. It is compared to other learning systems and a knowledge base produced by an expert. The results are promising: the rule base learned is simpler than the expert knowledge base and rules learned by one of the other systems, and the accuracy of the learned rule base in predicting which areas are damaged is better than all the other systems as well as the expert knowledge base. =============================================================================== baffes-proposal Learning to Model Students: Using Theory Refinement to Detect Misconceptions Paul T. Baffes A new student modeling system called ASSERT is described which uses domain independent learning algorithms to model unique student errors and to automatically construct bug libraries. ASSERT consists of two learning phases. The first is an application of theory refinement techniques for constructing student models from a correct theory of the domain being tutored. The second learning cycle automatically constructs the bug library by extracting common refinements from multiple student models which are then used to bias future modeling efforts. Initial experimental data will be presented which suggests that ASSERT is a more effective modeling system than other induction techniques previously explored for student modeling, and that the automatic bug library construction significantly enhances subsequent modeling efforts. =============================================================================== zelle-proposal Learning Search-Control Heuristics for Logic Programs: Applications to Speedup Learning and Language Acquisition John M. Zelle This paper presents a general framework, learning search-control heuristics for logic programs, which can be used to improve both the efficiency and accuracy of knowledge-based systems expressed as definite-clause logic programs. The approach combines techniques of explanation-based learning and recent advances in inductive logic programming to learn clause-selection heuristics that guide program execution. Two specific applications of this framework are detailed: dynamic optimization of Prolog programs (improving efficiency) and natural language acquisition (improving accuracy). In the area of program optimization, a prototype system, DOLPHIN, is able to transform some intractable specifications into polynomial-time algorithms, and outperforms competing approaches in several benchmark speedup domains. A prototype language acquisition system, CHILL, is also described. It is capable of automatically acquiring semantic grammars, which uniformly incorprate syntactic and semantic constraints to parse sentences into case-role representations. Initial experiments show that this approach is able to construct accurate parsers which generalize well to novel sentences and significantly outperform previous approaches to learning case-role mapping based on connectionist techniques. Planned extensions of the general framework and the specific applications as well as plans for further evaluation are also discussed. =============================================================================== speedup-ijcai93 Combining FOIL and EBG to Speed-up Logic Programs John M. Zelle and Raymond J. Mooney Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, Chambery, France, 1993 (IJCAI-93). This paper presents an algorithm that combines traditional EBL techniques and recent developments in inductive logic programming to learn effective clause selection rules for Prolog programs. When these control rules are incorporated into the original program, significant speed-up may be achieved. The algorithm is shown to be an improvement over competing EBL approaches in several domains. Additionally, the algorithm is capable of automatically transforming some intractable algorithms into ones that run in polynomial time. =============================================================================== neither-ijcai93 Symbolic Revision of Theories with M-of-N Rules Paul T. Baffes and Raymond J. Mooney Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, Chambery, France, 1993 (IJCAI-93). This paper presents a major revision of the EITHER propositional theory refinement system. Two issues are discussed. First, we show how run time efficiency can be greatly improved by changing from a exhaustive scheme for computing repairs to an iterative greedy method. Second, we show how to extend EITHER to refine M-of-N rules. The resulting algorithm, NEITHER (New EITHER), is more than an order of magnitude faster and produces significantly more accurate results with theories that fit the M-of-N format. To demonstrate the advantages of NEITHER, we present preliminary experimental results comparing it to EITHER and various other systems on refining the DNA promoter domain theory. =============================================================================== chill-aaai93 Learning Semantic Grammars with Constructive Inductive Logic Programming John M. Zelle and Raymond J. Mooney Proceedings of the Eleventh National conference of the American Association for Artificial Intelligence (AAAI-93). Automating the construction of semantic grammars is a difficult and interesting problem for machine learning. This paper shows how the semantic-grammar acquisition problem can be viewed as the learning of search-control heuristics in a logic program. Appropriate control rules are learned using a new first-order induction algorithm that automatically invents useful syntactic and semantic categories. Empirical results show that the learned parsers generalize well to novel sentences and out-perform previous approaches based on connectionist techniques. =============================================================================== conn-sci Combining Connectionist and Symbolic Learning to Refine Certainty-Factor Rule-Bases J. Jeffrey Mahoney and Raymond J. Mooney To appear in Connection Science journal's special issue on Architectures for Integrating Neural and Symbolic Processing. This paper describes Rapture --- a system for revising probabilistic knowledge bases that combines connectionist and symbolic learning methods. Rapture uses a modified version of backpropagation to refine the certainty factors of a Mycin-style rule base and it uses ID3's information gain heuristic to add new rules. Results on refining three actual expert knowledge bases demonstrate that this combined approach generally performs better than previous methods. =============================================================================== forte Refinement of First-Order Horn-Clause Domain Theories Bradley L. Richards and Raymond J. Mooney Submitted for journal publication Knowledge acquisition is a difficult and time-consuming task, and as error-prone as any human activity. The task of automatically improving an existing knowledge base using learning methods is addressed by a new class of systems performing {\it theory refinement}. Until recently, such systems were limited to propositional theories. This paper presents a system, {\sc Forte} (First-Order Revision of Theories from Examples), for refining first-order Horn-clause theories. Moving to a first-order representation opens many new problem areas, such as logic program debugging and qualitative modelling, that are beyond the reach of propositional systems. {\sc Forte} uses a hill-climbing approach to revise theories. It identifies possible errors in the theory and calls on a library of operators to develop possible revisions. The best revision is implemented, and the process repeats until no further revisions are possible. Operators are drawn from a variety of sources, including propositional theory refinement, first-order induction, and inverse resolution. {\sc Forte} has been tested in several domains including logic programming and qualitative modelling. =============================================================================== cnf Encouraging Experimental Results on Learning CNF Raymond J. Mooney To appear in Machine Learning This paper presents results comparing three inductive learning systems using different representations for concepts, namely: CNF formulae, DNF formulae, and decision trees. The CNF learner performs surprisingly well. Results on five natural data sets show that it frequently trains faster and produces more accurate and simpler concepts. The probable explanation for its superior performance is that the other systems are more susceptible to the replication problem. =============================================================================== brace-proposal Belief Revision in the Context of Abductive Explanation Siddarth Subramanian Appears as Technical Report AI92-179, Artificial Intelligence Lab, University of Texas at Austin, March 1991. This proposal presents an approach to explanation that incorporates the paradigms of belief revision and abduction. We present an algorithm that combines these techniques and a system called {\sc Brace} that is a preliminary implementation of this algorithm. We show the applicability of the {\sc Brace} approach to a wide range of domains including scientific discovery, device diagnosis and plan recognition. Finally, we describe our proposals for a new implementation, new application domains for our system and extensions to this approach. =============================================================================== accel-journal A First-Order Horn-Clause Abductive System and Its Use in Plan Recognition and Diagnosis Hwee Tou Ng and Raymond J. Mooney Submitted for journal publication. A diverse set of intelligent activities, including natural language understanding and diagnosis, requires the ability to construct explanations for observed phenomena. In this paper, we view explanation as {\em abduction}, where an abductive explanation is a consistent set of assumptions which, together with background knowledge, logically entails a set of observations. We have successfully built a domain-independent system, {\sc Accel}, in which knowledge about a variety of domains is uniformly encoded in first-order Horn-clause axioms. A general-purpose abduction algorithm, {\sc AAA}, efficiently constructs explanations in the various domains by caching partial explanations to avoid redundant work. Empirical results show that caching of partial explanations can achieve more than an order of magnitude speedup in run time. We have applied our abductive system to two general tasks: plan recognition in text understanding, and diagnosis of medical diseases, logic circuits, and dynamic systems. The results indicate that {\sc Accel} is a general-purpose system capable of plan recognition and diagnosis, yet efficient enough to be of practical utility. =============================================================================== kr92 Abductive Plan Recognition and Diagnosis: A Comprehensive Empirical Evaluation Hwee Tou Ng and Raymond J. Mooney Appears in the Proceedings of the Third International Conference on Principles of Knowledge Representation and Reasoning, pp. 499-508, Cambridge, MA, October 1992. While it has been realized for quite some time within AI that abduction is a general model of explanation for a variety of tasks, there have been no empirical investigations into the practical feasibility of a general, logic-based abductive approach to explanation. In this paper we present extensive empirical results on applying a general abductive system, {\sc Accel}, to moderately complex problems in plan recognition and diagnosis. In plan recognition, {\sc Accel} has been tested on 50 short narrative texts, inferring characters' plans from actions described in a text. In medical diagnosis, {\sc Accel} has diagnosed 50 real-world patient cases involving brain damage due to stroke (previously addressed by set-covering methods). {\sc Accel} also uses abduction to accomplish model-based diagnosis of logic circuits (a full adder) and continuous dynamic systems (a temperature controller and the water balance system of the human kidney). The results indicate that general purpose abduction is an effective and efficient mechanism for solving problems in plan recognition and diagnosis. =============================================================================== aaai92-misq Automatic Abduction of Qualitative Models Bradley L. Richards, Ina Kraan, and Benjamin J. Kuipers Appears in the Proceedings of the Tenth National Conference on Artificial Intelligence, San Jose, CA, July 1992. We describe a method of automatically abducing qualitative models from descriptions of behaviors. We generate, from either quantitative or qualitative data, models in the form of qualitative differential equations suitable for use by QSIM. Constraints are generated and filtered both by comparison with the input behaviors and by dimensional analysis. If the user provides complete information on the input behaviors and the dimensions of the input variables, the resulting model is unique, maximally constrainted, and guaranteed to reproduce the input behaviors. If the user provides incomplete information, our method will still generate a model which reproduces the input behaviors, but the model may no longer be unique. Incompleteness can take several forms: missing dimensions, values of variables, or entire variables. =============================================================================== aaai92-relpath Learning Relations by Pathfinding Bradley L. Richards and Raymond J. Mooney Appears in the Proceedings of the Tenth National Conference on Artificial Intelligence, pp. 50-55, San Jose, CA, July 1992. First-order learning systems (e.g., FOIL, FOCL, FORTE) generally rely on hill-climbing heuristics in order to avoid the combinatorial explosion inherent in learning first-order concepts. However, hill-climbing leaves these systems vulnerable to local maxima and local plateaus. We present a method, called relational pathfinding, which has proven highly effective in escaping local maxima and crossing local plateaus. We present our algorithm and provide learning results in two domains: family relationships and qualitative model building. =============================================================================== mlw92-speedup Speeding-up Logic Programs by Combining EBG and FOIL John M. Zelle and Raymond J. Mooney Appears in the Proceedings of the 1992 Machine Learning Workshop on Knowledge Compilation and Speedup Learning, Aberdeen Scotland, July 1992. This paper presents an algorithm that combines traditional EBL techniques and recent developments in inductive logic programming to learn effective clause selection rules for Prolog programs. When these control rules are incorporated into the original program, significant speed-up may be achieved. The algorithm produces not only EBL-like speed up of problem solvers, but is capable of automatically transforming some intractable algorithms into ones that run in polynomial time. =============================================================================== mlw92-integrated Combining Symbolic and Neural Learning to Revise Probabilistic Theories J. Jeffrey Mahoney & Raymond J. Mooney Appears in the Proceedings of the 1992 Machine Learning Workshop on Integrated Learning in Real Domains, Aberdeen Scotland, July 1992. This paper describes {\sc Rapture} --- a system for revising probabilistic theories that combines symbolic and neural-network learning methods. {\sc Rapture} uses a modified version of backpropagation to refine the certainty factors of a Mycin-style rule-base and it uses ID3's information gain heuristic to add new rules. Results on two real-world domains demonstrate that this combined approach performs as well or better than previous methods. =============================================================================== ijcnn-92 Growing Layers of Perceptrons: Introducing the Extentron Algorithm Paul T. Baffes and John M. Zelle Appears in the Proceedings of the 1992 International Joint Conference on Neural Networks, Baltimore, Maryland, June 7-11. The ideas presented here are based on two observations of perceptrons: (1) when the perceptron learning algorithm cycles among hyperplanes, the hyperplanes may be compared to select one that gives a best {\em split\/} of the examples, and (2) it is always possible for the perceptron to build a hyperplane that separates {\em at least one\/} example from all the rest. We describe the Extentron, which grows multi-layer networks capable of distinguishing NON-LINEARLY-SEPARABLE data using the simple perceptron rule for linear threshold units. The resulting algorithm is simple, very fast, scales well to large problems, retains the convergence properties of the perceptron, and can be completely specified using only two parameters. Results are presented comparing the Extentron to other neural network paradigms and to symbolic learning systems. =============================================================================== cogsci-92 Using Theory Revision to Model Students and Acquire Stereotypical Errors Paul T. Baffes and Raymond J. Mooney Appears in the Proceedings of the Fourteenth Annual Conference of the Cognitive Science Society, pp. 617-622, Bloomington, IN, July 1992 Student modeling has been identified as an important component to the long term development of Intelligent Computer-Aided Instruction (ICAI) systems. Two basic approaches have evolved to model student misconceptions. One uses a static, predefined library of user bugs which contains the misconceptions modeled by the system. The other uses induction to learn student misconceptions from scratch. Here, we present a third approach that uses a machine learning technique called theory revision. Using theory revision allows the system to automatically construct a bug library for use in modeling while retaining the flexibility to address novel errors. =============================================================================== threvpac A Preliminary PAC Analysis of Theory Revision Raymond J. Mooney March, 1992 To appear in Computational Learning Theory and Natural Learning Systems, Vol. 3, T. Petsche, S. Judd, and S. Hanson, Eds., MIT Press. This paper presents a preliminary analysis of the sample complexity of theory revision within the framework of PAC (Probably Approximately Correct) learnability theory. By formalizing the notion that the initial theory is ``close'' to the correct theory we show that the sample complexity of an optimal propositional Horn-clause theory revision algorithm is $O( ( \ln 1 / \delta + d \ln (s_0 + d + n) ) / \epsilon)$, where $d$ is the {\em syntactic distance} between the initial and correct theories, $s_0$ is the size of initial theory, $n$ is the number of observable features, and $\epsilon$ and $\delta$ are the standard PAC error and probability bounds. The paper also discusses the problems raised by the computational complexity of theory revision. =============================================================================== ilp-92 Automated Debugging of Logic Programs via Theory Revision Raymond J. Mooney & Bradley L. Richards February, 1992 Appears in the Proceedings of the Second International Workshop on Inductive Logic Programming, Tokyo, Japan, June 1992. This paper presents results on using a theory revision system to automatically debug logic programs. {\sc Forte} is a recently developed system for revising function-free Horn-clause theories. Given a theory and a set of training examples, it performs a hill-climbing search in an attempt to minimally modify the theory to correctly classify all of the examples. {\sc Forte} makes use of methods from propositional theory revision, Horn-clause induction ({\sc Foil}), and inverse resolution. The system has has been successfully used to debug logic programs written by undergraduate students for a programming languages course. =============================================================================== spring-symp-92 Batch versus Incremental Theory Refinement Raymond J. Mooney Appears in the Proceedings of AAAI Spring Symposium on Knowledge Assimilation, Standford, CA, March, 1992. Most existing theory refinement systems are not incremental. However, any theory refinement system whose input and output theories are compatible can be used to incrementally assimilate data into an evolving theory. This is done by continually feeding its revised theory back in as its input theory. An incremental batch approach, in which the system assimilates a batch of examples at each step, seems most appropriate for existing theory revision systems. Experimental results with the {\sc Either} theory refinement system demonstrate that this approach frequently increases efficiency without significantly decreasing the accuracy or the simplicity of the resulting theory. However, if the system produces bad initial changes to the theory based on only small amount of data, these bad revisions can ``snowball'' and result in an overall decrease in performance. =============================================================================== ml4-chapter A Multistrategy Approach to Theory Refinement Raymond J. Mooney & Dirk Ourston January, 1992 Revised version of a paper appearing in the Proceedings of the International Workshop on Multistrategy Learning, pp. 207-214, Harpers Ferry, West Virginia, November, 1991. To appear in Volume 4 of the Machine Learning book series from Morgan Kaufman editted by R.S. Michalski & G. Teccuci. This chapter describes a multistrategy system that employs independent modules for deductive, abductive, and inductive reasoning to revise an arbitrarily incorrect propositional Horn-clause domain theory to fit a set of preclassified training instances. By combining such diverse methods, {\sc Either} is able to handle a wider range of imperfect theories than other theory revision systems while guaranteeing that the revised theory will be consistent with the training data. {\sc Either} has successfully revised two actual expert theories, one in molecular biology and one in plant pathology. The results confirm the hypothesis that using a multistrategy system to learn from both theory and data gives better results than using either theory or data alone. =============================================================================== category-chapter Integrating Theory and Data in Category Learning Raymond J. Mooney December, 1991 Revised version of paper presented at the Interfaces Conference on Categorization and Category Learning by Humans and Machines, October 11-12, 1991, Texas Tech University, Lubbock, TX. To appear in: Acquisition, Representation, and Processing of Categories and Concepts: The Contribution of Exemplars and Theories, edited by G. Nakamura & R. Taraban, Academic Press. Recent results in both machine learning and cognitive psychology demonstrate that effective category learning involves an integration of theory and data. First, theories can bias induction, affecting what category definitions are extracted from a set of examples. Second, conflicting data can cause theories to be revised. Third, theories can alter the representation of data through feature formation. This chapter reviews two machine learning systems that attempt to integrate theory and data in one or more of these ways. IOU uses a domain theory to acquire part of a concept definition and to focus induction on the unexplained aspects of the data. {\sc Either} uses data to revise an imperfect theory and uses theory to add abstract features to the data. Recent psychological experiments reveal that these machine learning systems exhibit several important aspects of human category learning. Specifically, IOU has been used to successfully model some recent experimental results on the effect of functional knowledge on category learning. =============================================================================== either-aij Theory Refinement Combining Analytical and Empirical Methods Dirk Ourston and Raymond J. Mooney November, 1991. To appear in Artificial Intelligence. This article describes a comprehensive approach to automatic theory revision. Given an imperfect theory, the approach combines explanation attempts for incorrectly classified examples in order to identify the failing portions of the theory. For each theory fault, correlated subsets of the examples are used to inductively generate a correction. Because the corrections are {\em focused}, they tend to preserve the structure of the original theory. Because the system starts with an approximate domain theory, in general fewer training examples are required to attain a given level of performance (classification accuracy) compared to a purely empirical system. The approach applies to classification systems employing a propositional Horn-clause theory. The system has been tested in a variety of application domains, and results are presented for problems in the domains of molecular biology and plant disease diagnosis. =============================================================================== iou-mlj Induction Over the Unexplained: Using Overly-General Domain Theories to Aid Concept Learning Raymond J. Mooney November, 1991 To appear in the Machine Learning Journal. This paper describes and evaluates an approach to combining empirical and explanation-based learning called Induction Over the Unexplained (IOU). IOU is intended for learning concepts that can be partially explained by an overly-general domain theory. An eclectic evaluation of the method is presented which includes results from all three major approaches: empirical, theoretical, and psychological. Empirical results shows that IOU is effective at refining overly-general domain theories and that it learns more accurate concepts from fewer examples than a purely empirical approach. The application of theoretical results from PAC learnability theory explains why IOU requires fewer examples. IOU is also shown to be able to model psychological data demonstrating the effect of background knowledge on human learning. =============================================================================== aaai91 An Efficient First-Order Horn-Clause Abduction System Based on the ATMS Hwee Tou Ng and Raymond J. Mooney Appears in the Proceedings of the Ninth National Conference on Artificial Intelligence, pages 494-499, Anaheim, CA, July 1991. This paper presents an algorithm for first-order Horn-clause abduction that uses an ATMS to avoid redundant computation. This algorithm is either more efficient or more general than any other previous abduction algorithm. Since computing all minimal abductive explanations is intractable, we also present a heuristic version of the algorithm that uses beam search to compute a subset of the simplest explanations. We present empirical results on a broad range of abduction problems from text understanding, plan recognition, and device diagnosis which demonstrate that our algorithm is at least an order of magnitude faster than an alternative abduction algorithm that does not use an ATMS. =============================================================================== ml91-td Improving Shared Rules in Multiple Category Domain Theories Dirk Ourston and Raymond J. Mooney Appears in the Proceedings of the Eighth International Machine Learning Workshop, pp. 534-538, Evanston, IL, June 1991 This paper presents an approach to improving the classification performance of a multiple category theory by correcting intermediate rules which are shared among the categories. Using this technique, the performance of a theory in one category can be improved through training in an entirely different category. Examples of the technique are presented and experimental results are given. =============================================================================== ml91-ci Constructive Induction in Theory Refinement Raymond J. Mooney and Dirk Ourston Appears in the Proceedings of the Eighth International Machine Learning Workshop, pp. 178-182, Evanston, IL. June 1991. This paper presents constructive induction techniques recently added to the EITHER theory refinement system. These additions allow EITHER to handle arbitrary gaps at the ``top,'' ``middle,'' and/or ``bottom'' of an incomplete domain theory. {\it Intermediate concept utilization} employs existing rules in the theory to derive higher-level features for use in induction. {\it Intermediate concept creation} employs inverse resolution to introduce new intermediate concepts in order to fill gaps in a theory that span multiple levels. These revisions allow EITHER to make use of imperfect domain theories in the ways typical of previous work in both constructive induction and theory refinement. As a result, EITHER is able to handle a wider range of theory imperfections than does any other existing theory refinement system. =============================================================================== either-noise Theory Refinement with Noisy Data Raymond J. Mooney and Dirk Ourston Appears as Technical Report AI 91-153, Artificial Intelligence Lab, University of Texas at Austin, March 1991 This paper presents a method for revising an approximate domain theory based on noisy data. The basic idea is to avoid making changes to the theory that account for only a small amount of data. This method is implemented in the {\small EITHER} propositional Horn-clause theory revision system. The paper presents empirical results on artificially corrupted data to show that this method successfully prevents over-fitting. In other words, when the data is noisy, performance on novel test data is considerably better than revising the theory to completely fit the data. When the data is not noisy, noise processing causes no significant degradation in performance. Finally, noise processing increases efficiency and decreases the complexity of the resulting theory. =============================================================================== aaai90 On the Role of Coherence in Abductive Explanation Hwee Tou Ng and Raymond J. Mooney Appears in the Proceedings of the Eighth National Conference on Artificial Intelligence, pages 337-342, Boston, MA, 1990. Abduction is an important inference process underlying much of human intelligent activities, including text understanding, plan recognition, disease diagnosis, and physical device diagnosis. In this paper, we describe some problems encountered using abduction to understand text, and present some solutions to overcome these problems. The solutions we propose center around the use of a different criterion, called {\em explanatory coherence}, as the primary measure to evaluate the quality of an explanation. In addition, explanatory coherence plays an important role in the construction of explanations, both in determining the appropriate level of specificity of a preferred explanation, and in guiding the heuristic search to efficiently compute explanations of sufficiently high quality. ===============================================================================