Inductive logic programming [24] is situated in the intersection of machine learning or data mining on the one hand, and logic programming on the other hand. It shares with the former fields the goal of finding patterns in data, patterns that can be used to build predictive models or to gain insight in the data. With logic programming it shares the use of clausal first order logic as a representation language for both data and hypotheses. In the remainder of this text we will use some basic notions from logic programming, such as literals, conjunctive queries, and variable substitutions. We will use Prolog notation throughout the paper. For an introduction to Prolog and logic programming see [10].
Inductive logic programming can be used for many different purposes, and the problem statements found in ILP papers consequently vary. In this article we consider the so-called learning from interpretations setting [16,14]. It has been argued elsewhere that this setting, while slightly less powerful than the standard ILP setting (it has problems with, e.g., learning recursive predicates), is sufficient for most practical purposes and scales up better [6].
We formulate the learning task in such a way that it covers a number of different problem statements. More specifically, we consider the problem of detecting for a set of conjunctive queries for which instantiations of certain variables each query succeeds. These variables are called key variables, and a grounding substitution for them is called a key instantiation. The intuition is that an example in the learning task is uniquely identified by a single key instantiation.
The link with ILP systems that learn clauses is then as follows.
The search performed by an ILP system is directed by
regularly evaluating candidate clauses. Let us denote such a candidate
clause by
where
represents a vector
of variables appearing in the head of the clause and
represents
additional variables that occur in the body. We assume that the head
is a single literal and that a list of examples is given, where each
example is of the form
with
a substitution that
grounds
. Examples may be labelled (e.g., as positive or negative), but
this is not essential in our setting. While an example can be
represented as a fact
when learning definite Horn
clauses, we can also consider it just a tuple
. Both notations
will be used in this paper.
Intuitively, when positive and negative examples are given, one wants to
find a clause that covers as many positive examples as possible, while
covering few or no negatives. Whether a single example
is covered by the clause or not can be determined by running the query
. In other words, evaluating a clause boils down to
running a number of queries consisting of the body of the clause.
For simplicity of notation, we will often denote a conjunctive query
by just the conjunction (without the
symbol).
In some less typical ILP settings, the ILP algorithm does not search for Horn clauses but rather for general clauses, e.g., CLAUDIEN [15] or for frequent patterns that can be expressed as conjunctive queries, e.g., WARMR[18]. These settings can be handled by our approach as well: all that is needed is a mapping from hypotheses to queries that allow to evaluate these hypotheses. Such a mapping is defined by [15] for CLAUDIEN; for WARMR it is trivial.
Given a set of queries and a set of examples
, the main task is
to determine which queries
cover which examples
.
We formalise this using the notion of a result set:
Similar to the learning from interpretations setting defined in [14], the problem setting can now be stated as:
grandfather(X,Y) :- parent(X,Z), parent(Z,Y), male(X). grandfather(X,Y) :- parent(X,Z), parent(Z,Y), female(X).
Examples are of the form grandfather(,
) where
and
are constants; hence each example is uniquely identified by a
ground substitution of the tuple
. So in the above problem
setting the set of Prolog queries S equals
?- parent(X,Z), parent(Z,Y), male(X)
,
?- parent(X,Z),
parent(Z,Y), female(X)
and the key
equals
. Given a query
, finding all
tuples
for which
(with
the result set
as defined above) is equivalent to finding which of the grandfather(
,
) facts in the example set are predicted by
the clause grandfather(X,Y) :-
.
The generality of our problem setting follows from the fact that once it is known which queries succeed for which examples, the statistics and heuristics that typical ILP systems use can be readily obtained from this. A few examples:
After transforming the grandfather/2 clauses into
grandfather((X,Y)),I) :- parent(X,Z), parent(Z,Y), male(X), I = 1. grandfather((X,Y)),I) :- parent(X,Z), parent(Z,Y), female(X), I = 2.the result set can clearly be computed by collecting for all grounding
In practice, it is natural to compute the result set using a double loop: one over examples and one over queries and one has the choice as to which is the outer loop. Both the ``examples in outer loop'' and the ``queries in outer loop'' have been used in data mining systems; in the context of decision trees, see for instance [25] and [22]. We shall see further that the redundancy removal approach we propose uses the ``examples in outer loop'' strategy. In both approaches however, given a query and a key instantiation, we are interested only in whether the query succeeds for that key instantiation. This implies that after a particular query has succeeded on an example, its execution can be stopped.
In other words: computing the result set defined above boils down to evaluating each query on each example, where we are only interested in the existence of success for each such evaluation. Computing more than one solution for one query on one example is unnecessary.