After zero pronouns have been detected, they are then resolved in
the subsequent module of anaphora resolution (explained in the
following subsection). Basically, an algorithm that combines
different kinds of knowledge by distinguishing between
constraints and preferences is used [Ferrández et al., 1999,Palomar, M., et al., 2001].
The set of constraints and preferences presents two basic
differences between zero-pronoun and pronominal anaphora
resolution:
Zero-pronoun resolution has the constraint of agreement
only in person and number, whereas pronominal anaphora resolution
also requires gender agreement.
Two new preferences to solve zero pronouns are used: (a) preference is given to
candidates in the same sentence as the anaphor that have also been the solution of a zero pronoun in the same
sentence as the anaphor, and (b) in the case where the zero pronoun has gender
information, preference is given to those candidates that agree in gender.
In evaluating zero-pronoun resolution so as to
obtain the best order of preferences (one that produces the best
performance), we used the training phase
to identify the importance of each kind of knowledge. To do this, we
analyzed the antecedent for each pronoun in the training
corpora, and we identified their configurational
characteristics with reference to the pronoun (e.g.,
if the antecedent was a proper noun, if the
antecedent was an indefinite NP, if the antecedent occupied
the same position with reference to the verb as the anaphor
--before or after, etc.). Subsequently, we constructed a
table that showed how often each configurational characteristic
was valid for the solution of a particular pronoun (e.g.,
the solution of a zero pronoun was a proper noun 63% of the time,
for a reflexive pronoun, it was a proper noun 53% of the time, etc.).
In this way, we were able to define the different patterns of Spanish
pronoun resolution and apply them in order to obtain the
evaluation results that are presented in this paper. The order of
importance was determined by first sorting the preferences
according to the percentage of each configurational
characteristic; that is, preferences with higher percentages were
considered more important than those with lower percentages. After
several experiments on the training corpora, an optimal order for
each type of anaphora was obtained. Since in this phase we processed
texts from different genres and by different authors, we can
state that the final set of preferences obtained and their order
of application can be used with confidence on any Spanish text.
After the training, we conducted a blind test over the entire
test corpus, the results for which are shown in Table
3.
It is important to mention here that out of 3,126 verbs in these corpora, 1,348
(Table
2) are zero pronouns in
the third person and will be resolved. In Table
3 we present a classification
of these third-person zero pronouns, which has been conveniently
divided into three categories:
Cataphoric. This category is comprised of those zero
pronouns whose antecedents, that is, the clause subjects, come after
the verb. For instance, in the following Spanish sentence
Ø Compró [un niņo] en el
supermercado ([A boy] bought in the
supermarket), the subject, un niņo (a boy), appears
after the verb, compró (bought). These kinds of
verbs are quite common in Spanish (P = 53.1%, 716 out of 1,348), as can be seen in
Table 3, and represents
one of the main difficulties in resolving anaphora in
Spanish: the structure of a sentence is more flexible than in
English. These represent intonationally marked sentences, where
the subject does not occupy its usual position in the sentence,
that is, before the verb. Cataphoric zero pronouns will not be
resolved in AGIR, since semantic information is needed to be able
to discard all of their antecedents and to give preference to those
that appear within the same sentence and clause after the verb.
For example, the sentence Ø Compró un regalo en el
supermercado ([He] bought a present in the
supermarket) has the same syntactic structure as the previous
sentence, i.e., verb, NP, and PP, where the object function of the
NP can only be distinguished from the subject by means of semantic
knowledge.
Exophoric. This category consists of those zero pronouns
whose antecedents do not appear linguistically in the text (they
refer to items in the external world rather than things referred
to in the text). Exophoric zero pronouns will not be resolved by
the system.
Anaphoric. This category is comprised of those zero pronouns
whose antecedents are found before the verb. These kinds of
pronouns will be resolved by our system.
In Table 3 the numbers of
cataphoric, exophoric, and anaphoric zero pronouns for each corpus
are shown. For anaphoric pronouns, the number of pronouns
correctly solved as well as the obtained precision, P (number of
pronouns correctly solved divided by the number of solved
pronouns) is presented. For example, in the LEXESP corpus, there
are 640 cataphoric, 28 exophoric, and 559 anaphoric zero pronouns.
From these anaphoric pronouns, only 455 were correctly
solved, giving a precision of 81.4%.
Discussion. In zero-pronoun resolution, the following
results have been obtained: LEXESP corpus, P = 81.4%;
BB corpus, P = 81.1%. For the combined corpora, an overall
precision for this task of 81.4% (485 out of 596) was
obtained. The overall recall, R (the number of pronouns
correctly solved divided by the number of real pronouns) obtained
was 79.1% (485 out of 613).
From these results, we have extracted the following conclusions:
There are no meaningful differences between the results
obtained from each corpus.
Errors in the zero-pronoun-resolution stage are originated by
different causes:
exceptions in the application of preferences that imply the
selection of an incorrect antecedent as solution of the zero
pronoun (64% of the global mistakes)
the lack of semantic information9, causing an error rate of 32.4%
mistakes in the POS tagging (3.6%)
Since the results provided by other works have been obtained for
different languages (English), texts, and sorts of knowledge
(e.g., Hobbs and Lappin full parse the text), direct comparisons
are not possible. Therefore, in order to accomplish this
comparison, we have implemented some of these approaches in
SUPAR10, adapting
them for partial parsing and Spanish texts. Although these
approaches were not proposed for zero pronouns and the comparison
will not be fully fair, we have implemented them since that is the
only way to compare our proposal directly with some well-known
anaphora-resolution algorithms.
We have also compared our system with the typical baseline of
proximity preference (i.e., the antecedent that appears closest
to the anaphora is chosen from among those that satisfy the
constraints--morphological agreement and syntactic
conditions). We have also compared our system with the baseline
presented by Hobbs11 [Hobbs, 1978] and Lappin & Leass'
method [Lappin & Leass, 1994]. Moreover, we also compared our
proposal with centering approach by implementing functional centering [Strube & Hahn, 1999]. The
precisions obtained with these different approaches and AGIR are
shown in Table 4. As
can be seen, the precision obtained in AGIR is better
than those obtained using the other proposals.
Table 4:
Zero-pronoun
resolution in Spanish, comparison of AGIR with other approaches