An evaluation of the algorithm for anaphora resolution in Spanish
has been given in detail in the work of Palomar et al.
[Palomar, M., et al., 2001]. In this paper, we present the obtained
results of the evaluation of this task in AGIR over a different
portion of the LEXESP corpus. Furthermore, non-anaphoric
complement pronouns, that is, complement pronouns that appear next
to the previous indirect object when it has been moved from its
theoretical place after the verb (A Pedro levi ayer--I saw Pedro yesterday), were not resolved
because this kind of pronoun does not appear in the English
translation. For these reasons, the results of the two works are
slightly different.
After the training phase, the algorithm was evaluated over the
test corpus. In this evaluation, only lexical, morphological, and
syntactic information was used. Table
5 shows the results of this
evaluation.
Table 5:
Anaphora resolution
in Spanish, evaluation phase
Comp
P(%)
Ref
P(%)
PP notPP
P(%)
PPinPP
P(%)
Total
P(%) Total
LEXESP
98
82.6
105
92.4
71
70.4
46
76.1
320
82.2
In Table 5 the occurrences of
personal pronouns in the LEXESP corpus are shown. The different types
are: Comp (complement personal pronouns), Ref
(reflexive pronouns), PPnotPP (personal pronouns not
included in a prepositional phrase), and PPinPP (personal
pronouns included in a prepositional phrase). For each type, the
obtained precision, P (the number of pronouns correctly solved
divided by the number of solved pronouns), is shown. The last two
columns represent the total number of personal pronouns and the obtained
precision.
Discussion. In pronominal anaphora resolution in
Spanish, we obtained a precision of 82.2% (263 out of 320). The
recall, R (number of pronouns correctly solved divided by
the number of real pronouns), obtained was of 79% (263 out of
333).
After analyzing the results, the following conclusions were
extracted:
In the resolution of reflexive pronouns, a high precision
(92.4%) was obtained. This higher percentage is
because the antecedent of these pronouns is usually the closest
NP to the pronoun and it is in the same sentence. Therefore,
after applying preferences, few errors are produced.
Analyzing the errors in the remaining pronouns, it is
important to mention the complexity of the LEXESP corpus itself. It
consists of several narrative documents, sometimes with a very
complex style, with long sentences (with an average of 24.6 words
per sentence). This implies a large number of candidates per
anaphor after applying constraints (an average of 16.6).
Errors were originated by different causes:
exceptions in the application of preferences (66.7% of the
global mistakes)
a lack of semantic information (29.8%)
mistakes in the POS tagging (3.5%)
We compared our proposal with the approaches previously
presented in the evaluation of zero-pronoun resolution. As
shown in Table 6, the
precision obtained using AGIR is better than those for the other proposals.
Table 6:
Anaphora
resolution in Spanish, comparison of AGIR with other approaches