Next: Evaluation
Up: Resolution of NLP Problems
Previous: Resolution of NLP Problems
Elliptical Zero-Subject Constructions
(Zero Pronouns)
The Spanish language allows for the omission of the pronominal subject of the
sentences. These omitted pronouns are usually called zero
pronouns. Whereas in other languages (e.g., in
Japanese), zero pronouns may appear in
either the subject's or the object's grammatical position, in Spanish texts zero pronouns only appear in the
position of the subject.
In MT systems, the correct detection and resolution of zero
pronouns in the source language is of crucial importance if
these pronouns are compulsory in the target language. In the
following example, a Spanish sentence that contains a zero pronoun
and its translation into English with the equivalent compulsory
pronoun are shown.
- (S) [Ese hombre] era un boxeador profesional. Ø Perdió
únicamente dos combates.
- (E) [That man] was a professional boxer. He
only lost two fights.
We should remark that zero pronouns can also occur in English,
although they appear less frequently, since they usually are used
in coordinated sentences in which the zero pronoun usually refers
to the subject of the sentence. Although zero pronouns have
already been studied in other languages, such as Japanese--with a
resolution percentage of 78% in the work of [Okumura & Tamura, 1996],
they have not yet been studied in Spanish texts.
[Ferrández & Peral, 2000] has presented the first algorithm for Spanish
zero-pronoun resolution. Basically, in order to translate Spanish
zero pronouns into English, they must first be located in the text
(ellipsis detection) and then resolved (anaphora resolution)
[Peral & Ferrández, 2000b]:
- Zero-pronoun detection. In order to detect zero pronouns,
sentences should be divided into clauses, since the subject can
only appear between the clause constituents. After that, a
noun-phrase (NP) or a pronoun is sought, for each clause, through
the clause constituents on the lefthand side of the verb, unless
it is imperative or impersonal. Such an NP or pronoun must agree
in person and number with the verb of the clause.
- Zero-pronoun resolution. After the zero pronoun has been
detected, our computational system inserts the pronoun in the
position in which it has been omitted. This pronoun will be
detected and resolved in the following module of anaphora
resolution. Person and number information is obtained from the
clause verb. Sometimes, in Spanish, the gender information of the
pronoun can be obtained from the object when the verb is
copulative. In these cases, the subject must agree in gender and
number with its object whenever the object can have either a
masculine or feminine linguistic form.
Subsections
Next: Evaluation
Up: Resolution of NLP Problems
Previous: Resolution of NLP Problems
Jesus Peral
2002-12-13