The set of features defined for the training of the system is described in Figure 9 and is based on the features described by Ng1996 and Escudero2000ecml. These features represent words, collocations, and POS tags in the local context. Both ``collapsed'' and ``non-collapsed'' functions are used.
Actually, each item in Figure 9 groups several sets of features. The majority of them depend on the nearest words (e.g., comprises all possible features defined by the words occurring in each sample at positions , , , , , related to the ambiguous word). Types nominated with capital letters are based on the ``collapsed'' function form; that is, these features simply recognize an attribute belonging to the training data.
Keyword features (m) are inspired by Ng1996 work. Noun filtering is done using frequency information for nouns co-occurring with a particular sense. For example, let us suppose for a set of 100 examples of interest#4: if the noun bank is found 10 times or more at any position, then a feature is defined.
Moreover, new features have also been defined using other grammatical properties: relationship features () that refer to the grammatical relationship of the ambiguous word (subject, object, complement, ...) and dependency features ( and ) that extract the word related to the ambiguous one through the dependency parse tree.