This experiment was designed to study and evaluate whether the integration of corpus-based system within a knowledge-based helps improve word-sense disambiguation of nouns.
Therefore, ME can help to SM by labelling some nouns of the context of the target word. That is, reducing the number of possible senses of some nouns of the context. In fact, we reduce the search space of the SM method. This ensures that the sense of the target word will be the one more related to the noun senses labelled by ME.
In this case, we used the noun words from the English lexical-sample task from SENSEVAL-2. ME helps SM by labelling some words from the context of the target word. These words were sense tagged using the SemCor collection as a learning corpus. We performed a three-fold cross-validation for all nouns having 10 or more occurrences. We selected those nouns that were disambiguated by ME with high precision, that is, nouns that had percentage rates of accuracy of 90% or more. The classifiers for these nouns were used to disambiguate the testing data. The total number of different noun classifiers (noun) activated for each target word across the testing corpus is shown in Table 15.
Next, SM was applied, using all the heuristics for disambiguating the target words of the testing data, but with the advantage of knowing the senses of some nouns that formed the context of these targets words.
Table 15 shows the results of precision and recall when SM is applied with and without first applying ME, that is, with and without fixing the sense of the nouns that form the context. A very small but consistent improvement was obtained through the complete test set (3.56% precision and 3.61% recall). Although the improvement is very low, this experiment empirically demonstrates that a corpus-based method such as maximum entropy can be integrated to help a knowledge-based system such as the specification marks method.