This algorithm selects noun phrases (NP) occurring before an anaphor. These NPs are included in a list that is then weighted. Each time the NP appears in a new turn (frequency), its weight is increased, and each time the NP does not appear in a new turn (infrequency), its weight is decreased. According to this algorithm, the dialogue topic may be determined by its salience, i.e., by determining the NP with the heaviest weight (high frequency in a short distance) occurring before an anaphor. In order to obtain this information (weight), the algorithm uses the following two coefficients:
This automatic topic detection method has the following advantage over other methods: it does not obtain a single topic, but rather a list of topic candidates ordered by salience. That is important for our anaphora resolution system because, if the highest-ranked candidate does not fulfill the relevant constraints, then the next highest candidate can be tested.
Initially, values of 10 units and 1 unit, respectively, were assigned to Cf and Ci. These values were arrived at experimentally, but further study could lead to more precise values.