Wilson et al, AAAI 2004

From ScribbleWiki: Analysis of Social Media

This page maintained by: Mahesh Joshi

Just how Mad are you? Finding Strong and Weak Opinion Clauses

Theresa Wilson, Janyce Wiebe, Rebecca Hwa

PDF

Summary

This paper presents results that delve deeper into the sentiment analysis task in two respects:

First, it explores sentiment analysis at the sub-sentence level, specifically at the level of each individual clause in a sentence
Second, it deals with the task of predicting strength of opinions, rather than simple yay/nay judgments
- Strength is annotated at 4 levels - high, medium, low and neutral, where neutral is absence of opinion

The dataset used is the Multi-perspective Question Answering (MPQA) corpus of opinion annotations, where all the expressions in context were judged and marked up by the annotators, thus producing fine-grained sentiment markup.

The key contribution of the paper is in the form of strength-based grouping of syntactic and other subjectivity features that boost the accuracy significantly over the conventional features in sentiment analysis. Syntactic features are combined with a variety of previously used subjectivity clues in the form of manually developed lists or bootstrapped list of subjectivity features.

The syntactic features are initially extracted in a dependency parse representation, which is essentially a set of triples, where each triple contains a head word (usually with its Part-of-Speech tag), its modifier word (with its Part-of-Speech tag) and the relationship between them. Using the dependency parses, five classes of syntactic features are formed for each word in the parse:

root(w, t): whether word w with POS tag t is the root of the dependency tree (essentially the main verb)
leaf(w, t): whether word w with POS tag t is a leaf in the dependency tree, and therefore has no modifiers
node(w, t): whether word w with POS tag t is some node in the dependency tree (essentially the word-POS tag pair forms a feature)
bilex(w, t, r, w_c, t_c): whether word w with POS tag t is modified by word w_c with POS tag t_c with the relationship between them being r
allkids(w, t, r₁, w₁, t₁, ..., r_n, w_n, t_n): whether word w with POS tag t has n children with tags t₁ through t_n and relationships r₁ through r_n.

These features or clues are evaluated for their usefulness using a development set and only a certain subset is selected.

The features used are further filtered and organized into sets based on how indicative they are of a particular strength level in a sentiment. This strength-based filtering and organization of features yielded the best results in all the experiments presented in the paper.

The paper contains detailed results about classification at different levels - beginning at sentence level and going up to deeply nested clause level; and also for the different sets of features and classifiers used.

Overall, this is a careful and thorough analysis of the sentiment strength classification task at the clause level.

Retrieved from "http://socialmedia.scribblewiki.com/Wilson_et_al%2C_AAAI_2004"

Wilson et al, AAAI 2004

From ScribbleWiki: Analysis of Social Media

Just how Mad are you? Finding Strong and Weak Opinion Clauses

Theresa Wilson, Janyce Wiebe, Rebecca Hwa

Summary

Views

Personal tools

Navigation

Search

Toolbox