Qualified vs. Bald Claims Dataset - NAACL 2009 - Arora, Joshi & Rosé

Download qbclaims-v1.0.tar.gz. Please refer to the README file below for details (README also included in the compressed tarball).

README

0. Introduction

This is the README file accompanying the dataset (version 1.0) used by Arora et al. [1], in which each statement or "comment" in a product review is annotated as either a qualified claim, or a bald claim. Please see the accompanying annotation manual for detailed explanation of each type of claim, along with supporting examples.

The review comments in our dataset are a subset of the review comments in the product review dataset released by Hu and Liu [2].

1. Files

The data is made available as a CSV (comma separated values) file Qualified-Vs-Bald-Claims.csv. The first column of the CSV file is the class label - POS represents bald or unqualified claims and NEG represents qualified claims. The second column is the text of the review comments (as it appears in the dataset released by Hu and Liu [2]), enclosed in double quotes. Any double quotes appearing in the text itself are escaped by another double quote immediately preceding them.

2. Et Cetera

If you use this data for your research, please cite the following paper (same as [1]):

Shilpa Arora, Mahesh Joshi and Carolyn Rosé. 2009. Identifying Types of Claims in Online Customer Reviews. To appear in Proceedings of NAACL-2009 (Short Papers).

If you would also like to cite the paper that introduced the customer reviews data that we used, please cite the following paper (same as [2]):

Minqing Hu and Bing Liu. 2004. Mining and Summarizing Customer Reviews. In Proc. of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.

If you have any questions or corrections, please contact Shilpa Arora (shilpaa cs cmu edu --- insert an AT and two DOTs at the right places to get the email address) or Mahesh Joshi (maheshj cs cmu edu --- you know the drill).


References

[1] Shilpa Arora, Mahesh Joshi and Carolyn Rosé. 2009. Identifying Types of Claims in Online Customer Reviews. To appear in Proceedings of NAACL-2009 (Short Papers).

[2] Minqing Hu and Bing Liu. 2004. Mining and Summarizing Customer Reviews. In Proc. of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.