Political Blog Corpora
This page provides links to political blog corpora for use in academic research. These data were collected by
Tae Yano
,
William Cohen
, and
Noah Smith
at the
Language Technologies Institute
and
Machine Learning Department
at
Carnegie Mellon University
.
Download
Dataset 1
(
browsable directory
) and
readme
: data from five American political blogs during 2007–2008 (released May 29, 2009)
Further Reading
Please cite this paper if you write any papers involving the use of the data above:
Predicting Response to Political Blog Posts with Topic Models
Tae Yano
,
William W. Cohen
, and
Noah A. Smith
NAACL-HLT 2009, Boulder, CO, May–June 2009
Acknowledgments
This research project was supported by a gift from Microsoft Research and NSF IIS-0836431.