Geo-tagged Microblog Corpus
This page provides a link to a dataset containing a sample of geo-tagged microblog data, for use in academic research. These data were collected by
Brendan O'Connor
,
Jacob Eisenstein
, and
Noah A. Smith
.
Download
Version 2010-10-12
README.txt
GeoText.2010-10-12.tgz
(58M)
Changelog
2010-10-12: added location prediction evaluation scripts.
2010-10-06: initial release.
Further Reading
The dataset is described in the following paper. Please consider citing it if appropriate. Thanks!
A Latent Variable Model for Geographic Lexical Variation
Jacob Eisenstein
,
Brendan O'Connor
,
Noah A. Smith
, and
Eric P. Xing
In
Proceedings of the Conference on Empirical Methods in Natural Language Processing
, Cambridge, MA, 2010.