Two text learning data sets
The tarred and gzipped data directory for the
twenty newsgroups data
.
The page describing in more detail the
university data sets