This dataset was used in the blogpost here.

It's derived from the SEC 10K dataset of Kogan, Levin, Routledge, Sagan and Smith 2009: www.ark.cs.cmu.edu/10K/
[ICO]NameLast modifiedSizeDescription

[PARENTDIR]Parent Directory  -  
[TXT]HEADER.html2009-09-07 22:19 245  
[   ]sec_10k_pruned_triples.tgz2009-09-07 11:56 133M 

Apache/2.4.18 (Ubuntu) Server at www.cs.cmu.edu Port 80