Huge Site Ranking Dataset Donated to the Common Crawl Foundation 23
Greg Lindahl writes "blekko is donating search engine ranking data for 140 million domains and 22 billion urls to the Common Crawl Foundation. Common Crawl is a non-profit dedicated to making the greatest (yet messiest) dataset of our time, the web, available to everyone, including tinkerers, hackers, activists, and new companies. blekko's ranking data will initially be used to improve the quality of Common Crawl's 8 billion webpage public crawl of the web, and eventually will be directly available to the public."
Re:How does Common Crawl compare w/ Internet Archi (Score:3, Funny)
How does this project compare with the Internet Archive [archive.org]?
commoncrawl.org will be available on archive.org a lot longer than it will be available on commoncrawl.org