Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror

Slashdot videos: Now with more Slashdot!

  • View

  • Discuss

  • Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

×
Open Source

+ - Common Crawl - now everyone can be Google ->

Submitted by
mikejuk
mikejuk writes "If you have ever thought that you could do a better job than Google but were intimidated by the hardware needed to build a web index, then the Common Crawl Foundation has a solution for you. It has indexed 5 billion web pages, placed the results on Amazon EC2/S3 and invites you to make use of it for free. All you have to do is setup your own Amazon EC2 Hadoop cluster and pay for the time you use it — accessing the data is free. This idea is to open up the whole area of web search to experiment and innovation. So if you want to challenge Google now you can't use the excuse that you can't afford it."
Link to Original Source
This discussion was created for logged-in users only, but now has been archived. No new comments can be posted.

Common Crawl - now everyone can be Google

Comments Filter:

Nothing is more admirable than the fortitude with which millionaires tolerate the disadvantages of their wealth. -- Nero Wolfe

Working...