First time accepted submitter vigna writes "The Laboratory for Web Algorithmics of the Università degli studi di Milano together with the Data and Web Science Group of the University of Mannheim have put together the first entirely open ranking of more than 100 million sites of the Web. The ranking is based on classic and easily explainable centrality measures applied to a host graph, and it is entirely open — all data and all software used is publicly available. Just in case you wonder, the number one site is YouTube, the second Wikipedia, and the third Twitter."
They are using the Common Crawl data (first released in November 2011
). Pages are ranked using harmonic centrality
with raw Indegree centrality, Katz's index
, and PageRank provided for comparison. More information about the web graph is available in a pre-print paper
that will be presented at the World Wide Web Conference in April.