Google

Google Sheds Light On 'Dark Web' With PDF Search 78

CWmike writes "Google this week took another step in its effort to shed light on the so-called Dark Web, announcing that its search engine can now search scanned documents in a PDF. In April, Google announced that it was looking for ways for its search engine to index HTML forms such as drop-down boxes or select menus that otherwise couldn't be found or indexed." An announcement is available at the official Google blog, and it contains some demonstration searches.
The Internet

Opera Develops Search Engine For Web Developers 31

nk497 writes "The Metadata Analysis and Mining Application (MAMA) doesn't index content like a standard search engine, but looks at markup, style, scripting and the technology behind pages. Based on those existing MAMA-ed pages, 80.4 per cent of sites use cascading style sheets (CSS), while the average web page has 47 markup errors and 16,400 characters. Should you want to know which country is using the AJAX component XMLHttpRequest the most, MAMA can tell you that it's Norway, with 10.2 per cent of the data set." Additional coverage is available at Computerworld, and a deeper explanation is up at Opera's Dev site.
Google

Was the Yahoo-Google Deal a Ploy To Weaken Yahoo? 82

JagsLive writes with a link to a BetaNews story about a US Senator who is questioning whether the deal between Yahoo and Google was brokered with less than honorable intentions on Google's part. The advertising deal came under scrutiny from the Department of Justice recently for potential antitrust violations. The deal has now been delayed in order to allow investigators more time for evaluation. Meanwhile, rumors are circulating that Yahoo will cut as much as 20% of its workforce after an internal memo from CEO Jerry Yang called for "discipline" and said the company was "getting fit" for the long term. For their part, Google has launched a site endorsing the deal and attempting to smooth the way for its approval by providing facts and positive reactions from experts.
Communications

Princeton Researchers Say Feds Need Data Standard 49

dcblogs writes "The federal government's data-sharing efforts are a mess, and if Barack Obama really wants a useful 'Google for government,' he would have to set the government's vast amount of data free by exposing it and ensuring it complies to standards. Once that happens, commercial sites, aggregators, bloggers and everyone else will be able to access it, use it and transform it, argue a group of Princeton researchers (follow Download link for full PDF)."
Google

Google URL Index Hits 1 Trillion 249

mytrip points out news that Google's index of unique URLs has reached a milestone: one trillion. Google's blog provides some more information, noting, "The first Google index in 1998 already had 26 million pages, and by 2000 the Google index reached the one billion mark. Over the last eight years, we've seen a lot of big numbers about how much content is really out there. To keep up with this volume of information, our systems have come a long way since the first set of web data Google processed to answer queries. Back then, we did everything in batches: one workstation could compute the PageRank graph on 26 million pages in a couple of hours, and that set of pages would be used as Google's index for a fixed period of time. Today, Google downloads the web continuously, collecting updated page information and re-processing the entire web-link graph several times per day."
Security

UK Mobile Operator O2 Leaks MMS Photos 154

Anonymous Hero writes "UK Mobile Operator O2 allows its customers to send Multimedia Messaging Service (MMS) photos to email recipients by way of a web interface. The URLs published by the MMS-to-email application are not authenticated, so a simple Google search reveals hundreds, if not thousands of private photos." Reader ttul points out similar coverage of this issue at InformationWeek.

Search Engines' Reward Programs 83

Carl Bialik from WSJ writes "Search engines are dangling rewards and cash prizes to attract customers to their sites, the Wall Street Journal reports. MSN is offering free nights at the Four Seasons and other goodies to people who search for one of roughly a thousand terms on a rotating list. Yahoo's GoodSearch donates a penny to charity for each search. And Blingo hawks giveaways including iPods. But, the WSJ reports, 'There are strings attached to some of the reward programs. Some require users to register personal information like a name or email.'"
Google

Search Engine For Coders to Launch 149

karvind writes "According to Wired, 'Krugle' is set to next month. The search engine indexes programming code and documentation from open-source repositories like SourceForge, and includes corporate sites for programmers like the Sun Developer Network. The index will contain between 3 and 5 terabytes of code by the time the engine launches in March. According to article, Krugle also contains intelligence to help it parse code and to differentiate programming languages, so a PHP developer could search for a website-registration system written in PHP simply by typing 'PHP registration system.'" Update: 02/17 21:04 GMT by Z : Summary edited for accuracy.

Microsoft Hopes Prizes Will Attract New Searchers 195

BertieBaggio writes "Remember the long-running e-mail hoax that had Bill Gates testing an "e-mail tracing program" and offering to pay recipients big bucks if they passed his test e-mail along to all their friends? Well, the offer is true, sort of. Microsoft wants you to use its search engine, and it's got $1 million worth of prizes up for grabs for those who nibble at the offer. Following Yahoo's recent consideration of offering prizes to searchers, is this another tactic to lure users away from Google with candy and other shiny things?"
The Internet

Yahoo! Releases New Search Tool 146

rcrc writes "Yahoo! Research Labs has recently released a new search tool that gives the opportunity to the user to choose whether they are looking for informational sites, or shopping sites, based upon a slider bar. This tool is currently in beta and more information can be found in the FAQ." From the article: "With the slider in the middle position, only the default Yahoo! Search sort is used. When the slider is at either end, only the secondary commercial/non-commercial sort is used. But when the slider is anywhere in between, Yahoo! Mindset presents a blend of the two sorting systems."
The Internet

Feds Fund Anti-Terrorism Search Engine 278

Ben writes "The FAA and researchers at the University at Buffalo are developing an anti-terrorism search engine that will hunt for 'hidden' information -- like how to take down an airliner -- that can be puzzled together by grabbing bits and pieces from unrelated documents. Eventually, they say, the technique can be commercialized to improve search results on more mundane matters.`"
Microsoft

Apple and MS Battle For Desktop Search Supremacy 707

markmcb writes "As Microsoft and Apple go back and forth about who came up with what idea first, it's been hard to tell who the real innovaters are. Michael Gartenberg and Jim Allchin of Microsoft give some fair opinions on the current desktop search battle. While they do give credit to Apple's iTunes for search inspiration and to Apple being first out of the box in the OS race, they both imply that Microsoft will provide more robust features with the release of Longhorn."

Slashdot Top Deals