Google Now Searches JavaScript 114

Posted by timothy on Saturday May 26, 2012 @04:19AM from the watch-for-the-scriptview-vans dept.

mikejuk writes "Google has been improving the way that its Googlebot searches dynamic web pages for some time — but it seems to be causing some added interest just at the moment. In the past Google has encouraged developers to avoid using JavaScript to deliver content or links to content because of the difficulty of indexing dynamic content. Over time, however, the Googlebot has incorporated ways of searching content that is provided via JavaScript. Now it seems that it has got so good at the task Google is asking us to allow the Googlebot to scan the JavaScript used by our sites. Working with JavaScript means that the Googlebot has to actually download and run the scripts and this is more complicated than you might think. This has led to speculation of whether or not it might be possible to include JavaScript on a site that could use the Google cloud to compute something. For example, imagine that you set up a JavaScript program to compute the n-digits of Pi, or a BitCoin miner, and had the result formed into a custom URL — which the Googlebot would then try to access as part of its crawl. By looking at, say, the query part of the URL in the log you might be able to get back a useful result."

Google Now Searches JavaScript

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 114 Comments Log In/Create an Account

Comments Filter:

Really? (Score:5, Insightful)

by Anonymous Coward writes: on Saturday May 26, 2012 @04:25AM (#40119115)

Googlebot will have a very quick timeout on scripts and probably wont be more powerful than a standard home computer. How would that be useful for calculating digits of pi or bitcoin mining? It would take far longer than doing it the conventional way.

- Incremental and/or parallel computing? (Score:5, Interesting)
  
  by SlovakWakko ( 1025878 ) writes: on Saturday May 26, 2012 @04:31AM (#40119141)
  
  You can always cut the whole process into smaller steps, each providing URL that will initiate the next step. Or you can provide several URLs and have the Google cloud compute a problem for you in parallel...
  
  - Re:Incremental and/or parallel computing? (Score:5, Funny)
    
    by Anonymous Coward writes: on Saturday May 26, 2012 @04:35AM (#40119159)
    
    I already do this using a system of CNAME's in a .xxx domain.
    
  - - Re: (Score:1)
      
      by Anonymous Coward writes:
      
      The same reason why 72 hours of video is uploaded to YouTube every minute.
    - Re: (Score:2)
      
      by Dwonis ( 52652 ) writes:
      
      I realise that the kind of idiots who like Bitcoins will be the same fools who drool over Google, and that these same monkeys won't see any problem with providing an algorithm which generates a secret to a third party for execution,
      
      Bitcoin mining doesn't involve any secret information.
      I'm not sure why you're slagging "idiots who like Bitcoins" so much, either. Sure, Bitcoin has attracted some cranks, anarchists, people who don't trust government-issued money, and speculators who will say all manner of things in attempts to influence the price of Bitcoins (both up and down), but have you actually looked at the crypto and the system of incentives built into the Bitcoin system? It's brilliant, and it's basically the micropayment system
  - - Re: (Score:1)
      
      by rtfa-troll ( 1340807 ) writes:
      
      What if the URL triggers, for example, a slashdot posting then you use another external Javascript interpreter to gather all the results. Sort of map-reduce. Incredibly inefficient but you don't have to pay so who cares? Even better if some xss our similar attack on a Web site can be used to parcel out the work.
      It seems to me though that there's no reason to limit this to googlebot any Javascript interpreter will do.I'm surprised if nobody from the blackhat community doesn't have this up and running for
  - Re: (Score:1, Interesting)
    
    by Zero__Kelvin ( 151819 ) writes:
    
    Even if this is possible, you would certainly be violating Google's guidelines and have your site blacklisted from Googlebot pretty quickly. Furthermore, you could be charge with theft of services.
    - - Re: (Score:1)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        Intent.
        
        Re: (Score:2)
        
        by dreamchaser ( 49529 ) writes:
        
        Intent.
        Prove it. Seriously. You wouldn't be able to.
    - Re:Incremental and/or parallel computing? (Score:5, Interesting)
      
      by ThatsMyNick ( 2004126 ) writes: on Saturday May 26, 2012 @06:01AM (#40119417)
      
      Anyone wanting to do this would be doing it on a dedicate website. They wont care about the domain or IP address being blacklisted from Google. And good luck with the theft of service charge, they never asked Google to index them. They did not even agree to any terms of service from Google. As I said, good luck.
      
      - Re:Incremental and/or parallel computing? (Score:5, Informative)
        
        by truedfx ( 802492 ) writes: on Saturday May 26, 2012 @06:40AM (#40119521)
        
        No, that's not what opting in means. Opting in means you're asking Google to visit your site. Opting out means you're asking Google not to visit your site. When you're not asking for anything, merely hoping, you're neither opting in nor opting out.
        
        
        Stop trying to teach what you don't understand (Score:1)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        "Opting in means you're asking Google to visit your site. "
        Right. That is exactly what I said. The standard for the internet is well defined. You should read about it [wikipedia.org]. If you make a web page available to the internet without a password, captcha or firewall, etc. you are making it available to all. You have already purposely accepted the condition ahead of time. This is opting in [thefreedictionary.com]. The robots.txt allows you to opt-out instead. If you opt in by placing it on the internet available to web crawlers and
        
        Re: (Score:2)
        
        by truedfx ( 802492 ) writes:
        
        No, that isn't what you said. Allowing Google to access your site and asking Google to access your site are two different things. By neither opting in nor opting out, you're allowing Google to access your site, because the default is to allow it and you haven't told Google otherwise, but that's not opting in. Hint: what does opt mean? Who has chosen that the default is to allow anyone to visit your site? If it isn't you, then you didn't opt.
        
        Re: (Score:1)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        You are confusing e-mail with the internet, which it turns out isn't a series of tubes by the way. In an e-mail scenario opting in involves a specific request to receive information. In a web publishing scenario you are opting in to having your published data and information read by all.
        "Who has chosen that the default is to allow anyone to visit your site? If it isn't you, then you didn't opt."
        Why do you keep reiterating my point for me and then saying I didn't make my point? If you don't create a mecha
        
        Re: (Score:1)
        
        by bingoUV ( 1066850 ) writes:
        
        If you don't create a mechanism to keep Google out (e.g. robots.txt) then - by your own admission - you have opted to allow Googlebot to read what you publish to the world.
        Allowing Google to do something does not mean asking Google to do it. Allowing does not involve "service".
        
        Re: (Score:1)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        That's great. Now show me where I said it does mean asking them to do it. You are confusing the e-mail version of opt-in (opting to have others send to you) with the web version of opt-in, which is opting to have others view your content. When you don't block access to your site you are opting to have your site crawled. Any web content designer who doesn't know this is incompetent.
        
        Re: (Score:1)
        
        by bingoUV ( 1066850 ) writes:
        
        You said "theft of service". If you had read the second sentence, it said allowing does not involve service.
        For example, you "allowed" me to pray for you. I pray for a fee. You are hereby charged with theft of service.
        
        Re: (Score:1)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        You might want to go back and read the whole thread and study it. As I quite explicitly stated, theft of service doesn't come in until you write your code to run on and leverage Google's systems. Obviously, allowing their bot to crawl your site is not theft of service.
        
        Re: (Score:1)
        
        by bingoUV ( 1066850 ) writes:
        
        I have "studied" it quite well. So you are saying you cannot be charged with theft of service until you arrange with god to benefit from my prayers.
        Google is doing what it wants to do. It doesn't become theft of service just because someone benefits from it.
        
        Re: (Score:2)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        "So you are saying you cannot be charged with theft of service until you arrange with god to benefit from my prayers."
        No. I am saying that you shouldn't try to make analogies, because you suck at it.
        "I have "studied" it quite well. .... It doesn't become theft of service just because someone benefits from it."
        Also, don't waste your time studying things if your definition of 'quite well' results in the level of complete misunderstanding you have managed to acheive. Just accept that you aren't smart enough
        
        Re: (Score:1)
        
        by bingoUV ( 1066850 ) writes:
        
        Please try to be funny when you troll.
        
        Re: (Score:2)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        Please try to learn what the term troll means. More importantly, please don't reply back with your interpretation of what you have "learned" when you do.
        
        Re: (Score:1)
        
        by bingoUV ( 1066850 ) writes:
        
        You are demonstrating the act of trolling quite well. Only thing left is for the observer to know the name of this internet behavior. For an experienced internet user like me, it was very simple, thank you. Your posts can go into textbooks to illustrate trolling to help people less well informed than me. Thanks for community service.
        
        Re: (Score:2)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        I'm truly glad I could help with your book, which I have no doubt will be published once you pay the fee. Coincidentally, I'm writing a book on clueless morons, so I really value the examples you have provided for me as well! It isn't often that a Slashdot disagreement turns into this kind of a win-win situation!
        
        Re: (Score:1)
        
        by bingoUV ( 1066850 ) writes:
        
        Sorry, I am not writing the book I mentioned. I just hoped someone would. Though you can add your above post as an illustration in your own book, as I hadn't mentioned I intend to write any book like mentioned but you concluded it anyway. You are quite the person to write a book on "clueless morons". Being one yourself is quite a help, I am sure.
        
        Re: (Score:2)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        "Sorry, I am not writing the book I mentioned."
        That's OK. It's probably for the best. I've seen your writing.
        "I just hoped someone would. "
        It is always good to have hopes and dreams, even if they are phenomenally unrealistic. For example, I hope you get a clue someday.
        "Though you can add your above post as an illustration in your own book, as I hadn't mentioned I intend to write any book like mentioned but you concluded it anyway."
        You really are a dim bulb there, Sherlock. Have a nice life in fantasy lan
        
        Re: (Score:1)
        
        by bingoUV ( 1066850 ) writes:
        
        You really are a dim bulb there
        Even though it was you that drew the wrong conclusions?
        Anyway, don't worry. This is the best you can come up with, at the moment, but next year you are sure to think of a witty reply. Keep trying.
        
        Re: (Score:2)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        "Even though it was you that drew the wrong conclusions?"
        I didn't draw the wrong conclusion. I was making the point that the only way a book will be published that uses my post as an example of trolling is if you write one yourself, and the only way any book you write would be published is if you pay someone to publish it. Alas, you are too dim to figure these things out, so:
        
        PLONK [netlingo.com]
        
        Re: (Score:1)
        
        by bingoUV ( 1066850 ) writes:
        
        the only way a book will be published that uses my post as an example of trolling is if you write one yourself
        Unsubstantiated
        the only way any book you write would be published is if you pay someone to publish it
        Ditto. Also, a "book" need not be "paid published" to be called a book in these days of e-books.
        Anyway, I was just making fun of your stupidity. Alas, the same quality of yours makes you unable to understand it.
        
        Re: (Score:2)
        
        by Dark$ide ( 732508 ) writes:
        
        What happens when Google chooses to ignore my carefully crafted robots.txt?
        If they then download the my javascript experiment and run it at their cost, that's their problem.
        When I can trust crawlers to not ignore my robots.txt I'll stop using fail2ban on my apache logs.
        
        Re: (Score:1)
        
        by postbigbang ( 761081 ) writes:
        
        There is no reason to believe, as the research is scant at best, that Google even respects a robots.txt file. They are a vacuum hose attached to an analytic engine, easily metaphorized to Steven King's Langoliers.
        
        Here's your sign (Score:2)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        "There is no reason to believe, as the research is scant at best, that Google even respects a robots.txt file [google.com].
        From the preceeding link: "Make use of the robots.txt file on your web server. This file tells crawlers which directories can or cannot be crawled. Make sure it's current for your site so that you don't accidentally block the Googlebot crawler. Visit http://code.google.com/web/controlcrawlindex/docs/faq.html [google.com] to learn how to instruct robots when they visit your site. You can test your robots.txt file
        
        Re: (Score:2)
        
        by amRadioHed ( 463061 ) writes:
        
        Research is scant? It's ridiculously easy for anyone with a webserver to verify if Google respects robots.txt.
        
        Re: (Score:2)
        
        by Dwonis ( 52652 ) writes:
        
        Guilt under what section of what law, specifically?
        
        Re: (Score:1)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        You must be from another country. In the US, they have a smorgasbord of them from which they can choose now. But in this case I was thinking theft of service as I already stated quite explicitly.
      - Re: (Score:1)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        "Anyone wanting to do this would be doing it on a dedicate website. They wont care about the domain or IP address being blacklisted from Google."
        So you are saying that someone would go through all the trouble of registering the domain, creating the code, and getting (or waiting for) Google to index it, then wouldn't care that Google would cease to execute the actual code before the desired results are obtained? Re-read what I wrote. I merely said it would be blacklisted quickly. I didn't say that it woul
        
        Re: (Score:2)
        
        by ThatsMyNick ( 2004126 ) writes:
        
        So far blacklisting has worked pretty well for Google. Google has used it well to punish black hat SEO techniques.
        In this case though, if I dont care about my page rank, I would simply create tons of long length domain names for pennies (+icann fees). I would use few at a time and would care if Google blacklisted few at a time (I would be storing partial results, just like one of the parent mentioned, and the takeover should be seamless). It doesnt take a lot to recoop your domain name fees if your task is
        
        Re: (Score:1)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        It doesnt take a lot to recoop your domain name fees if your task is purely computational.
        Dedicated hardware is cheap, and designing software costs a lot of money and time. What you are proposing would be ridiculously convoluted and costly, even disregarding the legal ramifications. We software engineers often talk about using the right tool for the right job. Your outlandish proposal ignores numerous sound engineering principles, not the least of which is adhering to this simple maxim.
        
        Re: (Score:2)
        
        by ThatsMyNick ( 2004126 ) writes:
        
        May be not. But if someone wanted to do it just for the heck of it, it can be done. It may not scale very well, otherwise I dont see issues at all with it.
        
        Re: (Score:1)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        As I was trying to explain, there is a huge difference between the problems you can see with it and the actual long list of problems that any moderately competent software engineer could quickly point out.
        
        Re: (Score:3)
        
        by ThatsMyNick ( 2004126 ) writes:
        
        I think you missed the "just for the heck of it". I understand my approach is not the practical one, and any sane person would just use their resources to do what little can be done and implement it on their own hardware. But it does it mean it cannot be done in a no loss way. Say I want to calculate the last 100 digits of Graham's number, it is can be split into multiple calculations, a sub result calculation can take less than a second (which is what I assume Google will limit the runtime to). The bandwid
        
        Re: (Score:1)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        Let's start with the simplest problem. You plan on having Googlebot load and run your client side code. Great. Now how do you plan to get Googlebot to feed you the result?
        
        Re:Incremental and/or parallel computing? (Score:4, Insightful)
        
        by ThatsMyNick ( 2004126 ) writes: on Sunday May 27, 2012 @06:17AM (#40127275)
        
        Your JS would generate HTML on the client side. Just generate a link that your server can understand. Google bot, doing what it does, will try to load this URL. When it does, the server stores this result, and generates a new problem for GoogleBot to solve. This is the basis, for the article and the entire comment thread.
        
        
        Re: (Score:1)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        "Your JS would generate HTML on the client side."
        Like I said, you are making assumptions about Googlebot. You seem to think that they have no idea how to sanitize an input and will just execute whatever you send them byte for byte. That's not going to happen.
        
        Re: (Score:3)
        
        by ThatsMyNick ( 2004126 ) writes:
        
        Er, they are looking for JS that generates HTML (So this is not an assumption). The purpose of GoogleBot is to index. If they run the JS and dont even index the results, it is makes no sense.
        
        Would you mind specifically mentioning what I assumption I am making. And there is no way to Sanitize JS (JS is a turing complete language, there is no way (atleast as far as present day research) to santize it in any reasonable way)
        
        Re: (Score:2)
        
        by ThatsMyNick ( 2004126 ) writes:
        
        Soory about the typos, I guess I need to get some sleep.
        
        Re: (Score:2)
        
        by ThatsMyNick ( 2004126 ) writes:
        
        I feel honored to have been considered a Google employee. Well, not really. Is there is something wrong with my point, that it sounds Fanboish or Employeeish?
- Re: (Score:1)
  
  by multicoregeneral ( 2618207 ) writes:
  
  Depends how often they hit your site. Google has been known to check sites pretty regularly.
- Re: (Score:2)
  
  by Sloppy ( 14984 ) writes:
  
  Wait a minute, are you suggesting that having spiders run my javascript x86 emulator which runs jruby scripts which mines bitcoins, isn't practical?
Simply another example (Score:1)

by Anonymous Coward writes:

why having other parties fetch your arbitrary code and execute it is such a wonderful idea.
- Re:Simply another example (Score:5, Funny)
  
  by Zero__Kelvin ( 151819 ) writes: on Saturday May 26, 2012 @05:49AM (#40119377) Homepage
  
  Well, I think the bigger problem is that you are writing arbitrary code.
  
A much more likely application (Score:5, Interesting)

by maxwell demon ( 590494 ) writes: on Saturday May 26, 2012 @05:07AM (#40119259) Journal

Send Google JavaScript which generates different results for Google than for normal visitors, in order to rank up the site.

- - Re: (Score:1)
    
    by Anonymous Coward writes:
    
    That's an interesting idea and much more insidious than mine, which was to simply send nothing to Google and fuck 'em.
    Not allow your site to be indexed by Google? Yeah, that'd really fuck Google up good, wouldn't it?
- Re:A much more likely application (Score:5, Funny)
  
  by aaronb1138 ( 2035478 ) writes: on Saturday May 26, 2012 @05:18AM (#40119287)
  
  What is this method you have written, "sudo_mod_me_up?"
  
  - Re: (Score:2)
    
    by The Mighty Buzzard ( 878441 ) writes:
    
    Wait, GoogleBot gets mod points now? This explains soooo much.
    - Re: (Score:1)
      
      by multicoregeneral ( 2618207 ) writes:
      
      Well, there was this http://tech.slashdot.org/story/11/11/30/1356218/google-throws--under-bus-to-snag-patent [slashdot.org] Remember that one? Good times.
- Re: (Score:1)
  
  by Anonymous Coward writes:
  
  You don't need JavaScript for that. A lot of servers serve different HTML to Google than to us. It's especially noticeable when searching for a rare term; Google will show you results that appear to contain the term, but without relevant context (only mystifying unrelated terms) and when you open it the page turns out to have some completely different subject.
- Re: (Score:1)
  
  by Anonymous Coward writes:
  
  I noticed this in a PHP attack script earlier this year. It installs a script pointing to a Russian malware domain, but only inserts it in the page if the user agent is not GoogleBot or a few other spiders. It also checked for some Google ip ranges. Surely Google must be combating this by doing some stealth spidering, otherwise SEO and malware providers will game them if they stick to their classic robot rules.
- Re: (Score:2)
  
  by slapyslapslap ( 995769 ) writes:
  
  This is already being done, but in reverse. Google doesn't like it much either. Get caught, and you are de-listed.
  - Re: (Score:1)
    
    by maxwell demon ( 590494 ) writes:
    
    The point is, with Google executing JavaScript you could make it less obvious, by just having the JavaScript depend on some difference between the Google and the Browser JavaScript execution (maybe timings of certain rendering operations).
    Also, it might be used through XSS, to have competitors delisted.
- Re: (Score:1)
  
  by multicoregeneral ( 2618207 ) writes:
  
  Or one that generates useful looking links to other sites you own (on different servers and subnets, of course).
- Re: (Score:2)
  
  by squidinkcalligraphy ( 558677 ) writes:
  
  I would be surprised if the googlebot didn't try everything to appear to the server like a normal user browser. Even better would be to crawl a site while in disguise, then again while not disguised. Differences would affect the sites ranking negatively.
  - Re: (Score:1)
    
    by maxwell demon ( 590494 ) writes:
    
    Serving different content based on IP or self-identification is possible even without JavaScript. However if the detection makes use of peculiar behavior of the JavaScript implementation (and the JavaScript implementation will have to have some differences, or else it won't find content which is initially hidden, but unhidden by an user interaction), just fetching from a different UI or with a different browser/spider identification doesn't work.
    And BTW, the spider will certainly expose itself from the very
- Re: (Score:3)
  
  by mwvdlee ( 775178 ) writes:
  
  By "gracefully degrading" do you mean "if (useragent == 'googlebot') { random-spamwords(); paywalled-content(); links-to-every-parsable-uri(); }"?
- - Re: (Score:2)
    
    by TheLink ( 130905 ) writes:
    
    Would it be possible to filter out sites like this? I personally don't want to find sites like these in my Google search results.
    - Re: (Score:2)
      
      by dave420 ( 699308 ) writes:
      
      Google is not there to represent your own idea of what the internet is. Sites like that will become more and more common, whether you like it or not.
I noticed this already some time ago. (Score:1)

by jimbauwens ( 1648531 ) writes:

When I was looking at the page previews (in google) of my JavaScript network scanner, I noticed it listed some IP's, indicating that it was running the script. Just google "http://bwns.be/jim/scanning_printing/detect_range.html" and look at the preview. (Also, most of those IP's probably exist, as my script indicates it is sure about them).
- Re: (Score:2)
  
  by C18H27NO3 ( 1282172 ) writes:
  
  You typoed your url. You have detect_range.html which is actually detect-range.html
  - Re: (Score:1)
    
    by jimbauwens ( 1648531 ) writes:
    
    Oh :P Thanks :)
  - Re: (Score:3, Funny)
    
    by RoccamOccam ( 953524 ) writes:
    
    Also, the dry cleaning that you dropped off on Thursday is ready for pick-up and your driver's license expires in three months.
    Sincerely,
    The Slashdot Citizens Brigade
- Re: (Score:3)
  
  by marcosdumay ( 620877 ) writes:
  
  Now that you said it. The preview Google shows of one of my sites has all the CSS aplied, including some that is aplied by javascript after the page load.
so much for (Score:5, Insightful)

by Anonymous Coward writes: on Saturday May 26, 2012 @06:24AM (#40119477)

using javascript to hide or obfuscate email addresses to help protect them from spammers, scammers and bots.
thanks fer nuttin, google.

- Re: (Score:3)
  
  by VortexCortex ( 1117377 ) writes:
  
  robots.txt
  - Re: (Score:3)
    
    by MattskEE ( 925706 ) writes:
    
    Do you think spammers scraping the web for email addresses respect robots.txt?
- Re: (Score:3)
  
  by John Bokma ( 834313 ) writes:
  
  Uhm, years ago one could already do that using SpiderMonkey and some Perl. It's what I used to report nasty redirects in Blogspot/Blogger to Google (thousands and thousands). It took me some time, but Google did see the light and the problem was resolved.
  Why do people keep thinking that spammers are retards? If it can be abused, it will be. And spammers/cybercriminals are among the first to do so.
- Re: (Score:1)
  
  by goaxcap ( 2648385 ) writes:
  
  Use images or flash to show up email
Evaluate JavaScript on the client (Score:1)

by Anonymous Coward writes:

Now Google controls the client, the search engine and the analytics it should not be too difficult for them to see how traffic is flowing between sites. Pages need not even be physically linked for Google to see a connection. E.g. reading an article on the BBC may cause people to search for a company. With people signing into Chrome Google Google must have some very rich logs.
Google has been doing this for quite some time (Score:2, Interesting)

by Anonymous Coward writes:

Although maybe not quite in the same context. Google used to display javascript-munged email addresses in their search results until some of the larger sites involved, such as Rootsweb, complained.
GET vs POST (Score:1)

by Anonymous Coward writes:

I really hope website developers and web application developers know the difference between GET and POST requests.
Else, this could turn ugly.
- Re: (Score:2)
  
  by physburn ( 1095481 ) writes:
  
  I've often programmed write new article, or add item, GET links, and also javascript actions. Which would mean google is going to be spamming forums and databases. Whats the robots.txt command to prevent going running the javascript on a page?
  - Re: (Score:2)
    
    by xOneca ( 1271886 ) writes:
    
    Maybe put Javascript functions on a separate file and use robots.txt to ban bots access.
Google adding potential security holes in its bot? (Score:1)

by Kergan ( 780543 ) writes:

I can already picture hackers drooling at the idea of turning Google's cloud into the ultimate zombie network.
Chrome (Score:3)

by The MAZZTer ( 911996 ) writes: <megazzt&gmail,com> on Saturday May 26, 2012 @09:46AM (#40120435) Homepage

If you check out some of the thumbnails, it looks like Googlebot is using a customized version of Chrome now. You can see it blocking plugins.

I for one welcome the Javascript spamming. (Score:1)

by multicoregeneral ( 2618207 ) writes:

It's inevitable. Someone will figure out a way to abuse the system that google hasn't thought to make contingencies for yet. I'm on the fence as to whether this is a good idea. I just hope they know what they're doing.
- Re: (Score:2)
  
  by dave420 ( 699308 ) writes:
  
  Yeah, it's true - Google clearly knows nothing about searching the internet. ;)
  - Re: (Score:2)
    
    by multicoregeneral ( 2618207 ) writes:
    
    Dave, every time they make a change like this, they get hammered. They made some big changes the release before "panda" and the site was useless for almost a year.
They don't need to run the scripts (Score:3)

by Hentes ( 2461350 ) writes: on Saturday May 26, 2012 @01:59PM (#40122065)

You don't need to actually run the scripts, most of the time it's enough to just scrape the strings and links out of them.

WTF? (Score:2)

by Johann Lau ( 1040920 ) writes:

Oh yeah, fuck accessibility. Fuck the web in general. "It's better for everybody". That's literally all you need to know. "Just go ahead and remove that from your robots.txt".
I'm not saying there may not be good reasons (e.g. having the CSS and Javascript actually makes it possible to detect invisible text and whatnot, without that search engines may not even have a chance), but I really would appreciate some good reasoning, not being talked to like a fucking 5 year old.
Or hey, how about adding that "of cou
Spammers! (Score:4, Informative)

by xenobyte ( 446878 ) writes: on Sunday May 27, 2012 @03:06AM (#40126669)

They've been testing this for a while - We've already had the first complaints against someone spamming an email that only exists in exactly one place: Online as the result of some (trivial) javascript. Turned out that if you Googled the page, the result snapshot included the javascript generated email... In other words - it's already there and this will effectively kill javascript as a way of hiding functioning mailto links. Okay it would be fairly simple to add a condition based on the User Agent as GoogleBot is easily identified but it will make things a bit more complicated for the average user.

- Re: (Score:2)
  
  by dave420 ( 699308 ) writes:
  
  There are command-line WebKit-based parsers out there, which allow you to process any URL or file as a browser would, and take either a screenshot of the page or access the DOM or whatever. They're not new.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Really? (Score:5, Insightful)

Incremental and/or parallel computing? (Score:5, Interesting)

Re:Incremental and/or parallel computing? (Score:5, Funny)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:1, Interesting)

Re: (Score:1)

Re: (Score:2)

Re:Incremental and/or parallel computing? (Score:5, Interesting)

Re:Incremental and/or parallel computing? (Score:5, Informative)

Stop trying to teach what you don't understand (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:1)

Re: (Score:1)

Re: (Score:1)

Re: (Score:1)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Here's your sign (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:3)

Re: (Score:1)

Re:Incremental and/or parallel computing? (Score:4, Insightful)

Re: (Score:1)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Simply another example (Score:1)

Re:Simply another example (Score:5, Funny)

A much more likely application (Score:5, Interesting)

Re: (Score:1)

Re:A much more likely application (Score:5, Funny)

Re: (Score:2)

Re: (Score:1)

Re: (Score:1)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

I noticed this already some time ago. (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:3, Funny)

Re: (Score:3)

so much for (Score:5, Insightful)

Re: (Score:3)

Re: (Score:3)

Re: (Score:3)

Re: (Score:1)

Evaluate JavaScript on the client (Score:1)

Google has been doing this for quite some time (Score:2, Interesting)

GET vs POST (Score:1)

Re: (Score:2)