While the digital paparazzi were lined up waiting to snap photos of the Lulzboat crew getting vanned, some of us focused on how this collection of low tech script kiddies were able to knock over SONY, AT&T, the CIA, Arizona's DPS and numerous other sites and make off with highly confidential contents again and again.
It turns out that they had an accomplice, Google. Now before the good townspeople grab their torches and pitchforks and beat a hasty path to Mountain View, let it be known that Google's part in these massive hacks isn't actually Google's fault.
Or perhaps it can be if the public still wants to blame them anyway and question why this information is there on Google for the taking in the first place. But that's not really the issue at all.
The blame in my opinion lies once again with the administrators of the sites which were attacked. Google merely indexed the available booty for the lulzers and others and left the cardboard box on the curb where it could be picked up by anyone who drove by.
After all, page crawls weren't considered privileged information - they're all part of the "public internet" available to anyone who drops by.
How could this be? How could Google allow these kids to troll the internet and easily locate SQLi vulnerabilities or remote logins, passwords or even entire databases for the taking without any real effort at all? Simple.
A little thing known as SEO, sitemaps and the little spiders that go bump in the night. Let's look at the problem, along with a few specifics since the bad guys have been doing this for years and years and it's not a secret at all. Then I will explain what site admins can do to see to it that this information is not left at the curb any longer.
Copy and paste the following into a Google searches in a new window. I'll wait:
filetype:sql hotmail gmail password
You can try the above and substitute any of these too:
You might even see some major security companies and governments turn up in there. For extra credit, use the "site:your website url here" and see what comes up on yours!
THIS is what the script kiddies do when they do their Google drive-bys. The victims of lulzsec and others fell because of such simple Google searches, and they're made even easier when you have a target URL in mind to play "anybody home?"
As long as Google has it in their indexes, and you know the keywords to search for sites, then it certainly isn't "nuclear brain science" when an injectable site is found.
There's plenty of tools to automate the attacks on the database behind the site once you know how to POST or GET to it. I've seen apologists claim "we don't use MYSQL."
Rest assured that there are exploit GUI's readily available for PostgreSQL, MSSQL and Oracle as well as lesser and older databases. If it's there, and they can find it, and they can talk to it, and you're not properly filtering what can get to it, your site could very well be the next breaking news story.
Search results on Google come from two primary methods. The first one is web-crawlers which may or may not respect your "robots.txt" file in your website's root. Most webmasters are well aware of the rules for "robots" but can't always be expected to be aware of what dynamic web pages could contain from other parts of their site's backend.
Therefore, some dynamic content might end up not being in the "robots.txt" file to be skipped in the first place. It is essential that those responsible for web sites ensure that the golden rule of "if you don't want people to see it, don't put it on the site in the first place" is properly enforced.
Some more "l33t" hackers have written their OWN webcrawlers and you cannot count on these critters to obey your "robots.txt" file in the first place. Google does usually, but don't count on it EVER. There's plenty more spiders in that basement and Google is but one of them.
The biggest risk of all though is SEO ("Search Engine Optimization" for my pointy-haired readers). It involves the creation of sitemaps using either Google's own sitemapping tool, or risking using a third party SEO tool which will truly map everything it can find and then wrap it all up into a nice little XML file that the webmaster uploads to all the search engines.
Incredibly, a lot of not-so-experienced webmasters will run the SEO tool and never look at the final output file before sending it! If the XML indexes your databases or scripts, they're all part of your sitemap ready for lulzing. PLEASE check your sitemap information before sending it, please?
Yes, there WILL be a test. And it will go down on your "permanent record." Some useful reading on how your databases can be hit can be found here:
Google even has some nifty tools with which you can test your injectability quotient:
Bottom line: If you don't want pirates on your poopdeck, remember the golden rule. If it's ON your website, it's there for the pickings. Do NOT toss your company's wallet on the sidewalk and expect it to be there intact the following morning.
Know what's on your website, know what's being indexed and be certain that anything you don't want anybody else to own isn't there in the first place. Kinda depressing to even have to say any of this. May the lulz be your own, and not some idiot children with no leet in them at all.
About the author: Kevin McAleavey is the architect of the KNOS secure operating system ( http://www.knosproject.com ) and has been in antimalware research and security product development since 1996.