I Know Who You Hacked Last Summer - Attribution 101

Thursday, August 18, 2011

J. Oquendo


I Know Who You Hacked Last Summer - Attribution Validation 101

After see-sawing with concept, I'd like to bring forth my “attribution validation” theory, I figured I would throw the concept out to the public in order to generate discussion.

The gist of my “attribution validation” theory goes along the lines of a “passive offensive,” a mechanism to identify any party involved with a compromise of sensitive data.

How is this accomplished you ask, in similar fashion to an airplane's flight recorder, otherwise known as a “black box.” Or perhaps even LoJack for documents.

Black boxes “are designed to emit a locator beacon for up to 30 days” [1]. What if we could have the same type of technology implanted inside of a file.

My theory is, if data is compromised and exfiltrated, upon opening the file, a beacon would trigger a connection to a server simply notifying the server “here I am, being read from this location.”

In matters of attribution, this could go a long way. Further than the current accepted practice of relying on solely the origination of the attack.



Current events surrounding what the media, security companies, and what is slowly becoming the “Cybersecurity Industrial Complex,” labels almost all attacks as “Advanced Persistent Threats” almost always originating from one source, China. The mechanism for identifying the attackers is always the same: “the attacker came from the following IP address.”

I use the term "always" because no one is sharing information regarding these attacks. Everyone seems to think that the data is gold. In either event, not one company has offered an iota of information on how they are attributing a source of attack outside of the IP address realm.

Often in an attack, let's say "a really advanced attack," we should use some deductive reasoning when analyzing what occurred. We can be sure that no one is going to place a target on themselves and attack another machine from their originating location. To do so would be suicide - “here I am at this location attacking you!” It makes very little sense.

As an attacker, if I needed to gain entrance into a machine, it would be beneficial for me to do so from another country. In fact, I would outright find an open wireless network, compromise a host abroad and launch my attack from that hosts. Many attackers are fully aware of this and many attackers choose to compromise machines in China for this reason alone.

The running joke is “the buck stops in China” and validation of that statement comes via way of recent news reports [2]: “China claimed on Tuesday that it was hit by nearly 500,000 cyberattacks last year, according to the South China Morning Post.”

Before continuing let me be clear, I DO NOT believe that China is simply a victim, nor do I believe that there aren't any threats coming from China. On the contrary, my thoughts are, if the attackers really ARE coming from China, let's call them out on this. The issue that arises is on how to do so factually. Without relying solely on the origination via way of the IP address of who perpetrated the attack.

We can assume that if a government sponsors a hacking program whose sole goal is to exfiltrate data, someone is going to need to analyze that data. These would be the targets that I would want to identify. “Where does my data go? Who is reading the data?”

There is nothing stopping any government from seeking “hired guns” whose sole purpose is to compromise machines and exfiltrate data. While the hired guns are certainly the danger, I am willing to bet that nary a hired gun is going to use a static, identifiable connection, that could be traced back to them.

It is similar to putting a target on one's back and saying: “Here I am, I just compromised you from here!” For a company or government to sponsor a program, it would also be a waste of money. I also believe that whomever is compromising the data is not the party sifting through that data.

In my “validation attribution” framework, my data would all have beacons capable of connecting back for the sake of identifying where they are being read geographically. This could potentially yield “who is actually reading and where are they.” Certainly the parties reading and parsing through exfiltrated data would benefit more than the attackers. 

Theory and practice are two different beasts though. Certainly there would be a lot of “line noise” that could inject many false positives and false negatives into the equation and those would need to be sorted.

In order to counter that line noise, the documents would be what I call “loaded cookies” or as others have called them, honeytokens. I would give an attacker what they wanted. Only it would be on my own terms.

In either event, I will be posting a video demonstrating Validation Attribution shortly, complete with demonstrated beacons and tracking. Stay tuned.

[1] http://en.wikipedia.org/wiki/Flight_data_recorder

[2] http://www.ibtimes.com/articles/195431/20110810/cyber-attack-hacking-hacker-united-states-india-china.htm

Possibly Related Articles:
Information Security
China Hacking Attacks Advanced Persistent Threats IP Address Attribution Exfiltration
Post Rating I Like this!
Emmett Jorgensen Cool concept! Look forward to seeing the video.
Wally Paperson GOOD TITLE! :)
Ian Sneyd The only issue I see is that the "honetokens" would soon be found and reverse engineered, possibly added to freely available documents on sites used by millions and rendering your system useless pretty quickly...
J. Oquendo Ian - anything can be reversed. For the sake of the document, I kept things low key. With enough craftiness, I can almost guarantee you, there would be no afterthought to even being tracked. I anticipate finalizing the audio this weekend so hopefully I can have the video up this morning.

My concept is similar to say when we walk into most department stores. Many of them have anti-theft devices on their wares. Those devices will trigger an alarm when someone is trying to walk out with them as a clerk has not removed the tags from the item. For the video demo I am doing, I make the alarm overt however, I can obfuscate it enough so that it is highly "covert"

Look at the banking world even, in a bank, a teller will have die pack they will hand to a robber. When the robber walks out, the die pack explodes. The die then stains the robber making him identifiable to authorities. My concept of a homing application is no different. I chose to place a digital die pack or anti-theft application inside of my documents. This enables me to see where in the world my data was being read.

I cannot imagine ANY instance where a robber would cry foul over an exploding die pack let alone a company complaining they'd been had from the data they stole.
Lucian Andrei Very interesting principle. If I correctly understood you want that the stolen data to send you a message "here I am".
The problem, as I see it, is that the thief could read the data in an isolate environment, without internet connection. In this case you’ll have no alarm.

To make an analogy, it makes no sense that a burglar alarm sounds inside the thief's house. No one hears it.

It would be better if you put some kind of detection mechanism inside the data, mechanism that will make noise when the data leaves the company. I know that there are a lot of detection mechanisms (IDS, SIEM....) but I doubt that many companies know where all the frakels are.
If you have the business people use this kind of mechanism to mark data, it will be easier for us to detect the bad guys.

Like in a shop, the security guard doesn't know what is inside the consumer's bag, but if the alarm sounds it takes action.

What do you think?
J. Oquendo ucien, because most of the work is theory, I may also look to apply for patents so I will keep mum on a lot.

Let's think about the so called APT. We have an attacker, they compromise a machine and the likelihood of that data staying on that machine is low. My theory is as follows using ASCII art:

1) Attacker ---> compromise --> target
2) Attacker <--- steals data <-- target
3) Analysts ---> connects to ---> Attacker
4) Analysts <--- retrieves data <--- Attacker

In order to complete 3 & 4 there likely needs to be a connection in my connection. This is because:

3) Analysts (in country A) ---> connects to ---> (country B) Attacker
4) Analysts (in country A) <--- retrieves data <--- (country B) Attacker

To think someone would be insane enough to pull this off is absurd:

1) Attacker (hey look at me coming from this static IP address you can ID a mile away!) ---> compromise --> target

This is what almost EVERYONE is yapping is the APT. Forget the fact that:

Me: (any Internet cafe, wifi pointm, etc in Country C) --> compromise --> Host in Country B ---> Now I can attack from this machine

All analysts are crying foul over is:

Host in Country B (attacker) ---> compromised ---> Victim (country A)


1) Me (Country C) ---> compromises ---> Victim (Country B) ---> compromises ---> Victim (Country A)
2) Victim (Country B) ---> compromises ---> Victim (Country A)
3) Victim ---> incident responders ---> OMG Country B compromised us!!!

I am almost positive no one on the machine in Country B is even looking at that data. This task will be handed to someone else. There is no feasible mechanism for a "techie/hacker" to understand what is or isn't TRULY valuable from the data lifted from Country C. Makes zero sense no matter how you cut it. Think about that. Here we have someone who is rather skilfull at computing. This is his arena (security) certainly not going to be figuring out what data inside of a Fortune X company is worth any money on an economic espionage scale nor a military scale.
Lucian Andrei You are right, but my question was if the bad guy, once it leaves the internet cafe, never puts again the data in an internet connected device?

3) Analysts (in country A) ---> connects to ---> (country B) Attacker
4) Analysts (in country A) <--- retrieves data <--- (country B) Attacker
The views expressed in this post are the opinions of the Infosec Island member that posted this content. Infosec Island is not responsible for the content or messaging of this post.

Unauthorized reproduction of this article (in part or in whole) is prohibited without the express written permission of Infosec Island and the Infosec Island member that posted this content--this includes using our RSS feed for any purpose other than personal use.