I Know Who You Hacked Last Summer - Attribution Validation 101
After see-sawing with concept, I'd like to bring forth my “attribution validation” theory, I figured I would throw the concept out to the public in order to generate discussion.
The gist of my “attribution validation” theory goes along the lines of a “passive offensive,” a mechanism to identify any party involved with a compromise of sensitive data.
How is this accomplished you ask, in similar fashion to an airplane's flight recorder, otherwise known as a “black box.” Or perhaps even LoJack for documents.
Black boxes “are designed to emit a locator beacon for up to 30 days” . What if we could have the same type of technology implanted inside of a file.
My theory is, if data is compromised and exfiltrated, upon opening the file, a beacon would trigger a connection to a server simply notifying the server “here I am, being read from this location.”
In matters of attribution, this could go a long way. Further than the current accepted practice of relying on solely the origination of the attack.
Current events surrounding what the media, security companies, and what is slowly becoming the “Cybersecurity Industrial Complex,” labels almost all attacks as “Advanced Persistent Threats” almost always originating from one source, China. The mechanism for identifying the attackers is always the same: “the attacker came from the following IP address.”
I use the term "always" because no one is sharing information regarding these attacks. Everyone seems to think that the data is gold. In either event, not one company has offered an iota of information on how they are attributing a source of attack outside of the IP address realm.
Often in an attack, let's say "a really advanced attack," we should use some deductive reasoning when analyzing what occurred. We can be sure that no one is going to place a target on themselves and attack another machine from their originating location. To do so would be suicide - “here I am at this location attacking you!” It makes very little sense.
As an attacker, if I needed to gain entrance into a machine, it would be beneficial for me to do so from another country. In fact, I would outright find an open wireless network, compromise a host abroad and launch my attack from that hosts. Many attackers are fully aware of this and many attackers choose to compromise machines in China for this reason alone.
The running joke is “the buck stops in China” and validation of that statement comes via way of recent news reports : “China claimed on Tuesday that it was hit by nearly 500,000 cyberattacks last year, according to the South China Morning Post.”
Before continuing let me be clear, I DO NOT believe that China is simply a victim, nor do I believe that there aren't any threats coming from China. On the contrary, my thoughts are, if the attackers really ARE coming from China, let's call them out on this. The issue that arises is on how to do so factually. Without relying solely on the origination via way of the IP address of who perpetrated the attack.
We can assume that if a government sponsors a hacking program whose sole goal is to exfiltrate data, someone is going to need to analyze that data. These would be the targets that I would want to identify. “Where does my data go? Who is reading the data?”
There is nothing stopping any government from seeking “hired guns” whose sole purpose is to compromise machines and exfiltrate data. While the hired guns are certainly the danger, I am willing to bet that nary a hired gun is going to use a static, identifiable connection, that could be traced back to them.
It is similar to putting a target on one's back and saying: “Here I am, I just compromised you from here!” For a company or government to sponsor a program, it would also be a waste of money. I also believe that whomever is compromising the data is not the party sifting through that data.
In my “validation attribution” framework, my data would all have beacons capable of connecting back for the sake of identifying where they are being read geographically. This could potentially yield “who is actually reading and where are they.” Certainly the parties reading and parsing through exfiltrated data would benefit more than the attackers.
Theory and practice are two different beasts though. Certainly there would be a lot of “line noise” that could inject many false positives and false negatives into the equation and those would need to be sorted.
In order to counter that line noise, the documents would be what I call “loaded cookies” or as others have called them, honeytokens. I would give an attacker what they wanted. Only it would be on my own terms.
In either event, I will be posting a video demonstrating Validation Attribution shortly, complete with demonstrated beacons and tracking. Stay tuned.