An Ounce of Big Data is Worth a Pound of Defense

Tuesday, June 17, 2014

Ali Golshan

F5b1211c3952ce30f829cd3c757a1a7f

In security circles, the concept of what constitutes prevention has shifted from as recently as five years ago, due to the growth in complexity of malware and growing maturity and sophistication of cybercriminals. As more funding and resources have become available for targeted attacks, enterprise security teams need to re-think what prevention means to their organization. If they’re smart, they will leverage big data to do it.

Today’s Cybercrime Landscape:

There is a thriving marketplace for malware - sophisticated threats with high success rates are prone to duplication and replication. Malware-as-a-service that has grown substantially, offered by the creators of popular malware families such as Zeus, or created as one-off attacks for a higher price. 

My company is contacted on weekly basis by a variety of individuals operating mostly underground offering 0-Day vulnerabilities, explicitly mentioning they can provide proof these exploit could infiltrate most banking systems because of the standard models of card processing they use on the backend databases.

In addition to “guns for hire” we are also witnessing an unprecedented growth and investment in Cyber capabilities by nation states. This is an area where, depending on the nation's interests and goals, we are seeing cyber-investments overtake traditional, physical defense investments. In recent months we have seen substantial increase in activities from certain groups in China, Iran, SEA (Syrian Electronic Army) taking direct aim at very specific assets and interests. However some of the most sophisticated groups, based on resources, funding, and pure technical capabilities, are groups out of Russia.

Whether stemming from organized crime groups with relationships to the government or purely state-based attacks; the techniques that constitute the overall Tradecraft are some of the most advanced and covert we have seen to date. Today’s bad actors prefer operating under the radar, unlike the majority of Chinese-based attacks, who cared little about discovery or identification.

While as an industry, the shifts in the root cause and source of attacks are widely acknowledged and accepted, we need to understand and integrate these soft data points into our overall detection and prevention methodology.  As necessary as it is, it is easier said than done, in great part because attackers have traditionally been a few steps ahead of their targets. 

Generally, enterprises struggle with breaking through the obfuscation and armoring methods of their attackers, which limits their ability to understand their deeper intent and target.  Leveraging vulnerabilities to exploit applications and users has become a common practice; however, understanding how you can normalize data from vectors such as network, static analysis, behavioral indicators and reputation data sources is a very difficult process.

This is where the use of big data can add immense value.

Data, Data, and more data:

Consider how much data is associated with a single executable malware object:

First we have to take into account the network data we can collect, starting with origination IP, whether it came from a legitimate source indicating the source was infected or was delivered through dark IPs or from known malicious infrastructures.

Next we consider what hops it took: did it produce command & control traffic, what domains it reaches out to, what are the patterns and can we detect entropy in the traffic if using self-signed, hijacked or expired CERT?

Then we can move to determining what sort of data is available from static analysis: are there indicators to specific IP’s, assets, users, services? Are there morphic algorithms within the object causing multiple variations, is there obfuscation and encryption, is the payload included or does it look like a framework to receive future payload?

Finally, we have to move to behavioral analysis and first detonate the object, assuming there is no armoring against emulation, virtualization, dedicated sandbox architectures, sleep-loops, environmental and topological armoring such as trace route, to ensure the payload only detonates in a particular environment and/or in a specific location.

This sounds like a lot of information to keep track of, right?

Now imagine correlating thousands of objects producing the same data, requiring the same depth of analysis, but more importantly trying to find relationships in all this meta-data to determine if are they part of the same family, or coming from the same adversary, as well as the attackers intent is and what their real target is.

Why are these questions important to answer?

Well, in May 2014 the DOJ set precedent by indicting 5-members of the Chinese military hacking team.   The formal identification and recognition of individuals associated with such attacks brings long-term implications, such as lack of travel to certain countries, limitations of future jobs, and being in the crosshairs of most intelligence and government organizations.

This is a huge game changer, because until now, the act of active-defense by the private sector can (in my view) have very large and negative ramifications, because you never really know who’s on the other end, and what their motives and resources are. 

However, with this decision, identification of adversaries, as well as their tradecraft and overall intent will be part of prevention and response efforts.  If private sector companies have better tools to ID their attackers, they can work with the appropriate agency to create very strong cases against the attackers, providing them with new, legitimate options going forward.   

Therefore, it’s more important than ever to ensure detection doesn’t end with a single alert or incident. Advanced threat technologies need to integrate with the mesh of the existing infrastructure, leveraging the context and enforcement already in place.  It should make use of big data tools use as machine learning to combine features into feature-sets that indicate the larger cause and effect of an attack (aka Tradecraft), or smart and thick data solutions to determine where to look and what to look for rather than just looking at every piece of content produced.

The proper application of big data enables security analysts to correlate the right information, to ‘zoom out’ and see the whole galaxy rather than one bright-shining star. In other words, it will give you the whole story, rather than the one chapter your adversary wants you to read.

About the Author: Ali Golshan founded Cyphort in 2010 and leads the company’s research and technical direction. The original architect behind the core Cyphort technologies, Ali has more than 15 years of forensics, security analytics, and security architecture experience. He has advised numerous Fortune 100 companies such as Microsoft, PwC, Google, as well as government intelligence agencies and the defense industry. Working with these high-profile entities, Ali resolved matters related to cyber-espionage, targeted attacks, and research in the fields of machine learning and big data analytics.  Ali started at the age of 17 as an ethical hacker.

Possibly Related Articles:
12780
Budgets Enterprise Security Policy Security Awareness Security Training
Big Data
Post Rating I Like this!
The views expressed in this post are the opinions of the Infosec Island member that posted this content. Infosec Island is not responsible for the content or messaging of this post.

Unauthorized reproduction of this article (in part or in whole) is prohibited without the express written permission of Infosec Island and the Infosec Island member that posted this content--this includes using our RSS feed for any purpose other than personal use.