To DLP or not to DLP - Data Leakage/Loss Prevention

Wednesday, January 19, 2011

kapil assudani


Data Loss/Leak Prevention are no longer a buzz words, but part of an actual mature product market with major vendors in the game. 

And with that, its hard for the enterprises to ignore the vendor pitches or to make a conscious decision to "To DLP or not to DLP".

The goal of this article is not to go into details of what a DLP technology is - that can be referred anywhere and is well documented - but to help enterprises determine first the DLP Problem space, and then how to make conscious business decisions when investing int he DLP solution space.

The first and the foremost thing is to answer the question: What problem space are we talking about when we talk about Data Leakage? The Data Leakage problem can be can be defined as any unauthorized access of data due to an improper implementation or inadequacy of a technology, process or a policy.

The “unauthorized access” described above can be the result of a malicious, intentional, inadvertent data leakage, or a bad business/technology process from an internal or external user.

Next, the second question to answer is what part of the problem space defined above does the DLP product market solve? In the above definition of data leakage, the DLP solutions are designed to prevent unauthorized access of data due to inadequacy or improper implementation of a process or a policy, but not technology. They are not designed to address data leakage issues resulting from external attacks.

Hence the DLP systems primarily help enforce “acceptable use” policies and processes for an enterprise. They are not designed to solve the the part of data leakage problem space that is related to technology - a.k.a the information security aspect. So, it is not an information security data leakage issue that the DLP solution is trying to solve. 

For example, DLP solutions will not prevent an information security vulnerability or attack like SQL Injection related to technology. 

So far we have figured out what a data leakage problem space comprises, and also that the data leakage problem space is partially addressed by data leakage prevention solutions which relate primarily to "enforcing acceptable use policies and processes".

Hence the DLP solutions help mitigate following risks :

  • Identifying insecure business processes. For example, use of FTP for transporting PHI data
  • Accidental data disclosure by employees. For example, employee sending unencrypted email containing PHI data
  • Intentional data leakage by employees. For example, disgruntled employees stealing data or an employee leaving the company with sensitive data.

Now please understand even this problem space is not solved comprehensively by DLP solutions. For example, an employee can still take a picture of sensitive data and leak it.

So we can mature our definition further for DLP Solutions as being systems that aid the enforcement of acceptable use policies and process with certain limitations.  Please understand, the idea is not to say DLP solutions solve a tiny problem, but to understand what problem space they actually cover and address.

The third question that comes to mind, where is our enterprise in this Data Leakage Problem space? Surprisingly, one will notice that Data Leakage is already a part of one's enterprise security strategy in the form of deployed firewalls, encryption solutions, IDS, LDAP etc. from the technology perspective.

From the policies and process perspective, the enterprise already has established processes on how to request access to a corporate system or policy and for what is appropriate to browse on a corporate PC or laptop,  The missing part is the enforcement of these policies and processes, and that is where DLP solutions solve the problem for various instances of data - like data at rest, data in motion, and data at an end point.

Next, getting to the real question - does my enterprise need to invest in a DLP solution? And this is a million dollar question which requires comprehensive evaluation specifically to the current state of enterprise security technology investments, and of course the data type the enterprise processes/stores.

For all practical purposes we will assume the enterprise processes/stores PHI or credit card data and there are multiple regulatory, protect-the-brand, intellectual property protection-kind of drivers that demand the addressing of the data leakage problem space. 

The first and foremost thing to understand is that data leakage prevention should be/ is implicitly a part of an enterprise security strategy, and one should not just invest in a tool-kind of approach to solve the problem.

This stems from the fact that data the leakage problem space is much bigger and comprises both information security issues and issues related to enforcement of acceptable use policies for detecting insecure processes.

Once defined and prioritized in the strategy, the enterprise builds new or streamlines current policies and processes before investing in a technology solution.

A current assessment of the enterprise security infrastructure would also reveal that the enterprise has already invested in multi-point DLP or DLP-like solutions in the form of hard-disk encryption, portable storage device encryption solution, security event monitoring solutions, email gateway monitoring kinds of solutions, etc.

Any of these point solutions may be implemented or are a feature of a security infrastructure that an enterprise has already invested in and needs to have turned on.

Now in my opinion, there are some important pre-requisites required in order to even consider investing in a DLP solution as listed below :

1) Enterprise Data Classification: If your data is not classified, you will not be able to use a vendor DLP solution in an optimized way and it will be more like buying a white elephant that you have to maintain for minimum benefits. Basically, if you cannot answer the question where is my sensitive data, you need to first work on a data classification effort for your enterprise. You cannot just buy the DLP solution and point it everywhere, it would be an administrative nightmare. DLP vendors often say, the tool will do your data classification - that is partially true for data in motion only and not otherwise. For example a DLP tool will be clueless unless you define what specific kinds of unstructured information is sensitive and where it is stored. For example, an application logical architecture document.

2) Streamline or Implement Processes and Policies in support of data leakage prevention:  If your corporate policies or user provisioning processes are not properly defined, a DLP solution is going to be of no use, because at the end of the day the rules to monitor and detect for the tool are going to be configured by the enterprise. If there are no policies or clearly defined processes related to data leakage prevention, the tool will hardly be of any use. It would not be possible to integrate it with, lets say, an enterprise identity management solution.

3) Perform a gap assessment on current security infrastructure that already implicitly supports DLP or can be leveraged to support DLP: This will help the enterprise understand if they need to buy a point solution or the entire DLP suite. The result of the gap assessment would result in cost savings.

Data Classification efforts can be very easy for a small enterprise, and a beast for large enterprise. Similarly, implementing a DLP solution is an easy and effective for a small enterprise vs. a medium or large enterprise. 

The larger enterprises should always use a phased approach and also account for the extra manpower required to continuously configure, monitor and tune the DLP solution. This will reduce false positives and false negatives, which is usually the biggest problem enterprises have reported once implementing the DLP solution. Its not the tool which is a problem here, it's the preparation and implementation shortcomings that result in such outcomes. 

Also, it's easy to get blown away by some of the rally features of DLP solutions like end point protection at the desktop-like controlling copy-paste functions for certain kinds of data, or pattern matching features, etc.

But the practicality of these features in an actual business environment must be evaluated. Some of the features could result in serious business interruptions in the case of no data classification or a rules misconfiguration. 

So in a nutshell, the DLP solutions address only a subset of data leakage issues and only help enforce “acceptable use” policies and processes with a number of limitations. They do not prevent information security related data leakage issues like external malicious attackers stealing the data through use of OWASP listed top 10 kinds of vulnerabilities.

Also, in order for them to work effectively with encrypted data, any impacts - however minimal - related to integration of a tool with encrypted technologies should be accounted for when making a decision. 

The task of vendor selection should only start once the listed pre-requisites are accomplished in order to make a smart investment and leverage the amazing capabilities of the DLP solutions.

Possibly Related Articles:
Security Strategies Data Classification Data Loss Prevention Vendor Management DLP
Post Rating I Like this!
Uzi Yair Good article but again I have to take issue with the assumption that Enterprise Data Classification is a requirement before implementing DLP. This is because not all DLP systems are alike. For axample, if you look at Network DLP, the definition is a system that performs Data Classsification in Real-Time for any outbound transmission and enforces a pre-defined policy. In that case, the system will Identify PII without having to pre-classify anything. Some folks keep repeating what they see from other analysts or vendors' marketing garbage. It is time to look at New Generation DLP systems.
Uzi Yair
GTB Technologies, Inc.
kapil assudani Thanks Uzi. Also you are correct about network dlp's , that is why i wrote in the data classification section that only for data in motion, classification is not required since like you pointed out network dlp's discover PII. What we have to understand is that this is the only case where data classification is not required, but the vendors claim this in general - and that can be misleading for the enterprise not recognizing the need for data classification.
The views expressed in this post are the opinions of the Infosec Island member that posted this content. Infosec Island is not responsible for the content or messaging of this post.

Unauthorized reproduction of this article (in part or in whole) is prohibited without the express written permission of Infosec Island and the Infosec Island member that posted this content--this includes using our RSS feed for any purpose other than personal use.

Most Liked