The topic of data classification is one that can quickly polarize a crowd. The one side believes there is absolutely no way to make the classification of data and the requisite protection work -- probably the same group that doesn't believe in security awareness and training for employees. The other side believes in data classification as they are making it work within their environments, primarily because their businesses require it. The difficulty in choosing a side lies in the fact that both are correct.
In the average corporate network, data classification is extremely difficult, if not impossible. Data sprawl across unkempt network shares, desktops, and mobile devices makes it difficult for IT to identify and secure. When left in the hands of users, most organizations make classification schemes too difficult for users to know how to label the information they're responsible for.
The opposite is true when dealing with organizations that are related to or part of the Department of Defense or medical and pharmaceutical companies that have very stringent data classification and handling procedures. Data classification is part of the corporate culture. It is part of the employees' indoctrination into the company and required as part of their daily work lives. And the classifications are well defined, so there is little confusion as whether or not something should be considered sensitive or not.
For classification efforts to work, there needs to be a small set of categories for which data can be classified. Any more than a handful and users are likely to become confused, or frustrated, and misclassify something. Those classifications need to be based around the value of the data and the risk associated with the data falling into the wrong hands, being destroyed, or losing its integrity. Simple guidelines need to be established so that employees can easily recognize how something should be handled when they encounter it or when they are creating new data.
Don’t classify everything
Where classification programs fail is when management and the implementers get stuck in a "classify everything" mindset. Attempts to seek out all data and classify from the start can quickly become time consuming and futile depending on the level of data sprawl. It's easier to start with the core business processes and workflows to see where classification can occur. Sometimes it needs to be at a macro level where entire systems are designated as sensitive instead of at the file and individual database level. This may mean that tighter, more granular controls be implemented on fileshares or entire servers to provide the adequate level of protection.
With things like email, however, it's easier to accomplish by the user classifying the email when he or she creates it. Depending on the solution, the user can check a box or include a specific keyword in the subject or body of the message to trigger automatic encryption or prevent the content from being forwarded outside of the company. Automated classification systems can be used to label emails as sensitive, based on their content, but are more prone to error if the keywords are not well maintained.
Similarly, solutions exist to integrate with users' workflows as they create and modify Microsoft Office documents. The documents can be labeled based on the defined classifications. Those labels are then used by controls on the file and email server to ensure that only authorized users can access them.
User training is critical
Even with automated and manual solutions available for data classification, how is it that some organizations have successfully implemented a classification program when so many others have failed miserably? It’s because they focus on user training and awareness from the very beginning. Employees are involved early-on in determining classification schemes and guidelines that make sense to them. Focus groups are put together from different areas of the enterprise to see how well users interpret the proposed classifications and ensure that there is no confusion on how to classify the documents and emails they create.
Once the classifications have been developed, technical solutions need to be tested to find the best fit. Of the organizations I've talked to, most have found a mix of automated and manual techniques to work best, but it depends on what technologies are currently in place (e.g., Exchange and Outlook), how employees generate and work with information that needs to be classified (Microsoft Office and SharePoint), and integration capabilities with those workflows. Test groups of users need to be selected to test the products that make the shortlist to determine ease of use and clarity on how to label things within the classification scheme, and to ensure that the product does not hinder productivity.
If your organization is looking into developing a data classification program, it's not a decision to take lightly, as it involves much, much more than simply buying a product and dropping it in place. Users need to be involved from the beginning to ensure that classification schemes and guidelines are straightforward and easy to understand. Automated tools need to be tested to make sure they can identify and locate the types of data that are important to your organization. And manual classification solutions need to be put into the users' hands early, to make sure they are usable and do not hinder productivity.