People once thought that if it were possible to know every detail about the physical universe, the future would be completely predictable. The idea that our limited knowledge prevents us from fully understanding the world, and thus being able to fix it, has been around as long as civilization. As summed up in a cliché both empty and profound: knowledge is power. Not intelligence, not any of the creative aspects of the mind, but raw data itself.
1 Breach, 198 Million People
In 2017, this belief most clearly manifests in predictive analytics and other fields of big data analysis. On June 19, the Cyber Risk Team at UpGuard uncovered an unsecured Amazon S3 storage instance with the voter data of 198 million Americans. That's more than half the people in the country. The privacy implications of the data are self-evident, but more important is what the data reveals about how analytic techniques are used to define people as targets, for any type of persuasive intrusion.
The other important thing to note about the voter data leak is the size of the data set: 198 million individuals represented across more than a terabyte of highly researched data. The size of the RNC data set makes it incredibly valuable. That's why analytics companies exist and how they make money. Organizations from candy manufacturers to political parties want to use data analytics to reach more people with more accurate messaging. This is because these organizations are invested parties that make money from selling candy, or that gain power from winning elections. Their ability to persuade people to their side determines their success. The type of data discovered in the RNC leak is designed specifically to enable that persuasion.
If this model seems familiar, that's because social media and tech companies use these techniques in advertising. They provide a platform that people want to use, and then take the information provided by their customers to advertise back at them in a voice they're more likely to heed. In the case of the RNC, the data was used to determine what to say to different people in order to get the most votes.
Email phishing is one of the most successful cyber attacks being employed now. The most sophisticated of these are spearphishing attacks, such as the one against Hillary Clinton campaign chairman John Podesta that allowed hackers to access DNC emails. Spearphishing relies on information gathering to determine how to trick the target into clicking on a malicious link or attachment. Now imagine that spearphishers could use the advanced data of a leaked political strategy to craft their emails. It would make an already dangerous threat much more effective.
The Information Economy
The lesson here is that your information matters. Companies trade on your information daily, shipping huge data sets to third parties through various types of infrastructure, some more secure than others. The information economy is booming — every vector of capturing data is being utilized or soon will be. The Internet of Things promises to make life easier through interconnection, but it also adds devices that capture metrics on your daily life, reporting them to the manufacturer. We've seen this in everything from devices like Vizio TVs to apps like Facebook, where the line between lawful data analysis and privacy invasion is blurry.
Ultimately, the information economy has two faces. The customer-facing side is about ease of use shareability, and all the other go-to descriptions for apps and gadgets that tie back to the Internet. These aspects allow the customer to receive a personalized experience. The obverse is the business side, where devices, websites, and apps collect metrics, analyze customer habits, and predict behavior.
Who Controls the Past Controls the Future
Analytic techniques wouldn't be any good if they weren't predictive. Knowledge is power because knowing about the past enables better decisions in the present to achieve desired outcomes in the future. At the RNC, not only were insights drawn from the data set, but also into the data set, with several fields, including race and religion, being modeled, or predicted as probable. Further analysis can incorporate this modeled data and churn out even more predictive data.
Future innovations will provide more functionality for people, but they will also bring attendant risks and place more personal information in the hands of private companies. Data sets will grow larger and analytic capacity will increase, trying to reach the goal of perfectly knowing every individual through a highly scrutinized matrix of information in order to customize offers down to every man, woman, and child.
This is why information matters and why data breaches matter. The companies that handle information know how valuable it is — that's how they make money. The companies that outsource analytics know how valuable it is — that's why they pay millions for it. But when it comes to the day-to-day IT operations that gather, move, store, manipulate, and copy that data, it's often treated as if it had no value at all. And when valuable data falls into the wrong hands, it becomes a big problem for the organization and its clients.
It would be nice if we could flip a switch that would solve this problem, but it's a situation created from the sum total of data, processes, and assets within each digital organization and the vendors they employ. In a complex ecosystem of constant change, the daily work of IT operations is grueling and often thankless, especially at the largest scale.
Protecting a large organization and the people whose information it holds comes down to standardizing and improving this day-to-day work and continuously testing it to make sure it's right. It requires all business leaders to treat IT as they would any other critical piece of the company: by integrating it strategically and seriously accounting for its risks. If data-driven analysis is going to guide business for the foreseeable future, those making money from using it must do what is necessary to prevent that data from being exposed.