Breaking the Code: The Role of Visualization in Security Research

In today’s interconnected, data rich IT environments, passive inspection of information is not enough.

mean that the company is organized in three offices or countries. Second, "data dust" is present throughout the image. One interpretation could be that some email addresses are trying to reach nonexistent or old ones that aren't connected with anything else. Spam, for example. And finally, we can also easily see that certain nodes are connected in a group, displaying some sort of hierarchy in the communication (for example: managers, help desks, or mailing lists).

Visualization involves increased complexity as the size of the domain grows. In this era of big data, databases with millions and billions of entries from security devices are increasingly common. But complexity can be kept manageable through either entity grouping, sampling, or parallelization.

With entity grouping, the researchers create nodes that represent groups of entities rather than individual entities, such as team nodes instead of employee nodes. Perhaps more importantly, the level of detail, or drill-down depth, can vary according to the type of security data being used, firewall logs, traffic files, etc. This would give access to the whole of the model without having to load all its constituent information up-front.

Sampling is another way to limit the size of the data set without losing sight of the big picture, by using a random or focused subset of the data set. Using the previous example, the designer could drop a random half (or some other fraction) of the employee emails. The results would then need to be interpreted with the understanding that sampling was used.

