Network security and asset management products have to be able to identify what operating systems are currently running in the organization. With this information, IT and security departments gain greater visibility and control over their networks.
Device and operating system information is trivial to collect if the device has a software agent installed. However, agents are not an option for some operating systems, especially those installed on embedded and Internet of Things devices. This is where passive OS fingerprinting comes in handy, since passive methods don't require installing software on devices and work for most operating systems. Passive OS fingerprinting involves matching uniquely identifying patterns in the host's network traffic and classifying the traffic accordingly. Several protocols from different network layers can be used for OS fingerprinting: MAC addresses, TCP/IP parameters, HTTP User-Agent strings, and DHCP requests.
The Open Systems Interconnection (OSI) model contains several protocols for the application, transport, network, and data link layers. As a rule of thumb, protocols at the lower levels of the OSI stack — such as MAC and IP — provide better reliability with lower granularity compared with those on the upper levels of stack, and vice versa. Let's start at the bottom of the stack and move up.
Start at the MAC Address
The medium access control (MAC) protocol uses a unique physical identifier hardcoded into the device during manufacturing. The 12-digit hexadecimal number — the MAC address — is commonly written out as six pairs divided by hyphens. The left most six-digit long string represents the manufacturer's unique identifier and the right most six represent the serial number of the device's network interface card. For example, the six left-most digits of a MacBook laptop's MAC address would be 88:66:5a, which is the string affiliated with Apple.
By looking at the manufacturer's unique identifier, network administrators can infer the type of device running in the network and, in some cases, even the operating system.
TCP/IP Tells Tales
The TCP/IP stack is a more granular source of information. Most operating systems set unique values in the parameters within the packet headers, making it possible to identify the OS by looking at these values. Some of the most common parameters used in OS fingerprinting are initial time to live (TTL), Windows Size, "Don't Fragment" flag, and TCP options (values and order). For example, if a device has an outgoing packet's IP header with the "Don't fragment" flag set, TTL with the value of 64, Windows size of 65535, and a specific set of TCP options (02, 01, 03, 01, 01, 08, 04, 00), that's enough to identify it as running MacOS.
HTTP Has a Wealth of Info
Several protocols in the application layer are also useful for identifying the device's operating system type and the exact version or distribution. In some cases, these fields can be configured by the user and, thus, are less reliable.
The most common application protocol used in OS fingerprinting is HTTP, via the header's User-Agent field. The field, added by the application (such as a Web browser), includes information such as the application, operating system, and underlying device of the client. For example, the HTTP request to the server could contain a User-Agent field that identifies the client as a Firefox browser running on a Windows 7 OS: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0.
The User-Agent field is not the only indicator the HTTP protocol contains. Most operating systems have a built-in connectivity test that automatically runs when the device connects to a public network. For example, Network Connectivity Status Indicator (NCSI), an Internet connection awareness protocol in Microsoft's Windows operating systems, consists of a sequence of specifically crafted DNS and HTTP requests and responses that indicate if the host is located behind a captive portal or a proxy server.
DHCP Identifies Hosts
The DHCP protocol, used for IP assignment over the network, provides several indicators that can be used to identify the host. DHCP is composed of 4 steps: discovery, offer, request, and acknowledge (DORA). For example, a Windows host broadcasting DHCP messages over the local network and receiving replies from the DHCP server will be sending a unique vendor class identifier "MSFT 5.0," and the parameter request list would contain a sequence of values common to Windows hosts.
Combined with the order of the DHCP options themselves, the indicators are sufficient to identify the host as a Windows OS.
While some protocols provide better accuracy than others, there is no "silver bullet" for the task of OS identification, and the different options provide different types of information. Rather than fingerprinting based on a single protocol, you might consider a multiprotocol approach, such as combining the HTTP User-Agent with lower-level TCP options.