Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Perimeter

11/6/2008
06:15 PM
Robert Graham
Robert Graham
Commentary
50%
50%

Bending Skein Code

Few of the submissions to NIST's hash standard contest have been optimized for desktop/server processors. One, though, known as Skein, seems to have considered this. It is designed specifically to run well on Intel Core 2 processors -- without sacrificing speed on other processors or security.

Few of the submissions to NIST's hash standard contest have been optimized for desktop/server processors. One, though, known as Skein, seems to have considered this. It is designed specifically to run well on Intel Core 2 processors -- without sacrificing speed on other processors or security.I write fast code. People ask me how I do it. I tell them:

"Do not try to bend the spoon; that's impossible. Instead, only try to realize the truth. There is no spoon. Then you'll see that it is not the spoon that bends. It is only yourself."

Computers have the ability to run many times faster than they do in practice. That is the because the problems they are trying to solve don't completely fit with the hardware -- they aren't "bent" to conform to hardware quirks.

A good example is crypto algorithms. Common crypto algorithms consist of logical operations that transmogrify bits within data. A modern computer processor, like the Intel Core 2, can execute three logic instructions per clock cycle (meaning a 3-GHz processor can do 9 billion logic operations per second).

Unfortunately, the logic a crypto algorithm wants to do doesn't quite fit on the processor. The older hash algorithm, SHA-1, is a good example. It only used about a quarter of the potential of the Core 2 processor. It can only do about 1.5 logic operations per clock, and each operation operates on 32 bits rather than the full 64-bit potential of the processor.

First standardized around 1993, SHA-1 is now considered weak. NIST, the U.S. National Institute of Standards and Technology, is holding a contest to create a successor. This is similar to the contest it held for the AES symmetric-encryption standard, but this one will come up with a new hash standard tentatively called "SHA-3."

Roughly 10 people/groups have submitted algorithms. These submissions make some accomodation to the Core 2 processor. They operate in "little-endian" mode (a quirk of the Intel-like processors that reads some bytes in reverse order). They also allow a large file to be broken into chunks to split the work across multiple processors.

However, virtually all of the contest submissions share the performance problem mentioned above. The logic they use won't optimally fit within the constraints of a Intel Core 2 processor. Most will perform as bad or worse than the existing SHA-1 algorithm.

One exception to this is Skein, created by several well-known cryptographers and noted pundit Bruce Schneier. It was designed specifically to exploit all three of the Core 2 execution units and to run at a full 64-bits. This gives it roughly four to 10 times the logic density of competing submissions.

This is what I meant by the Matrix quote above. They didn't bend the spoon; they bent the crypto algorithm. They moved the logic operations around in a way that wouldn't weaken the crypto, but would strengthen its speed on the Intel Core 2.

In their paper (PDF), the authors of Skein express surprise that a custom silicon ASIC implementation is not any faster than the software implementation. They shouldn't be surprised. Every time you can redefine a problem to run optimally in software, you will reach the same speeds you get with optimized ASIC hardware. The reason software has a reputation of being slow is because people don't redefine the original problem. This is the sort of thing I did with intrusion detection/prevention. I redefined how I did detection in order to optimize how fast it ran. (The trick, by the way, is to use state machines so that you don't have to reassemble at low layers of the stack, just reorder fragments.) This prejudice about software has long frusterated me. I saw customers test my solution, find it was faster, but STILL buy the ASIC product because "everyone knows hardware is faster."

By the way, logic density should be considered a security risk. Hacking tools like John the Ripper crack hashed passwords by guessing more than one password at a time. While you can only use a quarter of the CPU resources to hash a single password, you can hash four passwords simultaneously using all of the resources. This gives the hacker a four-to-one advantage. It's a small advantage in the grand scheme of things, to be sure, but an advantage worth considering nonetheless.

I look forward to playing with assembly language versions of the NIST submissions. I'm particularly interested in SSE instructions that use 128-bit registers. The Core 2 only executes two 128-bit instructions per clock vs. the three 64-bit instructions used in Skein, so SSE is unlikely to help Skein much. It might help other algorithms, though.

Robert Graham is CEO of Errata Security. Special to Dark Reading

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
Why Cyber-Risk Is a C-Suite Issue
Marc Wilczek, Digital Strategist & CIO Advisor,  11/12/2019
DevSecOps: The Answer to the Cloud Security Skills Gap
Lamont Orange, Chief Information Security Officer at Netskope,  11/15/2019
Unreasonable Security Best Practices vs. Good Risk Management
Jack Freund, Director, Risk Science at RiskLens,  11/13/2019
Register for Dark Reading Newsletters
White Papers
Video
Cartoon Contest
Current Issue
Navigating the Deluge of Security Data
In this Tech Digest, Dark Reading shares the experiences of some top security practitioners as they navigate volumes of security data. We examine some examples of how enterprises can cull this data to find the clues they need.
Flash Poll
Rethinking Enterprise Data Defense
Rethinking Enterprise Data Defense
Frustrated with recurring intrusions and breaches, cybersecurity professionals are questioning some of the industrys conventional wisdom. Heres a look at what theyre thinking about.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2019-19040
PUBLISHED: 2019-11-17
KairosDB through 1.2.2 has XSS in view.html because of showErrorMessage in js/graph.js, as demonstrated by view.html?q= with a '"sampling":{"value":"<script>' substring.
CVE-2019-19041
PUBLISHED: 2019-11-17
An issue was discovered in Xorux Lpar2RRD 6.11 and Stor2RRD 2.61, as distributed in Xorux 2.41. They do not correctly verify the integrity of an upgrade package before processing it. As a result, official upgrade packages can be modified to inject an arbitrary Bash script that will be executed by th...
CVE-2019-19012
PUBLISHED: 2019-11-17
An integer overflow in the search_in_range function in regexec.c in Oniguruma 6.x before 6.9.4_rc2 leads to an out-of-bounds read, in which the offset of this read is under the control of an attacker. (This only affects the 32-bit compiled version). Remote attackers can cause a denial-of-service or ...
CVE-2019-19022
PUBLISHED: 2019-11-17
iTerm2 through 3.3.6 has potentially insufficient documentation about the presence of search history in com.googlecode.iterm2.plist, which might allow remote attackers to obtain sensitive information, as demonstrated by searching for the NoSyncSearchHistory string in .plist files within public Git r...
CVE-2019-19035
PUBLISHED: 2019-11-17
jhead 3.03 is affected by: heap-based buffer over-read. The impact is: Denial of service. The component is: ReadJpegSections and process_SOFn in jpgfile.c. The attack vector is: Open a specially crafted JPEG file.