Latency actually has several levels. The first level is the time it takes the media to position and be ready to respond to an I/O request. With a hard drive, this latency was the end of the discussion, as the milliseconds it took for a hard drive to get in position overshadowed any other latency in the storage communication chain.
Solid state disk (SSD) drives -- which are flash inside a hard drive container -- eliminated that device latency. SSDs can put themselves in position instantly since they have no moving parts, no platters to rotate. But that exposed other areas of latency in the storage protocol stack. For example, the time it takes for the I/O to work its way through the overhead of SCSI became noticeable.
This led to the introduction and rapid adoption of PCIe-based flash. Most of these cards eliminated the storage protocol stack altogether. The communication was direct to the application or operating system over the PCIe bus. Eliminating the storage protocol stack meant that special drivers had to be created for the various operating systems that a data center might have.
Vendors then introduced API sets that would allow users to write directly to the PCIe flash card from within the application for a further reduction in latency, not only avoiding the storage I/O stack but also avoiding the operating system itself.
What PCIe began to show, though, was that there was another level of latency. The PCIe bus itself. The PCIe bus routes through a motherboard-based fabric that allows multiple cards to share PCIe bandwidth. Some higher end servers will have multiple PCIe hubs in order to better route data. But even a high-end server when burdened with lots of I/O can become PCIe bottlenecked and lead to latency. (Remember that PCIe supports more than storage I/O.)
The next tier in latency elimination is to eliminate the PCIe bus altogether. Several vendors are introducing memory-bus-based flash storage. These flash storage devices come in a memory DIMM form factor and can act as storage with a device driver, similar to a PCIe SSD. Even more interesting, with a tweak to server system BIOS, it can act as main memory to the server. Imagine 400 GBs of "RAM" via flash on a single DIMM. Using the memory bus provides even more I/O channels and greater bandwidth; it was, after all, designed to support DRAM.
In both implementation modes, the use cases are very interesting. The ability to create very dense servers with terabytes (if not petabytes) of flash capacity in a 1U system changes the data center design game quite a bit.
There is no single perfect solution, as SSD, PCIe flash and memory flash all have their ideal use cases, and many data centers may have a mixture of all three. Designing applications for extreme high performance with almost zero latency is now reality. But their use is not limited to the performance fringe. For example imagine using these technologies to design a single physical server to support 10,000 plus desktops. We have the processing power available to us, but the latency of storage is no longer the roadblock it always was.