The first area to examine is how much data is being accessed on a moment by moment basis. As you may have noticed from the discussion in our last entry there is an onramp or cloud gateway for almost every data type now, ranging from backups to primary block storage. The moment by moment change rate plus the data type will determine how large the local gateway cache will need to be and how often data will need to be recalled from the cloud. The total size of the data set is for the most part irrelevant, other than the GB cost to store it but that cost should be relatively static. The movement of data from your local cache from the cloud will be what delays an application. The more often that data can be served from local cache either through smart caching algorithms or large cache space the better. Also several cloud storage providers charge extra for the transfer out of the cloud back to local storage, so it can lead to a surprise on your bill. Since most onramps or gateways give you a choice of provider it makes sense to know what the hidden extras are from each provider.
The impact of restoring data back from the cloud and its potential extra costs is one of the reasons that backup and archive data have been so popular. The transfer is almost always one way; upload. Also most big recoveries can happen from the local cache and don't need the data stored on the cloud. The backup copy in the cloud mostly serves as a long term retention area. As you move into using cloud storage for primary data the transfer issues become a bit more thorny. The easiest data set use case to deal with is the file share use case. Most files on a file server are only active for a few days and then become dormant. This is an ideal use case for cloud storage, let the older files migrate to the cloud. Even if they do need to be recalled from cloud storage later only a single user is typically impacted by the delay in access, and a single file access is relatively fast.
Databases become a bit more tricky. Here look for applications that have a small portion of the application that is accessed on a regular basis. Microsoft SharePoint is a good example of a "ready for cloud now" data set and potentially some mail systems that store attachments and messages as discrete files. In the near future don't rule out busy transaction oriented databases. As the developers of these platforms embrace the availability of cloud storage they can build in ways to auto-segment off tier sections of data so that it can be stored on different storage types automatically and the cloud could be one of those types.
The second common decision point is the initial load in of data into the cloud. How do you get it all there? That will be the focus of our next entry.
Track us on Twitter: http://twitter.com/storageswiss
Subscribe to our RSS feed.
George Crump is lead analyst of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. Find Storage Switzerland's disclosure statement here.