2/6/2010
04:08 PM
Adrian Lane
Adrian Lane
Commentary

Amazon's SimpleDB Not Your Typical Database

Several cloud providers offer databases specifically designed for cloud deployment. Amazon's SimpleDB, while technically a database, deviates from what most of us recognize as a database platform. Although SimpleDB is still in prerelease beta format, developers have begun designing applications for it.



Several cloud providers offer databases specifically designed for cloud deployment. Amazon's SimpleDB, while technically a database, deviates from what most of us recognize as a database platform. Although SimpleDB is still in prerelease beta format, developers have begun designing applications for it.SimpleDB, like its name implies, is a very simple data repository. It's designed to provide storage and retrieval services with minimal complexity. The core operations are put (insert), select, batch put (bulk upload), and delete. Data can be divided into domains, much like database schemas, but the similarities end there.

This cloud database service is not for transactional systems where data accuracy and integrity are mandatory.

The fundamental difference with SimpleDB is it lacks a schema that defines structure and data types, which relational databases require. The lack of the logical schema and table constructs drastically changes operations.

First, you do not need to define data before you insert it, so you can choose data types dynamically. Second, there are no primary or foreign key requirements to force referential integrity relationships because, quite simply, the concepts of columns and keys do not exist. Data elements are not evaluated for conformity prior to insertion. Third, stored data is automatically indexed, but done in a simple Google-esque manner that does not require the overhead associated with relational indices.

Finally, SimpleDB relies on indexed flat files. Rather than following the model of using a predefined block structure, adhering to a database vendor proprietary format, SimpleDB writes data out to file without imposing a rigid structure. Data need not be clustered physically in the same location on disk to boost performance.

When you strip away all of the relational database management overhead, insertions and queries are much faster. Insertion performance is very fast because there is no processing to perform data integrity, conformity, and consistency checks. Queries by nature are not complex and use a direct indexing system, which quickly locates data that matches query attributes. There are no issues with joining tables together, and much like a Google search, unstructured and dissimilar types of information are quickly located.

Flexibility is also a big advantage. With no need to worry about data types, your application can store different data types without knowing what you will be storing in advance. Applications built on this platform can offer dynamic storage capabilities, and changes to the application or data types do not require restructuring the database. But these advantages in speed, cost of ownership, and programmatic simplicity do have a downside. SimpleDB does not offer transactional consistency support. Database "state" is not guaranteed, and operational processes endemic to most relational databases like "two phased commit" to ensure your data was actually stored are not present. That means if something goes wrong during the insertion or deletion process, then data you meant to store may not be available.

Integrity checks on the data to ensure type, integrity, or range validation are not available. Underlying cloud storage is cheap and readily available, but multitenant (shared) in nature, which may not meet regulatory confidentiality and security requirements.

Amazon SimpleDB is in many ways more like a file system you dump your data into than a traditional relational database platform. As its name implies, it was designed for simplicity. By stripping functions down to the very basic elements of insertion, look-up, and deletion, there is not a lot of overhead to slow things down.

Just keep in mind that this will be very good for file and photo- sharing applications, but SimpleDB is not suitable for transactional systems.

Adrian Lane is an analyst/CTO with Securosis LLC, an independent security consulting practice. Special to Dark Reading. Adrian Lane is a Security Strategist and brings over 25 years of industry experience to the Securosis team, much of it at the executive level. Adrian specializes in database security, data security, and secure software development. With experience at Ingres, Oracle, and ... View Full Bio

Comment  | 
Email This  | 
Print  | 
RSS
More Insights
Copyright © 2020 UBM Electronics, A UBM company, All rights reserved. Privacy Policy | Terms of Service