According to Werner Vogels's blog post entitled Amazon EBS - Elastic Block Store has launched, it seems that my friends at Amazon have plugged a gaping hole in their cloud computing platform story. Werner writes
Back in the days when we made the architectural decision to virtualize the internal Amazon infrastructure one of the first steps we took was a deep analysis of the way that storage was used by the internal Amazon services. We had to make sure that the infrastructure storage solutions we were going to develop would be highly effective for developers by addressing the most common patterns first. That analysis led us to three top patterns:
- Key-Value storage. The majority of the Amazon storage patterns were based on primary key access leading to single value or object. This pattern led to the development of Amazon S3.
- Simple Structured Data storage. A second large category of storage patterns were satisfied by access to simple query interface into structured datasets. Fast indexing allows high-speed lookups over large dataset. This pattern led to the development of Amazon SimpleDB. A common pattern we see is that secondary keys to objects stored in Amazon S3 are stored in SimpleDB, where lookups result in sets of S3 (primary) keys.
- Block storage. The remaining bucket holds a variety of storage patterns ranging special file systems such as ZFS to applications managing their own block storage (e.g. cache servers) to relational databases. This category is served by Amazon EBS which provides the fundamental building block for implementing a variety of storage patterns.
What I like about Werner's post is that it shows that Amazon had a clear vision and strategy around providing hosted cloud services and has been steadily executing on it.
S3 handled what I've typically heard described as "blob storage". A typical Web application typically has media files and other resources (images, CSS stylesheets, scripts, video files, etc) that is simply accessed by name/path. However a lot of these resources also have metadata (e.g. a video file on YouTube has metadata about it's rating, who uploaded it, number of views, etc) which need to be stored as well. This need for queryable, schematized storage is where SimpleDB comes in. EC2 provides a virtual server that can be used for computation complete with a local file system instance which isn't persistent if the virtual server goes down for any reason. With SimpleDB and S3 you have the building blocks to build a large class of "Web 2.0" style applications when you throw in the computational capabilities provided by EC2.
However neither S3 nor SimpleDB provides a solution for a developer who simply wants the typical LAMP or WISC developer experience of building a database driven Web application or for applications that may have custom storage needs that don't fit neatly into the buckets of blob storage or schematized storage. Without access to a persistent filesystem, developers on Amazon's cloud computing platform have had to come up with sophisticated solutions involving backing data up manually from EC2 to S3 to get the desired experience.
EBS is the final piece in the puzzle that had prevented Amazon's cloud computing platform from being comparable to traditional hosting solutions. With EBS Amazon is now superior to most traditional hosting solutions from a developer usability perspective as well as cost. Google App Engine now looks like a plaything in comparison. In fact, you could build GAE on top of Amazon's cloud computing platform now that the EBS has solved persistent custom storage problem. It will be interesting to see if higher level cloud computing platforms such as App Engine start getting built on top of Amazon's cloud computing platform. Simply porting GAE wholesale would be an interesting academic exercise and a fun hacking project.
Now Playing: T.I. - Whatever You Like