There used to be a time when all you needed when building a Web site was a relational database, a web server and some sort of application server or web framework (terminology depending on whether you are enterprisey or Web 2.0) that acted as a thin layer to translate requests into database queries. As we've grown as an industry we've realized there are a lot more tools needed to build a successful large scale modern web site from messages queues for performing time consuming tasks asynchronously to high availability cloud management tools that ensure your site can keep running in the face of failure.
Leonard Lin has a [must bookmark] blog post entitled Infrastructure for Modern Web Sites where he discusses the underlying platform components you'll commonly see powering a large scale web site in today. The list is included below
I’ve split this into two sections. The first I call “below the line,” which are more system level (some things straddle the line): API Metering Backups & Snapshots Counters Cloud/Cluster Management Tools Instrumentation/Monitoring (Ganglia, Nagios) Failover Node addition/removal and hashing Autoscaling for cloud resources CSRF/XSS Protection Data Retention/Archival Deployment Tools Multiple Devs, Staging, Prod Data model upgrades Rolling deployments Multiple versions (selective beta) Bucket Testing Rollbacks CDN Management Distributed File Storage Distributed Log storage, analysis Graphing HTTP Caching Input/Output Filtering Memory Caching Non-relational Key Stores Rate Limiting Relational Storage Queues Rate Limiting Real-time messaging (XMPP) Search Ranging Geo Sharding Smart Caching dirty-table management
I’ve split this into two sections. The first I call “below the line,” which are more system level (some things straddle the line):
Leonard Lin's list comes from his experience working at Yahoo! but it is consistent with what I've seen at Windows Live and from comparing notes from publications on the platforms behind other large scale web services like Facebook, eBay and Google. This isn't to say you need everything on the above list to build a successful web site but there are limits to how much a service can scale or the functionality it can provide without implementing almost every item on that list.
This brings me to Google App Engine (GAE) which is billed as a way for developers to build web applications that are easy to build, maintain and scale. The problem I had with GAE when I took an initial look at its service is that although it handles some of the tough items from the above list such as the need for high availability cloud management tools, deployment tools and database sharding, it was also missing some core functionality like message queues. These oversights made it impossible to build large classes of Web applications such as search engines or email services. It also made it impossible to build asynchronous workflows in ways that improve responsiveness of the site from an end user's perspective by reducing request latency.
So it was with some interest I read Joe Gregorio's post on the Google App Engine blog entitled A roadmap update! where he updates the product's roadmap with the following announcement
The App Engine team has been plugging away and we're excited about some pretty big announcements in the near future. In the meantime, we decided to refresh our App Engine roadmap for the next six months with some of the great new APIs in our pipeline: Support for running scheduled tasks Task queues for performing background processing Ability to receive and process incoming email Support for sending and receiving XMPP (Jabber) messages As always, keep in mind that development schedules are notoriously difficult to predict, and release dates may change as work progresses. We'll do our best to update this roadmap as our engineers continue development and keep you abreast of any changes!
The App Engine team has been plugging away and we're excited about some pretty big announcements in the near future. In the meantime, we decided to refresh our App Engine roadmap for the next six months with some of the great new APIs in our pipeline:
As always, keep in mind that development schedules are notoriously difficult to predict, and release dates may change as work progresses. We'll do our best to update this roadmap as our engineers continue development and keep you abreast of any changes!
The ability to send email and the ability to perform background processing tasks are key features you'd need in any modern site. I've been wanting to try out GAE as a way to keep my Python skills fresh but have balked at the lack of background tasks and message queues which artificially limits my creativity. Once these announced features are done, I may have to take back my comments about GAE being only useful to toy applications.
Maybe next time C|Net asks Is Google App Engine successful? the answer will be "Yes". Or at least the possibility that the answer will be "Yes" will go up a few orders of magnitude since it definitely doesn't seem to be successful today.
Now Playing: Playa Fly - Feel Me