The folks at 37 Signals have an insightful blog post entitled Don’t scale: 99.999% uptime is for Wal-Mart which states
Jeremy Wright purports a common misconception about new companies doing business online: That you need 99.999% uptime or you’re toast. Not so. Basecamp doesn’t have that. I think our uptime is more like 98% or 99%. Guess what, we’re still here! Wright correctly states that those final last percent are incredibly expensive. To go from 98% to 99% can cost thousands of dollars. To go from 99% to 99.9% tens of thousands more. Now contrast that with the value. What kind of service are you providing? Does the world end if you’re down for 30 minutes? If you’re Wal-Mart and your credit card processing pipeline stops for 30 minutes during prime time, yes, the world does end. Someone might very well be fired. The business loses millions of dollars. Wal-Mart gets in the news and loses millions more on the goodwill account. Now what if Delicious, Feedster, or Technorati goes down for 30 minutes? How big is the inconvenience of not being able to get to your tagged bookmarks or do yet another ego-search with Feedster or Technorati for 30 minutes? Not that high. The world does not come to an end. Nobody gets fired.
Jeremy Wright purports a common misconception about new companies doing business online: That you need 99.999% uptime or you’re toast. Not so. Basecamp doesn’t have that. I think our uptime is more like 98% or 99%. Guess what, we’re still here!
Wright correctly states that those final last percent are incredibly expensive. To go from 98% to 99% can cost thousands of dollars. To go from 99% to 99.9% tens of thousands more. Now contrast that with the value. What kind of service are you providing? Does the world end if you’re down for 30 minutes?
If you’re Wal-Mart and your credit card processing pipeline stops for 30 minutes during prime time, yes, the world does end. Someone might very well be fired. The business loses millions of dollars. Wal-Mart gets in the news and loses millions more on the goodwill account.
Now what if Delicious, Feedster, or Technorati goes down for 30 minutes? How big is the inconvenience of not being able to get to your tagged bookmarks or do yet another ego-search with Feedster or Technorati for 30 minutes? Not that high. The world does not come to an end. Nobody gets fired.
Scalability issues are probably the most difficult to anticipate and mitigate when building a web application. When we first shipped MSN Spaces last year, I assumed that we'd be lucky if we became as big as LiveJournal, I never expected that we'd grow to be three times as big and three times as active within a year. We've had our growing pains and it's definitely been surprising at times finding out which parts of the service are getting the most use and thus needed the most optimizations.
The fact is that everyone has scalability issues, no one can deal with their service going from zero to a few million users without revisiting almost every aspect of their design and architecture. Even the much vaunted Google has had these problems, just look at the reviews of Google Reader that called it excruciatingly slow or the complaints that Google Analytics was so slow as to be unusable.
If you are a startup, don't waste your time and money worrying about what happens when you have millions of users. Premature optimization is the root of all evil and in certain cases will lead you to being more conservative than you should be when designing features. Remember, even the big guys deal with scalability issues.