In my previous post, I talked about some of the issues I saw with the idea of doing away with operations teams and merging their responsibilities into the development team's tasks [as practised at companies like Amazon]. Justin Rudd, who is a developer at Amazon, posts his first-hand perspective of this practice in his blog post entitled Expanding on the Pain where he writes
Since I am a current employee of Amazon in the software development area, I probably shouldn’t be saying this, but…
...
First a few clarifications - there is no dedicated operations team
for Amazon as a whole that is correct. But each team is allowed to
staff as they see fit. There are teams within Amazon that have support
teams that do handle quite a bit of the day to day load. And their
systems tend to be more “smooth” because this is what that team does -
keep the system up and running and automate keeping the system up and
running so they can sleep at night.
There are also teams dedicated to networking, box failures, etc. So
don’t think that developers have to figure out networking issues all
the time (although there are sometimes where networking doesn’t see a
problem but it is affecting a service).
Now for those teams that do not have a support team (and I am on one
of them), at 3 in the morning you tend to do the quickest thing
possible to get the problem rectified. Do you get creative? After
being in bed for 3 hours (if you’re lucky) and having a VP yell at you
on the phone that this issue is THE most important issue there
is or having someone yell at you that they are going to send staff
home, how creative do you think you can be? Let me tell you, not that
creative. You’re going to solve the problem, make the VP happy (or get
the factory staff back to work), and go back to bed with a little post
it note to look for root cause of the problem in the AM.
Now 1 of 2 things happens. If you have a support team, you let them
know about what happened, you explain the symptoms that you saw, how
you fixed it, etc. They take your seed of an idea, plant it, nurture
it, and grow it.
If you don’t have a support team and you are lucky, in the morning there won’t be another THE most
important thing to work on and you can look at the problem with some
sleep and some creativity. But the reality is - a lot of teams don’t
have that luxury. So what happens? You end up cronning your solution
which may be to bounce your applications every 6 hours or run a perl
script that updates a field at just the right place in the database,
etc.
We all have every intention of fixing it, but remember that VP that
was screaming about how this issue had to be fixed? Well now that it
isn’t an issue anymore and it’s off his radar screen, he has new
features that he wants pushed into your code. And those new features
are much more important than you fixing the issue from the night before
because the VP really doesn’t care if you get sleep or not at night.
Justin's account jibes with other accounts I've heard [second hand] from other ex-Amazon developers about what it means to live without an operations team. Although it sounds good
on paper to have the developers responsible for writing the code also responsible when there are issues with the code on the live site, it leads to burning the candle at both ends. Remember, division of labor exists for a reason.