Sporadically delivered thoughts on Continuous Delivery

Breaking Into Automated Infrastructure Management

| Comments

Automated management of infrastructure is vital for delivering highly effective IT services. But although there are plenty of tools available to help implement automation, it’s still common to see operations teams manually installing and managing their servers, which leads to a high-maintenance infrastructure, which soaks up the team’s time on firefighting and other reactive tasks.

Doing it by hand

I’ve met many smart and skilled systems administrators in this situation. These folks know automation can make their life easier, but they can’t afford to take time away from turning cranks, greasing wheels, and unjamming the gears to keep their infrastructure puffing along in order to focus on improving their situation.

I’m convinced this is largely due to habit. Even though these teams understand that automation would be useful to them, when the pressure is on (and the pressure is always on), they roll up their sleeves, ssh into the servers and knock them into shape, because that’s the fastest way to get stuff done. Manual infrastructure management is what they’re used to. I find that most of these teams haven’t had personal experience of well-automated infrastructures, and don’t tend to believe it’s something they can realistically implement for their own operations.

Sysadmins who have worked in teams with mature, comprehensive automation, on the other hand, can’t go back. Sure, they might log into a box to diagnose and fix something that needs fixing right now, but they can’t relax until they’ve baked the fix into their automated configuration, and made sure that their monitoring will alert them ahead of time if the problem happens again.

Breaking out of manual infrastructure management and setting up an effective automation regime is difficult. Although there are loads of tools out there to make it work, it helps to understand good strategies for implementing them. I recommend looking over the material on the site. It hasn’t been updated in a few years, so doesn’t take much of the advances since then into account, including virtualization, cloud, and newer tools like Chef and Puppet, but there is still rich material there.

Another must-read which more up to date is Web Operations by John Allspaw, Jesse Robbins, and a bunch of other smart peeps.

I’m also planning to share a few of the practices I’ve seen and used for automation in upcoming posts.