Sporadically delivered thoughts on Continuous Delivery

ITIL Can Suck, but Shouldn't

| Comments

The pattern of my career over the past five years or so has involved moving newish, smallish internet/software companies to a post-startup hosting infrastructure. My past three companies were all small companies that developed and hosted internet applications, either for clients (in one case) or their own products. My role in each case was to move things to a more mature infrastructure, with configuration management, monitoring, directory services, and the other pieces needed to be able to manage a growing sprawl of servers and applications.

My focus has been much more on the technical than on the people processes for running and supporting the infrastructures. In my current job, the team I’ve brought in and the infrastructure we’ve built has reached a decent level, although there’s certainly plenty more to do technically. But looking at what we’ve done and what I want to do next, I’ve realized thatrather than moving on and doing the same thing at another company, the more interesting challenge will be to take things to the next level.

The next level for me is going beyond the technical to focus on the people and processes. The technology infrastructure is going to grow in size and sophistication, including spreading to multiple data centres globally, but the technical challenges seem like more of the same to me. The challenges that seem newer and more intriguing to me personally are more along the lines of how the hell we’re going to organize and coordinate people doing development, infrastructure, and support in three, four, or more countries.

So I went on a course in ITIL version 3. Yikes. ITIL is basically a blueprint for organizing a huge IT operation with lots of bureaucratic processes, forms, and signoffs that will make it nearly impossible to get anything done, and ensure that responsibilities are divided so that nobody who is doing anything productive sees the big picture.

I don’t think it has to be this way. I actually did find the course useful, although not as useful as it could have been given that most of the people on it were more interested in ticking off the certification than getting ideas on how to improve the organisations they work at. There were some pretty interesting people there, some of whom were obviously interested in fixing real problems “back home”. If the course had been more of a workshop where we shared war stories and ideas, it would have rocked.

A lot of the concepts in ITIL are useful, I think it’s more a matter of using your head when applying them, making sure to adapt the ideas in ways that fit your needs and objectives. It’s very easy to see how an organisations, especially large ones, take the ITIL material and use it to build horribly inefficient IT structures. I’ve worked with companies that use ITIL this way, and the course shed light on how they got this way.

The biggest problem with ITIL is that it’s presented with clearly defined “phases” of strategizing, desiging, deploying (“transitioning”), operating, and improving IT services. This is an invitation to a waterfall model, where (as in at least one organization I know of) each phase can even be run by a completely different team of people.

So one group designs the service, hands it off to another than rolls it out (tests and installs it), and then hands off to a completely separate team that supports it. In the organisation I’ve encountered, the operations team hasn’t got the vaguest clue about the service.

Of course the transition process involves “knowledge transfer” where the people who set up the service train the support team, but anybody who’s done this stuff in the real world should know better.

Knowledge that is transferred in a handover process is never, ever, ever going to be learned as well as knowledge that comes from actually being involved throughout the whole process. Having some hands-off manager (ahem) overseeing things all the way through doesn’t cut it, the people who will actually be diving into runtime problems with an application need to have gotten their hands dirty trying to install the application, and even have pitched into meetings where the details of how the application should be integrated into the infrastructure.

Otherwise, you’re going to end up in the situation of my nameless organisation, one which is actually often held up as an exemplar of ITIL. They host an application on their servers, installed by the transition team, and their support team had training on how to log into the server and investigate problems with it. But when users call up with problems, the support people, who probably support dozens of applications, have forgotten all of this. They call up the software vendor - who have no access to the servers.

Can you imagine how incompetent your organisation looks when it’s clear that your support people have no idea that the application they support is run by their own company?

But I do think it’s possible to take many of the ideas of ITIL and apply them in a more agile manner. A bit of Googling shows I’m not the only one who thinks so, but that there doesn’t seem to have been much work done on the idea, at least publicly. It’s certainly something that would take a bit of thought and work.

My first thought, clearly, is that an agile IT services process would have to embrace the lean management principle of empowerment by having the “workers” (for lack of a better word) involved throughout the process.

I’ve also thought that the kanban approach to agile is paricularly suited to a sysadmin team, since it does away with the iterations/release cycle in favor of a queue of tasks that people pull from when they find they’ve got spare capacity.

Anyway, I’m looking forward to thinking this stuff through and trying out ideas over the next year. Although I’m going to be far less hands-on technically, my focus does need to involve a thorough understanding of the technical aspects of what we’re doing, so I don’t think I’m going to become a total suit.