Skip to main content

lean

Iterations considered harmful

The iteration is a cornerstone of agile development. It provides a heartbeat for the team and its stakeholders, and a structure for various routine activities that help keep development work aligned with what the customer needs. However, the way many teams run their iterations creates serious pitfalls which can keep them from delivering software as effectively as they could.

The orthodox approach to the iteration is to treat it as a timebox for delivering a batch of stories, which is the approach most Scrum teams take with sprints (the Scrum term for an iteration). In recent years many teams have scrapped this approach, either using iterations more as a checkpoint, as many ThoughtWorks teams do, or scrapping them entirely with Kanban and Lean software development.

For the purpose of this post, I will refer to these two general approaches to running iterations as the "orthodox" or "timeboxed-batch" iteration model on the one hand, and the "continuous development" model on the other hand. Although orthodox iterations have value, certainly over more old-school waterfall project management approaches, continuous development approaches which do away with timeboxing and avoid managing stories in batches allow teams to more effectively deliver higher quality software.

Orthodox iteration model: (or "timeboxed-batch" model). Each iteration works on a fixed batch of stories, all of which must be started and finished within a single iteration.
Continuous development model: Stories are developed in a continuous flow, avoiding the need to stop development in order to consolidate a build containing only fully complete stories.

The anatomy of the orthodox iteration

In the classically run iteration or sprint, the Product Owner (PO) and team choose a set of stories that it commits to deliver at the end of the iteration. All of the stories in this batch should be sufficiently prepared before the iteration begins. The level and type of preparation varies between teams, but usually includes a level of analysis including the definition of acceptance criteria. This analysis should have been reviewed by the PO and developers to ensure there is a common understanding of the story. The PO should understand what they can expect to have when the story is implemented, and the technical team should have enough of an understanding of the story to estimate it and identify potential risks.

The iteration begins with an iteration kickoff meeting (IKO) where the team reviews the stories and confirms their confidence that they can deliver the stories within the iteration. The developers then choose stories to work on, discussing each story with the PO, Business Analyst (BA), and/or Quality Analyst (QA) as appropriate, then breaking it down into implementation tasks. Implementation takes place with continual reviews, or even pairing with these other non-developers, helping to keep implementation on track, and minimizing the amount of rework needed when the story is provisionally finished and goes into testing.

The QA and BA/PO then test and review each story as its implementation is completed. This is in addition to the automated testing which has been written and repeatedly running following TDD and CI practices. Only once the story is signed off do the developers move on to another one of the stories in the iteration's committed batch.

As the end of the iteration approaches, developers and QAs should be wrapping up the last stories and preparing a releasable build for the showcase, which is typically held on the final day of the iteration. In the showcase, the team demonstrates the functionality of the completed stories to the PO and other stakeholders, the stories are signed off. The team holds a retrospective to consider how they can work better, then on the next working day they hold the IKO to start the following iteration.

When the iteration ends the team has a complete, fully tested and releasable build of the application, regardless of whether the software actually will go into production at this point.

The start and end dates of the iteration are firmly fixed. If there are stories (or defect fixes) which aren't quite ready at the end of the iteration, the iteration end date is never slipped. Instead, the story is not counted as completed, so must be carried over to the next iteration.

The benefits of the orthodox iteration

This style of iteration offers many benefits over traditional waterfall methodologies. A short, rigid cycle for producing completely tested and releasable code forces discipline on the team, keeping the code in a near-releasable state throughout the project, and avoiding the temptation to leave work (e.g. testing) for "later", building up unmanageable burdens of work, stress, and defects to be dealt with under the pressure of the final release phase.

The timeboxed iteration also forces the team to learn how to define stories of a manageable size. If stories are routinely too big to complete in one iteration this is a clear sign that the team needs improve the way it defines and prepares stories.

This demonstrates another benefit of the iteration, which is frequent feedback. The team is able to evaluate not only the quality of their code and its relevance to the business by getting feedback quickly, they are also able to evaluate how effectively they are working, and try out ideas for improving continually throughout the project.

Fundamental problems with the orthodox itertation

The timeboxed-batch approach to iterations has value, particularly for teams inexperienced with agile. However, it has fundamental problems. At core, this approach is waterfall written small, with many of the same flaws, albeit with a small enough cycle that issues can be dealt with more quickly than with a full waterfall project.

To understand why this is so, let's flesh out the idealized anatomy of the iteration from above with some of the things which often happen in practice.

  • At the start of the iteration, no development is taking place, because everyone is working on preparing the new batch of stories. The BA's are extremely busy now, because they have a full working set of stories to hand over to developers (i.e. however many stories the team can work on at once, that's how many stories the BA's must hand over all at the same time). The QA's are less busy now, although they may be helping the BA's out, and planning their testing for the iteration's stories.
  • Actually, I lied. Development is taking place, and QA's are extremely busy. Testing and bugfixing stories left over from the previous iteration is still going on. See points below to understand why. As a result, preparation and starting of the stories for this iteration is sluggish because of work carried over from the previous iteration. This actually isn't necessarily bad, since it helps to stagger the story preparation work, preventing the BA's from becoming a bottleneck. However, depending on whether carryover work was factored into the number of stories chosen for the current iteration (business pressures often make this difficult), this may mean the team is already at risk of failing to meet its commitment.
  • Once the previous iteration's work has settled down, QA's have little to do until the end of the iteration approaches, at which point they come under enormous pressure. Developers are humping to get their stories done in time, leaving QA's with a pile of stories to be tested in time for the showcase. Any defects they find increase this pressure even more, with very little time to get the fixes in and then re-tested (maybe needing even more fixing and re-testing!)
  • If developers complete a story with a bit of time left in the iteration, they aren't able to start new stories because the stories for the following iteration won't be ready to work on until the IKO.
  • In the end, some stories don't get fully tested during the iteration. They may be tested in the following iteration, after having already been signed off as "complete" by the unsuspecting Product Owner. If so, developers need to be pulled away to fix the defects found, or else the defects are added to a backlog to be fixed "later" (also known as "probably never"). Other developed code is left completely untested or under-tested, with the vague hope that any defects will be found in later testing phases, or that maybe there aren't any important bugs anyway. In fact, these defects will be found, but not at a time more convenient to the team.
  • If any serious issues are raised by stakeholders during the showcase there is not time to fix them until the next iteration, which means it will take the full length of an iteration before a truly releasable build is created.

The root problem of the orthodox iteration

At the end of the day, the orthodox iteration suffers from two problems which are inherent in its very definition: it organizes work into batches, and it enforces a timebox.

Batching work is the antithesis to flow. The Lean approach to working aims to maximize the flow of work for the members of a team, which in software development translates to getting stories flowing easily through creation, analysis, implementation, validation, and release. When a developer finishes one story and it is signed off, she should have another story ready for her to pick up and start on. This shouldn't need to wait on an arbitrary ceremony, and certainly shouldn't have to wait for everyone else on the team to finish their stories and get them all signed off.

The batching focus of orthodox iterations doesn't only cause developers to block, it also turns BA's, QA's, and the PO into bottlenecks. As described above, the start and end of the iteration each put a full working set of stories in the same state at once, all needing the same activity carried out on them at once.

Imagine an assembly line which starts up to assemble twenty cars, then stops while they are all inspected at once. Only once all of the cars are inspected and their defects fixed does the line start up again to begin assembling another twenty cars.

Timeboxing is also a source of problems for iterations. The main problem is the arbitrary deadline creates pressure to get stories "over the line" so they can count towards the velocity for the iteration. Unless management is enlightened (or uninterested) enough to avoid focusing on fluctuations of velocity from iteration to iteration (and even the most enlightened managers I've worked with do get worked up over velocity) this leads to the temptation to rush and cut corners, or to play games with stories.

Rushing obviously endangers the quality of the code, which almost certainly leads to delays down the line when the defects surface. Playing games, such as closing unfinished stories and opening defects to complete the work, or counting some points towards an unfinished story, undermines the team's ability to measure and predict its work honestly. These bad habits will catch up one way or another.

Expecting code to be complete at the end of the iteration, fully tested, fixed, and ready for deployment, is unrealistic unless the iteration is structured with significant padding at the end. This padding must come after all reviews, including the stakeholder showcase, to allow time to make corrections, unless those reviews are mere rubber stamp sessions, with no genuine feedback permitted. This then means the team will be underutilized during the padding time. Otherwise, if there is so much rework done during this period that the entire team is fully engaged, then the risk of introducing new defects is too high to be confident in stable code by the end.

The alternative is to break the strict timeboxed-batched iteration model by interleaving work on the next iteration with the cleanup work from the previous iteration. This turns out to not be such a bad idea, and leads to evolving away from the timeboxed-batch iteration model towards the continuous development model.

The continuous development model

The continuous development model may be purely iteration-less, e.g. Kanban, or it may still retain the iteration as a period for measuring progress and for scheduling activities such as showcases. Once development is underway stories are prepared, developed, and tested using a "pull" approach, being worked on as team members become available, so that stories are constantly flowing, and everyone is constantly working on the highest value work available at the moment. This requires some different approaches to managing work flow than are used with other approaches. For more information, look into Kanban and Lean software development.

Since joining a year or so ago I've found that although no two ThoughtWorks projects run in exactly the same way, there is a strong tendency to use iterations which look a lot like Kanban, but retaining a one or two week iteration. Iterations are used to report progress (including velocity), and to schedule showcases and other regular meetings, but stories are not moved through the process in batches. Teams don't start and stop work as a whole other than the start and end of a release. If the showcase is two days away, nothing stops a developer pair from starting on a new story knowing full well it will be incomplete when the codebase is demoed to the stakeholders, and possibly even deployed to later stage environments.

Although we do make projections and aim to have certain stories done by the next showcase, the team doesn't promise to deliver a specific batch of stories. If it makes sense, stories can be dropped, added, or swapped as needed. This gives the business more flexibility to adapt their vision of the software as it is developed. It also reduces the pressure to mark a given story as "done" by a hard deadline, since there is no disruption from letting work carry on over the end of an iteration.

I've seen a Scrum team become ornery and rebellious when a PO made a habit of asking to swap stories after a sprint had started, even though work hadn't been started on the particular stories involved. This was made worse because bugfixes were scheduled into sprints alongside stories, meaning that any serious defect found in production completely disrupted the team.

Another factor that aggravated the situation was that the stories for each sprint were agreed before the end of the previous iteration. So if the showcase raised ideas for improvements to the functionality completed in iteration N, new stories could only be started in iteration N + 2 at the soonest. This hardly created a situation where the PO or the business felt the development team was responsive to business needs.

Also see Oren Teich's post Go slow to go fast, which points out the problems with deadlines, and that iterations are simply a shorter deadline.

Challenges and rewards of continuous development

There are certainly challenges in moving to continuous development over the timeboxed-batch model. There is more risk of stories dragging on across multiple iterations. This can be mitigated by monitoring cycle time and keeping things visible, so that the team can discuss the issue and make changes to their processes if it becomes a problem.

For teams which are new to agile and still struggle to create appropriately sized stories, the timeboxed model may be more helpful to build the discipline and experience needed before being able to move to a continuous model. However, for experienced teams, timeboxing and batching stories simply has too many negative effects.

Continuous development, with a looser approach to iterations, maximizes the productivity of the team, avoids pitfalls that put quality at risk, and offers the business and the team more flexibility.

Successful software delivery in spite of Evil IT

In my previous post, I glibly said that SLAs represent waste that an organization has identified and formalized. ReaderKenfin commented on my post, rightly calling me out to provide alternatives.

If you believe that SLA's 'formalise waste' this way how would you approach my situation where communications are beyond poor (atrocious) and the org structure is silo'd and no one is accountable for their work?

Kenfin's example illustrates my point quite well - the organization's structure is an obstacle to effective delivery. Since he's not in a position to fix this problem, he's turned to SLA's as a way to manage the problem. They won't make the issues go away, but they may give him a handle to manage them, and importantly, make them more predictable.

Turtle on a keyboard, like slow IT people. It's a metaphor.

But it's a fair question, what can someone in Kenfin's shoes do in the face of an IT organization which is inherently not aligned to effectively providing the services he needs to deliver software to his users effectively?

A common strategy, and one that I've helped teams inside these kinds of organizations do, is to completely bypass the existing IT organization. The goal is to put control of everything that the product team needs in order to deliver into its hands, rather than leaving it at the mercy of a group (or multiple groups) who have other priorities.

Outsource it!

One way to do this is outsourcing, finding another company that specializes in the functions that the IT group would provide, whether this means development, integration, hosting, or something else. This works best if the project is not seen as core to the business, so that it avoids fear of entrusting sensitive data or business critical functions to outsiders. It also helps if the project needs skills that can't be found in-house.

My friends at Cognifide have built their business on this, building technically complex content-focused websites for corporate clients, delivering far more quickly, and with greater expertise, than most corporate IT organizations can manage. This is also the premise that Software as a Service (SaaS) is based on. By choosing SalesForce for CRM a company completely bypasses the massive IT project that would be required to implement an off the shelf, self-hosted CRM package (integration with other applications aside).

There are pitfalls to outsourcing to bypass IT. Many outsourcers are no more responsive than an in-house IT department, using SLAs and change control processes to make their workload, risks, and profitability more manageable.

Do it yourself!

The strategy I've most often been involved with myself (although I didn't really think of it this way at the time) is product departments building their own IT capabilities. Again, this is about having control of the services and resources the group needs in order to deliver to its own customers.

The typical pattern is an "online" (or often, "digital") department of a company where online was originally on the fringes of the main business, but has in recent years grown into a major channel for sales, customer service, or even delivery of products (for example in publishing).

The online team leverages their growing importance, as well as the specialized needs they have compared with typical corporate IT custometrs, to get approval from top management to create their own "digital operations team" or similar. This team may outsource elements of infrastructure, such as hosting (with IaaS cloud providers as an increasingly appealing option), but they are able to respond immediately to the needs of the online product group, because a) they don't have to juggle requests from other departments and teams, and b) they report directly to the manager of that group.

But what if I have to use the crappy IT guys who don't care about my project?

Those strategies are not feasible for every team. I've certainly had to support projects where we had no alternative but to struggle along with unresponsive IT. In these cases, SLAs may well have to do, even though they represent waste and inefficiency.

There are a few other things you might at least try out in these cases. Your goal is still to have the resources you need in order to get things done at your disposal as far as possible. So see if you can identify those services which are especially critical, and particularly those which are likely to change frequently, and see if you can get some dedicated resource assigned to your project. You want someone who will sit with your team, be incentivized by the success of your project, and who has the skills, authority, and system privileges to carry out the tasks you need.

If full time secondment of people to your team is not quite feasible due to budget, lack of available resource, etc., see if you can at least get commitments of time from the right people. Can someone come to daily standups? Weekly meetings? Regular release management meetings? Ask for as much as possible to start with, then see what you can get.

Also, maybe you can hire someone into your own team with qualifications and background that will help them effecitely liaise with difficult IT teams. Your own DBA, security consultant, etc. can engage with the IT groups using their own language, couching things in terms that address their concerns. They may be able take certain tasks off the IT group's plates, which ends up giving you the ability to get things done more quickly, while at the same time making IT grateful that their workload is lighter.

But the right thing to do is ...

These are all ways to work around the core problems. The best solution is of course for the organization to restructure itself in a way that aligns its resources with its goals. Most companies, especially large ones, insist on organizing themselves in ways that are self-defeating. It's a shame that many people who work in large companies accept this as normal, often even as desirable.

Grouping everyone with a given function into a single group forces them to focus on juggling the competing needs of many stakeholders, managing their own risks (especially the risk of getting blamed when projects fail). They will inevitably favor the abstract principles of their own technical practices over what is most effective in making the business succeed. Much better to group people into units that have complete ownership for delivering business value, and find ways to connect staff of given function with each other so they can develop their skills as working practices.

Unfortunately most of us are rarely in a position to influence this, so I hope that my suggestions will be helpful to some people in making things a little less painful.

Definition of an SLA

Posted in

SLA: Waste that an organization has identified in a critical business process and decided to formalize rather than eliminate.

Lean and Agile blogrolls

Posted in

Here are some blogs I'm currently following about Lean proceses. Some are focused on software development, some cover lean processes more broadly.

These are blogs focused on Agile software development as well as Lean:

These are pulled dynamiclly from my Google Reader account, so will change whenever I change my subcriptions.

Nice description of what Lean development is all about

Posted in

I'm becoming a big fan of Lean Software Development, a particular strand of Agile Development methodologies, although we are a mostly-Scrum shop. I find Corey Ladas's description a very tidy explanation of the key philosophies of Lean:

Syndicate content