SLA: Waste that an organization has identified in a critical business process and decided to formalize rather than eliminate.
A development team’s infrastructure - development and QA environments, CI servers, SCM servers, etc. - are indisputably business critical, but rarely given the kind of monitoring attention that production environments are. This is a missed opportunity, not only to ensure the continuity of development work, but also to gain valuable insight.
Reasons to monitor your application in every environment it’s deployed to:
1 Keep your development team moving
This is the obvious one. You need to know before you run out of disk space, RAM, etc.
2 Optimize your CI / deployment pipeline
Do you know what the limiting factors are on the time it takes your automated tests to run? The shorter you make your dev/test/fix feedback loop, the more productive your team will be, so why not analyze and optimize it as you would any other key software system? If checkout from SCM takes 20% of the time to test results, what can you do to reduce it? Are your unit tests constrained by CPU, RAM, or disk I/O?
3 Understand your applications
We’re conditioned to think that measuring performance and resource consumption is only useful in an environment that mirrors our production hardware. But if we build up an awareness of how our application uses memory and other resources as a part of every execution and every environment, we’ll have a deep and intuitive understanding of what makes it tick.
4 Develop and test your monitoring
By having monitoring running against applications while they are still in development, you will find ways to improve how you monitor (“let’s measure the activity on our queues”), catch holes in your monitoring (“why didn’t our monitoring tell us when the dev server queue broker went down?”), and test changes to your monitoring.
Once you put monitoring in place in development and testing and make a habit of using it, it becomes a ubiquitous and indispensable part of your team’s working processes, similar to the shift to using CI well.
Don’t leave it as a low priority task, something to get around to at some point after you get around to setting up a perfect performance testing environment. Put it at the center of your team’s toolset for understanding your work.
A common pattern in software development teams is to have a person who owns the build system. This may be a deliberate decision, or it may evolve organically as a particular team member gravitates towards dealing with the build scripts, automated testing and deployment, etc. While it’s normal for some team members to have a deeper understanding of these things than others, it’s not a good idea for the knowledge and responsibility for the build to become overly concentrated in one person.
The build system should be looked at as a module or component of the software application or platform being developed, so the philosophy taken towards code ownership apply.
If a single person owns the build system, everyone else becomes dependent on them to fix issues with it, and to extend it to meet new needs. There is also a risk, especially for projects which are big enough that maintaining the build system becomes a full time job, that a bit of a siloed mentality can develop.
If developers have a poor understanding of how their software is built and deployed, their software is likely to be difficult and costly to deploy. On the flip side, if build and test tools are implemented and maintained entirely by people who don’t develop or test the software, it isn’t likely to make the life of those who do as easy as it could be.
In the past few months I’ve taken on a role which is largely focused on this area, and have been helping a development team get their build and delivery system in place. Pairing with developers to implement aspects of the system has worked well, as has letting them take on the setup of particular areas of the build and test tooling. This follows what Martin Fowler calls “Weak Code Ownership”, allowing everyone to take part in working on the build and test system.
Special attention is needed for stages of the path to production as they get further from the developer’s workstation. Developers are keen to optimize their local build and deployment, but can often be fuzzy on what happens when things are deployed in server environments. This is exacerbated when the platforms are different (e.g. developers working on Windows, code deployed on Linux).
Even without platform differences, developers understandably focus on the needs of their own local build over those of production system deployment. This is natural when server deployment is not a part of their daily world. So the best way to compensate for this is to keep developers involved in implementing and maintaining server deployment.
Driving the implementation of the build and deployment system according to the needs of business stories has also been useful. So rather than setting up tooling to test parts of the system that haven’t been developed yet, wait until the design of the code to be tested starts to be understood, and the code itself has actually started being developed. This helps ensure the tooling closely fits the testing and deployment needs, and avoids waste and re-work.
In the first post in this series, I explained why I think Maven Maven is a good idea. Most projects need pretty much the same thing from a build system, but using Ant normally results in complex, non-standard build system which becomes a headache to maintain.
In theory, Maven should be a better way to run a build. By offering a standardised build out of the box, you would massively reduce the setup and learning curve for new joiners to the development team and take advantage of a healthy ecosystem of plugins that can be simply dropped into your build, and save loads of setup and maintenance hassle.
Although its goes pretty far towards the first two of these advantages, in my second post I described how Maven’s configuration is too complex even for simple things.
I note that the Maven site doesn’t currently mention “convention over configuration”, although I’m sure it used to in the past, and there are plenty of references to it around the web. The Wikipedia entry for convention over configuration lists Maven as a poster-child, and Sonatype, the commercial company supporting Maven, names a chapter of their reference book named after the concept.
But it’s a load of bollocks.
My final point (for this series, anyway) is on flexibility. The tradeoff between configuration complexity is normally flexibility. This is certainly the case with Ant; the complexity which makes every Ant-based build system a painfully unique snowflake buys you the capability to do damn near anything you want with it.
But Maven’s complexity does not buy us flexibility.
My team wants to divide up its testing into multiple phases, following the “Agile testing pyramid” concept as mentioned by Mike Cohn.
So we’d like to have four layers to our pyramid, unit tests running first; database integration tests running next; web service tests third; and only if all of these pass do we run web UI tests. These test groups run in order of increasing heaviness, so we get feedback on the simple stuff quickly.
Maven supports two levels of testing, unit tests and integration tests. The failsafe plugin which provides integration testing support seems fairly new, and is actually pretty good if you only need one phase of integration testing. It lets you configure setup and teardown activities, so you can fire up an app server before running tests, and make sure it gets shut down afterwards.
If we could get failsafe to run three times during the build, each time carrying out different setup and teardown activities, and running different groups of tests, my team would be fairly happy with Maven.
It is possible to use build profiles to set up different integration tests in this way, but to get them to run, you need to run the build three times, and each time the preceding steps will be re-run - compiling, unit tests, packaging, etc. So it’s kind of nasty, brutish, and too long.
The right way to achieve what we’re after is probably to customise the build lifecycle, or create a new build lifecyle. Either way, it involves creating custom plugins, or extensions, or both. I’ve taken a stab at working out how, but after burning a couple of evenings without getting anywhere, I’ve shelved it.
I have no doubt it can be done, but it’s just easier to do it in Ant and move on to other tasks. And that’s pretty much the bottom line for me. I still like the idea of Maven, I have hopes it will continue to improve (it’s a thousand times more usable than it was a few years ago), and maybe even go through a shift to embrace convention over configuration for real.
In the meantime, I’m likely to reach for Maven when I need something quickly for a project that seems likely to fit its paradigm, but for heavier projects (most of the ones I’m involved with for client projects), Ant is the pragmatic choice.
In my previous post I explained why Maven is a good concept for Java project builds. In this post I’ll delve into a key area where it falls down, the overcomplexity of its configuration.
In brief, we have a proliferation of home-brewed build systems created in many different ways using Ant, all of which do much the same thing. Since the vast majority of Java projects have very similar build requirements, an off the shelf build system should be able to do the job.
More to the point, a standard build system using convention over configuration should be simple to set up and maintain. This in turn helps the development team’s velocity, since less time is spent fiddling with (or worse, fighting against) the build system, so more time can be spent delivering value for customers.
Maven fulfils this if your build needs are ridiculously simple. Slap your source code into folders for application code and test code, name your tests right, and put in a very small pom.xml file. Run “mvn install” and your code is compiled, unit tested, and ready to deploy.
Things get ugly pretty quickly though, as soon as you need to make even a small tweak to the basic build.
For example, let’s look at what’s involved in changing the version of Java you want your project built to, for language compatibility and/or for compatibility with the runtime environment your project will be used in. Maven 3.0.x uses Java 1.5 by default, even if you have Java 1.6 installed, which is sensible enough.
So we have two parameters that need changing, language version for the source, and target version for generated classes. Here’s what we have to add to our simple pom.xml to achieve this:
<project> [...] <build> [...] <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>2.3.2</version> <configuration> <source>1.4</source> <target>1.4</target> </configuration> </plugin> </plugins> [...] </build> [...] </project>
So the first thing that strikes you is, holy spaghetti, that’s a lot of scaffolding for two parameters. Someone wading through a pom file trying to understand our build in order to modify or fix it will have a hard time working out the meat of this snippet, and it’s not clear that it’s all only there for those two parameters.
Much worse, the extra scaffolding isn’t harmless. Before we had this, compilation was done by the maven-compiler-plugin, using the latest available version for the version of Maven we’re using. Now this out of the box default has been surfaced in the pom file. Our project is locked into knowing about, and managing the plugin and its version. If we don’t specify the version, Maven 3.x prints a scary warning suggesting that in the future it will no longer enable our careless disregard for managing the nuts and bolts of its innards, so expect our build to break horribly.
As soon as you set any configuration option on a built-in component of Maven, you are forced to take responsibility for that component that you wouldn’t have otherwise, and clutter your configuration with stuff that would otherwise be assumed by Maven.
Now suppose you want to run integration tests on your code, that run heavier weight tests in a separate phase after unit testing. The testing model we’re following is to have lightweight unit tests that run fast for quick feedback on basic errors, then heavier tests which probably need a container like Jetty, and perhaps a database, to test a bit more deeply. It’s often referred to as integration testing, meaning you’re testing the components of a single application integrated together.
(However, the naming leads to confusion since integration testing can also refer to testing integration with external applications, web services, etc.)
So Maven supports running a separate set of tests in an integration test phase that runs after the unit tests, using the failsafe plugin. How do we make use of this? Firstly, we write JUnit test case classes, and name them ITMyClass, and the “IT” beginning tells Maven to run them in this phase.
So does Maven just find and run these classes if they exist? Of course not, you have to add some configuration to your pom.xml so it knows. Fair enough, Maven shouldn’t waste our precious build time searching all of our test classes for ones named with “IT” if we aren’t using integration tests.
So, it should be a pretty simple configuration setting to tell Maven we’re using the integration test phase. Right?
<project> [...] <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-failsafe-plugin</artifactId> <version>2.7.2</version> <executions> <execution> <id>integration-test</id> <goals> <goal>integration-test</goal> </goals> </execution> <execution> <id>verify</id> <goals> <goal>verify</goal> </goals> </execution> </executions> </plugin> </plugins> </build> [...] </project>
(From the failsafe plugin docs.)
Holy scaffolding! All we’ve really told Maven with all that configuration is to include the “failsafe plugin”, and wire it into the build lifecycle to run at the integration-test phase. But again, we’ve locked the plugin details, including the version number, into our project’s pom file.
In this case, we’ve also had to explicitly (and verbosely) specify the lifecycle phase. So this is a plugin whose whole purpose in life is to be used in the integration phase of Maven’s build lifecycle, abut we have to take it by the hand and tell it exactly how and where to wire itself into the lifecycle, even though we’re following its default use case.
That isn’t configuration over convention! That isn’t even Mexico!
Oh, wait, this doesn’t run our integration tests with a servlet container. Let’s plug in Jetty. It’s actually pretty easy, just copy, paste and tweak the example in this documenation from Codehaus (the second XML snippet on that page).
There is even a bit more meat in the configuration, which is fair enough, since there are a few things you might like to configure when running a servlet container. But you’ve still got loads more configuration, much of which is calling out details of Maven’s execution lifecyle that are, when it comes down to it, the default for 90% of the people using this.
So you can see how your Maven project’s pom.xml file quickly becomes packed with unneeded crap, which increases the barrier to understanding and modifying the build. Once you develop a pretty good understanding of Maven’s internal structures and models you become adept at working out which parts of the pom.xml are scaffolding, and which you actually care about.
But most developers on a team won’t invest the time to master Maven, and shouldn’t need to. They have business value to deliver. This is the wall I’m hitting supporting a high performance development team, because there is a low tolerance for spending time on things other than building the application features customers need.
So is Ant any better than this? As I said in my previous post, Ant builds typically become over-complicated. But Maven projects not only suffer from complexity, they also suffer from Maven’s inflexibility. Normally you’d consider trading off some flexibility for simplicity, as long as the end result was higher productivity.
So my next Maven post will focus on a specific area where Maven’s inflexibility is hurting my team’s productivity. I don’t currently have more than these three posts in mind for this series, but then my team is still only in iteration 0.
Feedback on these posts: I’m more than happy to be told I’m wrong, particularly if there are easier, simpler, and better ways to implement the examples I’ve described. Since I’ve given up on blog comments (for the time being at least), the best way to give feedback is to mention me (@kief) in a Tweet, with a link to your response.
I’ve been building Java software using Ant for over 10 years. I’ve been giving Maven a try every few years since it first came out, and going back to Ant pretty quickly each time, until last year. Early last year I used Maven on a few smallish - pretty much one-person - projects, and found it to be pretty useful, so I decided it may be ready for prime time.
Recently I’ve begun working on my first build system for a ThoughtWorks project, which happens to be using Maven, and although I’ve been enthusiastic about the challenge, the weaknesses of the tool are showing pretty clearly.
The idea of Maven is a good one - it’s a standardised build system that, off the shelf, does pretty much what you need for typical Java projects, using convention over configuration. Ant, on the other hand, is not a build system, it’s a tool that you can use to create your own build system. It’s almost a DSL for build systems, but far more flexible.
The thing is, at least 95% of Java projects have pretty much the same set of requirements from a build system. Get my dependencies. Compile my source. Run my unit tests. Run some analysis. Package my artefact. Run integration tests. Deploy. Build multiple modules into a single aggregated artefact. That’s pretty much it.
So everyone who uses Ant is using it to do pretty much the same thing, but they create wildly different build systems to do it. As a result, new team members have a serious learning curve, and often require days or even weeks to get set up. Adding tools such as a new code analysis report is not a simple drop-in, and the build system’s complexity tends to grow over time.
A typical project’s Ant-based build system is as complex, and as time consuming to maintain, as any major component of the software it builds. It accumulates technical debt and adds drag on the project’s velocity.
So rather than every team having it’s own home-brewed special snowflake build system, Maven aims to be a build system that uses convention over configuration to build any Java project that follows a common pattern. It’s become popular enough that other tools can just be dropped in, and a developer joining a Maven-based project could be up and running in less than a day, in ideal cases in under an hour.
Ideally, by using Maven on our project we eliminate the hassle of maintaining a complex build system, because Maven simply does what we need, by default.
From Tom DeMarco’s article Software Engineering: An Idea Whose Time Has Come and Gone? [PDF]:
You say to your team leads, “I have a finish date in mind, and I’m not even going to share it with you. When I come in one day and tell you the project will end in one week, you have to be ready to package up and deliver what you’ve got as the final product. Your job is to go about the project incrementally, adding pieces to the whole in the order of their relative value, and doing integration and documentation and acceptance testing incrementally as you go.”
DeMarco isn’t recommending specific methodologies like Agile, but this is a pretty good business oriented description of Continuous Delivery without continuous (production) deployment.
The difference between a typical development team and a high performing team is that the typical team tries to deliver quickly by working in ways which turn out to add “drag” into its workload.
The harder the typical team works, the more baggage they pile onto themselves, and the harder it becomes for them to deliver. They spend more and more time on maintenance - bugfixing, working around production issues, working around their workarounds, etc. - and they find that even simple change requests and features are unreasonably complex, time consuming, and risky. The team is always busy, and endlessly discuss the refactorings and improvements that would make everything easier, if only they had enough time.
The truth is, even if the code brownies came in one night and completely refactored the typical team’s code so that it was sparkling clean and completely bug-free, within 6 - 12 months they would find themselves back up to their necks in technical debt. This situation is the inevitable product of the team’s working habits.
The high performing development team routinely uses practices like TDD, pair programming, Clean Code, and similar. The specific practices may vary, but generally speaking, there are many things that almost all developers know they probably ought to be doing, but generally don’t.
Most teams find these practices impractical. They may try them out, but they find them awkward to do, and it takes longer to get work done using them. The pressure to deliver (this is a business, after all) forces them to be pragmatic, so they refocus on getting functional code out the door. They dismiss these nice-sounding practices as a luxury, fine for consultants, coaches, and book authors, but impractical in the Real World.
The high performing team, on the other hand, has been doing these things long enough that they have become a matter of habit. It doesn’t slow them down the way it does a team which is just learning.
It’s important to note that very few of these practices necessarily make a team churn out code faster. Rather, they reduce the amount of drag produced along with the code, so more developer time is spent writing new code rather than coping with existing code.
The code is more thoroughly tested (it’s fully retested on every build) so fewer bugs make it into production, which means the team spends less time dealing with production issues. Their applications handle and communicate errors more clearly, so issues that do come up are rapidly identified and fixed. The codebase is structured and written more cleanly, so making changes and adding features is simpler and faster.
The difficulty is that getting from A to B - from a typical team uncomfortable using TDD and the rest, to a high performing team which does them without thinking, requires practice, time, and work, that most teams aren’t given.
Why do many operations teams prefer to deploy applications manually, even where automated deployment tools exist?
I’ve been a part of operations teams which deploy manually, but always felt this was wrong, and struggled to get the time and resources to implement automated deployment. But I’ve encountered teams which actually insist on manual deployment, even when the development team has provided automated deployment tools for them.
The usual explanation is that manually deploying an application is the only way to know exactly what changes are being made. On the face of it, belief in the reliability and auditability of humans over scripts is silly.
But this comes back to my previous point that Devops is a confidence game. Ops won’t easily trust a black box deployment tool given to them by developers until they’ve had loads of experience with it.
Even if the deployment tool is written in a script that the Ops folks could theoretically read and understand if they took the time, it’s a non-trivial thing. Sysadmins don’t have loads of time to trawl through some weird developer-oriented scripting language (“what the hell is Ant?”) and test the hell out of a deployment script that they may only use every few months or so.
Where I have seen deployment tools used in ops is where ops have been involved in creating the tool. Ideally, ops should pair with developers to design, write, and test the deployment tool. If this isn’t practical, the deployment tool should be developed using agile principles, with the Ops team as the product owner. Given that ops people are technical, this needs to go even deeper than a normal product owner relationship; they should be involved in the technical design and even code review.
If the ops team is intimately involved in the specification, design, and implementation of the deployment tool, then they will have the confidence that they understand the tool thoroughly enough to use it in environments where failure may mean getting out of bed at 3 AM to fix it.
Command to debug chef recipes on a Vagrant VM that uses chef-solo for provisioning:
sudo chef-solo -c /tmp/vagrant-chef/solo.rb -j /tmp/vagrant-chef/dna.json -l debug
Or, the proper way:
(Thanks to Mitchell Hashimoto for pointing that out)