Testing Times

Just another force on the design?

Testing has often been a concern of mine, regardless of its flavour: unit, integration, functional, performance etc. Naturally part of my concern is how much of the system is tested and, thus, to what extent the architecture is proven. However, I'm also interested in how much the desire to test influences the design adopted. I've certainly heard the following:

  • We should use [dependency injection framework] so we can unit test
  • It's an interface so we can create a fake/mock/stub version of it if we need to
  • Can you write a unit test for it as it's not covered?

These types of comments can represent some significant design decisions.

The choice to use a DI framework over, say, EJB (ok, EJB3 has some DI support) to support out-of-container testing is going to influence your implementation greatly. While you could indeed mix the two that's not simply getting you the best of both worlds but all of both worlds. Regardless, your architecture has been skewed by your desire to test in a particular way.

The introduction of interfaces, indirection and inversion of control to support testing is also somewhat self-fulfilling. The code's complexity has arguably been increased to make it easier to fix.

The addition of a unit test to improve coverage typically fails to question what behaviours actually need to be locked in. Testing a class is also not a truly black-box approach in my experience. Access modifiers get changed, encapsulation distorted, otherwise fixed interactions get inverted. Is this the tail wagging the dog?

Implementations often get duplicated in test cases in the hope of answering the question, "does it do what it's supposed to do?" In many cases this simply compounds misunderstandings for practically no gain; the test is simply ensuring the compiler interprets two similar bits of code the same!

Granted, some of this is simply bad testing. Nevertheless, it's quite common and the architect has to design and legislate for the real world. There are other ways to improve coverage which are lower maintenance and more reusable than the introduction of a new test case and the architect is in a position to introduce them and use coverage reports as a force for improving the design, not just the tests.

Here's a, somewhat contrived, example from a project I've recently been working on:

public class SomeBean {

	private String id;
	private BigDecimal amount;

	public String getId() {
		return id;
	}

	public void setId(String id) {
		this.id = id;
	}

	public BigDecimal getAmount() {
		return amount;
	}

	public void setAmount(BigDecimal amount) {
		this.amount = amount;
	}

	@Override
	public int hashCode() {
		final int prime = 31;
		int result = 1;
		result = prime * result + 
			((amount == null) ? 0 : amount.hashCode());
		result = prime * result + 
			((id == null) ? 0 : id.hashCode());
		return result;
	}

	@Override
	public boolean equals(Object obj) {
		if (this == obj) {
			return true;
		}
		if (obj == null) {
			return false;
		}
		if (getClass() != obj.getClass()) {
			return false;
		}
		SomeBean other = (SomeBean) obj;
		if (amount == null) {
			if (other.amount != null) {
				return false;
			}
		} else if (!amount.equals(other.amount)) {
			return false;
		}
		if (id == null) {
			if (other.id != null) {
				return false;
			}
		} else if (!id.equals(other.id)) {
			return false;
		}
		return true;
	}

}

It's pretty standard fare, with generated hashCode and equals methods.

Taken in isolation it would seem clear how to go about testing this simple bean: populate a few SomeBean instances with various combinations of properties, check they contain what you expect and compare their hash values and equality. This class has 22 branches to test (in addition to getters and setters), though.

The tests are informing you how you use the code and how complex the methods are (cyclomatic complexity is essentially the effort required to get coverage). Is the code telling you to write more tests or the tests telling you to write less code? Personally I'd opt for the latter, all things being equal.

Some simple improvements to the code using static methods could get your branches down to 10 (and a chance to use a static import if you're so minded).

	@Override
	public int hashCode() {
		return makeHashCode(id, amount);
	}

	@Override
	public boolean equals(Object obj) {
		if (this == obj) {
			return true;
		}
		if (obj == null) {
			return false;
		}
		if (getClass() != obj.getClass()) {
			return false;
		}
		SomeBean other = (SomeBean) obj;
		if(!equalOrNull(id, other.id)) {
			return false;
		}
		if(!equalOrNull(amount, other.amount)) {
			return false;
		}
		return true;
	}

Using introspection you can get the branches down to something that will actually scale as the number of properties grows, namely zero!

	@Override
	public int hashCode() {
		return beanHashCode(this);
	}

	@Override
	public boolean equals(Object obj) {
		return beanEquals(this, obj);
	}

Admittedly this approach isn't suitable for everything. However, it's not the end of the quest for improved "coverage".

Merely running unit tests and inspecting the coverage only serves to expose where you have untested code. Assessing coverage by functional testing leads you in a different direction: untested code is (possibly) extraneous code.

In the example above, SomeBean is an entity representing a row in a database. It's never used as a key in a map, its notion of equality is arguably based purely on its identifier, not its values. Functional test coverage might tell me that equals and hashCode are never used.

Unit test coverage might show me that a check for equality is desired for other reasons (such as asserting that a service call returns the desired instances of SomeBean). However, this is the desire for testing polluting the design again. The notion of equality required for a test case may well be different to that belonging to the business domain. Does the business domain care whether the scale of the BigDecimal property matches between instances? In reality, some of our unit tests cared about the scale whereas some of the integration tests didn't, demonstrating that the notion of equality belonged to the test domain, not the business domain.

In the example above (which was actually a bean with around 50 properties), functional test coverage showed the only code relying on equals and hashCode were unit tests. These methods were therefore replaced with introspecting versions - a pragmatic, founded trade-off between supporting unit testing and having less code to test!

It seems to me that there's a sweet spot for testing, somewhere between methods and entire systems. Unit, system and functional tests can all contribute to coverage in different ways and help you to keep the forces on your design in balance. Unfortunately unit testing tends to get more than its fair share of attention in my experience and the architect ought to keep it in check.

About the author

Kevin has been working with Java for 10 years, in defence research through dot com to investment banking. Currently he works at JPMorgan developing front-office trading solutions.

While getting on well with server-side Java, Kevin's also a keen Swing developer (and possibly masochist).

E-mail : kevin.seal at codingthearchitecture.com


Re: Testing Times

If coverage levels are too high I get suspicious. A couple of loops and some introspection and you can guarantee 100%

Re: Testing Times

Presumably the reason you need to review the tests too! You could run up everything and assert nothing, doing little more than a smoke test.

Using introspection to automate the testing of (numerous) trivial elements is one way you can stop the seemingly unavoidable pursuit of coverage getting bogged down in low-value nonsense.

Re: Testing Times

As you say, chasing 100% unit test coverage after the fact will frequently skew the architecture. However, if unit testing is seen as a developmental tool, as a guide both for new development and later refactoring, rather than a measure of the system's fitness for production, then the architecture should tend to develop to meet a system's functional needs. If the up front decisions on software style enhance the code's quality, by reducing co-dependency for instance, then the benefits to unit testing are a convenient by-product. IMO unit testing is about building developer confidence rather than end user confidence. Even so, test-influenced design will creep in to take account of, for instance, poorly designed 3rd party APIs or "impossible to test" situations. So while 100% coverage should not be a target, I imagine ruling out any design changes to accomodate a unit test will reduce a developer's ability to sign-off code, whether new or refactored.

Re: Testing Times

Building developer confidence sounds like a bit of a placebo to me: confidence in what? That they've not broken a test that simply reinforces internal behaviour? Or that such a breakage isn't going to manifest as a functional or non-functional defect? I want developers to be concerned with the latter, to see the end that their software is a means to. Of course this doesn't imply functional, system or unit testing, but a holistic, pragmatic test strategy.

You're right that good design and good unit testing practices have a lot in common and, ultimately, who cares what drives a good design? However, I've yet to encounter a development team that displayed much more than an understanding of the syntax of unit testing, let alone what constitutes good practice! The developers get confidence in their code, but I don't even have confidence in their tests. They rate their code AAA on the basis of a measure that's junk: how topical.

As for 100% coverage, this is again where architects have a role to play. Why is 100% coverage not possible? Do you functionally test everything? Can you automate it? Why have a coverage report that mixes the important with the unimportant anyway? The coverage report is telling you various things, one of which is that the coverage report might need changing.

Re: Testing Times

It sometimes seems to me that only a QA manager, who signs off a production release, can maintain the quality of what we do. Unit tests then really do become a means for developers to ensure that their code matches their interpretation (the placebo, as you put it!) of the unit's contract and not much else (as well as a check that they haven't broken other developer's potentially flawed interpretations). I suppose the architect's job then becomes a matter of overseeing the code's quality (code and unit test reviews etc) and ensuring that what developers are doing matches the QA manager's expectations! Is that what automated functional tests achieve?

Re: Testing Times

Automated functional/system/integration tests are probably much closer to checking the functional and non-functional quality of the software than unit tests and code reviews are. At least insofar as the traditional (ie, non-development) stakeholders are concerned. And I guess it's those that a QA manager is probably trying to please first.

Of course this is not to say that the development team aren't stakeholders and the stability of the code isn't a concern!!


Add a comment Send a TrackBack