Unit and integration are ambiguous names for tests

I've blogged about architecturally-aligned testing before, which essentially says that our automated tests should be aligned with the constructs that we use to describe a software system. I want to expand upon this topic and clarify a few things, but first...


In a nutshell, I think the terms "unit test" and "integration test" are ambiguous and open to interpretation. We tend to all have our own definition, but I've seen those definitions vary wildly, which again highlights that our industry often lacks a shared vocabulary. What is a "unit"? How big is a "unit"? And what does an "integration test" test the integration of? Are we integrating components? Or are we integrating our system with the database? The Wikipedia page for unit testing talks about individual classes or methods ... and that's good, so why don't we use a much more precise terminology instead that explicitly specifies the scope of the test?

Align the names of the tests with the things they are testing

I won't pretend to have all of the answers here, but aligning the names of the tests with the things they are testing seems to be a good starting point. And if you have a nicely defined way of describing your software system, the names for those tests fall into place really easily. Again, this comes back to creating that shared vocabulary - a ubiquitous language for describing the structures at different levels of abstraction within a software system.

Here are some example test definitions for a software system written in Java or C#, based upon my C4 model (a software system is made up of containers, containers contain components, and components are made up of classes).

Name What? Test scope? Mocking, stubbing, etc? Lines of production code tested per test case? Speed
Class tests Tests focused on individual classes, often by mocking out dependencies. These are typically referred to as “unit” tests. A single class - one method at a time, with multiple test cases to test each path through the method if needed. Yes; classes that represent service providers (time, random numbers, key generators, non-deterministic behaviour, etc), interactions with “external” services via inter-process communication (e.g. databases, messaging systems, file systems, other infrastructure services, etc ... the adapters in a “ports and adapters” architecture). Tens Very fast running; test cases typically execute against resources in memory.
Component tests Tests focused on components/services through their public interface. These are typically referred to as “integration” tests and include interactions with “external” services (e.g. databases, file systems, etc). See also ComponentTest on Martin Fowler's bliki. A component or service - one operation at a time, with multiple test cases to test each path through the operation if needed. Yes; other components. Hundreds Slightly slower; component tests may incur network hops to span containers (e.g. JVM to database).
System tests UI, API, functional and acceptance tests (“end-to-end” tests; from the outside-in). See also BroadStackTest on Martin Fowler's bliki. A single scenario, use case, user story, feature, etc across a whole software system. Not usually, although perhaps other systems if necessary. Thousands Slower still; a single test case may require an execution path through multiple containers (e.g. web browser to web server to database).


This definition of the tests doesn't say anything about TDD, when you write your tests and how many tests you should write. That's all interesting, but it's independent to the topic we're discussing here. So then, do *you* have a good clear definition of what "unit testing" means? Do your colleagues at work agree? Can you say the same for "integration testing"? What do you think about aligning the names of the tests with the things they are testing? Thoughts?

About the author

Simon is an independent consultant specializing in software architecture, and the author of Software Architecture for Developers (a developer-friendly guide to software architecture, technical leadership and the balance with agility). He’s also the creator of the C4 software architecture model and the founder of Structurizr, which is a collection of open source and commercial tooling to help software teams visualise, document and explore their software architecture.

You can find Simon on Twitter at @simonbrown ... see simonbrown.je for information about his speaking schedule, videos from past conferences and software architecture training.

Re: Unit and integration are ambiguous names for tests

I agree that a shared vocabulary is missing. This is one precise way to talk about tests by focusing on what is being tested. My primary method is to talk about tests in terms of their purpose: - Unit Tests - they certify that code works the way you expect it to work. I also like to call them specs - for how the code is expected to work. - Functional Tests - they certify that product works the way you expect it to work. Testing is done through public interfaces (UI, API) and from user perspective. If you use BDD then you can also look at BDD scenarios as specs. - Non-functional Tests - they certify that software is operating the way you expect it to operate. Things like performance, resources utilization, etc. go here. Overall, the purpose of testing is to certify software to be released to production according to whatever certification criteria are deemed appropriate by the team (they will legitimately vary from product to product). We automate all aspects of software certification and delivery (no manual testing whatsoever). What I have noticed is that, when all testing is automated, with new products the split between tests is about 90/10. 90 for unit tests, 10 for functional/non-functional. Another thing that I have noticed is that most functional testing happens at system level, only basic testing is done at component level.

Re: Unit and integration are ambiguous names for tests

Traditional "unit tests" in Extreme Programmimng would mot be limited to single classes. Note that limiting to single classes couples the tests to the current design.

Re: Unit and integration are ambiguous names for tests

Jason, I consider myself versed in XP, but your comment interests me because it puzzles me. How does one write unit tests which aren't coupled to the public parts of an implementation? (Like public class, method, and function names)

Unit tests aren't just for classes

I like the goals here, but I'm uncomfortable with the name "class tests", because fully half the code I work with isn't in classes. This is especially true if a project explicitly uses a functional paradigm even for part of its code (e.g. functional core, imperative shell)

Unit tests aren't just for classes

Sure, that makes sense Jonathan ... the structural elements that make up your software are different to the example I used in the post (system - containers - components - classes), so your tests should be named differently too. The same would be said if you were building systems using JavaScript (there are no classes) or traditional database technologies (e.g functions and stored procedures). For me, this is about reaching an alignment between the structural elements of our software systems and the tests that exercise them.

Re: Unit and integration are ambiguous names for tests

Thanks for this post. I like that you are aligning the tests to your architectural model of C4. I particularly like the idea of aligning the integration tests with the Component level since that is the next public interface level. I noticed that the Container level was not mentioned. Is that because you consider them to be included in the System level? Maybe Container level would be more appropriate since this level of testing would include a website (system), a web service (api), or even a stored procedure at the database container level.

Re: Unit and integration are ambiguous names for tests

Thanks Todd. Yes, that's basically what this is about; finding the various levels of public interface. You can certainly think of a web API as being the public interface for a container, but if I look at my own API tests, they also tend to involve the database too. And this is why I didn't include the container level - the majority of these wider-reaching tests don't tend to execute code in just a single container. This obviously isn't always the case, but it seems easier to simply label all of that stuff as "system tests". I can see how you would easily define "container tests" if they were a useful thing to explicitly label though.

Where do focused integration tests go?

An important part IMHO of test architecture is the adapter-tests (in ports and adapters) or the infrastructure-tests (in DDD). Some people will call them "focused integration tests". Basically tests covering tens of lines of our code plus all the lines of code we integrate with for a specific need. They're important because the form the contract that I can use to confidently mock these things for all other tests. Problem I don't see where they fit in in your classification. Also originally I believe Kent Beck meant by unit test "a test that unitary with respect to other tests". Basically tests that can be run by themselves or in any order. Usually those would be at near class level, but the importance is that they'd be independent.

Add a comment Send a TrackBack