Mapping software architecture to code

Bringing software architecture back into the domain of the development team

One of the things I'm currently doing with a number of software teams is teaching them how to draw pictures. As an industry we've got really good at visualising the way that we work using things like Kanban boards and story walls, but we've forgotten how to visualise the software that we're building. In a nutshell, many teams are trying to move fast but they struggle to create a shared vision that the whole team can work from, which ultimately slows them down. And few people use UML nowadays, which just exaggerates the problem. I've written an article about this and it's due for publication soon (here's the link) plus it's covered in my Software Architecture for Developers ebook and in a number of talks that I'm doing around Europe (ITARC, IASA UK, Mix-IT) and the US (SATURN) during April. Here are the slides from Agile software architecture sketches - NoUML! that I presented a few weeks ago in Dublin.

The TL;DR version

The TL;DR version of this post is simply this ... if you're building monolithic software systems but think of them as being made up of a number of smaller components, ensure that your codebase reflects this. Consider organising your code by component (rather than by layer or feature) to make the mapping between software architecture and code explicit. If it's hard to explain the structure of your software system, change it.

Decomposition into components

For the purpose of this post, let's assume visualising a software system isn't a problem and that you're sketching some ideas related to the software architecture for a new system you've been tasked to build. An important aspect of "just enough" software architecture is to understand how the significant elements of a software system fit together. For me, this means going down to the level of components, services or modules. It's worth stressing this isn't about understanding low-level implementation details, it's about performing an initial level of decomposition. The Wikipedia page for Component based development has a good summary, but essentially a component might be something like a risk calculator, audit logger, report generator, data importer, etc. The simplest way to think about a component is that it's a set of related behaviours behind an interface, which may be implemented using one or more collaborating classes. Good components share a number of characteristics with good classes. They should have high cohesion, low coupling, a well-defined public interface, good encapsulation, etc.

There are a number of benefits to thinking about a software system in terms of components, but essentially it allows us to think and talk about the software as a small number of high-level abstractions rather than the hundreds and thousands of individual classes that make up most enterprise systems. The photo below shows a typical component diagram produced during the training classes we run. Groups are asked to design a simple financial risk system that needs to pull in some data, perform some calculations and generate an Excel report as the output.

A sketch of components

This sketch includes the major components you would expect to see for a system that is importing data, performing risk calculations and generating a report. These components provide us with a framework for partitioning the behaviour within the boundary of our system and it should be relatively easy to trace the major use cases/user stories across them. This is a really useful starting point for the software development process and can help to create a shared vision that the team can work towards. But it's also very dangerous at the same time. Without technology choices (or options), this diagram looks like the sort of thing an ivory tower architect might produce and it can seem very "conceptual" for many people with a technical background.

Talk about components, write classes

People generally understand the benefit of thinking about software as higher level building blocks and you'll often hear people talking in terms of components when they're having architecture discussions. This often isn't reflected in the codebase though. Take a look at your own codebase. Can you clearly see components or does your codebase reflect some other structure? When you open up a codebase, it will often reflect some other structure due to the organisation of the code. Mark Needham has a great post called Coding: Packaging by vertical slice that talks about one approach to code organisation and a Google search for "package by feature vs package by layer" will throw up lots of other discussions on the same topic. The mapping between the architectural view of a software system and the code are often very different. This is sometimes why you'll see people ignore architecture diagrams (or documentation) and say "the code is the only single point of truth".

Auto-generating architecture diagrams

To change tack slightly, I was in Dublin a few weeks ago and I met Chris Chedgey, who is part of the inspiration behind this post. Chris is the co-founder of a company called Headway Software and they have a product called Structure101. You should take a look if you've not seen it before, they have some cool stuff in the pipeline. I won't do their product any justice by trying to summarise what it does, but one of its many features is to visualise and understand an existing codebase.

When I teach people how to visualise their software systems, we create a number of simple NoUML sketches at different levels of abstraction. These are the context, containers and components diagrams. This context, containers and components approach is basically just a tree structure. A system is made up of containers (e.g. a web server, application server, database, etc), each of which is further made up of components. You can see some example diagrams on Flickr and in my book.

Given this is really just a tree structure, it should be fairly straightforward to auto-generate these diagrams from an existing codebase. And perhaps there is a tool out there that can do this, but I've never seen one that has worked really well. Microsoft Visual Studio can generate some layer diagrams but I've never met anybody that really raves about the architecture diagram support. Most tools generate diagrams showing dependencies between packages or classes but they don't tend to show components. And what's a component anyway? Is any class that implements an interface a component? If you're using inversion of control, perhaps everything that you inject is a component?

There are a number of reasons why auto-generating such diagrams is tricky but, once we start coding, much of the semantics associated with "containers" (runtime environments, process boundaries, etc) and "components" becomes lost of the sea of classes that make up the typical codebase. Many developers break their systems up into a number of projects within their IDEs to represent reusable libraries and deployable units but external tools often don't have access to this information if they are solely working from a bunch of JAR files or DLLs (for example). In essence, the information related to the abstract structural elements isn't adequately represented within a codebase. If you take a look at most codebases, I'm fairly sure that you could come up with a set of rules as to what defines a component but perhaps it would be easier to simply make these concepts explicit. Some techniques already exist to do this (e.g. the Architecture Description Language) but I've never seen them used in the corporate world.

Packaging by component

To bring this discussion back to code, the organisation of the codebase can really help or hinder here. Organising a codebase by layer makes it easy to see the overall structure of the software but there are trade-offs. For example, you need to delve inside multiple layers (e.g. packages, namespaces, etc) in order to make a change to a feature or user story. Also, many codebases end up looking eerily similar given the fairly standard approach to layering within enterprise systems. Uncle Bob Martin says that if you're looking at a codebase, it should scream something about the business domain. Organising your code by feature rather than by layer gives you this, but again there are trade-offs. A variation I've been experimenting with is organising code explicitly by component. The following screenshot shows an example of this in the codebase for my website (a content aggregator and portal for Jersey's digital sector). This screenshot only shows the core components; there's a separate Spring MVC project and the controllers use the components illustrated here.

Packaging by component

This is similar to packaging by feature, but it's more akin to the "micro services" that Mark Needham talks about in his blog post. Each sub-package of je.techtribes.component houses a separate component, complete with it's own internal layering and Spring configuration. As far as possible, all of the internals are package scoped. You could potentially pull each component out and put it in it's own project or source code repository to be versioned separately. This approach will likely seem familiar to you if you're building something that has a very explicit loosely coupled architecture such as a distributed messaging system made up of loosely coupled components. I'm fairly confident that most people are still building something more monolithic in nature though, despite thinking about their system in terms of components. I've certainly packaged *parts* of monolithic codebases using a similar approach in the past but it's tended to be fairly ad hoc. Let's be honest, organising code into packages isn't something that gets a lot of brain-time, particularly given the refactoring tools that we have at our disposal. Organising code by component lets you explicitly reflect the concept of "a component" from the architecture into the codebase. If your software architecture diagram screams something about your business domain (and it should), this will be reflected in your codebase too.

The structural elements of software

We could create a convention here to say that all sub-packages of je.techtribes.component are components, but it would be much easier to explicitly mark components using metadata. In Java, we could use annotations to do this, attributes in .NET, etc. If we used the same approach for other structural elements of software (e.g. services, layers, containers, etc), tool vendors could use this metadata to generate meaningful and *simple* architecture diagrams automatically. Plus, they could also use this structural information to generate dependency diagrams that focus on components rather than classes. I've started experimenting with annotations as a way to do this and I've created a Github repo to store whatever I come up with.

The major caveat to all of this is that designing a software system based around components isn't "the only way". It's a nice approach to think about software systems that are more monolithic in nature and it's a great stepping stone to designing loosely coupled architectures. But it isn't a silver bullet. Regardless of how you design software, I do hope this post has got you thinking about the mapping between software architecture and how it's reflected in the code.

Software architecture and coding are often seen as mutually exclusive disciplines and there's often very little mapping from the architecture into the code and back again. Effectively and efficiently visualising a software architecture can help to create a good shared vision within the team, which can help it go faster. Having a simple and explicit mapping from the architecture to the code can help even further, particularly when you start looking at collaborative design and collective code ownership. Furthermore, it helps bring software architecture firmly back into the domain of the development team, which is ultimately where it belongs.

About the author

Simon is an independent consultant specializing in software architecture, and the author of Software Architecture for Developers (a developer-friendly guide to software architecture, technical leadership and the balance with agility). He’s also the creator of the C4 software architecture model and the founder of Structurizr, which is a collection of open source and commercial tooling to help software teams visualise, document and explore their software architecture.

You can find Simon on Twitter at @simonbrown ... see for information about his speaking schedule, videos from past conferences and software architecture training.

Re: Mapping software architecture to code

I agree that splitting large developments into collaborating components is often a good thing, but it certainly has its dark side.

Initial development of the individual modules can be faster this way, but at the expense of extra API negotiations, loss of shared understanding, duplication of code and of effort, reduced ability to refactor across module boundaries etc.

In short, attempting to split a project by component is potentially just as misguided as attempting to split it by layers. A holistic approach would admit that components without shared code and shared understanding tend toward wasteful silos, and layers without direct business purpose tend toward bloated API swamps.

Re: Mapping software architecture to code

Agreed, although components can still share code but that's another story altogether!

Re: Mapping software architecture to code

Frank, I think there is a misconception lurking in the dark. If your team is small enough that everybody can work on the same code base, the structuring can simple happen by putting stuff in the 'right' package. No extra interface, no extra separation. If and only if you application grows one would consider splitting it into separate projects / maven modules or whatever your tool names those thingies. A clean package structure will give you a head start on that.

Re: Mapping software architecture to code

Hi Simon, cheers! It's really a nice essay I can't wait for source code. it looks very organized ;)

Re: Mapping software architecture to code

Hi Simon, I absolutely agree. Packages are important and components are more important than layers. Also structure 101 is a great tool. But when I used it the last time I found it hard to understand what refactorings are necessary/possible to get the structure I really wanted for my packages. I therefore created Degraph. It is still at an early release and probably will never be as pollished as commercial products but it provides some unique visulizations, and its free. I'd appreciate a feedback if you or your readers decide to give it a try.

Re: Mapping software architecture to code

Great, you´re trying to bring light weight design to teams. However, what I´m missing is how the impedance mismatch between agility and code is addressed. All (?) the drawings just show technical views of software. Essentially it's about "component" dependencies. The nesting is deep. Logic is distributed across all "layers". But where are the user stories? Or any increment for that matter. They seem to be dissolved like sugar in a cup of coffee. It sounds as if once developers get their hands on requirements they get implemented - but it becomes hard to find the implementation. That makes understanding software difficult. And it makes software hard to change, I´d say.

Re: Mapping software architecture to code

Good point. What my blog entry doesn't show is that there's a layer of web controllers sitting over the top, which orchestrate calls across the components to deliver features to the user.

Aside from having a good suite of outside-in tests, how do teams currently do this? One controller per user story? One per use case?

Re: Mapping software architecture to code

Hm... "a layer of controllers" doesn´t help with the deep nesting of logic "components". It just isolates them from some view layer. Asking "how many controllers per user story" to seems a violation of the principle of Separation of Concerns. User stories belong to the business domain. But controllers are part of a design pattern, they belong to the technical domain. So why should there be a specific mapping between user stories and controllers? If a user story is covered by just one view, it can be represented by the view´s one controller. But if a user story spans several views... it would need to be represented by several controllers, too, I´d say. (My premise: there is a 1:1 relationship between views and controllers.) That´s why user stories to me are not the starting point of agile design. They have to be broken up further. More tangible to users as well as developers are dialogs. So to me the question always is: What dialogs are part of a user story? (If you like think "view" instead of "dialog".) There is a n:m relationship between user stories and dialogs, I´d say. And the next step is decompose dialogs into interactions. An interaction is some sort of processing triggered by an event and transforming input into output. There is a n:m relationship between dialogs and interactions. Dialogs are naturally implemented as classes. And interactions are naturally implemented as functions. So with this kind of straightforward decomposition a set of hooks is designed onto which tests can be hung. The user story that way does not relate to a specific controller, but to a set of functions. And these functions are drawn together in tests. Thinking in terms of classes (or controllers) during design to me is a kind of premature optimization. Much energy can be spend on chossing candidate classes and distributing functions across them. That´s all technical detail not helping the agile cause, I´d say. Classes (interfaces) are an abstraction. If you have lots of stuff that somehow belongs together (high cohesion), then you might put it in the same class. But first find the heaps of stuff you can check for such patterns in the first place. Classes and components (as containers for classes) don´t come first - at least in my little design world ;-) They come last. No class ever has given a user any value. Only functions have. So that´s what agile design should lead to: from coarse grained requirements to finer grained ones (e.g. user stories) and then to even finer grained ones: dialogs and interactions. Because interactions are the root of all functionality we´ve to design. Get rid of starting with patterns like "layered architecture" or "MVC architecture". That´s secondary. Start with what users really care about: value provided thru increments. And increments are functions. Just my 2c.

Add a comment Send a TrackBack