Structurizr - create web-based software architecture diagrams using code

Application Architecture and Ransomware

Protecting data from Cryptolocker and variants

Ransomware and Cryptolocker

Ransomware is an increasing threat to many organisations - I recently had a conversation with a (non-IT) friend whose employer had been affected, which is why I’m writing this. These are attacks where a system or data are made inaccessible until a ransom is paid. This form of extortion actually dates back to the 1980s but recent variants, such as Crytolocker, are very dangerous and destructive on modern networks.

Often the initial infection is via a phishing email that contains a link to a website, that if clicked, will download the malware. This will scan all files that the user has access to and starts encrypting them. Once the files are encrypted the user will be sent a message telling them of the infection and offering to decrypt in return for payment (usually in bitcoins). Of course the user has no guarantee that their files will be decrypted even if the ransom is paid.

Applications and Processes

If an individual's machine is infected then they might lose all their personal documents. If they are using remote drives and shares, which have multiple users, then the infection may also lock other people's files. If a user has access to a large number of files across an organisation then this could be devastating.

These are all files that a person has access to. This includes any files used by applications along with documents etc. Therefore if a developer or operational user becomes infected then the systems files they have access to can be affected. It’s very common for technical employees to have access to the files of production servers in order to make issue resolution easy. For example; log files, configuration files, data exports/imports etc.

If the technical users have write access to a mapped drive on a production server then it is trivial for the malware to encrypt these files. This may take down the service (if runtime files are affected) or even destroy the data making the service impossible to run even after a reinstall. Remember that your databases will ultimately have their data stored in files on a disk somewhere.

If people with elevated privileges are infected, you can lose entire systems as well as that person's individual files.

Preventative Actions

I won't give advice here on Endpoint Protection (antiviruses etc.) as that out-of-scope for this blog but there are many data related actions you should consider with respect to your applications.

Audit

Many of you will be reading this and thinking "well we don't allow access as you've described here" but technical staff will setup systems to make their jobs easier. Has your organisation ever performed a data audit and classification? Do you know what files, shares and sections of your network each user has access to? If you haven't then I'd strongly advise you do so - you may be surprised at what you find. There are many commercial and free tools to assist you in doing this.

Restrict user access

You should define your users, what groups they are in and what data they have access to. This is good practice anyway (for reasons of privacy, data loss prevention etc) but if you reduce the total number of files accessible than any infection will have less effect.

File Permissions

If someone really needs access to files do they require write access? Log files and configuration files are a perfect example. A user shouldn't be writing to a log file and if they want to change some configuration then they should go through your normal release process rather than hacking it in manually. If you can't release configuration quickly enough, then your release process may be your real issue...

Don't share users between people and applications

A person shouldn't be using an account used by an application and the applications shouldn't be using personal accounts. Again you may claim this isn't happening but technical users often take shortcuts like this to release quickly (or get around approval processes). A good audit should pick up on this.

Don't use the same user for all applications (or use root!)

It's tempting (for ease of management) to create a single account and get all applications to run as this account. If this account is compromised then all data for all applications are vulnerable. Use specific accounts for applications to reduce lateral movement between systems.

Don't give administration permissions to interactive accounts

If a login account is used to run a web browser or email then it should have restricted permissions. Likewise any administrative account should not be able to run a web browser or email. Separate the concerns!

Analyse your Backup Policy

How do you backup your data? If you are using online backups, that are accessible to an infected user, then all your backups may get corrupted too! Maybe you should consider using WORM (write once read many) technology or at least use separate processes to move and permission backups appropriately once they have been taken.

Some malware may be stealthy and stay on your system for a long time before making itself known. Therefore incremental backups can be corrupted far back in time. Make sure you regularly test your restoration processes too.

Conclusion

It's important to remember that your data is the most important part of your application and valuable to your organisation. If something has value then nefarious parties can seek to take advantage of this. It's hard to stop some attacks but you can minimise the damage if you are attacked.

The architecture of a system should take into account where data is stored, how it is permissioned and who/what has access to it. It's very easy to become obsessed with the latest design patterns but basic data management is important and shouldn't be forgotten.

Introducing Structurizr Express

Create a single software architecture diagram using text

I rolled out a new feature to Structurizr at the weekend called Structurizr Express, which is basically a way to create software architecture diagrams using text. Although the core concept behind Structurizr is to create a software architecture model using code, there are times when you simply want a quick diagram, perhaps for a presentation, pre-sales proposal, etc. Structurizr Express will let you do just that - quickly create a single software architecture diagram using a textual definition. Much like tools such as PlantUML, yUML, WebSequenceDiagrams, etc.

Structurizr Express

Despite the name, this is all still based around the C4 model although it only targets one diagram at a time. The three types of diagrams currently supported are System Context, Container and Component diagrams. Structurizr Express is available to use now and the help page provides a description and examples of the syntax. I hope you find it useful.

Agile software architecture documentation

Lightweight documentation that describes what you can't get from the code

"We value working software over comprehensive documentation" is what the manifesto for agile software development says. I know it's now a cliche, but the typical misinterpretation of these few words is "don't write documentation". Of course, that's not actually what the manifesto says and "no documentation" certainly wasn't the intent. To be honest, I think many software teams never produced or liked producing any documentation anyway, and they're now simply using the manifesto as a way to justify their approach. What's done is done, and we must move on.

One of the most common questions I get asked is how to produce "agile documentation", specifically with regards to documenting how a software system works. I've met many people who have tried the traditional "software architecture document" approach and struggled with it for a number of reasons, irrespective of whether the implementation was a Microsoft Word document or a wiki like Atlassian Confluence. My simple advice is to think of such documentation as being supplementary to the code, describing what you can't get from the code alone.

Readers of my Software Architecture for Developers ebook will know that I propose something akin to a travel guidebook. Imagine you arrive in a new city. Without any maps or a sense of direction, you'll end up just walking up and down every street trying to find something you recognise or something of interest. You can certainly have conversations with the people who you meet, but that will get tiring really quickly. If I was a new joiner on an existing software development team, what I'd personally like is something that I can sit down and read over a coffee, perhaps for an hour or so, that will give me a really good starting point to jump into and start exploring the code.

The software guidebook

Although the content of this document will vary from team to team (after all, that's the whole point of being agile), I propose the following section headings as a starting point.

  1. Context
  2. Functional Overview
  3. Quality Attributes
  4. Constraints
  5. Principles
  6. Software Architecture
  7. Code
  8. Data
  9. Infrastructure Architecture
  10. Deployment
  11. Development Environment
  12. Operation and Support
  13. Decision Log

The definitions of these sections are included in my ebook and they're now available to read for free on the Structurizr website (see the hyperlinks above). This is because the next big feature that I'm rolling out on Structurizr is the ability to add lightweight supplementary documentation into the existing software architecture model. The teams I work with seem to really like the guidebook approach, and some even restructure the content on their wiki to match the section headings above. Others don't have a wiki though, and are stuck using tools like Microsoft Word. There's nothing inherently wrong with using Microsoft Word, of course, in the same way that using Microsoft Visio to create software architecture diagrams is okay. But it's 2016 and we should be able to do better.

Documentation in Structurizr

The basic premise of the documentation support in Structurizr is to create one Markdown file per guidebook section and to link that with an appropriate element in the software architecture model, embedding software architecture diagrams where necessary. If you're interested to see what this looks like, I've pushed an initial release and there is some documentation for the techtribes.je and the Financial Risk System that I use in my workshops. The Java code and Markdown looks like this.

Even if you're not using Structurizr, I hope that this blog post and publishing the definitions of the sections I typically include in my software architecture documentation will help you create better documentation to complement your code. Remember, this is all about lightweight documentation that describes what you can't get from the code and only documenting something if it adds value.

Layers, hexagons, features and components

This blog post is a follow-up to the discussions I've had with people after my recent Modular Monoliths talks. I've been enthusiastically told that the "ports & adapters" (hexagonal) architectural style is "vastly", "radically" and "hugely" different to a traditional layered architecture. I remain unconvinced, hence this blog post, which has a Java spin, but I'm also interested in how the concepts map to other programming languages. I'm also interested in exploring how we can better structure our code to prevent applications becoming big balls of mud. Layers are not the only option.

Setting the scene

Imagine you're building a simple web application where users interact with a web page and information is stored in a database. The UML class diagrams that follow illustrate some of the typical ways that the source code elements might be organised.

Some approaches to organising code in a simple Java web app

Let's first list out the types in the leftmost diagram:

  • CustomerController: A web controller, something like a Spring MVC controller, which adapts requests from the web.
  • CustomerService: An interface that defines the "business logic" related to customers, sometimes referred to in DDD terms as a "domain service". This may or may not be needed, depending on the complexity of the domain.
  • CustomerServiceImpl: The implementation of the above service.
  • CustomerDao: An interface that defines how customer information will be persisted.
  • JdbcCustomerDao: An implementation of the above data access object.

I'll talk about the use of interfaces later, but let's assume we're going to use interfaces for the purposes of dependency injection, substitution, testing, etc. Now let's look at the four UML class diagrams, from left to right.

  1. Layers: This is what a typical layered architecture looks like. Code is sliced horizontally into layers, which are used as a way to group similar types of things. In a "strict layered architecture", layers should only depend on lower layers. In Java, layers are typically implemented as packages. As you can see from the diagram, all layer (inter-package) dependencies point downwards.
  2. Hexagonal (ports & adapters): Thomas Pierrain has a great blog post that describes the hexagonal architecture, as does Alistair Cockburn of course. The essence is that the application is broken up into two regions: inside and outside. The inside region contains all of the domain concepts, whereas the outside region contains the interactions with the outside world (UIs, databases, third-party integrations, etc). One rule is that the outside depends on the inside; never the other way around. From a static perspective, you can see that the JdbcCustomerRepository depends on the domain package. Particularly when coupled with DDD, another rule is that everything on the inside is expressed in the ubiquitous language, so you'll see terms like "Repository" rather than "Data Access Object".
  3. Feature packages: This is a vertical slicing, based upon related features, business concepts or aggregate roots. In typical Java implementations, all of the types are placed into a single package, which is named to reflect the concept that is being grouped. Mark Needham has a blog post about this, and the discussion comments are definitely worth reading.
  4. Components: This is what I refer to as "package by component". It's similar to packaging by feature, with the exception that the application (the UI) is separate from the component. The goal is to bundle all of the functionality related to a single component into a single Java package. It's akin to taking a service-centric view of an application, which is something we're seeing with microservice architectures.

How different are these architectural styles?

On the face of it, these do all look like different ways to organise code and, therefore, different architectural styles. This starts to unravel very quickly once you start looking at code examples though. Take a look at the following example implementations of the ports & adapters style.

Spot anything? Yes, the interface (port) and implementation class (adapter) are both public. Most of the code examples I've found on the web have liberal usage of the public access modifier. And the same is true for examples of layered architectures. Marking all types as public means you're not taking advantage of the facilities that Java provides with regards to encapsulation. In some cases there's nothing preventing somebody writing some code to instantiate the concrete repository implementation, violating the architecture style. Coaching, discipline, code reviews and automated architecture violation checks in the build pipeline would catch this, assuming you have them. My experience suggests otherwise, especially when budgets and deadlines start to become tight. If left unchecked, this is what can turn a codebase into a big ball of mud.

Organisation vs encapsulation

Looking at this another way, when you make all types in your application public, the packages are simply an organisation mechanism (a grouping, like folders) rather than being used for encapsulation. Since public types can be used from anywhere in a codebase, you can effectively ignore the packages. The net result is that if you ignore the packages (because they don't provide any means of encapsulation and hiding), a ports & adapters architecture is really just a layered architecture with some different naming. In fact, if all types are public, all four options presented before are exactly the same.

Approaches without packages

Conceptually ports & adapters is different from a traditional layered architecture, but syntactically it's really the same, especially if all types are marked as public. It's a well implemented n-layer architecture, where n is the number of layers through a slice of the application (e.g. 3; web-domain-database).

Utilising Java's access modifiers

The way Java types are placed into packages can actually make a huge difference to how accessible (or inaccessible) those types can be when Java's access modifiers are applied appropriately. Ignoring the controllers ... if I bring the packages back and mark (by fading) those types where the access modifier can be made more restrictive, the picture becomes pretty interesting.

Access modifiers made more restrictive

The use of Java's access modifiers does provide a degree of differentiation between a layered architecture and a ports & adapters architecture, but I still wouldn't say they are "vastly" different. Bundling the types into a smaller number of packages (options 3 & 4) allows for something a little more radical. Since there are fewer inter-package dependencies, you can start to restrict the access modifiers. Java does allow interfaces to be marked as package protected (the default modifier) although if you do this you'll notice that the methods must still be marked as public. Having public methods on a type that's inaccessible outside of the package is a little odd, but it's not the end of the world.

With option 3, "vertical slicing", you can take this to the extreme and make all types package protected. The caveat here is that no other code (e.g. web controllers) outside of the package will be able to easily reuse functionality provided by the CustomerService. This is not good or bad, it's just a trade-off of the approach. I don't often see interfaces being marked as package protected, but you can use this to your advantage with frameworks like Spring. Here's an example from Oliver Gierke that does just this (the implementation is created by the framework). Actually, Oliver's blog post titled Whoops! Where did my architecture go, which is about reducing the number of public types in a codebase, is a recommended read.

I'm not keen on how the presentation tier (CustomerController) is coupled in option 3, so I tend to use option 4. Re-introducing an inter-package dependency forces you to make the CustomerComponent interface public again, but I like this because it provides a single API into the functionality contained within the package. This means I can easily reuse that functionality across other web controllers, other UIs, APIs, etc. Provided you're not cheating and using reflection, the smaller number of public types results in a smaller number of possible dependencies. Options 3 & 4 don't allow callers to go behind the service, directly to the DAO. Again, I like this because it provides an additional degree of encapsulation and modularity. The architecture rules are also simpler and easier to enforce, because the compiler can do some of this work for you. This echoes the very same design principles and approach to modularity that you'll find in a modern microservices architecture: a remotable service interface with a private implementation. This is no coincidence. Caveats apply (e.g. don't have all of your components share a single database schema) but a well-structured modular monolith will be easier to transform into a microservices architecture.

Testing

In the spirit of YAGNI, you might realise that some of those package protected DAO interfaces in options 3 and 4 aren't really necessary because there is only a single implementation. This post isn't about testing, so I'm just going to point you to Unit and integration are ambiguous names for tests. As I mention in my "Modular Monoliths" talk though, I think there's an interesting relationship between the architecture, the organisation of the code and the tests. I would like to see a much more architecturally-aligned approach to testing.

Conclusions?

I've had the same discussion about layers vs ports & adapters with a number of different people and opinions differ wildly as to how different the two approaches really are. A Google search will reveal the same thing, with numerous blog posts and questions on Stack Overflow about the topic. In my mind, a well implemented layered architecture isn't that different to a hexagonal architecture. They are certainly conceptually different but this isn't necessarily apparent from the typical implementations that I see. And that raises another interesting question: is there a canonical ports & adapters example out there? Of course, module systems (OSGi, Java 9, etc) change the landscape because they allow us to differentiate between public and published types. I wonder how this will affect the code we write and, in particular, whether it will allow us to build more modular monoliths. Feel free to leave a comment or tweet me @simonbrown with any thoughts.

Codifying the rules used to organise your code

Another way to extract architectural information from a codebase

Regular readers will already know about Structurizr - a set of open source libraries to create a software architecture model as code, plus a SaaS product to visualise those models. Having created and helped create a number of models with Structurizr now, I've noticed an interesting side-effect. In the absence of architectural information being present in the code, the power of using something like Structurizr to define a software architecture model using code is in extracting information algorithmically, by codifying the rules that you've ultimately used to structure your codebase.

Let me give you an example. Imagine you're building a web-MVC web application in Java, C#, etc and you have a tens or hundreds of controller classes, each of which uses a number of other components to implement some functionality. Drawing a single diagram to visualise the static structure of the entire web application is a bad idea because it shows too much information. A better approach is to create one view per vertical slice, where there could be one vertical slice per web controller. This results in smaller, simpler diagrams like this.

A component diagram based upon a web controller

So far so good, and this is relatively easy to do using static analysis techniques. But you'll notice this diagram includes an "Authenticated User", which isn't part of the code itself. This raises the question of how the user ends up getting included on the diagram. There are a number of options:

  • Manually add the correct type of user on a case by case basis.
  • Match the controller's URL routing to the permitted user role (this is likely specified in configuration somewhere) and use this information to choose the appropriate type of user for each controller.
  • Add machine-readable metadata (e.g. Java Annotations, C# Attributes) to specify the user type (there are some Structurizr annotations I've created to help with this).
  • Codify the rules you've used to organise the controllers in your codebase.

The ability to codify the rules you've used to organise the controllers in your codebase obviously depends on how much thought you've put into doing this. For example, did you dump all of these controller classes into a single package or namespace without giving it much thought at all? Or perhaps you took Martin Fowler's advice and modularised further, creating one package/namespace per functional area or aggregate root, for example. Another possibility is that you grouped controllers together based upon whether unauthenticated users, authenticated users or other software systems are using them. Organising your code well provides you with another angle to extract architectural information, because you can codify rules such as, "the Anonymous User uses all controllers in the com.mycompany.mywebapp.unsecured package/namespace".

With hindsight this is fairly obvious, but we often don't put enough thought into how we organise our code, possibly because we perceive that it doesn't actually matter that much and modern IDEs provide powerful ways to navigate large and/or complex codebases. Trying to codify the rules used to organise a codebase certainly gets you thinking, and often refactoring too.

Structurizr for .NET

A C# implementation of the Structurizr for Java client library

The initial version of Structurizr was targeted at the Java ecosystem (see "Structurizr for Java"), for no other reason than it's what I'm most familiar with. Although this works for a good portion of the organisations that I visit when doing training/consulting, an equally sized portion use the Microsoft stack. For this reason, I've put together Structurizr for .NET, which is more or less a direct port of the Java version, with some automatically generated code from Swagger used as a starting point. It's by no means "feature complete" yet, especially since none of the component finder code (the part that extracts components automatically from a codebase) is present, but there's enough to create some basic diagrams. Here's some example code that creates a software model for the "Financial Risk System" case study that I use in my workshops.

It creates the following Context, Container and Component diagrams.

If you want to take a look or try it out, the source code can be found on GitHub and there's an initial version of the package on NuGet. Have fun!

DevNexus 2016 in Atlanta, GA

Talks and a 1-day software architecture sketching workshop

I'm pleased to say I'll be in the United States next month for the DevNexus 2016 conference that is taking place in Atlanta, GA. In addition to a number of talks about software architecture, I'll also be running my popular "The Art of Visualising Software Architecture" workshop. Related to the (free) book with the same name, this hands-on workshop is about improving communication and specifically software architecture diagrams. We'll talk about UML and some anti-patterns of "boxes and lines" diagrams, but the real focus is on my "C4 model" and some lightweight techniques for communicating software architecture. The agenda and slides for this 1-day workshop are available online. I hope you'll be able to join me.

Agile Hong Kong Meetup - 15th Jan 2016

The Art of Visualising Software Architecture

Happy new year and I wish you all the best for 2016. My first trip of the year starts next week and I'll be doing some work in Shenzhen, China. As a result, I'll also be in Hong Kong on January 15th, presenting "The Art of Visualising Software Architecture" at a meetup organised by Agile Hong Kong. You can register on the Meetup page. See you there!

p.s. If anybody would like a private, in-house 1-day software architecture sketching workshop on the 15th, please drop me a note.

Cyber Security 2015

A constantly changing, complex quality attribute gaining more attention

Last week I gave a presentation titled "2015 - A CyberSecurity Year" to the London Java Community's Open Conference. I like to present at the LJC's Open Conference on whatever topic has occupied the majority of my time in the previous year. This is partly because it's always advisable to "present on what you know" but also as a cathartic exercise to vent my frustrations! The slides can be found here but won't make much sense without the following context.

This year a huge amount of my time has been spent on Cybersecurity concerns as 2015 was the year these issues were forced into everyone's mind. The threats have been increasing for several years but many high profile (and often salacious) events mean that the press, and therefore the public, have realised this is a serious issue. The first part of my presentation described what had happened over the last few years to cause the current situation.

Once the press and the public have a concern, the politicians will pick up on it and this means... laws, regulation, 'guidelines' and consequences. The second part of the presentation discussed the large range of regulators and regulations that have been (and are still being) created. These can be complex, incomplete and sometimes contradictory. I only touched the surface of what is happening. Interestingly a few of the people watching had similar concerns and experiences but many seemed unaware of even the most basic provisions of the Data Protection Act - I suspect this could be a HUGE business risk in the next few years.

These concerns have led to an increase in actual and planned expenditure (including large announcements from governments) but many in the group expressed doubts on how effectively they would be spent.

Lots of money means... lots of companies offering products and services! We spent a while discussing some of these and again, there was concern about their maturity and effectiveness.

So to bring this back on topic! Security (whether at application, system or data level) is a highly complex quality attribute. It is also constantly changing. A good architecture will take the current security concerns into account and provide foundations for not only providing this now but also for solving future issues. You not only have to address the threat but also do this is a legally compliant way. It is possible to be secure but still in breach of the law.

It is also a concern throughout the system and cannot be considered in isolation. If you are writing an application you need to think about all the services you rely upon, the sources and destination of your data and most importantly the people using it. Your developers, system administrators, database administrators and operational teams need to communicate with each other on these issues.

Good luck, I'm sure it'll all be different by the end of 2016!

Magpie Talkshow Episode 6 - Simon Brown (Channel Islands Edition)

While at Devoxx Poland earlier this year, Sam Newman interviewed me for his new podcast, The Magpie Talkshow. We chat about software architecture diagrams, my C4 model, UML, writing books and Jersey. Enjoy!

Thanks Sam! :-)