We had an interesting discussion about simplifying deployment on the training course last week and I thought it would be good to re-iterate some of the points here. As part of the course, everybody is asked to put together an architecture for a small case study that can be decomposed into two major components. One of these is a scheduled/batch component responsible for processing some input files and the other is a user interface for users to modify some of the configuration used during the processing step. We've seen many different architectural approaches to solving this case study although it's natural for people to think about them as components that are deployed separately. One typical example is a Java web application providing the user interface and a separate standalone Java program initiated by cron for the batch processing. Another option would be an ASP.NET application and a Windows service.
We have a similar situation on one of my current projects. The main part of the system is an n-tier web application where the ASP.NET pages use WCF middle-tier services, which in turn communicate with either a backend system or a SQL Server database. There's also a batch element to the system, where a job is scheduled on the SQL Server box to look in the database and perform a very small amount of batch processing (in this case, generating files for onward transmission to another system). At a predefined interval, SQL server starts a standalone program up that looks for work to do, performs that work and then shuts down. At the next occurrence of the interval, the job is started up again.
In both of these cases, the architecture works and the use of a scheduled job means that we don't need to worry about failover and fault tolerance so much. If the job fails or crashes for some reason, the next run of the schedule will restart the process meaning that we don't (necessarily) need to worry about transient problems. We do need to make sure that the process initiating the scheduled job is always running, but typically this is something like an operating system or database, which are themselves usually monitored. One of the trade-offs with this approach is monitoring of the actual job though. Because such jobs are normally standalone programs that simply get executed on a scheduled basis, they have a very short window in which they are active and can be proactively monitored.
Another trade-off is that we've introduced a slightly more complicated deployment process. Okay, so it's not complicated to install a standalone application in the right place and schedule it to run, but we do have another *step* in the deployment process. In the case of our real-world system, we've put a lot of effort into making the build and deployment process as streamlined as possible for the majority of the application, but sometimes we even forget to deploy the scheduled job! Partially this is because it rarely changes, but it's also because it sits on the edge of the overall architecture. It *is* an important component, but it's not immediately apparent if it stops working because (for example) the underlying database schema has changed. We tend to notice the website not working though.
We're currently working on some new functionality that will see the introduction of another scheduled job into the architecture. In this case, we want to send e-mail from our application and have chosen to write e-mails into the database on an immediate basis so that they're captured, with a separate job sending them on a scheduled basis. With an existing scheduled job in place, it's very tempting to follow the same approach and build another standalone application that hooks into the same scheduler mechanism. But in doing this, we now have two components that we could potentially forget about and we've introduced yet another step into the deployment process.
This is one of those occasions where you can do some refactoring at the architecture level and simplify the deployment model by moving the scheduled jobs into our existing middle-tier. After all, it's a long running process with a very well understood build and deployment process, plus we have some bespoke diagnostics to monitor the status of the services it provides. Moving these standalone jobs into the middle-tier has been achieved by creating a very simple "timer service" framework (a wrapper around the platform provided timers) that kicks off a couple of timers to run the existing code every time an interval elapses. The benefits of doing this are that we've (a) simplified the deployment process and (b) we've introduced the ability to integrate these scheduled services with our existing middle-tier diagnostics.
Coming back to the case study in the training course, again, merging both of the batch and UI components into a single process is a great way to simplify the deployment for what is essentially a very simple system. This also provides additional benefits when the operations and support team aren't familiar with the technology, perhaps because you're introducing something new to them. Logically, the batch and UI components are very separate in their behaviour and responsibilities, but this is one of those occasions where bringing the logical design back to reality can help you identify opportunities for simplification. The logical to physical mapping doesn't have to be one-to-one.
Simon is an independent consultant specializing in software architecture, and the author of Software Architecture for Developers (a developer-friendly guide to software architecture, technical leadership and the balance with agility). He’s also the creator of the C4 software architecture model and the founder of Structurizr, which is a collection of open source and commercial tooling to help software teams visualise, document and explore their software architecture.