Designing Maintainable Systems

The Upgrade Process

I'm currently involved in a project to upgrade a third party piece of software and it's apparent that when the software was originally designed, the upgrade process was not considered. This became obvious when we totaled up the time required to perform, configure and post-release test the upgrade - it came to over three days of work. This was not even taking into account any rollback times (which is fortunately simplified these days by the use of virtualisation).

The software is used heavily from Monday to Friday so we wanted to upgrade over a weekend. The vendor suggested we perform an upgrade on a parallel system and then get the users to re-enter all the data into the new system that was missed - you can imagine how well that would have gone down. This would also mean trying to post-release, regression test two systems that are live, being used and not in sync.

Software almost always needs updating/upgrading (unless it's control software for a deep space probe!) The ability and consequence of upgrading should be considered as part of the design and development process. Questions to ask include:

  • Can an upgrade be performed in parallel to a live, running system and how does a switchover occur?
  • Will a system need to be taken down for any upgrades and for how long? How does this affect your Service Level Agreements?
  • How easy will any upgrade be to rollback? Errors occur!
  • Can you upgrade parts of the systems or does everything have to be done at once?
  • What is the effect on any users? Will they need to log out first etc? Will they lose any work if they fail to follow your procedures?
  • How easy will it be to test the upgraded system to determine success? Your notice of failure shouldn't be an angry user phone call.

Some simple tools can make all the difference. Most of my work is on financial applications and I like to run regression reports between systems for important points e.g. End-of-year. However it's often very difficult to get data out of systems to perform simple comparisons!

Sensible configuration management is often missing. If I've upgraded and configured new features in my pre-production environment I really shouldn't have to repeat the process from scratch in production. Manual processes are prone to errors and ideally once I've prepared for an upgrade I should just hit a 'go' button and sit back.

In my experience very few software developers are aware of IT Service Management (ITSM/ITIL). In particular we should be aware of the Change Management, Release Management and Configuration Management roles that support staff have. If you want to read about ITSM/ITIL then the wiki page is a good place to start.

Some of the processes of ITSM may strike agile developers as being heavy-weight but this doesn't stop you developing the system in an agile manner, it just means that it can be deployed within a formal environment.

An architect should be aware of how the software fits into the organisation. So remember that your ‘users’ aren't just the end users but also the support staff who'll be maintaining your system for the next ten years!

About the author

Robert Annett Robert works in financial services and has spent many years creating and maintaining trading systems. He knows far more about low latency data systems and garbage collection than is good for anyone. He likes to think of himself as a pragmatist who loves technology but uses what's appropriate rather than what's cool.

When not pouring over data connections or tormenting interviewees with circular reference questions, Robert can be found locked in his shed with an impressive collection of woodworking tools.

E-mail : robert.annett at codingthearchitecture.com


Couldn't agree more...

A production failure can be a very expensive one both in terms of financial cost and reputation. So to go into a deployment with doubts about whether it's viable or uncertainty about whether it was successful may be reckless. Using your DR environment for the upgrade might be appropriate -- providing you've got a fast and reliable rollback procedure!

We've expended a lot of effort in trying to automate deployment but also tried to remember that the test phases aren't just a test of the functionality: they also test the processes that support them.

I'm pleased to see there's a track devoted to "dev and ops - a single team" at this year's QCon. Your SLA depends on these things, and it's an important facet of the user (or client) experience.

Re: Designing Maintainable Systems

Umm - even control software for deep space probes needs updating/upgrading! There was an interesting paper at last years Ada UK conference about modifying the code for the Cassini-Huygens probe to Saturn to be able to modify the science when unexpected conditions were encountered.

Re: Designing Maintainable Systems

Fair enough! Can anyone think of a type of software that doesn't need updating?

Add a comment Send a TrackBack