Is caching an 'Architectural Smell'?

Kent Beck introduced the concept of "Code Smells" while working on Martin Fowler's famous Refactoring book and I think that most people would agree with many of the stinks he identified. Many of us probably also use tools such as checkstyle to automatically identify such things as excessively long methods, dead code etc. To those not familiar with the concept please have a quick read from the link above but the basic premise is that

A code smell is a surface indication that usually corresponds to a deeper problem in the system.

Though we have to remember that just because some code has a 'smell' doesn't mean it's bad, just that it's worth investigation and justification.

We can take the concept to the next layer of abstraction and identify a number of "Architectural Smells". A recent blog article touched upon one of mine - the (over) use of Caches.

I've had terrible trouble with caches in the past. They can introduce bugs which are difficult to reproduce as they rely upon operation timing to be visible. They are similar to bugs you find in concurrent systems, where the issue only occurs every few thousand operations and aren't present when you attach a debugger or logging. Like all performance tuning a cache should be introduced AFTER you have determined that there is an problem. However they can be added so easily that developers throw them in whenever they can. Of course if your cache hit is low then your performance can actually degrade after adding a cache.

Maybe you agree or not with the above (and I know I'll be flamed for saying it) but why do I consider caches to be an Architectural Smell?

In a perfect system the business logic will always have access to the data it needs. The access (local or remote) will fit comfortably into the non-functional requirements and the data it uses will be from the primary source/system of record and not be stale.

Back in the real world the system is not used in the way it was originally designed for, by many more users than anticipated and they can't wait for anything.

The temptation is to introduce a cache at each layer there is an issue. They can be very easy to introduce (Spring will allow you to do this with a couple of lines of configuration for your data access components) and the user's perception of response can increase dramatically. Is it a free lunch? If you look closely at the options available with caching systems you'll see all sorts that you might associate with databases - which is not surprising as they are really a mini database. Have you considered data staleness, dirty reads, dirty writes, update schedules? Will all clients of the data see the same data at the same time? Can updates be missed? Does it listen for updates or poll? Is data coalesced, grouped or skipped? Depending on the use of the data you might answer these questions and decide that caching is an effective and accurate solution - great! If it's not then the cache will introduce the kind of bugs I described.

Either way it is still an Architectural Smell. Perhaps the best solution is to re-examine how data is distributed and accessed throughout the system. For example:

  • Maybe a monolithic database sitting at the center of the system isn't the best solution and perhaps you need multiple database with different responsibilities? (Issues with monolithic, remote databases are a common reason for needing caches).
  • Maybe an asynchronous messaging system with multiple messages being processed would work better than a single request/response system?
  • Perhaps data associated with a request should be sent through the system with the request itself (enriched request).
  • Should some data (e.g. static) be explicitly kept locally rather than requested and cached?
  • Should some data have its encoding changed? Moving from/to xml is very time consuming.
  • Can data be request in larger or smaller blocks to reduce overheads? Calling a database in a loop is a common problem.

I appreciate that this will involve a lot more work than a few lines of configuration but may help architectures to evolve logically rather than become a series of hacks and bolt ons. Introducing a cache is an architectural decision and not a coding one.

What are your favourite Architectural Smells we should all look for? I've already mentioned another of mine - "XML everywhere".

About the author

Robert Annett Robert works in financial services and has spent many years creating and maintaining trading systems. He knows far more about low latency data systems and garbage collection than is good for anyone. He likes to think of himself as a pragmatist who loves technology but uses what's appropriate rather than what's cool.

When not pouring over data connections or tormenting interviewees with circular reference questions, Robert can be found locked in his shed with an impressive collection of woodworking tools.

E-mail : robert.annett at codingthearchitecture.com


Re: Is caching an 'Architectural Smell'?

Interesting post. I've experienced many mind-boggling bugs caused by misused caches, so this same question has crossed my mind more than once. Sadly, the answer is, as always, "it depends". Still, listing frequent mistakes and generally prompting people to think about this problem, like this article attempts to do, is a very good thing.

Re: Is caching an 'Architectural Smell'?

Oh yes. Caching reeks, specifically the "add-on" cache where you can bolt a cache onto something to "improve" performance. Caches that work are designed into the system as a core part of its design - like processor caches, or disk subsystem caches. You can't configure them, you can't turn them off, and you might not even know they are there. Adding a cache into an existing pipeline because you have to fix a problem is definitely a smell.

Re: Is caching an 'Architectural Smell'?

No, caching is not an architecture smell. What you're describing is people treating caches as a silver bullet to help with bad architecture. That's bad. The way I understand the term, a smell is something that you should pay attention to because there's a higher than average likelihood that closer inspection will uncover a flaw. You can argue that that's the case because there's so much badly architected code around, but accepting that argument quickly leads down the slippery slope of someone writing a blog post about how avoiding all caches will make the whole world better. And that couldn't be further from the truth. The point is, just because a tool can be misused doesn't make it sensible to call it a smell. Otherwise, every tool in our toolbox is a strong candidate for a smell.

Re: Is caching an 'Architectural Smell'?

I definitely agree with you. Caching is an optimization, so when it is done prematurely it often causes more harm than good. Also, there are really two types of caches: read-only and write-through, using the correct one is important. Throwing a cache into the mix when it gets very few hits slows down the system and wastes memory. All optimization techniques can be used to speed things up, but they can also slow them down. They need to be used with caution and understanding. Paul.

Add a comment Send a TrackBack