Code Metrics

Driving up quality with automated review

I've recently been involved in refactoring an application to alter it from its successful, 'tactical' implementation to a more 'strategic' codebase. Broadly speaking this means improving the quality of the code.

This 'quality' requirement was obviously key and, like many non-functional requirements, initially defined a bit too vaguely. We expanded it into a more tangible goals - increase developer productivity, reduce UAT overhead, etc..  It was very tempting (and I've certainly done it in the past) to simply define the actions that would meet these goals based on personal bug-bears in the code. Instead we opted to identify the metrics that represented these NFRs and commit to improving those. This would help identify where we should spend time and, crucially, provide a measure that demonstrated that we were being effective.

The following metrics, some of which I'll cover in a bit more detail later, were applied:

  • total lines of code
  • non-commenting source statements (NCSS)
  • cyclic complexity number (CCN)
  • lines of duplicate code
  • unit test coverage
  • coding standards violations

Some were effective, some not. Overall, though, I've been pleasantly surprised by how easy they were to work into the automated build and the insight even a simple set of metrics can provide. They've also proven to be a very useful way of communicating progress to (non-technical) stakeholders in an empirical way. I don't need to waflle on about code changes anymore - the line on the graph drops over time, the line represents code, code represents cost of ownership.

Total lines of code

The metric that's often overlooked as being too simplistic and misleading? The actual value of the line count metric doesn't really express cost of ownership very well but I've found it useful nonetheless. The shape of the total line count graph doesn't really change much when comments/whitespace are removed so it demonstrates the overall trend effectively. The reason I still use it alongside NCSS is that absolutely everyone understands it!

Non-Commenting Source Statements (NCSS)

As its name implies, NCSS represents anything that's not commentary. You could view it as the weight of the compiled code, or the amount of source code that actually does stuff, etc..

I must admit that I've not found it much more insightful than counting lines of code although it has had its moments. Noone outside the development team seems to care about it very much when a simple line count metric is available.

Cyclic Complexity Number (CCN)

In essence, CCN measures how many paths there are through a particular piece of code. Conditions and loops add to the CCN of a method. This can be viewed as indicating the number of test cases that are required to cover the method.

I'm not sold on an argument as simplistic as reducing average method CCN equates to a reduction in TCO. After all, TCO is about the total cost of ownership, not the average cost of ownership. High complexity is clearly something to be avoided, but not necessarily because it will result in less code or less tests!

CCN has definitely been worthwhile graphing over time. It does a very effective job of pointing out where the code has turned to spaghetti or has multiple responsibilities. It's a metric that's appealing to developers as there is a clear correlation between the numbers and how much they like the code. It was strangely fun to pick one of the top ten methods (by CCN) and try to get it out of the top ten.

Duplicate lines of code

I use a tool called CPD (the Copy-Paste Detector) to identify where duplication occurs in the code. It's not particularly sophisticated as it is not very aware of the language and therefore is mainly doing a textual comparison. Despite this limitation, it identified about a fifth of my codebase that had been copied verbatim from somewhere else.

Since the remedy to most copy/paste operations is to delete one of them there is often a very quick improvement in this metric when you run through, finding references and deleting. I found it to be dramatic and short-lived, however, as once the duplicate code had been identified and removed there was apparently nothing more to do. 

CPD missed quite a lot of duplication since it was fooled by differences in commentary and string literals. Fortuantely, in many cases, we'd already been tipped off by duplicate neighbouring code. Unfortunately, removing this duplication didn't alter the metric and perhaps therefore gave the impression of being a waste of time. Fortunately, *phew*, removal of code was always reflected in line count and NCSS metrics.

 

I've found code metrics to be effective, especially when you pick a very small set of demonstrably worthwhile ones. Duplication is sometimes a quick win but CCN is my current favourite, especially if you can explain to your audience without them glazing over.

I'd be interested to hear if anyone has any particular favourites.



Re: Code Metrics

Interesting stuff - and I can see the value of CCN. How did you measure it though?

Re: Code Metrics

We have an Ant script that uses the JavaNCSS task to generate NCSS and CCN metrics:

http://www.kclee.de/clemens/java/javancss/

The XSL that generates the report has been modified to list the top 100 methods by CCN as well as show distribution of complexity according to low (<=3) medium (<=7) high (<=10) and very high (>10) methods.

Re: Code Metrics

It's good to hear you've found some of these metrics useful.

We recently incorporated all the metrics available as simple Maven2 plugins into our continuous integration system although we haven't spent time analysing the results yet.

I'll be doing this over the next couple of weeks so I'll get back to you on our experiences.

Currently we're migrating an external component into our codebase.

We're using coverage tools and metrics and have found them very useful indeed. We use Emma integrated into the IDE for real-time TDD coverage, Cobertura for CI unit testing coverage, and a combination of Selenium with Emma for integration test coverage.

The coverage tools keep us very focused, stop the team writing unecessary tests, and give me a quick overview of the current test effort.

Re: Code Metrics

I once compared Cyclomatic Complexity (CC) with Essential Complexity (EC), and found that EC was a better estimator of true complexity of functions. The problem with CC is that it gives a function containing just a long but absolutely trivial switch statement a high complexity number. EC does not.

More info here:

http://hissa.nist.gov/HHRFdata/Artifacts/ITLdoc/235/chaptera.htm

Mats

Re: Code Metrics

A trivial switch statement still needs a lot of tests to cover it so there's value in knowing how branched the code is. However, essential complexity is definitely a valuable metric for identifying where the system's functionality is truly being implemented (and help redistribute it accordingly).

No doubt a combination of CCN and ECN would provide the best of both worlds.

P.S. Neil, I look forward to the day when my project uses Maven and has automated acceptance tests!! Sounds like your new job has got things right.

Re: Code Metrics

Try changing your copy and paste detection tool to IntelliJ (if you haven't already ;). I just ran it on our codebase and the results are simply staggering.
That's probably as much of a statement about our codebase as it is of IntelliJ's smartness, but it uses anonymizing of all names  (Class, Method, Variable whatever...) which finds those duplications you'd miss at first glance.
It also associates a cost , 'which is the arbitrary units being the code block size calculated using the additive algorithm; generally, the larger the code piece, the higher the figure'.
If the block is in the same class you can also go into that chunk of code and extract a new method from it wherein intellij destroys the duplicates automagically.
The fact that it anonymizes compilation units also means you can spot duplications which can be removed through parameterization - this is a significant step up from dummy duplicate detection.
It also has a sub-expression anonymizer aswell but this significantly slows down the detection process. I may try it overnight to see what joy it yields.
I'm off to see if I can get a LOC reduction bonus from my boss...

Re: Code Metrics

Glad to hear that CPD is working well for you!  FWIW, it's also got an ignoreLiterals flag that you can set to make it ignore literal values. Ditto for ignoreIdentifiers.

Re: Code Metrics

For all our java projects, we have been consistently using following tools (with some of them integrated with our cruise control / maven based build servers) 

  • PMD
  • Checkstyle
  • FindBugs
  • JCSC
  • DocCheck
  • JDepend
  • eMetrics
  • Code Analysis Plugin
  • AppPerfect Code Analysis

Given our experience in improving - and constantly maintaining, the quality standards of software being build I would recommend these tools to be part of every serious java software development being done.

Re: Code Metrics

I'd be quite interested to hear of your experiences of the Code Analysis Plugin. I've found CAP to be useful on a few rare occasions when I'm researching a particularly risky refactoring (basically doing dependency analysis). If I'm being honest I've found its metrics to be of no use at all to me yet, though.

Re: Code Metrics

It is not clear what axes did have your graph:  metrics vs what?

Re: Code Metrics

I graph metrics over time. This lets anyone who's interested track the trend of the metric (or between metrics). Without the history of each metric it's quite hard to say whether you'e made any difference to the code through refactoring.

This is a different goal to setting a threshold for each metric to ensure quality is maintained at each milestone and is about continual improvement of quality.

Re: Code Metrics

Thanks.

Re: Code Metrics

We have been using Resource Standard Metrics by M Squared Technologies to analyze Java and C#. Nice tool. Does baseline differences for these types of code metrics.

Add a comment Send a TrackBack