<?xml version="1.0"?>
<rss version="2.0">
<channel>
  <title>Coding the Architecture - optimization tag</title>
  <link>http://www.codingthearchitecture.com/tags/optimization/</link>
  <description>Software architecture for developers</description>
  <language>en</language>
  <copyright>Coding the Architecture</copyright>
  <lastBuildDate>Mon, 21 May 2012 09:41:00 GMT</lastBuildDate>
  <generator>Pebble (http://pebble.sourceforge.net)</generator>
  <docs>http://backend.userland.com/rss</docs>
  
  
  <item>
    <title>What is Significant?</title>
    <link>http://www.codingthearchitecture.com/2007/12/15/what_is_significant.html</link>
    
      
        <description>
          &lt;p&gt;
Recently I wrote a quick blog about taking metrics for optimisation. I suggested they should only be included if the improvement was significant, but how do you define significant?
&lt;/p&gt;

&lt;p&gt;
You may think that &#039;significant&#039; is just a matter of opinion but it actually has a very specific meaning in statistics - 

&lt;a href=&#034;http://en.wikipedia.org/wiki/Statistical_significance&#034;&gt;Wikipedia &#039;s Description&lt;/a&gt;. You can have a read through the maths but it basically comes down to &#034;a result is called statistically significant if it is unlikely to have occurred by chance&#034;.
&lt;/p&gt;

&lt;p&gt;
This is really important and something that performance testers and optimisers often forget. For example...
&lt;/p&gt;
&lt;p&gt;
Imagine that you perform some kind of performance test on your system or code. This could be anything e.g. latency response timings, throughput per time unit etc but we&#039;ll assume for this that it&#039;s units processed in 10 minutes. The figure you get is  20. You spend a day modifying a piece of code you think will affect the performance, retest and get 22. A 10% improvement - pretty good.
&lt;/p&gt;
&lt;p&gt;
You hand the new code over to a colleague who also does a test. She says that it&#039;s worse by 5%. Slander! You take it to your boss who says there is no difference...
&lt;/p&gt;
&lt;p&gt;
What we&#039;ve done is perform three tests on the old and new system. Lets list them and perform seven others as well:
&lt;/p&gt;
&lt;p&gt;
&lt;pre&gt;
Old: 20 20 22 19 19 21 19 19 19 22
New: 22	19 22 20 20 19 19 19 21 19
&lt;/pre&gt;
&lt;/p&gt;
&lt;p&gt;
Now it&#039;s obvious what happened (although it probably was before). Your test does not produce constant figures even without changes. Both have a range of 3 (19-22), an average of 20 and a variance of 1.55
&lt;/p&gt;
&lt;p&gt;
If you had performed ten runs on the original code first you would have realised that a single result of 22 for the new code is not significant as it&#039;s within the range of the previous figures.
&lt;/p&gt;
&lt;p&gt;
Performing the test multiple times on the new code would increase your confidence that it&#039;s a significant change. You can test this statistically (but the maths is beyond the scope of this blog entry).
&lt;/p&gt;
&lt;p&gt;
Just to leave you with a challenge, the project I&#039;m currently working has a task that we wish to optimise but it takes ten hours to run even on a grid of several hundreds machines. How do we run realistic, pre-production tests that we know are statistically significant?
&lt;/p&gt;

        </description>
      
      
    
    
    
    <category>How do you deliver software architecture?</category>
    
    <comments>http://www.codingthearchitecture.com/2007/12/15/what_is_significant.html#comments</comments>
    <guid isPermaLink="true">http://www.codingthearchitecture.com/2007/12/15/what_is_significant.html</guid>
    <pubDate>Sat, 15 Dec 2007 21:15:27 GMT</pubDate>
  </item>
  
  </channel>
</rss>

