<?xml version="1.0"?>
<rss version="2.0">
<channel>
  <title>Coding the Architecture - kroper</title>
  <link>http://www.codingthearchitecture.com/authors/kroper/</link>
  <description>Software architecture for developers</description>
  <language>en</language>
  <copyright>Coding the Architecture</copyright>
  <lastBuildDate>Mon, 09 Jan 2012 09:02:08 GMT</lastBuildDate>
  <generator>Pebble (http://pebble.sourceforge.net)</generator>
  <docs>http://backend.userland.com/rss</docs>
  
  
  <item>
    <title>The Other Interface</title>
    <link>http://www.codingthearchitecture.com/2009/02/07/the_other_interface.html</link>
    
      
        <description>
          &lt;p&gt;
One of the most succinct definitions of a technical architect is: a technologist who is responsible for a system meeting its Non-Functional Requirements.
&lt;/p&gt;&lt;p&gt;
What are often perceived as the most interesting NFRs relate to performance, stability and availability.  However, recently I&#039;ve been paying a lot of attention to perhaps the least glamorous of all the non-functionals: supportability.  In a mature system, the lion&#039;s share of the time it takes to fix a fault is taken up by diagnosing where the fault lies.  Once you&#039;ve diagnosed it, fixing it is often trivial (testing the fix less so, but that&#039;s a discussion for another day).
&lt;/p&gt;&lt;p&gt;
So how do you decrease this diagnosis time?  It boils down to logging and monitoring.  There are some excellent monitoring tools available, and I&#039;ve seen some good home-grown applications, which provide a very informative real-time view of what&#039;s going on under the hood of a process (for Java systems, JMX greatly facilitates rolling your own, although you get a lot out of the box with Sun&#039;s Java distribution these days).  Historical concerns about monitoring tools slowing processes down have all but disappeared: such tools are used on the most latency-sensitive of trading systems.  While it&#039;s relatively easy to recognise a good monitoring tool, a good approach to logging is less self-evident.
&lt;/p&gt;&lt;p&gt;
I&#039;ve encountered dramatically different views on application logging: ranging from the view that the log of a healthy long-running process should be short and readable, no bigger than one screen from top to bottom, to the view that a log file should be exhaustive, often gigabytes in size, and carefully designed post-processing scripts (yes, not just grep) can be used to build a picture of what was going on in the process at a given point in time, or in response to a given event.
&lt;/p&gt;&lt;p&gt;
The best approach will depend on the nature of your system and how it is supported.  I&#039;m currently working on a system supported by several different teams; the development team forms a third or even fourth level of support.  Therefore what the system dumps out in its logs feeds into human processes: messages logged at Error level should require manual intervention and possibly escalation to the next level of support, whereas warnings and below should be ignorable.
&lt;/p&gt;&lt;p&gt;
Everyone who can change the code needs to be aware of this, therefore a logging policy needs to be defined, published and enforced.  Ideally this policy will make your system as close to self-diagnosing as possible.  When this has not been the case, the black art of knowing which errors can be ignored, or where to look if a process fails with no log information at all, can hugely increase the support costs of the system.  It affects the speed of resolution of support incidents, increases the learning curve of new joiners in the team, makes testing more difficult, and reduces software quality by hiding or delaying the discovery of bugs.
&lt;/p&gt;&lt;p&gt;
If there is one approach which is relevant to all logging policies, it is don&#039;t cry wolf, and don&#039;t die quietly.  To put it another way:
&lt;ul&gt;
&lt;li&gt;
Messages logged as Errors / Severe / Fatal should actually be problems with the system, and should not be ignorable.
&lt;/li&gt;&lt;li&gt;
When the system fails, if there is scope in the code to log the current state, this should be done whenever possible.
&lt;/li&gt;
&lt;/ul&gt;
&lt;/p&gt;&lt;p&gt;
This may sound obvious, but I&#039;m finding out, to my expense, that applying even such a simple logging policy to a mature system after the fact can be very costly.
&lt;/p&gt;&lt;p&gt;
There&#039;s perhaps no right answer as to what makes an ideal application log, however there are many wrong answers.  The worst of all is to ignore this interface to your system.  So define your logging standard at the same time as you define your other non-functional requirements, and similarly enforce it as the system evolves.
&lt;/p&gt;

        </description>
      
      
    
    
    
    <category>How do you define software architecture?</category>
    
    <comments>http://www.codingthearchitecture.com/2009/02/07/the_other_interface.html#comments</comments>
    <guid isPermaLink="true">http://www.codingthearchitecture.com/2009/02/07/the_other_interface.html</guid>
    <pubDate>Sat, 07 Feb 2009 14:53:00 GMT</pubDate>
  </item>
  
  <item>
    <title>The Enid Blyton effect</title>
    <link>http://www.codingthearchitecture.com/2008/03/12/the_enid_blyton_effect.html</link>
    
      
        <description>
          &lt;p&gt;

An architect&#039;s &lt;a href=&#034;http://www.codingthearchitecture.com/2007/07/31/role_profile_for_software_architects.html&#034;&gt;role&lt;/a&gt; often includes defining the use of development tools and process.
&lt;/p&gt;&lt;p&gt;
One such tool which I value greatly is a wiki.
&lt;/p&gt;&lt;p&gt;
For those of us used to developing with an in-team wiki, it&#039;s very hard to imagine not using one.  Of course, there are many different knowledge sharing systems which can do the same job, but a few months ago I joined a team which, not only had no wiki, but had no adequate replacement: a collection of documents scattered across an online drive, email folders, source control, and individual hard-disks not only lacked structure, but lacked the low-level detail of changeable information which is vital knowledge for members of the development team.
&lt;/p&gt;&lt;p&gt;

After championing the introduction of a wiki, I was pleased to see it in active use by the team.  It had become a vital resource for new joiners and was used as a continual reference, with almost every team member having added useful pages.
&lt;/p&gt;&lt;p&gt;

However I quickly noticed what I call an &lt;a href=&#034;http://en.wikipedia.org/wiki/Enid_blyton&#034;&gt;Enid Blyton&lt;/a&gt; effect creeping in.  Some developers&#039; suppressed desires to be authors surreptitiously surfaced, and I noticed reams of borderline-relevant material being posted.  For example, a developer would create a page on an obscure business topic which he was far from an authority on, and yet never get round to posting key configuration parameters about a particular build process he had improved.  Some of the more obscure features of the particular wiki implementation -- polls, graphs, emoticons -- were explored at length, but added no useful information.  Deep hierarchies of structure were introduced, presumably on the assumption that the author would come back later to flesh out an enormous topic, but with the end result of simply hiding any useful content behind 7 pages which needed to be clicked-through.
&lt;/p&gt;&lt;p&gt;

There was a concern that in trying to improve communication in the team, I had simply distracted developers from productive output by giving them a new toy with little guidance.
&lt;/p&gt;&lt;p&gt;

&lt;a href=&#034;http://www.economist.com/printedition/displaystory.cfm?story_id=10789354&#034;&gt;The battle for Wikipedia&#039;s soul&lt;/a&gt; story in this week&#039;s issue of The Economist outlines a similar problem facing Wikipedia contributors: how do you censor the content of a wiki so as to keep it relevant, without suppressing the enthusiasm of the contributors?  The rather more dour terms of inclusionists versus deletionists were used.
&lt;/p&gt;&lt;p&gt;

My experience in this case was to err on the inclusionist side.  Remove superfluous structure where necessary, but generally let people put up anything which they think relevant.  The initial spike of extraneous content leveled off, and the overall relevance level remained high.
&lt;/p&gt;&lt;p&gt;
My own instincts are still to post to a wiki, even if it&#039;s just a personal page, rather than writing down information in a log book which I think I might need later.  Like they say, you can&#039;t grep dead trees.
&lt;/p&gt;
        </description>
      
      
    
    
    
    <category>How do you share software architecture?</category>
    
    <comments>http://www.codingthearchitecture.com/2008/03/12/the_enid_blyton_effect.html#comments</comments>
    <guid isPermaLink="true">http://www.codingthearchitecture.com/2008/03/12/the_enid_blyton_effect.html</guid>
    <pubDate>Wed, 12 Mar 2008 19:30:00 GMT</pubDate>
  </item>
  
  <item>
    <title>The joy of sets</title>
    <link>http://www.codingthearchitecture.com/2008/02/13/the_joy_of_sets.html</link>
    
      
        <description>
          &lt;p&gt;
I&#039;ve recently seen impressive performance gains in a data-centric process, which is a generic enough concept to be of general interest.
&lt;/p&gt;&lt;p&gt;
Imagine a system which consolidates the trades done in 10 different branches of a supermarket chain.  We receive these trades on a batch basis: a file per branch every night.  This file contains de-normalised rows of data containing information about a customer (e.g. identified through their loyalty card number), a product, and how much was purchased.  We need to convert this data into a normalised schema where product and customer have their own set of relational tables.  Our database doesn&#039;t store an exhaustive set of every product sold by the supermarket chain, so we only bother to store information about products when they appear the input files.  Let&#039;s also assume we&#039;re doing this by hand through stored procedures -- no fancy ETL tools here.
&lt;/p&gt;&lt;p&gt;
The iterative-programmer&#039;s style of dealing with this problem is likely to be looping through the input file, and having conditional checks on each row, e.g.:
&lt;/p&gt;&lt;p&gt;
For each row in the input file:
&lt;ul&gt;
&lt;li&gt;
if the product doesn&#039;t exist in our database, insert the product
&lt;/li&gt;
&lt;li&gt;
if the customer doesn&#039;t exist, insert the customer
&lt;/li&gt;
&lt;li&gt;
finally, insert the price and quantity which was purchased
&lt;/li&gt;
&lt;/ul&gt;
&lt;/p&gt;&lt;p&gt;
We apply the above algorithm to each file we receive from each supermarket branch we are importing data for, and, because major relational databases pride themselves on their ability to deal with concurrent access, we have a thread (or process) per branch, where each thread imports all its trades in its own transaction.  We kick them off all at once, hence finishing in approximately as much time as it takes to import the branch with the most data.  Right?
&lt;/p&gt;&lt;p&gt;
Wrong.  When certain conditions arise, this process serialises so it takes 10 times longer than expected, as if we&#039;d done each branch one at a time.  (In fact, with naive error handling, this approach may never work due to race conditions).
&lt;/p&gt;&lt;p&gt;
If one product appears for the first time in 2 different branches, let&#039;s say the West London branch and the South Manchester branch,  then whichever branch encounters that product first in the feed file will insert it.  When the other branch then tries to insert this product, it needs to wait for the first branch&#039;s transaction to commit or rollback before knowing if it can create the same row (in our database we naturally have primary key constraints to ensure each product is represented by 1 unique row).  So the second branch waits for the whole of the first branch&#039;s transaction to commit!
&lt;/p&gt;&lt;p&gt;
When you&#039;re writing PL/SQL and you&#039;re thinking at the level of cursors and if-then-else statements, the problems outlined above may not seem obvious.  If you&#039;re purely a Java/C#/C++ programmer you may start thinking along the lines of &#034;well, perhaps we should reduce the size of our transaction&#034;, or &#034;perhaps we can ignore this row and come back to it&#034;.  These are also illustrations of thinking about the problem in the wrong way.
&lt;/p&gt;&lt;p&gt;
The correct approach is not to consider the input data as a succession of rows, but rather as a set of sets: identify the set of products you are importing, the set of customers, and the set of prices and quantities.  Logic can then be applied to different sets: identify the union of new customers, and the union of all new products, and insert these before inserting the price and volume information in a concurrent manner.
&lt;/p&gt;&lt;p&gt;
The overhead of the pre-processing required is quickly recouped by the excellent performance achievable once the problem has been reduced in such a way that the lion&#039;s share of the work is embarrassingly parallel.
&lt;/p&gt;&lt;p&gt;
The original problem here arose again due to a &lt;a href=&#034;http://www.codingthearchitecture.com/2008/01/08/the_clash_of_the_paradigms.html&#034;&gt;paradigm clash&lt;/a&gt;: procedural programming, such as the constructs at your fingertips in languages like PL/SQL, encourage you to think of solutions to problems which are probably better solved using relational algebra, which ANSI SQL captures pretty well.  Of course, PL/SQL, T-SQL and the like will always have their place.  But if you consider yourself an iterative or object oriented programmer, and you find yourself writing stored procedures, the most elegant solutions will often be achieved if you approach the problem thinking in sets, not loops.
&lt;/p&gt;

        </description>
      
      
    
    
    
    <category>How do you define software architecture?</category>
    
    <comments>http://www.codingthearchitecture.com/2008/02/13/the_joy_of_sets.html#comments</comments>
    <guid isPermaLink="true">http://www.codingthearchitecture.com/2008/02/13/the_joy_of_sets.html</guid>
    <pubDate>Wed, 13 Feb 2008 17:30:00 GMT</pubDate>
  </item>
  
  <item>
    <title>JVM Lies: The OutOfMemory Myth</title>
    <link>http://www.codingthearchitecture.com/2008/01/14/jvm_lies_the_outofmemory_myth.html</link>
    
      
        <description>
          &lt;P&gt;
There are times when an OutOfMemoryError means exactly what it says.  Try adding new objects to an ArrayList in a while(true) loop and you&#039;ll see what I mean.
&lt;/P&gt;&lt;P&gt;
However, there are times when it doesn&#039;t.
&lt;/P&gt;&lt;P&gt;
Recently, when I saw a vital supporting application of our system throwing an OutOfMemoryError in production, my first instinct was to increase the &lt;code&gt;-Xmx&lt;/code&gt; switch from the existing 2GB.  Let&#039;s whack on an extra gig, why not.  That will give us at least 6 months until we start worrying about the logical 4GB limit of a 32-bit process&#039;s addressable space.
&lt;/P&gt;&lt;P&gt;
I expect I am not alone in having the knee-jerk reaction that any application&#039;s memory problems can be solved by cranking up the heap.  I blame James Gosling, or whoever decided that the JRE 1.1 JVM&#039;s heap should default to 64M.  Even at the start of my Java programming career in 1998 I remember quickly running out of heap space, and needed to look up what this &lt;a href=&#034;http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp&#034;&gt;non-standard&lt;/a&gt; &lt;code&gt;-Xmx&lt;/code&gt; switch did.  Increasing this value made these problems just disappear.
&lt;/P&gt;&lt;P&gt;
However, instead of doing the obvious and increasing the -Xmx, I added extra GC debugging output and attempted to replicate the problem.  We have plenty of spare memory on our hardware, so any time spent on such an obvious issue is arguably a waste: there was important business functionality I could be delivering instead of messing around with JVM switches.  However, being at times more stubborn than my own good, I insisted on understanding exactly what was going on.  In particular:
&lt;/P&gt;&lt;ol&gt;
&lt;li&gt;Why was similar behaviour not occurring in the test environment?&lt;BR/&gt;

	I am blessed with comparable hardware, and data volumes, in a test environment as the production environment.  A rare treat, I appreciate, but an invaluable one for situations such as this.  Well it turns out the answer to this question was straightforward: it was.  The flaw was with our monitoring of this environment.  Abashed, I made a mental note to improve our application monitoring and moved on to question 2.
	&lt;/li&gt;
&lt;li&gt;Why were we running out of memory?&lt;BR/&gt;

	Data volumes increase in the system on a monthly basis, so the answer to this question may seem self-evident. Without correct monitoring and re-tuning, our JVMs are &lt;i&gt;expected&lt;/i&gt; to run out of memory.  This isn&#039;t necessarily an architectural flaw, it&#039;s simply about allocating the right amount of memory for the current data volumes.  However, I had to be sure it was the heap that we were running out of.
&lt;/li&gt;&lt;/ol&gt;
&lt;P&gt;
Depending on the flavour of JVM, an OutOfMemoryError can indicate a shortage of memory in one of several areas.  These broader concepts are common to generational GC algorithms across the major JVM vendors including Sun, IBM and BEA, although the specifics I refer to below relate to the &lt;a href=&#034;http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html&#034;&gt;Sun Hotspot GC model&lt;/a&gt;.
&lt;/P&gt;&lt;ul&gt;
&lt;li&gt;
The first is the tenured generation.  This is usually what I mean when I say &#034;the heap&#034;.  Memory is segmented into several generations, however it is when the tenured generation is full, and cannot be expanded any further, that the JVM considers itself OutOfMemory.
&lt;/li&gt;&lt;li&gt;
The second is the permanent generation.  This does not resize during the life time of the application, regardless of how much free space may exist in the rest of the heap, but remains at whatever it was originally set to (default is 64K).  Should this prove too small for the perm generation, then the JVM will throw an OOME even if there&#039;s plenty of heap left.  Adding the &lt;code&gt;-XX:+PrintHeapAtGC&lt;/code&gt; switch will tell you if this is the case.
&lt;/li&gt;&lt;li&gt;
The third possibility is your operating system is out of memory, e.g. you&#039;ve asked for a 2GB heap on a box with 1GB RAM and 512MB swap space (not a typical server, admittedly, but serves as an example).
&lt;/li&gt;&lt;/ul&gt;
In my case I was primarily investigating which of the first two above scenarios was occurring (I knew we had enough spare memory on the box itself), so I was somewhat surprised to find out it was neither.
&lt;ul&gt;&lt;li&gt;
Another possibility is native components are hogging your 4GB ceiling.  Native code competes with the JVM to use the 4GB of addressable space in your application.  If these components are memory hungry, your app will be starved of addressable space, even if it hasn&#039;t actually used up all the heap you&#039;ve given it yet.  This may manifest itself during the workings of the Hotspot JIT compiler, which itself is a native component, as the Just In Time compiler uses some of your process&#039;s space to compile methods to native code at runtime.  Should these memory requirements push the addressable space required in the process above 4GB, then you get an OOME thrown which the 1.4 JVM logs as:
&lt;/P&gt;
&lt;code&gt;
Exception in thread &#034;CompilerThread0&#034; java.lang.OutOfMemoryError: requested&lt;BR/&gt;
Exception in thread &#034;main&#034; java.lang.OutOfMemoryError: requested 32756 bytes for ChunkPool::allocate. Out of swap space?&lt;BR/&gt;
&lt;/code&gt;
&lt;/li&gt;&lt;/ul&gt;
&lt;P&gt;
The process hadn&#039;t used all the space available to it when I saw this error -- the Java heap had plenty of room left unused.  However for addressing purposes this space was considered consumed.
&lt;/P&gt;&lt;P&gt;
So, what to do about the above error?  Increasing the heap allocation actually exacerbates this problem! It decreases the headroom the compiler, and other native components, have to play with.
&lt;/P&gt;&lt;P&gt;
So the solution to my problem was:&lt;/P&gt;
&lt;ol&gt;
	&lt;li&gt;&lt;i&gt;reduce&lt;/i&gt; the heap allocated to the JVM.
	&lt;/li&gt;&lt;li&gt;
	remove the memory leaks caused by &lt;a href=&#034;http://www.codingthearchitecture.com/2008/01/08/the_clash_of_the_paradigms.html&#034;&gt;native objects not being freed in a timely fashion&lt;/a&gt;.
	&lt;/li&gt;
&lt;/ol&gt;
&lt;P&gt;
Or just use a 64-bit JVM.
&lt;/P&gt;
        </description>
      
      
    
    
    
    <comments>http://www.codingthearchitecture.com/2008/01/14/jvm_lies_the_outofmemory_myth.html#comments</comments>
    <guid isPermaLink="true">http://www.codingthearchitecture.com/2008/01/14/jvm_lies_the_outofmemory_myth.html</guid>
    <pubDate>Mon, 14 Jan 2008 13:39:00 GMT</pubDate>
  </item>
  
  <item>
    <title>The Clash of the Paradigms</title>
    <link>http://www.codingthearchitecture.com/2008/01/08/the_clash_of_the_paradigms.html</link>
    
      
        <description>
          &lt;p&gt;
I recently came across a subtle anti-pattern which has caused me some pains.  This occurred in a successful application used extensively throughout a large financial corporation.  It is a library, which is implemented in C++, but usable from several different programming languages, including Java and C#.
&lt;/p&gt;&lt;p&gt;
The Java port is a thin layer of Java objects which are used as proxies making JNI calls to underlying C++ objects.
&lt;/p&gt;&lt;p&gt;
At first glance this is an understandable design: the vast majority of the API&#039;s source code is written in one language, and the thin layers used to port the API are presumably easy to maintain.
&lt;/p&gt;&lt;p&gt;
But it all goes to pot when you consider garbage collection.  How does a native object know that it is no longer needed?  Well, when there&#039;s no more references to the proxy, of course!  But there&#039;s the rub: the only way a proxy can implicitly determine it is not needed is through a finalize method*.  This method is &lt;a href=&#034;http://java.sun.com/javase/6/docs/api/java/lang/Object.html#finalize()&#034;&gt;clearly documented&lt;/a&gt; to offer no guarantees of when it is invoked; this is likely to be at the mercy of the garbage collector, which in itself implies that at best several potentially-infrequent garbage collection cycles are required before such methods are called.
&lt;/p&gt;&lt;p&gt;
Consider the case when you have many thin Java objects which each create enormous native data structures: the JVM may not invoke a GC cycle on the modest memory usage of the Java objects, and in the meantime zombie native components may have used up all the swap space available.
&lt;/p&gt;&lt;p&gt;
The only reasonable alternative is to expose a delete() method in the Java API which is to be explicitly called when the application knows that the native object is no longer required, which in turn frees the native memory used.  However, for Java programmers used to their memory-managed sandboxes this goes strongly against the grain.  The 3rd party native objects, so innocently wrapped up in Java proxies, quickly spread like a virus throughout large tracts of code.  The package in question also, for performance reasons, exposes objects at a relatively fine-grained level, further facilitating its own objects&#039; insinuation throughout the application.  What&#039;s left is a soup of objects, of which some are POJOs, and some hide native objects and must have their lifecycles carefully tracked to be cleaned up appropriately.  Get it wrong and free an object prematurely, and you core-dump the JVM and bring down your application spectacularly, irrevocably and catastrophically.  Eliminating memory leaks in your Java code has become a major task.
&lt;/p&gt;&lt;p&gt;
Admittedly, the impact of the library can be reduced by carefully considered abstractions in client code.  But I can&#039;t help but blame the designers of the API who enforce their paradigm of programming on all clients of their system.  What may be an excellent object-oriented API in C++ does not make a good API in Java.  The lesson here is for architects not only to consider the functionality and aesthetics of their API, but also the programming paradigm which suits their clients.
&lt;/p&gt;&lt;p&gt;
*Tony Printezis has published an &lt;a href=&#034;http://java.sun.com/developer/technicalArticles/javase/finalization/&#034;&gt;alternative for finalizers by using weak references&lt;/a&gt;.  As the author says, it involves implementing an algorithm, running on its own thread, duplicating functionality provided by the garbage collector, in application code.

&lt;/p&gt;
        </description>
      
      
    
    
    
    <category>How do you define software architecture?</category>
    
    <comments>http://www.codingthearchitecture.com/2008/01/08/the_clash_of_the_paradigms.html#comments</comments>
    <guid isPermaLink="true">http://www.codingthearchitecture.com/2008/01/08/the_clash_of_the_paradigms.html</guid>
    <pubDate>Tue, 08 Jan 2008 10:29:00 GMT</pubDate>
  </item>
  
  </channel>
</rss>

