The Clash of the Paradigms

The consequences of going native

I recently came across a subtle anti-pattern which has caused me some pains. This occurred in a successful application used extensively throughout a large financial corporation. It is a library, which is implemented in C++, but usable from several different programming languages, including Java and C#.

The Java port is a thin layer of Java objects which are used as proxies making JNI calls to underlying C++ objects.

At first glance this is an understandable design: the vast majority of the API's source code is written in one language, and the thin layers used to port the API are presumably easy to maintain.

But it all goes to pot when you consider garbage collection. How does a native object know that it is no longer needed? Well, when there's no more references to the proxy, of course! But there's the rub: the only way a proxy can implicitly determine it is not needed is through a finalize method*. This method is clearly documented to offer no guarantees of when it is invoked; this is likely to be at the mercy of the garbage collector, which in itself implies that at best several potentially-infrequent garbage collection cycles are required before such methods are called.

Consider the case when you have many thin Java objects which each create enormous native data structures: the JVM may not invoke a GC cycle on the modest memory usage of the Java objects, and in the meantime zombie native components may have used up all the swap space available.

The only reasonable alternative is to expose a delete() method in the Java API which is to be explicitly called when the application knows that the native object is no longer required, which in turn frees the native memory used. However, for Java programmers used to their memory-managed sandboxes this goes strongly against the grain. The 3rd party native objects, so innocently wrapped up in Java proxies, quickly spread like a virus throughout large tracts of code. The package in question also, for performance reasons, exposes objects at a relatively fine-grained level, further facilitating its own objects' insinuation throughout the application. What's left is a soup of objects, of which some are POJOs, and some hide native objects and must have their lifecycles carefully tracked to be cleaned up appropriately. Get it wrong and free an object prematurely, and you core-dump the JVM and bring down your application spectacularly, irrevocably and catastrophically. Eliminating memory leaks in your Java code has become a major task.

Admittedly, the impact of the library can be reduced by carefully considered abstractions in client code. But I can't help but blame the designers of the API who enforce their paradigm of programming on all clients of their system. What may be an excellent object-oriented API in C++ does not make a good API in Java. The lesson here is for architects not only to consider the functionality and aesthetics of their API, but also the programming paradigm which suits their clients.

*Tony Printezis has published an alternative for finalizers by using weak references. As the author says, it involves implementing an algorithm, running on its own thread, duplicating functionality provided by the garbage collector, in application code.

About the author

Kenneth RoperKenneth Roper is a development team leader at tier-1 investment bank. He is interested in applications with low-latency requirements or large memory footprints. He spends a lot of time reading garbage collection logs and snow reports.

E-mail : kenneth.roper at codingthearchitecture.com


Re: The Clash of the Paradigms

This is a common anti-pattern in finance. Often a pricing library is written in C/C++ for 'speed' as the original developer/quant sees that it runs faster on a single calculation on their local machine when written in C. What they fail to realise is that when run on large, multiprocessor boxes for hundreds of thousands of calculations then the issues of memory management and efficient/easy threading become much more important than a, for example, 10% loop speedup.

When this kind of scaling is taken into account, well written java or C# code tends to out-perform C++.

A representative reference architecture and appropriate metrics would help in these cases!

Re: The Clash of the Paradigms

How does SWT handle this ? Isn't that what SWT is, JNI wrappers to C on the OS ? Or am I just exposing my ignorance ?

Re: The Clash of the Paradigms

SWT (and AWT, and therefore Swing, for that matter), addresses this problem via the semantics of the API. This fits in quite well with the way people reason about GUI programming: you display components by adding them to a top-level container, and when you are finished with that container you dispose() of it, which in turns frees resources used by the sub-components. However, if your library doesn't support a logical dependency hierarchy (such as containers and their sub-components), then you need to think harder about it. One idea would be to enforce such a dependency hierarchy, e.g. through the use of a contrived Context object, which tracks associated components and releases them all when complete. There's probably a design pattern describing this idea but its name escapes me for the time being.

Re: The Clash of the Paradigms

well, if you consider garbage collection a cross cutting concern, you could try and abstract it out with aop, so only your aspects make those delete() calls.

Re: The Clash of the Paradigms

To me that sounds like a terrible idea, but then again I'm not fan of aspects at the best of times. My first point would be: is it even possible to set up a pointcut which detects when an object is no longer referenced? I suspect not. If you were to re-design your usage of the api to scope all native objects to have a complete lifecycle within well-defined method calls, then you would have cracked the design problem anyway, and whether you free these objects via aspects or in-line in the business code would be a separate question. There's more weight to your argument in this case.

Add a comment Send a TrackBack