Twitter

Wednesday, March 31, 2010

Java - Cache (frequently used immutable objects) using ConcurrentHashMap.

One of the better way to reduce the impact of garbage collection in any Java application is to reduce the number of newly created objects thereby reducing the amount of garbage produced.

Let us take an example of university admission application. There are two possible titles (Masters and Bachelors) and two possible majors (Computer and Electrical). Let us name the combination of title-and-major as Degree.

Assume that there are thousands of applications submitted every day. Each application will be for the particular combination of the above mentioned Degree. eg. Masters-Computer, Master-Electrical, Bachelor-Computer...

Instead of creating new Degree object for each and every application, we can cache the Degree object. When a particular Degree object is demanded, look for that in the cache. If the cache doesn't contain that, then create it, put it inside the cache and then return it. Since the cache is built using ConcurrentHashMap, it is also thread safe. i.e, Even when there are more than 1 thread running the Degree.valueOf() method for a same set of "title" and "major" strings, ONLY one instance of that particular Degree instance will be constructed and will be used by the threads.

It is clear that we will produce less garbage using caching. But is there is another add-on advantage. It is easier to check whether two Degree objects are equal using reference equality (==) rather that equals() method. ie. we can do Master_Computer==Master_Computer_ANOTHER instead of Master_Computer.equals(Master_Computer_ANOTHER). On my machine == is 9 times faster than equals(). Thereby saving some CPU cycles.

Have a look in the following code on how to build a cache of Objects (Degree) having two String properties. Complete code here.

The main method is the Degree.valueOf() method. Where all the caching is done.

This is just an example. This technique can be used in the telco application servers running with load in terms of 1000s of TPS. Eg. Consider a header/value type of protocol. Instead of creating a particular header 1000s of times per second we can reuse the existing cached header thereby relaxing the CPU and RAM for other useful processing.



......

        Degree MASTER_COMPUTER = Degree.valueOf(MASTER, COMPUTER);
        Degree MASTER_ELECTRICAL = Degree.valueOf(MASTER, ELECTRICAL);
        Degree BACHELOR_COMPUTER = Degree.valueOf(BACHELOR, COMPUTER);
        Degree BACHELOR_ELECTRICAL = Degree.valueOf(BACHELOR, ELECTRICAL);
......

        // we ask the cache for all the possible present values that were created by the above lines.
        // Therefore it returns the existing values; Nothing is created anew.
        System.out.println("\nNo more new constructions for existing entries...");
        Degree MASTER_COMPUTER_1 = Degree.valueOf(MASTER, COMPUTER);

......
        // now ask for something that is not there; cache will create it anew and caches them
        System.out.println("\nNew constructions for non-existing entries...");
        Degree.valueOf("Phd", "Computer");

......    
        // one more advantage of caching : we can compare the reference of two objects instead of checking their equal()ity
        // this is because all requests to Degree.valueOf(MASTER, COMPUTER) always return the very same object
        System.out.println("\nFaster equality check...");

        

Output
New Degree object: Master_Computer
New Degree object: Master_Electrical
New Degree object: Bachelor_Computer
New Degree object: Bachelor_Electrical

No more new constructions for existing entries...
Returning existing instance of Master_Computer

New constructions for non-existing entries...
New Degree object: Phd_Computer

Faster equality check...
Checked with equals().
Checked with reference equality.

Thursday, March 11, 2010

A peek into JDK7 - ForkJoinTask example (RecursiveAction example, Forkjoinpool example)

Consider tasks like sorting an array, doing a complex math on each and every element in an array. Eg. we want to increment by 1 all the elements in the array {0,1,2,3,4,5,6,7,8,9}.

The simplest way is to loop over the entire array and do array[i]=array[i]+1. However this will run in a single thread.

But what if we can take advantage of multi-core CPUs, i.e. break the array into two halves and give it to two threads. So that the first thread operates on the left-half (thereby modifying the array entries to {1,2,3,4,5,......}) of the array whereas the second thread operates on the second-half of the array (thereby modifying the array entries to {......,6,7,8,9,10}).

The ....s means that the corresponding thread doesn't know what is there. It doesn't have to care. It is not part of its job!

This is where the JDK7's ForkJoinTask comes into the play. We give a complex task to be executed. Along with that we also have to specify a threshold. If the task's size is greater than the threshold then the task divides itself and fork()s them and wait for them to finish by join()ing. Hence the name ForkJoinTask. There are two implementation of ForkJoinTask - RecursiveAction and RecursiveTask.

Here is an example. The applyAlgorithm() is the CPU intensive method where each element in the array is modified. When the array is bigger than 5000 (threshold), then the array is divided into two and the two new arrays are handled in parallel by the threads available in the ForkJoinPool.

Following are the results from 2 different machines. One on a 16 core machine and another on a dual core machine. In both the cases the parallel execution is well ahead the single threaded numbers.

You can download the java code here.

1. On a 16 core machine
myServer $ java -cp test/jsr166y.jar:. ForkJoinAlgoritmicTask
Number of processor available: 16
Array size: 10000000
Treshhold: 5000
Number of runs: 5
 
Parallel processing time: 198
Parallel processing time: 69
Parallel processing time: 64
Parallel processing time: 61
Parallel processing time: 59

Number of steals: 579

Sequential processing time: 438
Sequential processing time: 437
Sequential processing time: 436
Sequential processing time: 436
Sequential processing time: 437

2. On a 2 core machine
muruga-Study$java -cp jsr166y.jar:. -Xms1G -Xmx1G ForkJoinAlgoritmicTask
Number of processor available: 2
Array size: 10000000
Treshhold: 5000
Number of runs: 5

Parallel processing time: 227
Parallel processing time: 206
Parallel processing time: 226
Parallel processing time: 203
Parallel processing time: 208
Number of steals: 12

Sequential processing time: 385
Sequential processing time: 385
Sequential processing time: 385
Sequential processing time: 385
Sequential processing time: 385

Wednesday, March 10, 2010

Adapting netbeans default license template.

After participating in the JavaEE6 codecamp, somehow I got in love with Netbeans.


But whenever I create a new java file, the editor kept on adding the following default license template to the files.

/*
* To change this template, choose Tools | Templates
* and open the template in the editor.
*/

If you would like to avoid this then go to "Tools->Templates". The "Template Manager" window will pop up and there expand the "Licenses" folder. Select "Default License" and click "Open in Editor" button (at the bottom).

There you can customize the license text or you can just delete the contents.

Hope this helps someone.