Twitter

Thursday, March 11, 2010

A peek into JDK7 - ForkJoinTask example (RecursiveAction example, Forkjoinpool example)

Consider tasks like sorting an array, doing a complex math on each and every element in an array. Eg. we want to increment by 1 all the elements in the array {0,1,2,3,4,5,6,7,8,9}.

The simplest way is to loop over the entire array and do array[i]=array[i]+1. However this will run in a single thread.

But what if we can take advantage of multi-core CPUs, i.e. break the array into two halves and give it to two threads. So that the first thread operates on the left-half (thereby modifying the array entries to {1,2,3,4,5,......}) of the array whereas the second thread operates on the second-half of the array (thereby modifying the array entries to {......,6,7,8,9,10}).

The ....s means that the corresponding thread doesn't know what is there. It doesn't have to care. It is not part of its job!

This is where the JDK7's ForkJoinTask comes into the play. We give a complex task to be executed. Along with that we also have to specify a threshold. If the task's size is greater than the threshold then the task divides itself and fork()s them and wait for them to finish by join()ing. Hence the name ForkJoinTask. There are two implementation of ForkJoinTask - RecursiveAction and RecursiveTask.

Here is an example. The applyAlgorithm() is the CPU intensive method where each element in the array is modified. When the array is bigger than 5000 (threshold), then the array is divided into two and the two new arrays are handled in parallel by the threads available in the ForkJoinPool.

Following are the results from 2 different machines. One on a 16 core machine and another on a dual core machine. In both the cases the parallel execution is well ahead the single threaded numbers.

You can download the java code here.

1. On a 16 core machine
myServer $ java -cp test/jsr166y.jar:. ForkJoinAlgoritmicTask
Number of processor available: 16
Array size: 10000000
Treshhold: 5000
Number of runs: 5
 
Parallel processing time: 198
Parallel processing time: 69
Parallel processing time: 64
Parallel processing time: 61
Parallel processing time: 59

Number of steals: 579

Sequential processing time: 438
Sequential processing time: 437
Sequential processing time: 436
Sequential processing time: 436
Sequential processing time: 437

2. On a 2 core machine
muruga-Study$java -cp jsr166y.jar:. -Xms1G -Xmx1G ForkJoinAlgoritmicTask
Number of processor available: 2
Array size: 10000000
Treshhold: 5000
Number of runs: 5

Parallel processing time: 227
Parallel processing time: 206
Parallel processing time: 226
Parallel processing time: 203
Parallel processing time: 208
Number of steals: 12

Sequential processing time: 385
Sequential processing time: 385
Sequential processing time: 385
Sequential processing time: 385
Sequential processing time: 385

No comments:

Post a Comment