Monday 12 January 2015

Java8 Sorting - Performance Pitfall

Java 8 brings all the goodness of lambdas to enable us to program using a declarative style. But is it really free? And should we be concerned about the price we have to pay for the new programming goodies?

Here's an example where we might have to worry.

Consider sorting instances of this simple class:
private static class MyComparableInt{
        private int a,b,c,d;

        public MyComparableInt(int i) {
            a = i%2;
            b = i%10;
            c = i%1000;
            d = i;
        }

        public int getA() { return a; }
        public int getB() { return b; }
        public int getC() { return c; }
        public int getD() { return d; }
}
We can sort using the new Java 8 declarative syntax as below:

List<MyComparableInt> mySortedComparableList = 
      myComparableList.stream()
       .sorted(Comparator.comparing(
         MyComparableInt::getA).thenComparing(
         MyComparableInt::getB).thenComparing(
         MyComparableInt::getC).thenComparing(
         MyComparableInt::getD))
       .collect(Collectors.toList());

Or we can sort the old way (still using lambdas) using this code:

List<MyComparableInt> mySortedComparableList = 
         myComparableList.stream()
           .sorted(MyComparableIntSorter.INSTANCE)
           .collect(Collectors.toList());

public enum MyComparableIntSorter implements 
     Comparator<MyComparableInt>{
        INSTANCE;

        @Override
        public int compare(MyComparableInt o1, MyComparableInt o2) {
            int comp = Integer.compare(o1.getA(), o2.getA());
            if(comp==0){
                comp = Integer.compare(o1.getB(), o2.getB());
                if(comp==0){
                    comp = Integer.compare(o1.getC(), o2.getC());
                    if(comp==0){
                        comp = Integer.compare(o1.getD(), o2.getD());
                    }
                }
            }
            return comp;
        }
 }
When I ran this test with 10m objects the sort took ~6.5s using the declarative 
syntax but only 1.5s using the old syntax.  That's almost 4 times as long!
So where is the time going? Presumably it's in the overhead for marshalling
between the thenComparing methods.
Interestingly,  if you try exactly the same test but replace int for String the 
times change as follows.  Again for 10m objects, using the new syntax takes
~11.5s whilst the old syntax takes ~7s.  Using String, when the marshalling is less
significant the new syntax takes only 1.5 times as long as the old syntax.

When I ran this in YourKit I ended up with a not so useful CPU analysis:

When I profiled this in FlightRecorder you can see a more useful analysis (because
it doesn't rely as much on safe points).

The time is spent firstly in calls to virtual methods through the lambdas and secondly converting ints to Integers.

In my follow up post I will be explaining, using a step by step optimisation, how the code matches the analysis in Flight Recorder.

In summary, whilst the new syntax looks great and is wonderfully expressive,
if you are worried about performance you should stick to the old syntax.
Once again it seems that there is no such thing as a free lunch!  


6 comments:

  1. Thanks for the article! Can you source of the tests somewhere so that we can review them and try them ourselves?

    ReplyDelete
    Replies
    1. Er. that was: Can you please upload the test source somewhere? Need... coffee....

      Delete
  2. Gareth
    Thanks for your feedback.
    I'm working on part 2 of this article which hopefully you will enjoy. After that I hope to release the code.

    ReplyDelete
  3. Espresso exceptions offer a way to method synchronous glitches for instance sections by means of absolutely nothing and out-of-range variety indexes, and this article provides major realistic rules on the employ. This content incorporates whenever to use Exceptions, if you neither of them catch or toss exceptions and (importantly) whenever to use standard Espresso exceptions. Half-___ (java order)

    ReplyDelete
  4. Your first code is bad in performance because you use Comparator.comparing instead of Comparator.comparingInt

    ReplyDelete