Tuesday, 22 December 2015

How long does it take the jvm to effect escape analysis? Maybe longer than you think.

This post looks at escape analysis, in particular how long it takes for the jvm to effect escape analysis in a running program. I make some observations but don't have all the explanation at this point.

By way of introduction let's take a detour to look at a little known and even less used flag (which we'll see is a good thing) in the jvm, -Xcomp

The behaviour for this flag is defined in the jvm documentation as:




-Xcomp





Forces compilation of methods on first invocation. By default, the Client VM (-client) performs 1,000 interpreted method invocations and the Server VM (-server) performs 10,000 interpreted method invocations to gather information for efficient compilation. Specifying the -Xcomp option disables interpreted method invocations to increase compilation performance at the expense of efficiency.


At first sight this seems to be an excellent option.  A shortcut to warming up the jvm through 10,000 cycles - we can get the code to compile straight away. Shouldn't we always enable this option by default?

But the documentation does warn that this will be 'at the expense of efficiency'.

The jvm learns about code behaviour in the 10,000 warmup cycles so that when it comes to compiling it compiles in the most efficient way possible.  Compiling the code right away will mean that yes the code is indeed compiled but that the compiled code may not be the most efficient. You can read more about it in this blogpost - but that's not really the subject of this post.

Something else that doesn't happen if you use -Xcomp is escape analysis. This is actually rather surprising as the jvm shouldn't need to learn about whether escape analysis is possible by running the program.  This should be evident by a static analysis of the code.

Have a look at this code (I was inspired by the ideas in this blog):


We need to make sure that the program runs without a gc (I suggest these flags):

-verbosegc -Xmx4g -Xms4g

When the program waits for input carry out a heap dump to see how many Optional object have been created. Then hit any key to resume the program.

To perform a heap dump first run jps to identify the pid of the program then run:

jmap -histo pid | head
  
Do this once without the -Xcomp flag and once with -Xcomp flag.

Without the -Xcomp flag:

After first iteration:


num     #instances         #bytes  class name
----------------------------------------------
   1:           644      123241360  [I
   2:        234496        3751936  java.util.Optional
   3:          6582         791968  [C
   4:          2717         540712  [B
   5:          4786         114864  java.lang.String
   6:           662          75144  java.lang.Class
   7:          1349          63912  [Ljava.lang.Object;

After second iteration:

num     #instances         #bytes  class name
----------------------------------------------
   1:           644      116687472  [I
   2:        644095       10305520  java.util.Optional
   3:          6583         792056  [C
   4:          2717         540712  [B
   5:          4787         114888  java.lang.String
   6:           662          75144  java.lang.Class
   7:          1349          63912  [Ljava.lang.Object;

All subsequent iterations are the same no further objects are created:

There's clearly escape analysis kicking in after 234k iterations - not sure why it should take so long, usually (for example with compiling code) 10k iterations is enough? Also in the second iteration it creates another ~400k objects before escape analysis kick in which is also a bit mysterious.

With the -Xcomp flag

After the first iteration:


num     #instances         #bytes  class name
----------------------------------------------
   1:           653      153880352  [I
   2:       1000001       16000016  java.util.Optional
   3:          7397         834744  [C
   4:          2717         540728  [B
   5:          5685         136440  java.lang.String
   6:           672          76208  java.lang.Class
   7:          1349          63912  [Ljava.lang.Object;

After the second iteration:


num     #instances         #bytes  class name
----------------------------------------------
   1:           654      159354896  [I
   2:       2000001       32000016  java.util.Optional
   3:          7398         834832  [C
   4:          2717         540728  [B
   5:          5686         136464  java.lang.String
   6:           672          76208  java.lang.Class
   7:          1349          63912  [Ljava.lang.Object;

After each iteration the number of Optional objects goes up by 1m.



Summary

  • -Xcomp is a switch that should almost certainly never be used in production. I can imagine some scenarios where you might want to play around with disabling the interpreter but those would be very specific edge cases. 
  • It seems to take at least 200K iteration for escape analysis to be effective.  So you need to allow longer than the 10k iterations for a full warm up.
  • There is also another phase where after escaping out objects it seems to need to do this again. This needs further understanding. 
  • If you slow the program down a bit by doing some work in between the calls to create the Optional the number of objects reduces.  For example I found that a call to Math.sin reduces the Optional objects by about 50%.

6 comments:

  1. Hello

    Did you try without tiered compilation?

    I don't see why, once compiled, EA behavior would change...

    Cheers

    ReplyDelete
    Replies
    1. Hello - always good to hear from you.

      Yes tried with -XX:-TieredCompilation and got exactly the same behaviour. In my last test 58k instances of Optional in the first 1m iterations and 292k instances of Optional in the second 1m - after that no more allocation - all a bit strange.

      Delete
  2. Hi
    -Xcomp triggers compilation at first run but I think compilation is done in the background. So some of it have time to run interpreted (?)
    -Xbatch makes the compilation synchronous. Used with -Xcomp we should not have interpreted code executed.
    Can you try?
    Cheers
    GG

    ReplyDelete
  3. Hi
    -Xcomp triggers compilation at first run but I think compilation is done in the background. So some of it have time to run interpreted (?)
    -Xbatch makes the compilation synchronous. Used with -Xcomp we should not have interpreted code executed.
    Can you try?
    Cheers
    GG

    ReplyDelete
    Replies
    1. Thanks for the suggestion but -Xbatch makes no difference to running either with _Xcomp or without.

      Delete
  4. This comment has been removed by a blog administrator.

    ReplyDelete