Friday 15 April 2016

JLBH Examples 3 - The Affects of Throughput on Latency

In this post:

  • A discussion about the effects of thoughput on latency
  • How use JLBH to measure TCP loopback
  • Adding probes to test both halves of the TCP round trip
  • Watching the effect of increasing throughput on latency
  • Understanding that you have to drop throughput to achieve good latencies at high percentiles.


In the post we saw the effects of accounting for coordinated omission or measuring the effect a delay to one iteration will have an affect on subsequent iterations.

Intuitively we understand that throughput will affect latency. It seems natural that as we raise throughput we will also raise latency. 

Going into a very crowded shop will affect how fast you can select and purchase your goods. On the other hand consider a very rarely frequented shop. It could be that in such a shop the shopkeeper is away from the till having a tea break and that your purchase will be delayed as you wait for him to put down his cup of tea and make his way to the counter where he can server you.

This is exactly what you find when running benchmarks and vary throughput.  

In general as you raise throughput latency increases but at some point when throughput drops below a certain threshold latency can increase as well.

The code below times a round trip TCP call over loopback.

We add two probes:

  • client2server - the amount of time taken to complete the first half of the round trip
  • server2client - the amount of time taken to complete the second half of the trip

These probes do not take into account co-ordinated omission - it is only the end to end time that takes coordinated omission into account.

Here's the code for the benchmark:


And here are the results when run at a throughput of 20,000 iterations / second:

Warm up complete (50000 iterations took 2.296s)
-------------------------------- BENCHMARK RESULTS (RUN 1) ---------Run time: 2.5s
Correcting for co-ordinated:true
Target throughput:20000/s = 1 message every 50us
End to End: (50,000)                            50/90 99/99.9 99.99 - worst was 34 / 2,950  19,400 / 20,450  20,450 - 20,450
client2server (50,000)                          50/90 99/99.9 99.99 - worst was 16 / 26  38 / 72  287 - 336
server2clientProbe (50,000)                     50/90 99/99.9 99.99 - worst was 16 / 27  40 / 76  319 - 901
OS Jitter (26,960)                              50/90 99/99.9 99.99 - worst was 9.0 / 16  44 / 1,340  10,220 - 11,800
--------------------------------------------------------------------
-------------------------------- BENCHMARK RESULTS (RUN 2) ---------
Run time: 2.5s
Correcting for co-ordinated:true
Target throughput:20000/s = 1 message every 50us
End to End: (50,000)                            50/90 99/99.9 99.99 - worst was 42 / 868  4,590 / 5,110  5,370 - 5,370
client2server (50,000)                          50/90 99/99.9 99.99 - worst was 20 / 27  38 / 92  573 - 2,560
server2clientProbe (50,000)                     50/90 99/99.9 99.99 - worst was 19 / 27  38 / 72  868 - 1,740
OS Jitter (13,314)                              50/90 99/99.9 99.99 - worst was 9.0 / 16  32 / 96  303 - 672
--------------------------------------------------------------------
-------------------------------- BENCHMARK RESULTS (RUN 3) ---------
Run time: 2.5s
Correcting for co-ordinated:true
Target throughput:20000/s = 1 message every 50us
End to End: (50,000)                            50/90 99/99.9 99.99 - worst was 34 / 152  999 / 2,160  2,290 - 2,290
client2server (50,000)                          50/90 99/99.9 99.99 - worst was 17 / 26  36 / 54  201 - 901
server2clientProbe (50,000)                     50/90 99/99.9 99.99 - worst was 16 / 25  36 / 50  225 - 1,740
OS Jitter (14,306)                              50/90 99/99.9 99.99 - worst was 9.0 / 15  23 / 44  160 - 184
---------------------------------------------------------------------------------------------------- SUMMARY (end to end)---------------
Percentile   run1         run2         run3      % Variation   var(log)
50:            33.79        41.98        33.79        13.91       
90:          2949.12       868.35       151.55        75.92       
99:         19398.66      4587.52       999.42        70.53       
99.9:       20447.23      5111.81      2162.69        47.62     99.99:      20447.23      5373.95      2293.76        47.24       
worst:      20447.23      5373.95      2293.76        47.24       
--------------------------------------------------------------------
-------------------------------- SUMMARY (client2server)------------
Percentile   run1         run2         run3      % Variation   
50:            16.13        19.97        16.90        10.81       
90:            26.11        27.14        26.11         2.55       
99:            37.89        37.89        35.84         3.67       
99.9:          71.68        92.16        54.27        31.76       
99.99:        286.72       573.44       200.70        55.32       
worst:        335.87      2555.90       901.12        55.04       
--------------------------------------------------------------------
-------------------------------- SUMMARY (server2clientProbe)-------
Percentile   run1         run2         run3      % Variation   
50:            16.13        18.94        16.13        10.43       
90:            27.14        27.14        25.09         5.16       
99:            39.94        37.89        35.84         3.67       
99.9:          75.78        71.68        50.18        22.22       
99.99:        319.49       868.35       225.28        65.55       
worst:        901.12      1736.70      1736.70         0.00       
--------------------------------------------------------------------

What should happen is that:

client2server + server2client ~= endToEnd

And this more of less what happens at the 50th percentile

Take the second run for the purposes of this demonstration:

19.97 + 18.94 ~= 41.98

If that was all you were measuring you might say that there would be no problem running 20k/s messages through my machine.

However my laptop clearly can't handle this volume and if we look at the second run again for the 90th percentile.

27.14 + 27.14 !~= 868.35

And as you move up the percentiles it just gets worse and worse...

If however I change the throughput to 5k messages / second I see this on the 90th percentile:

32.23 + 29.38 ~= 62.46

So we see that if you want to achieve low latencies at the high percentiles you have to drop your throughput to the correct level.

This is why it so important that we are able to vary throughputs with JLBH.

8 comments:

  1. It is very good articles with sample TCP code. Can you advice we have many thread hang state and required to restart application.

    Java Training in Chennai | JEE Training in Chennai | Spring Training in Chennai

    ReplyDelete
  2. That is a great comparison of the two engines. And thanks for the code. Your post was a huge help to me. Thanks a lot.

    ReplyDelete
  3. Torrentsformac
    A free download hub to activate, crack your softwares for windows and you mac. you can download keygen, generator and activators to crack your desired TriDef Crack

    ReplyDelete
  4. Dress up your own characters and gacha offer it for you! Gacha life download
    is a great game that advances imagination and narrating in kids.

    ReplyDelete
  5. I really like you words and appreciate your article post. Really Cool please keep writing.
    hitachi 1.5 ton 4 star inverter ac

    ReplyDelete