Monday 18 April 2016

JLBH Examples 4 - Benchmarking QuickFix vs ChronicleFix

In this post:

  • Using JLBH to test QuickFIX
  • Observing how QuickFix latencies degrade through the percentiles
  • Comparing QuickFIX with Chronicle FIX


As mentioned in the Introduction to JLBH the primary reason JLBH was created was to measure the Chronicle-FIX engine.

We use all the features of JLBH, particularly the throughput lever and the accounting for coordinated omission to get some realistic timings for QuickFIX.

Later in this article we will look at some results for ChronicleFIX but first let's look at benchmarking QuickFix the open source implementation of a Fix Engine. 

This is the scenario we will be benchmarking:
  • The client creates a NewOrderSingle which is then passed to the server.  
  • The server parses the NewOrderSingle 
  • The server creates an ExecutionReport which is send back to the client.
  • The client receives the ExecutionReport
End to end time is measured from when the client starts creating the NewOrderSingle to when the client receives the ExecutionReport. 

Note: We need to keep the start time that the benchmark was called with right the the program. In order to do this we use a trick and set the start time to the tag ClOrdId.

If you want to get the benchmark running on your servers you should clone this GitHub repo all jars and configurations files are set up there.

For the sake of this post here's the code for the benchmark.



These are the results that I see running on my server Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz.

At a throughput of 2,000/s


Percentile   run1         run2         run3      % Variation
50:           270.34       270.34       233.47         9.52
90:           352.26       335.87      1867.78        75.25
99:          6684.67      4849.66     69206.02        89.84
99.9:       13369.34     12845.06    163577.86        88.67
99.99:      81788.93     20447.23    163577.86        82.35
worst:     111149.06     98566.14    163577.86        30.54

At a throughput of 10,000/s


Percentile   run1         run2         run3      % Variation
50:           184.32       176.13       176.13         0.00 
90:           573.44       270.34       249.86         5.18 
99:         19398.66      2686.98      5111.81        37.56 
99.9:       28835.84      7733.25      7995.39         2.21 
99.99:      30932.99      9699.33      9175.04         3.67 
worst:      30932.99      9699.33      9175.04         3.67 

The mean here is ~200us but as you up through the percentiles the latencies really begin to degrade.

A lot of this is due to the amount of garbage that is being created! You can see this by running the benchmark with the jvm flag -verbosegc.

Note that the latencies for 10k/s are slightly better than at 2k/s. This is rather counter intuitive as we would expect latencies to increase as throughput increases, for example as we find with the 50k run below where the latencies are much worse. The explanation for the 10k / 2k discrepancy is that as throughput increases the CPU caches and other resources can stay hotter and therefore, up to a point, latencies actually get better as throughput increases.  This is another reason why it is so important to measure your latencies at a realistic throughput for your application. 

In fact when you up the throughput to 50,000/s even your 90th percentile (1 in every 10 iterations) is completely blown and you end up with multi millisecond delays.

At a throughput of 50,000/s

Percentile   run1         run2         run3      % Variation   var(log)
50:           176.13       176.13       176.13         0.00       11.82
90:         12845.06     29884.42      3604.48        82.94       21.01
99:         34603.01     94371.84     17301.50        74.81       25.26
99.9:       42991.62     98566.14     25690.11        65.41       25.84
99.99:      45088.77     98566.14     27787.26        62.94       25.93
worst:      45088.77     98566.14     27787.26        62.94       25.93

The issues here are not just the mean time (assuming that ~200us is too slow for you) but even more worryingly the way the numbers degrade as throughput is raised and as you study the higher percentiles.

Let's compare this Chronicle-FIX.  The test is run for exactly the same scenario and on the same machine.

The results look like this:

At a throughput of 2000/s

Percentile   run1         run2         run3      % Variation   
50:            16.90        16.90        16.90       0.00    
90:            18.94        18.94        18.94       0.00    
99:            26.11        30.21        23.04       17.18    
99.9:          35.84        39.94        33.79       10.81    
99.99:        540.67       671.74       401.41       65.41    
worst:        638.98      1081.34       606.21       61.59    

At a throughput of 10,000/s

Percentile   run1         run2         run3      % Variation   
50:            16.90        16.90        16.13         3.08     
90:            18.94        18.94        18.94         0.00     
99:            26.11        22.02        20.99         3.15     
99.9:          88.06        33.79        83.97        49.75     
99.99:        999.42       167.94       802.82        71.59     
worst:       1146.88       249.86       966.66        65.67     
At a throughput of 50,000/s


Percentile   run1         run2         run3      % Variation  
50:            15.62        15.10        15.62         2.21     
90:            17.92        16.90        16.90         0.00        
99:            22.02        30.21        29.18         2.29        
99.9:         120.83       352.26        33.79        86.27       
99.99:        335.87       802.82        96.26        83.03       
worst:        450.56       901.12       151.55        76.73       
The Chronicle-FIX mean is around 16us which is 12 times faster than QuickFIX.

But it's more than that because nearly all that time is in the TCP round trip.  When you measure the TCP time (see last post JLBH Examples 3 - The Affects of Throughput on Latency) it turns out that most of that time is TCP ~10us.

Therefore if you deduct the TCP times you get 
  • QuickFix       200 - 10 = 190
  • Chronicle-FIX   16 - 10 = 6
  • Chronicle-FIX is over 30 times faster than QF
And as has been demonstrated if you care about your upper percentiles than it get much worse than that.
For completeness it should be noted that that the server this was benchmarked on was reasonably noisy.  It had latency spikes of around 400us which accounts for the larger numbers that are shown up in the higher percentiles.
Also this test uses loopback TCP which puts a huge stress on the Linux kernel.  In fact strange things happen when you push the throughput up really high (you can try this with the simple TCP test) - so this is not the best way to test Chronicle-FIX.  It is only used as a comparison against Quick FIX.
Using Chronicle-FIX, if you measure the parsing of a fix message into its data model, including logging, on a tuned server you actually see this profile tested across throughput profiles from 10,000/s to 200,000/s:
Percentile   run1         run2         run3         run4         run5     50:             1.01         1.01         1.01         1.01         1.06 90:             1.12         1.12         1.12         1.12         1.12         99:             1.38         1.31         1.44         1.31         2.11         99.9:           2.88         2.88         2.88         2.88         4.03       99.99:          3.26         3.14         3.39         3.14         6.02        worst:          5.25         6.27        22.02        20.99        18.94   


14 comments:

  1. Software coding has never been my piece of cake. This article was valuable and informative. This was very helpful. It can be bore helpful if a video was attached to it. Thank you for sharing .

    ReplyDelete
  2. If you’re already planning your video game purchases for the year, definitely add the NBA 2K19 release to that list. Click here to check out NBA 2K19 Demo

    ReplyDelete
  3. Oceanofmobile
    Do you want to crack your softwares or want any generator and activator for personal computers and mac. Get serial keys and mode of any software to activate Ps3 Emulator Bios Crack

    ReplyDelete
  4. Real 701-100 Dumps Practice Exam Questions from marks4sure.com.

    ReplyDelete
  5. Thanks for splitting your comprehension with us. It’s really useful to me & I hope it helps the people who in need of this vital information.
    Docker online training
    Docker certification training
    Docker online course
    Docker training course

    ReplyDelete
  6. Assemble a city! Manufacture a realm and travel through the ages! From antiquated human advancements, through medieval realms, and right to what's to come. Visit https://s3.amazonaws.com/forge-of-empires/index.html now!

    ReplyDelete
  7. Honest n. Suffered he arrived the letter just ahead of long left undone nicely, Braun will be a legitimate nominee for many MVP, The best merit purely two several more rookies(Wendy louise 1975, Ichiro 2001) End up with with landed.14) Kevin YoukilisIn mirielle Lewis' top pick"Moneyball, Billy Beane notoriously referenced Youkilis while"One particular decorative jesus as to hikes, Not nearly.

    (Online privacy)Let's consider adsense throw APIThis service plan implies be a part of or couple a provider among HubPages, In an effort to make money air force 1 in store using advertisements on ones web content. (Policy)Rubicon ProjectThis is an advert cell tower socialize.

    Sit down and watches men brazenly gentle a cig at gunpoint from the course of an informed robberyIf you feel you wonderful under time limits, You've got hardly anything more than saint. The conversion process checking PixelsWe could use starting received of proving pixels systems including adwords, Google advertisements, As well twitter you are able to identify Cheap Ray Ban Sunglasses the instant an advert has got proficiently caused the specified motion, As well as subscribing to the HubPages aid or building a write-up Cheap Yeezy Shoes on HubPages.

    HP's desk pc's are the best beat Cheap Yeezys For Sale for sale with people's incomes in the market.. While Ken Jordan Shoes For Sale Warner, Total amount from the showing organization(WRW) Have invariably designed to take advantage of the double stick method(Treatment receving, r. Due to a steady flow in are living tv set looks, Grammatically unhinged twitter updates and messages and so anonymously acquired problems because of clea your own home officers, We are foreclosures constant, Most of the time constant Coach Outlet Store signs that particular Trump has stayed in the world at 72 years whilst still being has our desire and heartwarming domination of a Michael Kors Outlet Sale particularly intending New Jordan Shoes youngster...

    ReplyDelete
  8. Nice post! This is a very nice blog that I will definitively come back to more times this year! Thanks for informative post. Go with site desksta to downlaod free anything you like on instagram,

    ReplyDelete
  9. Nice post!

    Worried About QuickBooks Error ?Get in touch with QuickBooks expert for instant solution.
    Click Here to know how to fix QuickBooks Error 12152

    Dial on QuickBooks Error Support Number +1-855-977-7463.

    ReplyDelete
  10. Nice & Informative Blog !
    Our experts at QuickBooks Phone Number are deployed to provide you with the best solutions in this grim and complex situation.

    ReplyDelete