Wednesday 28 October 2015

Let's pause for a Microsecond

A lot of benchmarks in low latency Java applications involve having to measure a system under a certain load. This requires maintaining a steady throughput of events into the system as opposed to pumping events into a system at full throttle with no control whatsoever.

One of the tasks I often have to do is pause a producer thread for a short period inbetween events. Typically this amount of time will be single digit microseconds.

So how do you pause a Thread for this amount of time?  Most Java developers think instantly of Thread.sleep(). But that's not going to work because Thread.sleep() only goes down to milliseconds and that's an order of magnitude longer than the amount of time required for our pause in microseconds.

I saw an answer on StackOverflow pointing the user to TimeUnit.MICROSECONDS.sleep() in order to sleep for less than a millisecond.  This is clearly incorrect, to quote from the JavaDoc:

Performs a Thread.sleep using this time unit. This is a convenience method that converts time arguments into the form required by the Thread.sleep method.

So you're not going to be able to get better than a 1 millisecond pause , similar to Thread.sleep(1). (You can prove this trying the example on the code below).

The reason for this is that this method of pausing, namely putting a thread to sleep and waking it up, is never going to be fast or accurate enough to go lower than a millisecond.

Another question we should be introducing at this point is how accurate is Thread.sleep(1) anyway? We'll come back to this in later.

Another option when we want to pause for a microsecond is to use LockSupport.parkNanos(x).  Using the following code to park for 1 microsecond actually takes ~10us.  It's way better than TimeUnit.sleep() / Thread.sleep() but not really fit for purpose.  After 100us it does get into the same ball park with only a 50% variation.


The answer to our problems is to use System.nanoTime(). By busy waiting on a call to System.nanoTime we will be able to pause for a single microsecond.  We'll see the code for this in a second but first let's understand the accuracy of System.nanoTime(). Critically, how long does it take to perform the call to System.nanoTime().

Here's some code that will do exactly this:



The numbers will vary from one machine to another on my MBP I get ~40 nanoseconds.

That tells us that we should be able to measure to an accuracy of around 40 nanoseconds. Therefore, measuring 1 microsecond (1000 nanoseconds) should easily be possible.

This is the busy waiting approach 'pausing' for a microsecond:


The code waits for a microsecond and then times how long it has waited.  On my machine I get 1,115 nanoseconds which is within ~90% accurate. 

As you wait longer the accuracy increases, 10 microseconds takes 10,267 which is ~97% accurate and 100 microseconds takes 100,497 nanoseconds which is ~99.5% accurate.

What about Thread.sleep(1), how accurate is that?

Here's the code for that:


The average time in nanoseconds for 1 millisecond sleep is 1,295,509.  That only ~75% accurate.  It's probably good enough for nearly everything but if you want an exact millisecond pause you are far better off with a busy wait.  Of course you need to remember that busy waiting, by definition keeps your thread busy and will costs you a CPU.

Summary Table

Pause Method1us10us100us1000us/1ms10,000us/10ms
TimeUnit.Sleep()1284.61293.81295.71292.711865.3
LockSupport.parkNanos()8.128.4141.81294.311834.2
BusyWaiting1.110.1100.21000.210000.2


Conclusions

  • The only way to pause for a microsecond is by busy waiting
  • If you want to pause for anything less than a millisecond accurately you need to busy wait
  • LockSupport only begins to get accurate at 100us
  • System.nanoTime() takes ~40ns
  • Thread.sleep(1) is only 75% accurate
  • Busy waiting on more than 10us and above is almost 100% accurate
  • Busy waiting will tie up a CPU 





Friday 16 October 2015

Dynamic Java Code Injection

In this post we're going to look at how to dynamically load Java code into a running jvm. The code might be completely new or we might want to change the functionality of some existing code within our program.

(Before we start you might be wondering why on earth anyone might want to do this. The obvious example is for something like a rules engine. A rules engine would want to offer the ability for users to add or change rules without having to restart the system. You could do this by injecting DSL scripts as rules which would be called by your rules engine. The real problem with such an approach is that the DSL scripts would have to be interpreted making them exceedingly slow to run. Injecting actual Java code which can then be compiled and run in the same way as any other code in your program will be orders of magnitude more efficient.

At Chronicle we are using this very idea at the heart of our new microsecond micro-services/algo container).


The library we are going to use is the open source Chronicle library Java-Runtime-Compiler.

As you will see from the code below, the library is exceedingly simple to use - in fact it really only takes a couple of lines. Create a CachedCompiler and then call loadFromJava. (See the documentation here for the actual simplest use case.)

The program listed below does the following:
  1. Creates a thread which calls compute on a Strategy every second. The inputs to the Strategy are 10 and 20.
  2. Loads a strategy which add two numbers together
  3. Waits 3s
  4. Loads a strategy which deducts one number from the other
This is the full code listing:


This is the output (comments in blue):

The strategy has not been loaded yet. underlying in the StrategyProxy is null so Integer.MIN_VALUE is returned
-2147483648
The adding strategy has been loaded 10+20=30
30
30
30
After 3s the subtracting strategy is loaded. It replaces the adding strategy. 10-20=-10
-10
-10
-10
-10

-10

Note that in the code we created a new ClassLoader and a new CachedCompiler each time we loaded the Strategy.  The reason for this is that a ClassLoader can only have one instance of a particular class loaded at any one time.

If you were only using this library to load new code you would do it like this, without creating a ClassLoader (i.e. using the default ClassLoader) and using the CachedCompiler.


Class aClass = CompilerUtils.CACHED_COMPILER.loadFromJava(className, javaCode);

Friday 9 October 2015

Chronicle-Wire Tutorial (Part 3): Serialising Code

Before reading this, Part 3 of the Chronicle-Wire tutorial, I'm assuming that you are comfortable with at least Part 1 (the basics) of the tutorial.

So far we've looked at how to serialise data as objects as well as serialising data in the form of documents. Now we're going to look at how we can serialise code.

There are two ways this can be done with lambdas and with enums.

Serialising code with lambdas

One of the coolest features of Java 8 are lambdas which provide a means to pass functions (essentially code) from one part of your application to be run in other parts of your application.

Being able to serialise code is really useful, for example, if you want to write some code on a client which gets executed on the server. Or, for map reduce type problems where you break up a problem and distribute the code so that it can be run on many machines. 

This example demonstrates how this can be done, in this case using the class SerializableFunction. The lambda String::toUpperCase is serialised to Bytes using both TextWire and BinaryWire.  The Bytes are then deserialised into a Function which is applied to a string "hello world".



The output of this program is:

----------Testing with TextWire--------------------
Text Wire representation of serialised function
toUpperCase: !SerializedLambda {
  cc: !type net.openhft.engine.chronicle.demo.WireDemoLambdas,
  fic: net/openhft/chronicle/core/util/SerializableFunction,
  fimn: apply,
  fims: (Ljava/lang/Object;)Ljava/lang/Object;,
  imk: 5,
  ic: java/lang/String,
  imn: toUpperCase,
  ims: ()Ljava/lang/String;,
  imt: (Ljava/lang/String;)Ljava/lang/String;,
  ca: [
  ]
}

hello world -> HELLO WORLD

----------Testing with BinaryWire--------------------
Binary Wire representation of serialised function
00000000 CB 74 6F 55 70 70 65 72  43 61 73 65 B6 10 53 65 ·toUpper Case··Se
00000010 72 69 61 6C 69 7A 65 64  4C 61 6D 62 64 61 82 1E rialized Lambda··
00000020 01 00 00 C2 63 63 BC 31  6E 65 74 2E 6F 70 65 6E ····cc·1 net.open
00000030 68 66 74 2E 65 6E 67 69  6E 65 2E 63 68 72 6F 6E hft.engi ne.chron
00000040 69 63 6C 65 2E 64 65 6D  6F 2E 57 69 72 65 44 65 icle.dem o.WireDe
00000050 6D 6F 4C 61 6D 62 64 61  73 C3 66 69 63 B8 34 6E moLambda s·fic·4n
00000060 65 74 2F 6F 70 65 6E 68  66 74 2F 63 68 72 6F 6E et/openh ft/chron
00000070 69 63 6C 65 2F 63 6F 72  65 2F 75 74 69 6C 2F 53 icle/cor e/util/S
00000080 65 72 69 61 6C 69 7A 61  62 6C 65 46 75 6E 63 74 erializa bleFunct
00000090 69 6F 6E C4 66 69 6D 6E  E5 61 70 70 6C 79 C4 66 ion·fimn ·apply·f
000000a0 69 6D 73 B8 26 28 4C 6A  61 76 61 2F 6C 61 6E 67 ims·&(Lj ava/lang
000000b0 2F 4F 62 6A 65 63 74 3B  29 4C 6A 61 76 61 2F 6C /Object; )Ljava/l
000000c0 61 6E 67 2F 4F 62 6A 65  63 74 3B C3 69 6D 6B 05 ang/Obje ct;·imk·
000000d0 C2 69 63 F0 6A 61 76 61  2F 6C 61 6E 67 2F 53 74 ·ic·java /lang/St
000000e0 72 69 6E 67 C3 69 6D 6E  EB 74 6F 55 70 70 65 72 ring·imn ·toUpper
000000f0 43 61 73 65 C3 69 6D 73  F4 28 29 4C 6A 61 76 61 Case·ims ·()Ljava
00000100 2F 6C 61 6E 67 2F 53 74  72 69 6E 67 3B C3 69 6D /lang/St ring;·im
00000110 74 B8 26 28 4C 6A 61 76  61 2F 6C 61 6E 67 2F 53 t·&(Ljav a/lang/S
00000120 74 72 69 6E 67 3B 29 4C  6A 61 76 61 2F 6C 61 6E tring;)L java/lan
00000130 67 2F 53 74 72 69 6E 67  3B C2 63 61 82 00 00 00 g/String ;·ca····
00000140 00                                               ·                

hello world -> HELLO WORLD

Serialising code with enums

Using lambdas to serialise code give the developer maximum flexibility in terms of what can be serialised but there also a couple drawbacks.
  1. The serialised lambda is very bulky (see the print out of the bytes above)
  2. There is no control over what can be serialised.  If you are using serialisation to send messages across the wire between client and server you might want to have more control over the code that can be executed by the client on the server. 
Using enums addresses these shortcomings.

The code below does exactly the same as the code we saw in the lambda example in the first section. It takes the code to transform a string to upper case and serialises it. The code to transform the string is stored in an enum which implements Function.




This is the output from that program

----------Testing with TextWire--------------------

Text Wire representation of serialised function
toUpperCase: !StringFunctions TO_UPPER_CASE

hello world -> HELLO WORLD

----------Testing with BinaryWire--------------------

Binary Wire representation of serialised function
00000000 CB 74 6F 55 70 70 65 72  43 61 73 65 B6 0F 53 74 ·toUpper Case··St
00000010 72 69 6E 67 46 75 6E 63  74 69 6F 6E 73 ED 54 4F ringFunc tions·TO
00000020 5F 55 50 50 45 52 5F 43  41 53 45                _UPPER_C ASE     


hello world -> HELLO WORLD

Straight away we see the difference between the verbosity of the self describing lambda to the predefined enum.

Enum ->  toUpperCase: !StringFunctions TO_UPPER_CASE
Lambda -> toUpperCase: !SerializedLambda {
  cc: !type net.openhft.engine.chronicle.demo.WireDemoLambdas,
  fic: net/openhft/chronicle/core/util/SerializableFunction,
  fimn: apply,
  fims: (Ljava/lang/Object;)Ljava/lang/Object;,
  imk: 5,
  ic: java/lang/String,
  imn: toUpperCase,
  ims: ()Ljava/lang/String;,
  imt: (Ljava/lang/String;)Ljava/lang/String;,
  ca: [
  ]
}

Since all the enums have been pre-defined there are no nasty surprises as to what code is going to be run.

On the downside of course you lose the flexibility of ad hoc lambdas.

Note the line:

ClassAliasPool.CLASS_ALIASES.addAlias(StringFunctions.class, "StringFunctions");

As explained earlier this means that "StringFunctions" is mapped to the class StringFunctions:

Without this line (without class aliasing) the output:

toUpperCase: !StringFunctions TO_UPPER_CASE

would be

toUpperCase: !net.openhft.engine.chronicle.demo.WireDemoEnums$StringFunctions TO_UPPER_CASE

Apart from being more verbose and harder to read it would also be a problem when passing the message between languages which is one of the reasons to use Chronicle-Wire serialisation.

Summary

Use enums when you know you need maximum efficiency and/or control. Use lambdas when you need flexibility. In practice you will probably mix both for different parts of your application.

Chronicle-Wire Tutorial (Part 2): Working with Documents

In Part 1 we saw the basics of how to use Chronicle-Wire to serialise and deserialise objects and some of the benefits of using Wire.

In this post I want to show you some more advanced features around serialising data in the form of documents.

Serialising documents

Rather than serialising a whole object with Wire (which is what we did in the previous post with Person) it is possible that you might want to group a number of data items together in an ad hoc object and serialise that with Wire.

You can do this with the document feature of Chronicle.

This should be clearer by looking at this example.



The output from this program is:


--------TextWire Demo--------------
Data serialised with TextWire:
!data: {
  name: dan,
  age: 44
}

Data deserialised:
Name:dan
Age:44

---------BinaryWire Demo--------------
Data serialised with BinaryWire:
00000020 34 34 0A 7D 0A 18 00 00  00 C4 64 61 74 61 82 0E 44·}···· ··data··
00000030 00 00 00 C4 6E 61 6D 65  E3 64 61 6E C3 61 67 65 ····name ·dan·age
00000040 2C                                               ,                
Data deserialised:
Name:dan

Age:44

So you have the ability to create ad hoc objects in form of documents.

But you can go a step further with this:

Creating real objects on the fly

When writing a document you have the ability to give it a 'type'.  This is done by calling the method typePrefix() as you can see in the code below.



This is the out put from the program:

---------TextWire Demo--------------
Data serialised with TextWire:
Kdata: !chronicle.demo.Person {
  name: dan,
  age: 44
}

Data deserialised:
Person{name='dan', age=44}

---------BinaryWire Demo--------------
Data serialised with BinaryWire:
00000040 6E 2C 0A 20 20 61 67 65  3A 20 34 34 0A 7D 0A 42 n,·  age : 44·}·B
00000050 00 00 00 C4 64 61 74 61  B6 28 6E 65 74 2E 6F 70 ····data ·(net.op
00000060 65 6E 68 66 74 2E 65 6E  67 69 6E 65 2E 63 68 72 enhft.en gine.chr
00000070 6F 6E 69 63 6C 65 2E 64  65 6D 6F 2E 50 65 72 73 onicle.d emo.Pers
00000080 6F 6E 82 0E 00 00 00 C4  6E 61 6D 65 E3 64 61 6E on······ name·dan
00000090 C3 61 67 65 2C                                   ·age,            
Data deserialised:
Person{name='dan', age=44}


The code for the serialisation is almost the same as the we used in the last example except that this time it sets typePrefix() to the Person class.  This is the same Person we saw in the previous post. Code listing below:




Because we know the type of the object, we are able to deserialise using the method typedMarshallable() into a Java object.  In this example we have created a Person object.

Note: Wire has the concept of a ClassAliasPool which allows you to use shortened names or aliases rather than the fully qualified class name. This is important as it can make your data shorter and easier to read, both of which are goals of Chronicle-Wire.

Deserialising documents without creating objects

One of the goals of Chronicle in general and Chronicle-Wire in particular is to aim for zero object creation.  Reusing objects is key to achieving this goal.

In the code below we see how to deserialise data from the document directly into an existing object. (For the sake of brevity I'm only going to use TextWire).




Working with deserialised data

Along the same lines as we saw above you can also deserialise directly into a lambda that can be used to manipulate or use that data.

Take a look at this example that tests the value of the deserialised data:




The interesting thing to note here is how, when deserialising, the object() method can be employed to use the data. In this case we are asserting to prove we have the correct data but in a real application other more meaningful tasks would be created.

Summary

Hopefully this tutorial has introduced you to the power of using documents within Chronicle-Wire.  Creating ad hoc objects, parts of objects, and real objects from their constituent data parts are features that can make your code easier to write, easier to debug and most of all, make your code faster. 

Thursday 8 October 2015

Chronicle-Wire Tutorial (Part 1): The Basics

Chronicle-Wire is a library written by Chronicle that allows a Java developer to serialise Java objects. It's a library we developed to support our higher level products like Chronicle Queue and Chronicle Map. However the library has applications in any code that uses serialisation.

At this point you're wondering what's new about serialisation why another library... Serialisation is hardly a novel concept in Java. In fact Serializable has been in Java since jdk1.1 - so almost forever :)

The real innovation behind Wire is that it abstracts away the implementation of the serialisation to a pluggable Wire implementation.  The idea is that your objects need only describe what is to be serialised not how it should be serialised. This is done by the objects (the POJOs that are to be serialised) implementing the Marshallable interface.  

It is only at the point of serialisation that you decide how that serialisation is actually implemented by selecting a particular Wire implementation to provide to the process.

Let's look at an example of this which will hopefully make this concept much clearer:


This is the output from the program:

-----------TEXT WIRE------------

Person to serialise: Person{name='dan', age=44}
Text Wire prints:
name: dan
age: 44
Deserialised person: Person{name='dan', age=44}

-----------BINARY WIRE------------

Person to serialise: Person{name='dan', age=44}
Text Wire prints:
00000000 C4 6E 61 6D 65 E3 64 61  6E C3 61 67 65 2C       ·name·da n·age,  
Deserialised person: Person{name='dan', age=44}

-----------RAW WIRE------------

Person to serialise: Person{name='dan', age=44}
Text Wire prints:
00000000 03 64 61 6E 2C                                   ·dan,            

Deserialised person: Person{name='dan', age=44}

What should be clear here is that the class Person is in no way responsible for how its data is serialised.  That is done by the various implementations of Wire.

  • TextWire - serialises to text for a humanly readable format
  • BinaryWire - serialises to a self describing binary format
  • RawWire - serialises to a compact binary format

Person is only responsible for choosing the data that is to be serialised and describing the type of that data.  You will notice that Wire has a very large list of types which allow for maximum efficiency (e.g. int8(), int16() ) that can achieved by certain Wire implementations.

Whilst in the example above Person is serialised to Bytes (for more information on Bytes see here) you can actually serialise to whatever format you want. All you have to do is to implement the Wire interface.  (It is not necessary to provide an implementation for the methods if not appropriate.)

As well as obvious serialisation formats such as JSON, YAML, csv you can also create some rather bizarre ones as well. For example I created an LDAPWire which serialises objects into Attributes that can be stored into an LDAP database.  In this case I only implemented the text() methods of Wire.

This chart (scroll down) compares a couple of the Wire implementations with the competition. As you can see it compares favourably with SBE and Capt'n Proto.