Wednesday, 27 January 2016

Starting out with ChronicleMap: Using Off Heap Memory

Introduction

This post will introduce you to ChronicleMap and explain how by using off heap memory you can do things you might never expect a Java program to be able to do. Using off heap memory is very un-Javaish. If you're following the debate about UNSAFE in Java 9 you will appreciate just how controversial it is. Nevertheless it provides the power to do some some really useful things. ChronicleMap harnesses the power of UNSAFE, taming it and making it 'SAFE' for the rest of us to use. 

Let's demonstrate using a really simple use case for ChronicleMap

Scenario: Your server accepts requests from many clients. You want to keep a count of the requests from each client. You must ensure that every time the request count is incremented the number is persisted to disk to prevent against your process crashing and the data being lost.

Traditional approach: Create a table in database with two columns, clientID and requestCount. Every time you get a request from a client fetch the client's requestCount from the database, increment it and save the result back into the database. The actual database interaction can be abstracted somewhat by having data beans using Hibernate or the equivalent. But then you have deal with configuration and you still have to setup a database...   

Wouldn't it be nice if: We could just save the data to a java.util.Map and that would be all the configuration you would need.

Enter ChronicleMap: ChronicleMap is just like a normal java HashMap except that rather than saving its data to on-heap memory it saves its data to off-heap memory. Since the off heap memory is backed by a memory mapped file all the data gets flushed to disk and persisted by the OS. 

What's the difference between on and off heap memory: On heap memory is memory 'managed' by the JVM.  Most importantly all data that is allocated on heap is subject to garbage collection.  (This has its pros and cons but more of that another time).  This memory is entirely local to the JVM and data in on heap memory will be lost when the JVM dies.  Off heap memory is unmanaged memory. It can be shared between processes and persisted through memory mapped files.

Enough of the theory: Let's see the code for this in practice.

To add Chronicle-Map to you project just add this Maven dependency:
<dependency>
    <groupId>net.openhft</groupId>
    <artifactId>chronicle-map</artifactId>
    <version>2.4.12</version>
</dependency>

It takes exactly 2 lines write the code for our scenario - have a look at the code below:


The first line creates the map and the second updates it as you would any other java.util.concurrent.ConcurrentMap.

Each time you run the program you will notice that the number of user requests for "user1" is incremented.

A note to explain the parameters to the constructor: To achieve the greatest performance with ChronicleMap the Map does not resize.  (Apart from anything else ChronicleMap was written as a performant data store for low latency systems but that's not what we are really concentrating on in this post). Therefore, to enable the Map to reserve the correct amount amount of disk space we must specify the maximum number of entries and also give it a rough idea of the length of the Strings in the key. (You shouldn't worry too much about over estimating the number of entries as disk space is only taken passively and will only be used if required.)

A deeper look into off heap memory: Hopefully that was simple enough we've seen how you can use ChronicleMap as an implementation of java.util.Map and all inserts and updates to that map are saved to disk because ChronicleMap is backed by off heap memory.

But let's say you don't want to write the number of requests back to the map each time you want it saved to disk.  What you might want is to update the userRequests and for that number to be persisted without having to update the Map.  (It is the equivalent of having an AtomicLong stored in a HashMap.)

To do this rather than using a Long to hold the user request you should use LongValue where LongValue is just a wrapper class for a Long. Crucially though LongValue must be backed by off heap memory.  For this reason you can't just create an instance of LongValue using new as you would with normal on heap memory you need to use a factory provided by Chronicle.

It's much easier to understand this by examining as the code below:

The Map is built exactly the same way as in the simple example above other than the value type is LongValue not Long

The variable userRequests is create by generating a direct (off heap) instance using the call DataValueClasses.newDirectInstance().  (If you want to understand more about the internals of this code run the program in a debugger and step into line addValue() on line 28.) All the data written into this instance variable is saved off heap and therefore persisted.

Because useRequest is backed by off heap memory we can increment it on line 28 and do not have to store it back into the ChronicleMap for the data to be persisted.

One thing to point out is the the method call on the Map acquireUsing() on line 25 which allows us to fetch the data in the Map using a a pre-existing instance variable.  This means that no allocation takes place and allows for zero GC program. (This is important because GC is the enemy of predictable low latency systems).

Summary

ChronicleMap uses off heap memory to enable:

  • persistence (as we have seen in this post)
  • IPC (inter process communication) you can share the ChronicleMap between more than one JVM.
  • Zero GC (important for real-time systems)
If you need any of the above you might consider using ChronicleMap in your code!

5 comments:

  1. Hi Daniel,

    but LongValue it's an instance, not a reference (produced by DataValueClasses.newDirectInstance) so to store it in the map you'll need call a new put anyway, but the atomicity of the operation will be gone if multiple threads try to call it on the same user...
    Maybe acquireContext is better for this kind of task...

    ReplyDelete
  2. Hi there,
    Thanks for your comment
    Even though it's an instance the LongValue will store it to off heap memory. Run the program yourself and see what happens.

    ReplyDelete
    Replies
    1. Hi Daniel,

      I've tried tour example, but i'm not agree with you, sorry :(
      if you add a:

      LongValue lastUserRequests = demoMap.getUsing("user1", DataValueClasses.newDirectInstance(LongValue.class));

      just before the print, to retrieve the last inserted value of the map for that user and you'll find out that there is no the value of 1 or the value you were expecting to find in it (depends how many times you run the program e.g. hoping to find an incremented value every time).

      The fact that the previous example works is because you were reading from the same value object in which you have stored the 1 (or the last read value +1)...

      Delete
    2. Hi Francesco

      Thanks for your reply.

      You are correct that in this instance it would have been better to use a newDirectReference instead of newDirectInstance. Because the variable 'userRequests' is first used to 'get' form the map it is wasteful to allocate memory which is what happens with an instance.
      However, other than being slightly wasteful the program is still functionally correct using the Instance.

      Delete

  3. Thank you! I received answers to my question. Very good article, informative and interesting, thank you for your work and for sharing experience!
    Richard Brown data room due diligence

    ReplyDelete