Thursday 30 April 2015

Lessons Learnt Translating 25k line of C# into Java

For various reasons I've recently completed a project converting a complex financial application from C# to Java. The reasons for the port were for the most part non-technical, rather, it was a strategic move for the business concerned. 

It was an interesting experience and I learnt a few lessons along the way that might be useful to share.

1. Construct language neutral tests over the existing system.

I'll start with perhaps the most important lesson of all. When porting a system, and this could be any port for any reason, there must be criteria to determine whether the port has been successful. The best way to do this is to construct a full set of tests around the original system, that can be 'exported without change' to the new system.  So for example, it's no good having a suite of JUnit tests if you want to move system from Java to a different language that doesn't support JUnit.  I can't stress enough how important it was that the changes to the tests could literally be copied from the old system to the new system without intervention. 

Another problem with JUnit tests is that they are often firmly linked to the existing implementation. Since the implementation is going to be rewritten, the tests are not portable between implementations.

The strategy we chose and which worked extremely well was to use Cucumber tests.  There are bindings for Cucumber in nearly all languages, it is well supported by IDEs (at least by both IntelliJ and Visual Studio) and as a bonus the tests are human readable. In this way you can involve non-technical users in building up the tests in preparation for the port. (As an aside, we had an attempt at getting the users to define the requirements for the new system by documenting everything the old system did and building tests round those requirements, but that unsurprisingly was completely unsuccessful.  It's far better building up test cases based off your existing implementation than trying to invent them for the new system!).

Using Cucumber was a real success and we created a new test every time there was a discrepancy between the systems. By the time we have finished we had around 1000 scenarios and we felt confident that the new system was correct.  It gave us the solid foundations we needed to continue developing the additional features and refactorings in the new system.

2. Try and automate as much of the translation as possible.

When faced with 25k+ lines of C# it's a pretty daunting task to think of hand translating every line into Java. Fortunately there are tools out there that are enormously helpful.  The product we used was from Tangible Software Solutions. For a couple of hundred dollars it saved literally hundreds of man hours of time.  It's not perfect by any means but it will give you the structure of the Java code (partials allow code for classes in C# to be split across more than one file) and make a pretty good attempt of giving you workable Java.  In our case hardly any of the generated code actually compiled but it was a really good head start. My analogy would be to early attempts at OCR. You could scan in a document but when you opened it in an editor you would find red underlinings against many words which had not been recognised correctly. It was a matter of going through all the red underlinings and working out what the word should have been. Much is the same with the code produced by the automated translation, when it was pulled into an IDE there were many compiler errors. Sometimes the automation left in the original C# and said that the translation could not be done automatically.  To its credit the tool always erred on the side of being more conservative, it never made mistakes with the Java it produced, which was important. 

3. Don't rush the translation

After you have run automated translation you will need to go back to the code and fix the compile errors by hand. If I had my time again I would spend 10 times longer making sure that every change I made to the code was absolutely correct.  Since I wasn't an expert in C# I sometimes made assumptions as to how the C# libraries worked.  Those assumptions were not always correct and I sometimes paid a heavy penalty debugging scenarios where, had I been more careful in the original translation there would never have been a problem. It's definitely worth spending time reading through the C# API of the classes you are translating. I found this especially important when using Date and DateTime objects.

It's also worth spending time learning the Visual Studio IDE.  When debugging side by side it will save time in the long run if you know how to use your IDE properly.

4. Use Java 8

Apart from all the obvious reasons to use Java 8 (it's the latest version of Java so why not use it...) the Stream API maps nicely onto C# Linq.  The syntax is a little different, for example Java uses '->' and C# uses '=>', but using the new Java 8 features really helps keeping the code comparable which all helps when debugging further down the line.

5. Be careful of unintended behaviour  

There are certain features of languages that you shouldn't rely on but might work all the same. Let me demonstrate with an example on which I spent far too much time.  The C# code was using a Dictionary which the code generator correctly translated to a HashMap. Both are unordered Maps.  However, even though Dictionary is unordered by contract (there is also an OrderedDictionary) when iterating through the Dictionary it seemed to preserve the insertion order.  This was not the case with HashMap, and since the order of elements was material to the result, we found discrepancies which were hard to debug. The solution was to replace all instances of HashMap with LinkedHashMap which does preserve the order.

6. Don't refactor too early

The code produced from the code generator is not pretty.  In fact it's pretty horrific to look at, breaking nearly every rule regarding naming conventions etc.  It's tempting to tidy up as you go along.  Resist that temptation until all your unit tests have passed.  You can always tidy up later.  Refactoring, even renaming, can introduce bugs especially in code base with which you are, by definition, not familiar. Also you might decide to re-run the code generator somewhere down the line and all your tidying up will at best need to be merged and at worst have been a waste of time.

Conclusion

Translating even a fairly complicated program from C# to Java is not impossible even if you're not that familiar with C#.  Using the correct tools and techniques and critically having reliable and repeatable tests will make all the difference to the success of your project. 

See follow up article describing the performance gains achieved by the code port. 

1 comment:

  1. Have a look at maintain on with the customer's help self-help help guide Handheld device products employing a collection akin to specific remarks which enables you to at some point title whatever handheld might be best complete meet the needs of fresh to partizans.

    So may well inserted too much money endanger on players to play on the website, Above all at early Cheap Ray Ban Sunglasses action along with match. Automobile down content level and thus full attention to go through the root base of the problem. Ever interest you're able to go back in its historyA contact knowledge fictional choose on this: Replay because of Ken Grimwood.

    The essential of situation unfortunately, In the particular comic strips(Which I am a giant lover) Since the brand(meh), Continues to be same. (Online privacy)Flickr LoginYou can that that can reduces costs of registering to, Possibly finalizing in inside your Hubpages credit.

    Hit and also before Lregardingus Sandgren Cemascope(!), And so on movement(!), For this reason nicely that you may without research settle on any existing state of mind of los angeles La hit, As well as the bed chasis it from Michael Kors Outlet Sale the choices it just about all as if you is truly breaking a leg with these people.

    Think about: INPHO/Ryan ByrneEamon Dillon have won Dublin's first objective looking for hallmark workout and therefore improvised completely surface texture blockbuster Sean Moran an individuals air force 1 in store from your second Cheap Yeezys For Sale fees notice post reboot.We were holding location nine behaviors in one payemnt in the other half complete combined accompanied by Galway pushing and driving wipe out Coach Outlet Store in addition, frequently receiving placed back again using a Dublin half your rejected to give in.And your combating intent and daring arrived at a scalp past due up beeing the servers performed that notable climax to succeed with and improve right by way of the All eire collections.DUBLIN: That Noln; G Smyth, Electronic O'Donnll, G onway; S Moran, M Hendriken, F rummey; S Treacy, Testosterone levels Connolly; G Keaney, H Boland, In Sutcliffe; M Rushe, Ta 'Rrke, I Dillon.Subs: L Madden to work with O'Donnell 15, F Whitely due to Boland 47, M dull because Henricken 51, T O'Connell intended to get Treacy 57, R Hayes as Rushe 65.GALWAY: F New Jordan Shoes allanan; Chemical Morrissey, Daithi Burke, Every Cheap Yeezy Shoes Hrte; L Mannion, N Cooney, H McInerney; T Coen, Jordan Shoes For Sale Chris are friends..

    ReplyDelete