Fork me on GitHub
Credit and license

Interfacing Fortran, C, C++, and Python: Roadmap for migrating and modularizing legacy code

Overview

Teaching: 10 min
Exercises: 0 min
Questions
  • Big untested legacy monolith code in front of you - what now?
Objectives
  • Offer a step-by-step recipe for migrating and modularizing legacy code.

Suggestion for a roadmap

1) Add end-to-end test

  • This means feeding the code with input, waiting for the result, and compare the result against a reference.
  • This should be scripted.
  • Do not touch the code before you have a test as safeguard.
  • Try introducing some bugs and verify that the test catches them - if not, spend some more time on the testing, it will pay off.

2) If possible, add unit tests

  • It is unrealistic to add unit tests for every single function.
  • Test central functions and functions which are important entry points between modules/sections.

3) Test coverage, iterate until test coverage is sufficient

  • Use tools like gcov.
  • Use platforms like Coveralls.
  • Identify untested code and dead wood.
  • Iterate with step 1 and 2.

4) Define interfaces

  • Identify important entry points between modules/sections.
  • Again, this is an iterative process.

5) Isolate modules behind interfaces

  • Make interface functions publicly accessible and make all other functions publicly inaccessible.
  • This is a painful step since you will identify many dependencies.

6) Refine interfaces

  • Iterate steps 4 to 6.

7) Localize global data

Bad:

my_parameter = ...

def function1(...):
    # uses my_parameter
    return ...

def function2(...):
    # uses my_parameter
    return ...

Better:

def get_my_parameter():
    ...
    return my_parameter

def function1(my_parameter, ...):
    # uses my_parameter
    return ...

def function2(my_parameter, ...):
    # uses my_parameter
    return ...

Discuss where the state is located in the two above examples.

8) Build modules separately into libraries

  • This will expose dependencies.

9) Test modules separately

  • Speeds up the edit-test-commit loop and helps exposing dependencies and sharpening the API.
  • The minimum is to test the API.

10) Minimize dependencies

  • The less dependencies the better.
  • All dependencies should be documented - this in fact is part of the API.

11) Maximize cohesion

Bad:

def does_a_or_b(..., option_b=False):
    return ...

Better:

def does_a(...):
    return ...

def does_b(...):
    return ...

12) Document and version interfaces

  • It is often unrealistic to document all functions.
  • Interfaces need to be versioned and documented.
  • While inner functions may change, interfaces hopefully do not change often.

13) Outsource modules into own repositories

  • Large enough independent units should track own development history.
  • This allows to incorporate them in other projects without duplicating code.

14) Include external repositories with the help of CMake and possibly git submodules/subtrees

Key points

  • Introduce testing early as a safeguard.

  • Use code coverage analysis tools.

  • Cut the code at important interfaces.

  • Build and test modules separately to identify hidden dependencies.